Keyword Normalization

Keyword normalization automatically detects variations, abbreviations, or misspellings of important terms and replaces them with standardized canonical versions. This ensures consistency in transcriptions by normalizing industry jargon, product names, or domain-specific terminology.

You can apply keyword normalization during transcription or to existing text:

Option 1: Keyword Normalization During Transcription

Request:

Don’t forget to replace YOUR_API_KEY with your own secret key.
import requests

url = "https://tb2.shunyalabs.ai/v1/transcriptions"
headers = {"X-API-Key": "your-api-key"}

with open("customer_call.wav", "rb") as f:
    files = {"file": f}
    data = {
        "enable_keyterm_normalization": "true",
        "keyterm_keywords": '["customer service", "representative", "technical support"]'
    }

    response = requests.post(url, headers=headers, files=files, data=data)
    print(response.json())

Example Output:

{
  "success": true,
  "text": "I contacted the customer service representative about my account and they proved helpful",
  "segments": [
    {
      "start": 0.0,
      "end": 8.5,
      "text": "I contacted the customer service representative about my account and they proved helpful",
      "speaker": "SPEAKER_00"
    }
  ]
}

Note: Original variations like "cust serv rep" and "tech support" are automatically normalized to their canonical forms.

Option 2: Keyword Normalization with Standalone Text

Request:

Don’t forget to replace YOUR_API_KEY with your own secret key.
import requests

url = "https://tb.shunyalabs.ai/v1/text_intelligence"
headers = {"X-API-Key": "your-api-key"}

data = {
  "text": "The cust serv rep helped me with tech support",
  "keywords": '["customer service", "representative", "technical support"]'
}

response = requests.post(url, headers=headers, data=data)
print(response.json())

Example Output:

{
  "processed_text": "The customer service representative helped me with technical support"
}

Use Cases for Keyword Normalization

  • Brand Consistency: Standardize company and product name mentions
  • Medical Documentation: Normalize medical terminology and abbreviations
  • Technical Documentation: Standardize technical terms and acronyms
  • Legal Transcription: Ensure consistent use of legal terminology
  • Customer Service: Normalize common product references and shortcuts
  • Academic Content: Standardize scientific and academic terminology