Transcript Enrichment

Keyterm Normalization

Normalises domain-specific terms using Gemini. Result in nlp_analysis.normalized_text — original text is unchanged.

Python SDK

python

config = TranscriptionConfig(
    model="zero-indic",
    enable_keyterm_normalization=True,
    keyterm_keywords=["EMI", "NACH mandate", "bounce charge"],
)
result = await client.asr.transcribe("audio.wav", config=config)

print(result.text)                           # original, unchanged
# aapki emi ki tarikh paanch august hai
print(result.nlp_analysis.normalized_text)  # corrected
# aapki EMI ki tarikh paanch august hai

REST API

terminal

curl -X POST https://asr.shunyalabs.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer <API_KEY>" \
  -F "file=@audio.wav" \
  -F "model=zero-indic" \
  -F "enable_keyterm_normalization=true" \
  -F 'keyterm_keywords=["EMI", "NACH mandate", "bounce charge"]'

Output

json

{
  "text": "aapki emi ki tarikh paanch august hai",
  "nlp_analysis": {
    "normalized_text": "aapki EMI ki tarikh paanch august hai"
  }
}

PreviousTransliteration Next Profanity Hashing