Transcript Enrichment

Keyterm Normalization

Normalises domain-specific terms using Gemini. Result in nlp_analysis.normalized_text — original text is unchanged.


Python SDK

python
config = TranscriptionConfig(
    model="zero-indic",
    enable_keyterm_normalization=True,
    keyterm_keywords=["EMI", "NACH mandate", "bounce charge"],
)
result = await client.asr.transcribe("audio.wav", config=config)

print(result.text)                           # original, unchanged
# aapki emi ki tarikh paanch august hai
print(result.nlp_analysis.normalized_text)  # corrected
# aapki EMI ki tarikh paanch august hai

REST API

terminal
curl -X POST https://asr.shunyalabs.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer <API_KEY>" \
  -F "[email protected]" \
  -F "model=zero-indic" \
  -F "enable_keyterm_normalization=true" \
  -F 'keyterm_keywords=["EMI", "NACH mandate", "bounce charge"]'

Output

json
{
  "text": "aapki emi ki tarikh paanch august hai",
  "nlp_analysis": {
    "normalized_text": "aapki EMI ki tarikh paanch august hai"
  }
}