Shunyalabs Speech-to-Text
A high-accuracy, low-latency speech recognition API built specifically for Indian languages. Transcribe pre-recorded audio or stream live audio over WebSocket — with built-in NLP including diarization, sentiment, and intent detection.
Choose your path
Before you start, decide which transcription mode fits your use case.
Batch transcription
Send an audio file, buffer, or remote URL via HTTP POST and receive a full structured transcript. Best for offline audio processing.
Real-time streaming
Open a persistent WebSocket connection and receive transcripts as audio is spoken. Partial results arrive within milliseconds.
Key differentiators
Indic-first accuracy
Purpose-built model (zero-indic) with significantly lower WER on Hindi, Telugu, Kannada, Bengali, and more.
Low-latency streaming
Partial transcripts emitted in real time. Configurable chunk size and silence threshold for latency tuning.
Built-in NLP
Diarization, sentiment, intent, emotion, summarization, translation — all in one API call, no separate pipeline.
Supported languages
All 55 languages run on the zero-indic model. Use language_code="auto" for automatic detection.
| Language | language_code | Model | Status |
|---|---|---|---|
| Ahirani | ahr | zero-indic | Available |
| Assamese | as | zero-indic | Available |
| Awadhi | awa | zero-indic | Available |
| Bagheli | bfy | zero-indic | Available |
| Bagri | bgq | zero-indic | Available |
| Banjari | bwq | zero-indic | Available |
| Bengali | bn | zero-indic | Available |
| Bhili | bhb | zero-indic | Available |
| Bhojpuri | bho | zero-indic | Available |
| Bodo | brx | zero-indic | Available |
| Braj | bra | zero-indic | Available |
| Bundeli | bns | zero-indic | Available |
| Chhattisgarhi | hne | zero-indic | Available |
| Dogri | doi | zero-indic | Available |
| English | en | zero-indic | Available |
| Garhwali | gbm | zero-indic | Available |
| Garo | grt | zero-indic | Available |
| Gujarati | gu | zero-indic | Available |
| Harouti | hoj | zero-indic | Available |
| Haryanvi | bgc | zero-indic | Available |
| Hindi | hi | zero-indic | Available |
| Kachchhi | kfr | zero-indic | Available |
| Kangri | xnr | zero-indic | Available |
| Kannada | kn | zero-indic | Available |
| Kashmiri | ks | zero-indic | Available |
| Khortha | ktk | zero-indic | Available |
| Kodava | kfa | zero-indic | Available |
| Konkani | kok | zero-indic | Available |
| Kumaoni | kfy | zero-indic | Available |
| Kurukh | kru | zero-indic | Available |
| Lambadi | lmn | zero-indic | Available |
| Magahi | mag | zero-indic | Available |
| Maithili | mai | zero-indic | Available |
| Malayalam | ml | zero-indic | Available |
| Manipuri | mni | zero-indic | Available |
| Marathi | mr | zero-indic | Available |
| Marwadi | mwr | zero-indic | Available |
| Meitei | mni | zero-indic | Available |
| Mewari | mtr | zero-indic | Available |
| Nepali | ne | zero-indic | Available |
| Nimadi | noe | zero-indic | Available |
| Odia | or | zero-indic | Available |
| Pahari Mahasui | him | zero-indic | Available |
| Punjabi | pa | zero-indic | Available |
| Rajasthani | raj | zero-indic | Available |
| Sambalpuri | spv | zero-indic | Available |
| Sanskrit | sa | zero-indic | Available |
| Santali | sat | zero-indic | Available |
| Sindhi | sd | zero-indic | Available |
| Surgujia | sgj | zero-indic | Available |
| Tamil | ta | zero-indic | Available |
| Telugu | te | zero-indic | Available |
| Tulu | tcy | zero-indic | Available |
| Urdu | ur | zero-indic | Available |
| Wagdi | wbr | zero-indic | Available |
Minimal example
from shunyalabs import AsyncShunyaClient
from shunyalabs.asr import TranscriptionConfig
async with AsyncShunyaClient() as client:
result = await client.asr.transcribe(
"audio.wav",
config=TranscriptionConfig(model="zero-indic"),
)
print(result.text)
# "नमस्ते, आज मौसम बहुत अच्छा है।"SHUNYALABS_API_KEY and omit api_key=. The SDK picks it up automatically.How the API works
| — | Pre-recorded (Batch) | Streaming |
|---|---|---|
| Transport | HTTP POST | WebSocket |
| Input | File, bytes, or URL | Raw PCM audio chunks |
| Response | Single structured response | Event stream (PARTIAL → FINAL) |
| Config object | TranscriptionConfig | StreamingConfig |
| NLP features | All features available | Core transcription only |
| Best for | Recordings, files, pipelines | Voice agents, live captions, IVR |