Getting Started

Shunyalabs Speech-to-Text

A high-accuracy, low-latency speech recognition API built specifically for Indian languages. Transcribe pre-recorded audio or stream live audio over WebSocket — with built-in NLP including diarization, sentiment, and intent detection.


Choose your path

Before you start, decide which transcription mode fits your use case.

Pre-recorded

Batch transcription

Send an audio file, buffer, or remote URL via HTTP POST and receive a full structured transcript. Best for offline audio processing.

call recordingspodcastsvideo filesvoice notes
Get started →
Streaming

Real-time streaming

Open a persistent WebSocket connection and receive transcripts as audio is spoken. Partial results arrive within milliseconds.

voice agentslive captionsIVRreal-time assist
Get started →

Key differentiators

Indic-first accuracy

Purpose-built model (zero-indic) with significantly lower WER on Hindi, Telugu, Kannada, Bengali, and more.

Low-latency streaming

Partial transcripts emitted in real time. Configurable chunk size and silence threshold for latency tuning.

Built-in NLP

Diarization, sentiment, intent, emotion, summarization, translation — all in one API call, no separate pipeline.


Supported languages

All 55 languages run on the zero-indic model. Use language_code="auto" for automatic detection.

Languagelanguage_codeModelStatus
Ahiraniahrzero-indic Available
Assameseaszero-indic Available
Awadhiawazero-indic Available
Baghelibfyzero-indic Available
Bagribgqzero-indic Available
Banjaribwqzero-indic Available
Bengalibnzero-indic Available
Bhilibhbzero-indic Available
Bhojpuribhozero-indic Available
Bodobrxzero-indic Available
Brajbrazero-indic Available
Bundelibnszero-indic Available
Chhattisgarhihnezero-indic Available
Dogridoizero-indic Available
Englishenzero-indic Available
Garhwaligbmzero-indic Available
Garogrtzero-indic Available
Gujaratiguzero-indic Available
Haroutihojzero-indic Available
Haryanvibgczero-indic Available
Hindihizero-indic Available
Kachchhikfrzero-indic Available
Kangrixnrzero-indic Available
Kannadaknzero-indic Available
Kashmirikszero-indic Available
Khorthaktkzero-indic Available
Kodavakfazero-indic Available
Konkanikokzero-indic Available
Kumaonikfyzero-indic Available
Kurukhkruzero-indic Available
Lambadilmnzero-indic Available
Magahimagzero-indic Available
Maithilimaizero-indic Available
Malayalammlzero-indic Available
Manipurimnizero-indic Available
Marathimrzero-indic Available
Marwadimwrzero-indic Available
Meiteimnizero-indic Available
Mewarimtrzero-indic Available
Nepalinezero-indic Available
Nimadinoezero-indic Available
Odiaorzero-indic Available
Pahari Mahasuihimzero-indic Available
Punjabipazero-indic Available
Rajasthanirajzero-indic Available
Sambalpurispvzero-indic Available
Sanskritsazero-indic Available
Santalisatzero-indic Available
Sindhisdzero-indic Available
Surgujiasgjzero-indic Available
Tamiltazero-indic Available
Telugutezero-indic Available
Tulutcyzero-indic Available
Urduurzero-indic Available
Wagdiwbrzero-indic Available

Minimal example

python
from shunyalabs import AsyncShunyaClient
from shunyalabs.asr import TranscriptionConfig

async with AsyncShunyaClient() as client:
    result = await client.asr.transcribe(
        "audio.wav",
        config=TranscriptionConfig(model="zero-indic"),
    )
    print(result.text)
    # "नमस्ते, आज मौसम बहुत अच्छा है।"
Environment variable:
Set SHUNYALABS_API_KEY and omit api_key=. The SDK picks it up automatically.

How the API works

Pre-recorded (Batch)Streaming
TransportHTTP POSTWebSocket
InputFile, bytes, or URLRaw PCM audio chunks
ResponseSingle structured responseEvent stream (PARTIAL → FINAL)
Config objectTranscriptionConfigStreamingConfig
NLP featuresAll features availableCore transcription only
Best forRecordings, files, pipelinesVoice agents, live captions, IVR

Next steps