API Reference

Batch Transcription Endpoint

Full reference for POST /v1/audio/transcriptions.


Request — multipart/form-data

FieldRequiredTypeDescription
modelYesstringzero-indic
fileone ofbinaryAudio or video file upload.
urlone ofstringPublic audio/video URL.
language_codeNostringISO code or name. Default: auto.
use_vad_chunkingNoboolSplit at speech pauses. Default: true.
chunk_sizeNointFixed chunk seconds (VAD off). Default: 30.
output_scriptNostringTransliterate output script. Default: auto.
word_timestampsNoboolPer-word timestamps + alignment score.
enable_diarizationNoboolSpeaker diarization.
enable_speaker_identificationNoboolMap speakers to registered names.
projectNostringSpeaker library namespace + analytics tag.
enable_emotion_diarizationNoboolPer-segment emotion detection.
enable_intent_detectionNoboolIntent classification (Gemini).
intent_choicesNoJSON arrConstrain intent to specific labels.
enable_summarizationNoboolTranscript summarization (Gemini).
summary_max_lengthNointMax summary word count. Default: 150.
enable_sentiment_analysisNoboolSentiment analysis (Gemini).
output_languageNostringTranslate to this language (Gemini).
enable_keyterm_normalizationNoboolKeyterm normalisation (Gemini).
keyterm_keywordsNoJSON arrTerms to focus normalisation on.
enable_profanity_hashingNoboolMask profanity with **** (Gemini).
hash_keywordsNoJSON arrMask specific words with **** (regex).

Error responses

StatusMeaningDescription
401authentication_errorMissing or invalid API key.
403permission_deniedInsufficient permissions.
422transcription_errorInvalid audio or config parameters.
429rate_limit_exceededToo many requests. Back off and retry.
5xxserver_errorTransient server error. Safe to retry.