API Reference

Batch Synthesis — POST /v1/audio/speech

Send text and receive a complete audio file in a single HTTP round-trip.

Endpoint

PROPERTY	VALUE
Method	`POST`
URL	`https://tts.shunyalabs.ai/v1/audio/speech`
Content-Type	`application/json`
Auth	`Authorization: Bearer <API_KEY>`
Response	Audio bytes in the requested format

Request Parameters

PARAMETER	TYPE	REQUIRED	DESCRIPTION
`model`	`string`	Yes	Model name. Use "zero-indic".
`voice`	`string`	Yes	Speaker voice name.
`input`	`string`	Yes	Text to synthesize. Max 10,000 characters.
`response_format`	`string`	No	Output format. Default: "mp3".
`speed`	`float`	No	Speed multiplier. Range: 0.25-4.0. Default: 1.0.
`language`	`string`	No	ISO 639 language code for preprocessing.
`trim_silence`	`bool`	No	Remove leading/trailing silence. Default: false.
`volume_normalization`	`string`	No	"peak" or "loudness".
`background_audio`	`string`	No	Preset name or base64-encoded audio.
`background_volume`	`float`	No	Background volume 0.0-1.0. Default: 0.1.
`reference_wav`	`string`	No	Base64 reference audio for voice cloning.
`reference_text`	`string`	No	Transcript of reference audio. Max 500 chars.
`word_timestamps`	`bool`	No	Return word-level timestamps. Default: false.
`max_tokens`	`int`	No	Max tokens for LLM generation. Default: 2048.

Response Headers

HEADER	DESCRIPTION
`Content-Type`	MIME type of the audio (e.g., `audio/mpeg`).
`Content-Length`	Size of the audio response in bytes.
`X-Request-Id`	Unique request identifier for debugging.

Error Status Codes

CODE	MEANING	DESCRIPTION
`200`	OK	Audio returned successfully.
`400`	Bad Request	Invalid parameters, input too long, or missing required fields.
`401`	Unauthorized	Missing or invalid API key.
`500`	Internal Server Error	Server-side failure. Retry with backoff.
`503`	Service Unavailable	Server overloaded or under maintenance. Retry later.
`504`	Gateway Timeout	Request exceeded server timeout (120 s).

PreviousAuthentication Next Streaming WebSocket