TTS API reference

Every endpoint under tts.shunyalabs.ai, every field, every error code.

Base URLs

Interface	URL
Batch	`https://tts.shunyalabs.ai`
Streaming	`wss://tts.shunyalabs.ai/ws`
Health	`https://tts.shunyalabs.ai/health`

Authentication

Same API key works for both transports. Pick the tab for the protocol you're using.

http

Authorization: Bearer sk-your-api-key

Send the key as an Authorization header on the WebSocket upgrade. If your client can't set headers, fall back to a token query parameter:

shell

wss://tts.shunyalabs.ai/ws?token=sk-your-api-key

POST /v1/audio/speech

Batch synthesis. Returns audio bytes in the requested format.

Property	Value
Method	POST
URL	`https://tts.shunyalabs.ai/v1/audio/speech` (also `/tts`, `/`)
Content-Type	application/json
Response body	Audio bytes in requested format

Request fields

Field	Type	Required	Default	Description
`model`	string	Yes	-	Use `zero-indic`.
`input`	string	Yes	-	Text to synthesize. Max 10,000 chars.
`voice`	string	Yes	-	Speaker name. See Voices & languages.
`response_format`	string	No	`mp3`	`pcm`, `wav`, `mp3`, `ogg_opus`, `flac`, `mulaw`, `alaw`.
`speed`	float	No	`1.0`	0.25 (slowest) to 4.0 (fastest).
`language`	string	No	null	ISO 639 code for text preprocessing.
`trim_silence`	bool	No	`false`	Remove leading/trailing silence.
`volume_normalization`	string	No	null	`peak` (0 dBFS) or `loudness` (EBU R128).
`background_audio`	string	No	null	Preset name (`office`, `cafe`, `rain`, `street`) or base64-encoded WAV/MP3.
`background_volume`	float	No	`0.1`	Background mix ratio, 0.0-1.0.
`reference_wav`	string	No	null	Base64 audio for voice cloning. WAV/FLAC/OGG. 1-6 s.
`reference_text`	string	No	""	Transcript of reference audio. Max 500 chars.
`word_timestamps`	bool	No	`false`	Include word-level timestamps (batch only).
`max_tokens`	int	No	`2048`	Max tokens for internal LLM generation. 1-8192.

Response headers

Header	Meaning
`Content-Type`	MIME type (`audio/mpeg`, `audio/wav`, etc.)
`X-Sample-Rate`	Sample rate in Hz
`X-Request-Id`	Unique identifier, log for support

WebSocket /ws

Full protocol in Streaming. Summary below.

Handshake

Three equivalent ways to send the synthesis request on connect, pick whichever fits your client.

Encode parameters in the URL, works in clients that can't easily send post-connect messages.

shell

wss://tts.shunyalabs.ai/ws?model=zero-indic&voice=Varun&encoding=pcm

Connect with no params, then send a config message. Useful when you want to reuse the connection.

json

{"type": "config", "model": "zero-indic", "voice": "Varun", "response_format": "pcm"}

Send the full synthesis request in one message, the server treats it as both config and synthesis trigger.

json

{"model": "zero-indic", "input": "Hello!", "voice": "Sunita", "response_format": "pcm"}

Inbound fields

Field	Required	Description
`model`	Yes	Use `zero-indic`.
`input`	Yes	Text. Max 10,000 chars.
`voice`	Yes	Speaker name.
`response_format`	No	Default `pcm` on WebSocket.
`speed`	No	0.25-4.0, default 1.0.
`language`	No	ISO 639 code for preprocessing.
`trim_silence`	No	Remove silence, default false.

Outbound message shape

For each synthesis, the server emits messages in this order: a chunk metadata JSON, then a binary audio frame, repeated for every chunk; then a final completion JSON. If something goes wrong, an error JSON arrives instead of completion.

1. Chunk metadata (JSON)

Sent before each binary audio frame.

{
  "type": "chunk",
  "request_id": "uuid",
  "chunk_index": 0,
  "is_final": false,
  "format": "pcm",
  "sample_rate": 16000
}

2. Audio data (binary frame)

Raw audio bytes that immediately follow each chunk JSON. In the SDK, isinstance(msg, bytes) is True.

3. Completion (JSON)

Sent once after all audio chunks have been delivered.

{
  "type": "completion",
  "request_id": "uuid",
  "status": "complete",
  "total_chunks": 3,
  "total_duration_seconds": 2.48,
  "format": "pcm",
  "sample_rate": 16000
}

4. Error (JSON)

Sent instead of completion on failure. The connection closes after this message.

{
  "type": "error",
  "request_id": "uuid",
  "error": "Error description"
}

GET /health

Request:

curl https://tts.shunyalabs.ai/health \
  -H "Authorization: Bearer $SHUNYALABS_API_KEY"

Response:

{"status": "healthy", "triton_ready": true, "auth_ready": true}

SDK exception hierarchy

All SDK exceptions inherit from ShunyalabsError.

Exception	HTTP	Description
`AuthenticationError`	401	Invalid or missing API key.
`PermissionDeniedError`	403	API key lacks permission.
`RateLimitError`	429	Rate limit exceeded. Back off.
`SynthesisError`	422	Invalid text or config.
`ServerError`	5xx	Transient. Safe to retry.
`TimeoutError`	-	Request exceeded timeout.
`ConnectionError`	-	Network failure.

Rate & concurrency limits

Limit	Value
Max text length per request	10,000 characters
Recommended per-request	Under 500 characters for best quality; split longer text
HTTP request timeout	Set to at least 120 s for long text
Concurrent requests (default tier)	16

Rate-limit retry pattern

import asyncio
from shunyalabs.exceptions import RateLimitError

async def synthesize_with_backoff(client, text, config, retries=3):
    for attempt in range(retries):
        try:
            return await client.tts.synthesize(text, config=config)
        except RateLimitError:
            wait = 2 ** attempt    # 1s, 2s, 4s
            await asyncio.sleep(wait)
    raise RateLimitError("Max retries exceeded")

Concurrency cap pattern

import asyncio

sem = asyncio.Semaphore(16)

async def safe_synthesize(client, text, config):
    async with sem:
        return await client.tts.synthesize(text, config=config)

tasks = [safe_synthesize(client, s, config) for s in scripts]
results = await asyncio.gather(*tasks)

HTTP error status codes

Status	Description
200	Success. Audio bytes in response body.
400	Missing or malformed fields. Body: `{"detail": "..."}`.
401	API key invalid or missing.
422	Invalid text or config.
429	Rate limited.
500	Internal server error.
503	Backend (Triton/Redis) temporarily unavailable.
504	Gateway timeout.

SDK configuration reference

Parameter	Type	Default	Description
`api_key`	string	None	Falls back to `SHUNYALABS_API_KEY` env var.
`timeout`	float	60.0	Request timeout (seconds).
`max_retries`	int	2	Retries for 5xx and connection failures.
`tts_url`	string	`https://tts.shunyalabs.ai`	Batch API base URL. Override for self-hosted.
`tts_ws_url`	string	`wss://tts.shunyalabs.ai/ws/v1/audio/speech`	WebSocket URL.

Self-hosted configuration

client = AsyncShunyaClient(
    api_key="your-api-key",
    tts_url="https://my-tts-server.example.com",
    tts_ws_url="wss://my-tts-server.example.com/ws",
)

GitHub

github.com/Shunyalabsai/shunyalabs-python-sdk