TTS API reference

Every endpoint under tts.shunyalabs.ai, every field, every error code.

Base URLs

InterfaceURL
Batchhttps://tts.shunyalabs.ai
Streamingwss://tts.shunyalabs.ai/ws
Healthhttps://tts.shunyalabs.ai/health

Authentication

Same API key works for both transports. Pick the tab for the protocol you're using.

http
Authorization: Bearer sk-your-api-key

Send the key as an Authorization header on the WebSocket upgrade. If your client can't set headers, fall back to a token query parameter:

shell
wss://tts.shunyalabs.ai/ws?token=sk-your-api-key

POST /v1/audio/speech

Batch synthesis. Returns audio bytes in the requested format.

PropertyValue
MethodPOST
URLhttps://tts.shunyalabs.ai/v1/audio/speech (also /tts, /)
Content-Typeapplication/json
Response bodyAudio bytes in requested format

Request fields

FieldTypeRequiredDefaultDescription
modelstringYes-Use zero-indic.
inputstringYes-Text to synthesize. Max 10,000 chars.
voicestringYes-Speaker name. See Voices & languages.
response_formatstringNomp3pcm, wav, mp3, ogg_opus, flac, mulaw, alaw.
speedfloatNo1.00.25 (slowest) to 4.0 (fastest).
languagestringNonullISO 639 code for text preprocessing.
trim_silenceboolNofalseRemove leading/trailing silence.
volume_normalizationstringNonullpeak (0 dBFS) or loudness (EBU R128).
background_audiostringNonullPreset name (office, cafe, rain, street) or base64-encoded WAV/MP3.
background_volumefloatNo0.1Background mix ratio, 0.0-1.0.
reference_wavstringNonullBase64 audio for voice cloning. WAV/FLAC/OGG. 1-6 s.
reference_textstringNo""Transcript of reference audio. Max 500 chars.
word_timestampsboolNofalseInclude word-level timestamps (batch only).
max_tokensintNo2048Max tokens for internal LLM generation. 1-8192.

Response headers

HeaderMeaning
Content-TypeMIME type (audio/mpeg, audio/wav, etc.)
X-Sample-RateSample rate in Hz
X-Request-IdUnique identifier, log for support

WebSocket /ws

Full protocol in Streaming. Summary below.

Handshake

Three equivalent ways to send the synthesis request on connect, pick whichever fits your client.

Encode parameters in the URL, works in clients that can't easily send post-connect messages.

shell
wss://tts.shunyalabs.ai/ws?model=zero-indic&voice=Varun&encoding=pcm

Connect with no params, then send a config message. Useful when you want to reuse the connection.

json
{"type": "config", "model": "zero-indic", "voice": "Varun", "response_format": "pcm"}

Send the full synthesis request in one message, the server treats it as both config and synthesis trigger.

json
{"model": "zero-indic", "input": "Hello!", "voice": "Sunita", "response_format": "pcm"}

Inbound fields

FieldRequiredDescription
modelYesUse zero-indic.
inputYesText. Max 10,000 chars.
voiceYesSpeaker name.
response_formatNoDefault pcm on WebSocket.
speedNo0.25-4.0, default 1.0.
languageNoISO 639 code for preprocessing.
trim_silenceNoRemove silence, default false.

Outbound message shape

For each synthesis, the server emits messages in this order: a chunk metadata JSON, then a binary audio frame, repeated for every chunk; then a final completion JSON. If something goes wrong, an error JSON arrives instead of completion.

1. Chunk metadata (JSON)

Sent before each binary audio frame.

json
{
  "type": "chunk",
  "request_id": "uuid",
  "chunk_index": 0,
  "is_final": false,
  "format": "pcm",
  "sample_rate": 16000
}

2. Audio data (binary frame)

Raw audio bytes that immediately follow each chunk JSON. In the SDK, isinstance(msg, bytes) is True.

3. Completion (JSON)

Sent once after all audio chunks have been delivered.

json
{
  "type": "completion",
  "request_id": "uuid",
  "status": "complete",
  "total_chunks": 3,
  "total_duration_seconds": 2.48,
  "format": "pcm",
  "sample_rate": 16000
}

4. Error (JSON)

Sent instead of completion on failure. The connection closes after this message.

json
{
  "type": "error",
  "request_id": "uuid",
  "error": "Error description"
}

GET /health

Request:

shell
curl https://tts.shunyalabs.ai/health \
  -H "Authorization: Bearer $SHUNYALABS_API_KEY"

Response:

json
{"status": "healthy", "triton_ready": true, "auth_ready": true}

SDK exception hierarchy

All SDK exceptions inherit from ShunyalabsError.

ExceptionHTTPDescription
AuthenticationError401Invalid or missing API key.
PermissionDeniedError403API key lacks permission.
RateLimitError429Rate limit exceeded. Back off.
SynthesisError422Invalid text or config.
ServerError5xxTransient. Safe to retry.
TimeoutError-Request exceeded timeout.
ConnectionError-Network failure.

Rate & concurrency limits

LimitValue
Max text length per request10,000 characters
Recommended per-requestUnder 500 characters for best quality; split longer text
HTTP request timeoutSet to at least 120 s for long text
Concurrent requests (default tier)16

Rate-limit retry pattern

python
import asyncio
from shunyalabs.exceptions import RateLimitError

async def synthesize_with_backoff(client, text, config, retries=3):
    for attempt in range(retries):
        try:
            return await client.tts.synthesize(text, config=config)
        except RateLimitError:
            wait = 2 ** attempt    # 1s, 2s, 4s
            await asyncio.sleep(wait)
    raise RateLimitError("Max retries exceeded")

Concurrency cap pattern

python
import asyncio

sem = asyncio.Semaphore(16)

async def safe_synthesize(client, text, config):
    async with sem:
        return await client.tts.synthesize(text, config=config)

tasks = [safe_synthesize(client, s, config) for s in scripts]
results = await asyncio.gather(*tasks)

HTTP error status codes

StatusDescription
200Success. Audio bytes in response body.
400Missing or malformed fields. Body: {"detail": "..."}.
401API key invalid or missing.
422Invalid text or config.
429Rate limited.
500Internal server error.
503Backend (Triton/Redis) temporarily unavailable.
504Gateway timeout.

SDK configuration reference

ParameterTypeDefaultDescription
api_keystringNoneFalls back to SHUNYALABS_API_KEY env var.
timeoutfloat60.0Request timeout (seconds).
max_retriesint2Retries for 5xx and connection failures.
tts_urlstringhttps://tts.shunyalabs.aiBatch API base URL. Override for self-hosted.
tts_ws_urlstringwss://tts.shunyalabs.ai/ws/v1/audio/speechWebSocket URL.

Self-hosted configuration

python
client = AsyncShunyaClient(
    api_key="your-api-key",
    tts_url="https://my-tts-server.example.com",
    tts_ws_url="wss://my-tts-server.example.com/ws",
)

GitHub

github.com/Shunyalabsai/shunyalabs-python-sdk