TTS quickstart

Install, authenticate, synthesize. By the end of this page you'll have an MP3 on disk and know how to switch voice, language, speed, and format.

1. Install the SDK (optional)

pip install "shunyalabs[TTS]"         # TTS only
pip install "shunyalabs[all]"         # TTS + ASR + everything
pip install "shunyalabs[extras]"      # + audio playback helpers (sounddevice)

You can also call the REST API directly with requests or any HTTP client, the SDK is just a thin wrapper.

2. Configure authentication

export SHUNYALABS_API_KEY="sk-your-key"

Or pass it in code:

client = AsyncShunyaClient(api_key="sk-your-key")

3. First synthesis

shell

curl -X POST https://tts.shunyalabs.ai/v1/audio/speech \
  -H "Authorization: Bearer $SHUNYALABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "zero-indic", "input": "Hello, how are you today?", "voice": "Varun"}' \
  --output output.mp3

python

import asyncio
from shunyalabs import AsyncShunyaClient
from shunyalabs.tts import TTSConfig

async def main():
    async with AsyncShunyaClient() as client:
        result = await client.tts.synthesize(
            "Hello, how are you today?",
            config=TTSConfig(model="zero-indic", voice="Varun"),
        )
        result.save("output.mp3")
        print(f"{len(result.audio_data)} bytes saved, {result.sample_rate} Hz")

asyncio.run(main())

python

import requests

response = requests.post(
    "https://tts.shunyalabs.ai/v1/audio/speech",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"model": "zero-indic", "input": "Hello!", "voice": "Varun"},
    timeout=120,
)
response.raise_for_status()
with open("output.mp3", "wb") as f:
    f.write(response.content)

python

from openai import OpenAI

client = OpenAI(api_key=API_KEY, base_url="https://tts.shunyalabs.ai/v1")
response = client.audio.speech.create(
    model="zero-indic",
    input="Hello!",
    voice="Varun",
    response_format="mp3",
)
response.stream_to_file("output.mp3")

4. Switch voice, language, speed, format

Pick a different voice

# Hindi female
TTSConfig(model="zero-indic", voice="Sunita")

# Tamil male
TTSConfig(model="zero-indic", voice="Murugan")

# English female
TTSConfig(model="zero-indic", voice="Nisha")

46 voices total. See Voices & languages for the full catalogue.

Change speed

TTSConfig(model="zero-indic", voice="Nisha", speed=1.3)   # fast notifications
TTSConfig(model="zero-indic", voice="Nisha", speed=0.85)  # slower dictation

Change output format

TTSConfig(model="zero-indic", voice="Varun", response_format="pcm")    # real-time playback
TTSConfig(model="zero-indic", voice="Varun", response_format="mulaw")  # telephony
TTSConfig(model="zero-indic", voice="Varun", response_format="wav")    # editing

Full format list at Audio formats.

Add an expression style

await client.tts.synthesize(
    "<Happy> Welcome aboard!",
    config=TTSConfig(model="zero-indic", voice="Sunita"),
)

11 styles: Happy, Sad, Angry, Fearful, Surprised, Disgust, News, Conversational, Narrative, Enthusiastic, Neutral. See Expression styles.

5. Stream it

For real-time use (voice agents, IVR), stream audio as it synthesizes instead of waiting for the full file:

config = TTSConfig(model="zero-indic", voice="Varun", response_format="pcm")

async for chunk in await client.tts.stream("Hello!", config=config):
    # play chunk bytes as they arrive
    speaker.write(chunk)

Full streaming details at Streaming.

6. Handle errors

from shunyalabs.exceptions import (
    AuthenticationError, RateLimitError,
    SynthesisError, ServerError, ShunyalabsError,
)

try:
    result = await client.tts.synthesize("Hello!", config=config)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limited, back off and retry")
except SynthesisError as e:
    print(f"Bad input: {e}")
except ServerError:
    print("Server error, safe to retry")
except ShunyalabsError as e:
    print(f"SDK error: {e}")

You're done

You now have text → audio working. Next, skim voices and audio formats to pick the right ones for your use case.