TTS quickstart
Install, authenticate, synthesize. By the end of this page you'll have an MP3 on disk and know how to switch voice, language, speed, and format.
1. Install the SDK (optional)
shell
pip install "shunyalabs[TTS]" # TTS only
pip install "shunyalabs[all]" # TTS + ASR + everything
pip install "shunyalabs[extras]" # + audio playback helpers (sounddevice)You can also call the REST API directly with requests or any HTTP client, the SDK is just a thin wrapper.
2. Configure authentication
shell
export SHUNYALABS_API_KEY="sk-your-key"Or pass it in code:
python
client = AsyncShunyaClient(api_key="sk-your-key")3. First synthesis
shell
curl -X POST https://tts.shunyalabs.ai/v1/audio/speech \
-H "Authorization: Bearer $SHUNYALABS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "zero-indic", "input": "Hello, how are you today?", "voice": "Varun"}' \
--output output.mp3python
import asyncio
from shunyalabs import AsyncShunyaClient
from shunyalabs.tts import TTSConfig
async def main():
async with AsyncShunyaClient() as client:
result = await client.tts.synthesize(
"Hello, how are you today?",
config=TTSConfig(model="zero-indic", voice="Varun"),
)
result.save("output.mp3")
print(f"{len(result.audio_data)} bytes saved, {result.sample_rate} Hz")
asyncio.run(main())python
import requests
response = requests.post(
"https://tts.shunyalabs.ai/v1/audio/speech",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"model": "zero-indic", "input": "Hello!", "voice": "Varun"},
timeout=120,
)
response.raise_for_status()
with open("output.mp3", "wb") as f:
f.write(response.content)python
from openai import OpenAI
client = OpenAI(api_key=API_KEY, base_url="https://tts.shunyalabs.ai/v1")
response = client.audio.speech.create(
model="zero-indic",
input="Hello!",
voice="Varun",
response_format="mp3",
)
response.stream_to_file("output.mp3")4. Switch voice, language, speed, format
Pick a different voice
python
# Hindi female
TTSConfig(model="zero-indic", voice="Sunita")
# Tamil male
TTSConfig(model="zero-indic", voice="Murugan")
# English female
TTSConfig(model="zero-indic", voice="Nisha")46 voices total. See Voices & languages for the full catalogue.
Change speed
python
TTSConfig(model="zero-indic", voice="Nisha", speed=1.3) # fast notifications
TTSConfig(model="zero-indic", voice="Nisha", speed=0.85) # slower dictationChange output format
python
TTSConfig(model="zero-indic", voice="Varun", response_format="pcm") # real-time playback
TTSConfig(model="zero-indic", voice="Varun", response_format="mulaw") # telephony
TTSConfig(model="zero-indic", voice="Varun", response_format="wav") # editingFull format list at Audio formats.
Add an expression style
python
await client.tts.synthesize(
"<Happy> Welcome aboard!",
config=TTSConfig(model="zero-indic", voice="Sunita"),
)11 styles: Happy, Sad, Angry, Fearful, Surprised, Disgust, News, Conversational, Narrative, Enthusiastic, Neutral. See Expression styles.
5. Stream it
For real-time use (voice agents, IVR), stream audio as it synthesizes instead of waiting for the full file:
python
config = TTSConfig(model="zero-indic", voice="Varun", response_format="pcm")
async for chunk in await client.tts.stream("Hello!", config=config):
# play chunk bytes as they arrive
speaker.write(chunk)Full streaming details at Streaming.
6. Handle errors
python
from shunyalabs.exceptions import (
AuthenticationError, RateLimitError,
SynthesisError, ServerError, ShunyalabsError,
)
try:
result = await client.tts.synthesize("Hello!", config=config)
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limited, back off and retry")
except SynthesisError as e:
print(f"Bad input: {e}")
except ServerError:
print("Server error, safe to retry")
except ShunyalabsError as e:
print(f"SDK error: {e}")You're done
You now have text → audio working. Next, skim voices and audio formats to pick the right ones for your use case.