Integrate with the Python SDK
The shunyalabs SDK wraps the REST + WebSocket APIs in an idiomatic async client. Typed configuration, streaming generators, and a clean exception hierarchy, production-grade integration in under a dozen lines of code.
Your journey
Step 1: Install
pip install shunyalabsStep 2: Configure your API key
The SDK reads SHUNYALABS_API_KEY from the environment by default. Recommended:
export SHUNYALABS_API_KEY="your-api-key"Or pass directly to the client:
from shunyalabs import AsyncShunyaClient
client = AsyncShunyaClient(api_key="your-api-key")Other env vars: SHUNYALABS_TTS_URL, SHUNYALABS_TTS_WS_URL override the batch and WebSocket endpoints, useful for on-prem deployments.
Step 3: Your first synthesis
import asyncio
from shunyalabs import AsyncShunyaClient
from shunyalabs.tts import TTSConfig
async def main():
async with AsyncShunyaClient() as client:
result = await client.tts.synthesize(
"Hello, how are you today?",
config=TTSConfig(model="zero-indic", voice="Varun"),
)
result.save("output.mp3")
print(f"{len(result.audio_data)} bytes saved")
asyncio.run(main())import asyncio
from shunyalabs import AsyncShunyaClient
from shunyalabs.tts import TTSConfig
async def main():
async with AsyncShunyaClient() as client:
config = TTSConfig(model="zero-indic", voice="Sunita", response_format="pcm")
async for audio in await client.tts.stream("Hello!", config=config):
play(audio) # your playback function, bytes arrive ~1s after the call
asyncio.run(main())What batch returns
result is a TTSResult with three fields:
| Field | Type | Description |
|---|---|---|
audio_data | bytes | Decoded audio bytes, write to file or pass to playback. |
sample_rate | int | Hz (e.g. 22050 for mp3, 8000 for mulaw). |
format | string | Matches the response_format in TTSConfig. |
Long-form streaming → disk
For long-form content where buffering the whole response is impractical, use stream_to_file(): constant memory regardless of length:
await client.tts.stream_to_file(
"Long audiobook chapter text...",
"chapter_01.pcm",
config=TTSConfig(model="zero-indic", voice="Varun", response_format="pcm"),
)Step 4: TTSConfig knobs you'll actually use
TTSConfig(
model="zero-indic", # required
voice="Rajesh", # required, see /tts/voices
response_format="pcm", # pcm, wav, mp3, ogg_opus, flac, mulaw, alaw
speed=1.0, # 0.25 → 4.0
language="hi", # optional ISO 639 hint
trim_silence=True, # strip leading/trailing silence
volume_normalization="loudness", # "peak" or "loudness"
)Other knobs worth knowing:
background_audio="cafe"+background_volume=0.1: preset ambient mix.reference_wav+reference_text: voice cloning from a 1-6 second sample.word_timestamps=True: per-word timing for captions and alignment (batch only).
Step 5: Error handling
All exceptions inherit from ShunyalabsError. Import from shunyalabs.exceptions:
from shunyalabs.exceptions import (
AuthenticationError, RateLimitError,
SynthesisError, ServerError, ShunyalabsError,
)
try:
result = await client.tts.synthesize("Hello!", config=config)
except AuthenticationError:
print("Invalid API key, check SHUNYALABS_API_KEY")
except RateLimitError:
print("Rate limit hit, back off and retry")
except SynthesisError as e:
print(f"Synthesis failed: {e}")
except ServerError:
print("Server error, safe to retry")
except ShunyalabsError as e:
print(f"SDK error: {e}")Step 6: LLM → TTS pipeline (the conversational pattern)
Pipe LLM tokens into TTS at sentence boundaries. Cuts time-to-first-audio by 200-400 ms vs. waiting for the full response.
from openai import AsyncOpenAI
from shunyalabs import AsyncShunyaClient
from shunyalabs.tts import TTSConfig
async def gpt_to_tts(user_message: str):
oai = AsyncOpenAI()
shunya = AsyncShunyaClient()
config = TTSConfig(model="zero-indic", voice="Sunita", response_format="pcm")
buffer = ""
stream = await oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_message}],
stream=True,
)
async for chunk in stream:
token = chunk.choices[0].delta.content or ""
buffer += token
if token in (".", "!", "?", ";") and len(buffer) > 15:
async for audio in await shunya.tts.stream(buffer, config=config):
play(audio)
buffer = ""
if buffer.strip():
async for audio in await shunya.tts.stream(buffer, config=config):
play(audio)Step 7: Ship checklist
- ✅
SHUNYALABS_API_KEYfrom environment, never hardcoded - ✅
async with AsyncShunyaClient()for proper connection cleanup - ✅ Error handler covers
AuthenticationError,RateLimitError,ConnectionError - ✅
stream_to_file()for long-form synthesis (no memory pressure) - ✅ Reconnect / retry on ConnectionError for long-running streaming sessions
- ✅ Sentence-boundary buffering when piping from an LLM (don't flush per token)
base_url. See OpenAI compatible.