Streaming TTS
Text to Speech — Streaming
Open a persistent WebSocket connection and receive audio chunks in real time.
How it works
Streaming synthesis opens a WebSocket to the TTS server and yields audio chunks as they are generated, enabling real-time playback before the full utterance is complete.
| Property | Value |
|---|---|
| Transport | WebSocket |
| Endpoint | wss://tts.shunyalabs.ai/ws/v1/audio/speech (also: /ws/tts, /ws) |
| Config | TTSConfig |
| Default format | mp3 |
When to use streaming
- Conversational AI and voice assistants that need low-latency audio
- Real-time narration where playback must begin before synthesis completes
- Telephony and call-center bots that stream audio to callers
- Any use case where time-to-first-byte matters