Models

Models Overview

Shunya TTS currently ships one production model. This page describes its capabilities and intended use cases.


Available Models

MODELDESCRIPTION
zero-indicMulti-lingual, multi-speaker Indic + English TTS model optimized for low latency and natural prosody across 23 languages.

Capabilities

  • 23 languages -- Hindi, English, Tamil, Telugu, Kannada, Malayalam, Bengali, Marathi, Gujarati, Punjabi, Odia, Assamese, and more.
  • 46 voices -- Male and female voices across all supported languages.
  • 11 expression styles -- Conversational, Newscast, Cheerful, Sad, Angry, Whisper, Excited, Friendly, Hopeful, Shouting, Terrified.
  • 7 output formats -- MP3, PCM, WAV, Ogg Opus, FLAC, mu-law, A-law.
  • Cross-lingual synthesis -- Any voice can speak any supported language, enabling code-mixed content without voice switching.
  • Voice cloning -- Provide a 1-6 second reference WAV to clone a custom voice on the fly.
  • Speed control -- 0.25x to 4.0x playback speed.
  • Silence trimming -- Remove leading and trailing silence for telephony and notification use cases.

Usage

Pass model="zero-indic" in your TTSConfig. This is currently the only supported value and is required for all synthesis requests.