Batch TTS

Tips & Tricks

Practical advice for getting the best results from batch TTS synthesis.


Choosing the right output format

  • Use mp3 for general storage and delivery — good compression, widely supported.
  • Use pcm or wav for real-time pipelines — no decoding overhead.
  • Use mulaw or alaw for telephony (Twilio, IVR, PSTN) — standard 8kHz codecs.
  • Use ogg_opus for web streaming — lower latency than MP3 at similar quality.

Getting the best audio quality

  • Send complete sentences — the model performs best on full utterances.
  • Use trim_silence=True for notifications and IVR prompts where tight audio matters.
  • Use volume_normalization="loudness" for consistent perceived volume across multiple clips.
  • Experiment with expression style tags to make speech sound more natural and engaging.