Media & entertainment
Media & entertainment is one of the verticals Shunya Labs targets, described on the company site as "Automation for production & post-processing." This page links the Shunya capabilities you'll typically reach for when building in this space.
Recommended Shunya capabilities
| Capability | Shunya component | Source |
|---|---|---|
| Batch transcription with word timestamps | POST /v1/audio/transcriptions with word_timestamps=true | ASR API guide |
| Speaker separation for interviews / panels | enable_diarization=true; enable_speaker_identification with registered voices to get real names | ASR API guide |
| Multi-language captions | Vāķ Translate, 55 Indic languages, 2,970 any-to-any pairs, BLEU 38.5 weighted average | Vāķ HF model card |
| Dubbing & voiceover | Zero TTS, 23 Indic languages + English, 46 voices, 11 expression styles | TTS docs §5 |
| Voice cloning | reference_wav + reference_text: clone from a 1-6 second sample, works across all 23 supported Indic languages | TTS docs §6 |
| Subtitle file output | Word timestamps in the verbose JSON response, ready to serialise as SRT/VTT | ASR API guide |
| Content moderation | enable_profanity_hashing, hash_keywords with banned-phrase list | ASR API guide |
Voice cloning consent
When dubbing with a cloned voice, confirm you have rights, both the original recording's rights and written consent from the voice owner to synthesize. This is both a legal and reputational bar.Source: Shunya Labs website (Media & Entertainment vertical), ASR Gateway API Reference, TTS Developer Documentation §5 + §6, Vāķ Translate model card on Hugging Face. Specific production pipelines, ROI estimates, and tooling integrations are not officially published by Shunya.