Shunya Labs DocsShunya Labs Docs
🌐 International
🇺🇸 English
🇯🇵 Japanese
🇨🇳 Chinese (Simplified)
🇹🇼 Chinese (Traditional)
🇸🇦 Arabic
🇩🇪 German
🇫🇷 French
🇪🇸 Spanish
🇧🇷 Portuguese
🇷🇺 Russian
🇰🇷 Korean
🇹🇷 Turkish
🇻🇳 Vietnamese
🇮🇩 Indonesian
🇮🇳 Hindi Belt
हिन्दी — Hindi
भोजपुरी — Bhojpuri
मैथिली — Maithili
राजस्थानी — Rajasthani
🇮🇳 South India
தமிழ் — Tamil
తెలుగు — Telugu
ಕನ್ನಡ — Kannada
മലയാളം — Malayalam
🇮🇳 West India
मराठी — Marathi
ગુજરાતી — Gujarati
कोंकणी — Konkani
🇮🇳 East India
বাংলা — Bengali
ଓଡ଼ିଆ — Odia
অসমীয়া — Assamese
🇮🇳 North-East India
মেইতেই — Meitei
नेपाली — Nepali
🇮🇳 North India
ਪੰਜਾਬੀ — Punjabi
اردو — Urdu
کٲشُر — Kashmiri
डोगरी — Dogri
سنڌي — Sindhi

Pipecat integration

pipecat-shunyalabs v1.0.3

Native Shunyalabs STT and TTS services for Pipecat pipelines. Drop ShunyalabsSTTService and ShunyalabsTTSService into any Pipecat pipeline and get real-time streaming ASR with 46 speakers across 23 languages - no glue code required.

Requirements
Python 3.9+, Pipecat framework, and a valid Shunyalabs API key. Install a Pipecat transport (e.g. pipecat-ai[daily]) for WebRTC support.

Installation

Install the package from PyPI:

Terminal
pip install pipecat-shunyalabs

To include a transport (e.g. Daily WebRTC):

Terminal
pip install pipecat-shunyalabs pipecat-ai[daily]

Authentication

Set your API key as an environment variable (recommended) or pass it directly to the service classes:

Environment variable
export SHUNYALABS_API_KEY="your-api-key"
Python — inline
stt = ShunyalabsSTTService(api_key="your-api-key")
tts = ShunyalabsTTSService(api_key="your-api-key")
Security
Never commit API keys to source control. Use a secrets manager (GCP Secret Manager, AWS Secrets Manager, HashiCorp Vault) in production.

Quick start

A minimal pipeline wiring Shunyalabs STT → OpenAI LLM → Shunyalabs TTS on a local audio transport:

Python
import asyncio, os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.local.audio import LocalAudioTransport
from pipecat_shunyalabs import ShunyalabsSTTService, ShunyalabsTTSService

async def main():
    transport = LocalAudioTransport()

    stt = ShunyalabsSTTService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        language="en",
    )

    llm = OpenAILLMService(
        api_key=os.environ["OPENAI_API_KEY"],
        model="gpt-4o",
    )

    tts = ShunyalabsTTSService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        voice="Rajesh",
        language="en",
        style="<Conversational>",
    )

    pipeline = Pipeline([transport.input(), stt, llm, tts, transport.output()])
    task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
    await PipelineRunner().run(task)

if __name__ == "__main__":
    asyncio.run(main())

STT - ShunyalabsSTTService

Real-time streaming speech-to-text over a persistent WebSocket connection. Supports 23 Indian and international languages with optional automatic language detection.

Parameters

ParameterTypeDefaultDescription
api_keystrNoneAPI key. Falls back to SHUNYALABS_API_KEY env var.
languagestr"auto"Language code (e.g. "en", "hi") or "auto" for detection.
urlstrwss://asr.shunyalabs.ai/wsWebSocket endpoint URL.
sample_rateint16000Expected audio sample rate in Hz. Must match transport input.

Frame mapping

Shunyalabs eventPipecat frame
PARTIALInterimTranscriptionFrame - emitted continuously as speech is recognised
FINAL_SEGMENTTranscriptionFrame - emitted at speech segment boundary
FINALTranscriptionFrame - emitted when full utterance is finalised
Python — STT example
from pipecat_shunyalabs import ShunyalabsSTTService

stt = ShunyalabsSTTService(
    language="hi",       # Hindi; use "auto" for detection
    sample_rate=16000,
)
Auto-reconnect
If the WebSocket connection drops during audio streaming, the service automatically reconnects and resumes sending audio.

TTS - ShunyalabsTTSService

Streaming text-to-speech over WebSocket. Each synthesis request opens a new connection and streams audio chunks back as TTSAudioRawFrame frames. Supports 46 speakers across 23 languages - any speaker can synthesise in any language.

Parameters

ParameterTypeDefaultDescription
api_keystrNoneAPI key. Falls back to SHUNYALABS_API_KEY env var.
urlstrwss://tts.shunyalabs.ai/wsWebSocket endpoint URL.
modelstr"zero-indic"TTS model identifier.
voicestr"Rajesh"Speaker voice name.
stylestr"<Neutral>"Emotion / delivery style tag.
languagestr"en"Output language code.
output_formatstr"pcm"Audio encoding - pcm, wav, mp3, ogg_opus, flac, mulaw, alaw.
speedfloat1.0Speaking speed multiplier (0.25 – 4.0).

Style tags

TagDescription
<Neutral>Clean read-speech by default
<Happy>Joyful, upbeat tone
<Sad>Somber, melancholic tone
<Angry>Forceful, intense tone
<Fearful>Anxious, trembling tone
<Surprised>Exclamatory, astonished tone
<Disgust>Repulsed, disapproving tone
<News>Formal news-anchor style
<Conversational>Casual, everyday speech - recommended for voice agents
<Narrative>Storytelling / audiobook delivery
<Enthusiastic>Energetic, passionate tone

Text formatting

The service automatically prepends the style tag before sending to the API:

Python
tts = ShunyalabsTTSService(speaker="Rajesh", style="<Happy>")
# Input:  "Welcome!"
# Sent:   "<Happy> Welcome!"

Full pipeline example

A complete voice agent with Shunyalabs STT and TTS, OpenAI LLM, and the Daily WebRTC transport:

Python
import asyncio, os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import (
    OpenAILLMContext, OpenAILLMContextAggregator,
)
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat_shunyalabs import ShunyalabsSTTService, ShunyalabsTTSService

async def run_voice_agent(room_url: str, token: str):
    transport = DailyTransport(
        room_url, token, "Shunyalabs Agent",
        DailyParams(audio_out_enabled=True, transcription_enabled=False),
    )

    stt = ShunyalabsSTTService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        language="auto",
        sample_rate=16000,
    )

    llm = OpenAILLMService(
        api_key=os.environ["OPENAI_API_KEY"],
        model="gpt-4o",
    )

    messages = [{"role": "system", "content": "You are a helpful voice assistant."}]
    context = OpenAILLMContext(messages)
    context_aggregator = llm.create_context_aggregator(context)

    tts = ShunyalabsTTSService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        voice="Rajesh",
        language="hi",
        style="<Conversational>",
    )

    pipeline = Pipeline([
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ])

    task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True, enable_metrics=True))

    @transport.event_handler("on_first_participant_joined")
    async def on_first_participant_joined(transport, participant):
        await task.queue_frames([context_aggregator.user().get_context_frame()])

    await PipelineRunner().run(task)

if __name__ == "__main__":
    asyncio.run(run_voice_agent(
        room_url=os.environ["DAILY_ROOM_URL"],
        token=os.environ["DAILY_TOKEN"],
    ))

Multilingual example

Python
# Hindi conversational bot
tts = ShunyalabsTTSService(voice="Rajesh", language="hi", style="<Conversational>")

# English news-style bot
tts = ShunyalabsTTSService(voice="Varun", language="en", style="<News>")

Error reference

ExceptionHTTP codeDescription
AuthenticationError401Invalid or missing API key.
PermissionDeniedError403API key lacks permission for the resource.
RateLimitError429Rate limit exceeded. Implement exponential backoff.
ServerError5xxServer-side error. Retried automatically.
TimeoutErrorRequest exceeded timeout (default 60 s).
TranscriptionErrorASR-specific failure (e.g. unsupported audio format).
SynthesisErrorTTS-specific failure (e.g. invalid voice parameter).

Troubleshooting

SymptomResolution
AuthenticationError on startupVerify SHUNYALABS_API_KEY is set and valid.
WebSocket connection refusedEnsure outbound WSS (port 443) is open to asr.shunyalabs.ai and tts.shunyalabs.ai.
No transcription outputCheck sample_rate matches your transport input. Verify audio source is active.
TTS audio silent or missingEnsure output_format=pcm matches transport output. Verify TTSStartedFrame is received.
High latency on first TTS chunkDeploy closer to the Shunyalabs gateway region (asia-south1).
ImportError: pipecat_shunyalabsRun pip install pipecat-shunyalabs and confirm your virtual environment is activated.
Hugging Face
Next →
LiveKit
← Previous
Hugging Face
Next →
LiveKit