Shunya Labs DocsShunya Labs Docs
🌐 International
🇺🇸 English
🇯🇵 Japanese
🇨🇳 Chinese (Simplified)
🇹🇼 Chinese (Traditional)
🇸🇦 Arabic
🇩🇪 German
🇫🇷 French
🇪🇸 Spanish
🇧🇷 Portuguese
🇷🇺 Russian
🇰🇷 Korean
🇹🇷 Turkish
🇻🇳 Vietnamese
🇮🇩 Indonesian
🇮🇳 Hindi Belt
हिन्दी — Hindi
भोजपुरी — Bhojpuri
मैथिली — Maithili
राजस्थानी — Rajasthani
🇮🇳 South India
தமிழ் — Tamil
తెలుగు — Telugu
ಕನ್ನಡ — Kannada
മലയാളം — Malayalam
🇮🇳 West India
मराठी — Marathi
ગુજરાતી — Gujarati
कोंकणी — Konkani
🇮🇳 East India
বাংলা — Bengali
ଓଡ଼ିଆ — Odia
অসমীয়া — Assamese
🇮🇳 North-East India
মেইতেই — Meitei
नेपाली — Nepali
🇮🇳 North India
ਪੰਜਾਬੀ — Punjabi
اردو — Urdu
کٲشُر — Kashmiri
डोगरी — Dogri
سنڌي — Sindhi

Official Shunyalabs plugin for LiveKit Agents. Plug shunyalabs.STT and shunyalabs.TTS directly into a LiveKit AgentSession - supports real-time streaming transcription, batch recognition, and high-fidelity multilingual voice synthesis.

Requirements
Python 3.9+, LiveKit Agents framework, and a valid Shunyalabs API key.

Installation

Terminal
pip install livekit-plugins-shunyalabs

Authentication

Set your API key as an environment variable or pass it directly to the plugin classes:

Environment variable
export SHUNYALABS_API_KEY="your-api-key"
Python — inline
stt = shunyalabs.STT(api_key="your-api-key")
tts = shunyalabs.TTS(api_key="your-api-key")

Quick start

The minimal wiring - pass shunyalabs.STT and shunyalabs.TTS into an AgentSession:

Python
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs, silero

session = AgentSession(
    stt=shunyalabs.STT(language="en"),
    tts=shunyalabs.TTS(speaker="Rajesh", style="<Neutral>"),
    vad=silero.VAD.load(),
)

STT - shunyalabs.STT

Streaming and batch speech-to-text backed by the Shunyalabs ASR gateway. Audio frames from LiveKit are forwarded over WebSocket; transcription events are pushed back as SpeechEvents.

Parameters

ParameterTypeDefaultDescription
api_keystrNoneAPI key. Falls back to SHUNYALABS_API_KEY env var.
languagestr"auto"BCP-47 language code or "auto" for detection.
api_urlstrhttps://asr.shunyalabs.aiREST batch endpoint base URL.
ws_urlstrwss://asr.shunyalabs.ai/wsWebSocket streaming endpoint URL.

Capabilities

CapabilitySupported
Streaming (real-time)Yes
Interim resultsYes
Offline / batch recognitionYes

Streaming STT

Real-time transcription over WebSocket with event mapping to LiveKit's SpeechEventType:

Shunyalabs eventLiveKit SpeechEventType
PARTIALINTERIM_TRANSCRIPT
FINAL_SEGMENTFINAL_TRANSCRIPT + END_OF_SPEECH
FINALFINAL_TRANSCRIPT + RECOGNITION_USAGE
Python — streaming STT
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs, silero

session = AgentSession(
    stt=shunyalabs.STT(language="en"),
    vad=silero.VAD.load(),
)

@session.on("user_speech_committed")
def on_speech(ev):
    print(f"User said: {ev.transcript}")

Batch STT

Single-shot transcription of an audio buffer via POST /v1/audio/transcriptions:

Python — batch STT
from livekit.plugins import shunyalabs

stt = shunyalabs.STT(language="en")

# Inside an agent context:
event = await stt.recognize(audio_buffer)
print(event.alternatives[0].text)

TTS - shunyalabs.TTS

Streaming and chunked text-to-speech. Token-by-token streaming collects text then synthesises on flush via WebSocket; the batch API handles single-shot synthesis over HTTP.

Parameters

ParameterTypeDefaultDescription
api_keystrNoneAPI key. Falls back to SHUNYALABS_API_KEY env var.
api_urlstrhttps://tts.shunyalabs.aiHTTP batch endpoint base URL.
ws_urlstrwss://tts.shunyalabs.ai/wsWebSocket streaming endpoint URL.
modelstr"zero-indic"TTS model name.
voicestr"Rajesh"Voice name for the API.
speakerstr"Rajesh"Speaker name prefix for text formatting.
stylestr"<Neutral>"Emotion style tag.
languagestr"en"Language code for transliteration.
sample_rateint16000Output audio sample rate in Hz.
output_formatstr"pcm"Audio format: pcm, wav, mp3, ogg_opus, flac.
speedfloat1.0Speaking speed multiplier (0.25 – 4.0).

Style tags

TagDescription
<Neutral>Neutral tone by default
<Happy>Happy / cheerful
<Sad>Sad / melancholic
<Angry>Angry / intense
<Fearful>Fearful / anxious
<Surprised>Surprised / excited
<Disgust>Disgusted
<News>News anchor style
<Conversational>Casual conversational - recommended for voice agents
<Narrative>Storytelling / narration
<Enthusiastic>Enthusiastic / energetic

Text formatting

The plugin automatically prepends the style tag before sending text to the API:

Python
tts = shunyalabs.TTS(speaker="Rajesh", style="<Happy>")
# Input:  "Welcome to our platform"
# Sent:   "<Happy> Welcome to our platform"

Streaming TTS example

Python
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs

session = AgentSession(
    tts=shunyalabs.TTS(
        speaker="Nisha",
        style="<Conversational>",
        model="zero-indic",
        voice="Nisha",
    ),
)

Chunked (batch) TTS example

Python
from livekit.plugins import shunyalabs

tts = shunyalabs.TTS(speaker="Varun", voice="Varun")
stream = tts.synthesize("Hello, how can I help you today?")

Full agent example

Python
import asyncio
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import shunyalabs, silero

class MyAgent(Agent):
    def __init__(self):
        super().__init__(instructions="You are a helpful voice assistant.")

async def entrypoint(ctx):
    session = AgentSession(
        stt=shunyalabs.STT(language="auto"),
        tts=shunyalabs.TTS(
            model="zero-indic",
            voice="Rajesh",
            speaker="Rajesh",
            style="<Conversational>",
        ),
        vad=silero.VAD.load(),
    )
    await session.start(
        agent=MyAgent(),
        room=ctx.room,
        room_input_options=RoomInputOptions(),
    )

Multilingual example

Python
# Hindi speaker
tts_hindi = shunyalabs.TTS(
    speaker="Rajesh", voice="Rajesh",
    language="hi", style="<Neutral>",
)

# English speaker
tts_english = shunyalabs.TTS(
    speaker="Varun", voice="Varun",
    language="en", style="<Conversational>",
)
Pipecat
Back to
Deployment overview
← Previous
Pipecat
Back to
Deployment overview