Speaker Intelligence

Emotion Detection

Adds emotion field directly inside each segment — NOT in nlp_analysis. Requires enable_diarization=true.


Python SDK

python
config = TranscriptionConfig(
    model="zero-indic",
    enable_diarization=True,
    enable_emotion_diarization=True,
)
result = await client.asr.transcribe("call.wav", config=config)

for seg in result.segments:
    print(f"[{seg.speaker}] [{seg.emotion}] {seg.text}")
# [SPEAKER_00] [angry] मेरी गाड़ी खराब हो गई है
# [SPEAKER_01] [neutral] ठीक है मैं आपकी मदद करता हूँ

REST API

terminal
curl -X POST https://asr.shunyalabs.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer <API_KEY>" \
  -F "[email protected]" \
  -F "model=zero-indic" \
  -F "enable_diarization=true" \
  -F "enable_emotion_diarization=true"

Output

json
{
  "segments": [
    { "start": 0.5, "end": 3.2, "text": "...", "speaker": "SPEAKER_00", "emotion": "angry" },
    { "start": 4.1, "end": 6.8, "text": "...", "speaker": "SPEAKER_01", "emotion": "neutral" }
  ]
}
Note: emotion is in segments[] — NOT in nlp_analysis.