Speaker Diarization
Speaker diarization automatically separates different speakers in an audio recording, labeling each segment with a speaker tag (e.g., SPEAKER_00, SPEAKER_01). This tells you who spoke when in conversations, meetings, or interviews.
You can also pre-identify speakers and customize speaker tags to match your context so transcripts automatically recognize known speakers. Learn more in speaker identification.
How to Enable
"enable_diarization": "true"Full Speaker Diarization Request
Don’t forget to replace YOUR_API_KEY with your own secret key.
import requests
url = "https://tb2.shunyalabs.ai/v1/transcriptions"
headers = {"X-API-Key": "your_api_key_here"}
with open("your_audio.wav", "rb") as audio_file:
files = {"file": audio_file}
data = {
"enable_diarization": "true"
}
response = requests.post(
url,
headers=headers,
files=files,
data=data
)
result = response.json()
print(result["text"])Example Output
{
"success": true,
"text": "Hello, thank you for calling customer support. How can I help you today? Hi, yes, I'm having trouble with my account login. I keep getting an error message. I'm sorry to hear that. Let me pull up your account and see what's going on.",
"segments": [
{
"start": 0.0,
"end": 5.5,
"text": "Hello, thank you for calling customer support. How can I help you today?",
"speaker": "SPEAKER_00"
},
{
"start": 6.0,
"end": 12.3,
"text": "Hi, yes, I'm having trouble with my account login. I keep getting an error message.",
"speaker": "SPEAKER_01"
},
{
"start": 12.8,
"end": 14.5,
"text": "I'm sorry to hear that.",
"speaker": "SPEAKER_00"
},
{
"start": 14.6,
"end": 18.9,
"text": "Let me pull up your account and see what's going on.",
"speaker": "SPEAKER_00"
}
]
}For custom speaker labels, refer to speaker identification.
Use Cases
- Meeting transcriptions with participant-level attribution
- Interview analysis (interviewer vs interviewee)
- Customer support calls (agent vs customer)
- Podcast and panel discussions with multiple speakers
- Legal recordings and courtroom proceedings