Speaker Identification

Speaker identification goes beyond basic diarization by matching detected speakers against pre-registered voice profiles. Instead of generic labels like SPEAKER_00, transcripts can contain real names such as John Doe or Jane Smith.

Note: Unregistered voices automatically fall back to generic speaker labels.

Register new speakers under a project
Enable identification during transcription
List all registered speakers
Delete speaker profiles

Register New Speakers

To register a speaker, provide a short voice sample (3–5 seconds of clear audio recommended). Each speaker is registered under a specific project.

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb.shunyalabs.ai/speakers/register"
headers = {"X-API-Key": "your-api-key"}

with open("john_voice_sample.wav", "rb") as f:
    files = {"file": f}
    data = {
        "name": "John Doe",
        "project": "test_project"
    }

response = requests.post(
    url,
    headers=headers,
    files=files,
    data=data
)

print(response.json())

Response

{
  "status": "success",
  "message": "Speaker 'John Doe' registered.",
  "vector_shape": [1, 192],
  "project": "test_project"
}

Transcribe with Speaker Identification

Once speakers are registered, enable both diarization and identification during transcription.

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb2.shunyalabs.ai/v1/transcriptions"
headers = {"X-API-Key": "your_api_key_here"}

with open("your_audio.wav", "rb") as audio_file:
    files = {"file": audio_file}
    data = {
        "enable_diarization": "true",
        "use_identification": "true",
        "project": "test_project"
    }

response = requests.post(
    url,
    headers=headers,
    files=files,
    data=data
)

result = response.json()
print(result["text"])

Unregistered participants will still appear with generic speaker tags.

Example Output

{
  "success": true,
  "text": "Good morning everyone, let's begin the meeting...",
  "segments": [
    {
      "start": 0.0,
      "end": 3.5,
      "text": "Good morning everyone, let's begin the meeting.",
      "speaker": "John Doe"
    },
    {
      "start": 3.8,
      "end": 5.2,
      "text": "Thanks John. I have the sales report ready.",
      "speaker": "Jane Smith"
    },
    {
      "start": 5.5,
      "end": 8.9,
      "text": "Great, and I've prepared the marketing analysis.",
      "speaker": "Bob Johnson"
    },
    {
      "start": 9.2,
      "end": 11.5,
      "text": "Let me share my screen.",
      "speaker": "SPEAKER_03"
    }
  ],
  "unique_speakers": ["John Doe", "Jane Smith", "Bob Johnson", "SPEAKER_03"]
}

In this example, registered speakers are identified by name, while unknown participants fall back to generic labels.

List Registered Speakers

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb.shunyalabs.ai/speakers/list"
headers = {"X-API-Key": "your-api-key"}

response = requests.get(url, headers=headers)
result = response.json()

print(result)

{
  "status": "success",
  "speakers": ["John Doe", "Jane Smith", "Bob Johnson"],
  "count": 3,
  "project": "test_project"
}

Delete a Speaker

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb.shunyalabs.ai/speakers/delete"
headers = {"X-API-Key": "your_api_key_here"}

data = {
    "name": "John Doe",
    "project": "test_project"
}

response = requests.post(url, headers=headers, data=data)
print(response.json())

{
  "status": "success",
  "message": "Speaker 'John Doe' deleted successfully.",
  "project": "test_project"
}