Streaming Audio

Audio Format Requirements

Why streaming requires raw PCM, and how to convert any audio source.


Why raw PCM?

The streaming endpoint receives audio as a continuous binary stream. Container formats like WAV, MP3, or OGG include headers the server does not parse in streaming mode — only raw PCM samples are accepted.

Converting with ffmpeg

terminal
# int16 — default, recommended
ffmpeg -i input.wav -ar 16000 -ac 1 -f s16le output.pcm

# float32
ffmpeg -i input.wav -ar 16000 -ac 1 -f f32le output.pcm
Always match the dtype in ffmpeg with the dtype set in StreamingConfig — mismatched formats cause garbled transcripts.