Pre-recorded Audio — Tips
Choosing Between Batch and Streaming
A decision guide based on latency needs, NLP requirements, and input type.
Comparison
| — | Batch | Streaming |
|---|---|---|
| Latency | Seconds (full file) | Milliseconds (real-time) |
| NLP features | All features available | Core transcription only |
| Best for | Recordings, files, pipelines | Voice agents, live captions, IVR |
| Input types | File, bytes, URL | Raw PCM audio chunks |
Use batch when
- You have a complete audio file or recording to process.
- You need NLP features like diarization, sentiment, or translation.
- Processing latency of seconds is acceptable.
Use streaming when
- You need real-time or near-real-time transcripts.
- You are building a voice agent, IVR, or live captioning system.
When in doubt: Start with batch. It is simpler to integrate, supports all features, and can feel near-real-time for short audio clips.