Pre-recorded Audio — Tips

Choosing Between Batch and Streaming

A decision guide based on latency needs, NLP requirements, and input type.


Comparison

BatchStreaming
LatencySeconds (full file)Milliseconds (real-time)
NLP featuresAll features availableCore transcription only
Best forRecordings, files, pipelinesVoice agents, live captions, IVR
Input typesFile, bytes, URLRaw PCM audio chunks

Use batch when

  • You have a complete audio file or recording to process.
  • You need NLP features like diarization, sentiment, or translation.
  • Processing latency of seconds is acceptable.

Use streaming when

  • You need real-time or near-real-time transcripts.
  • You are building a voice agent, IVR, or live captioning system.
When in doubt: Start with batch. It is simpler to integrate, supports all features, and can feel near-real-time for short audio clips.