Skip to main content

Improving Transcription Accuracy

To get the best results from the Speech to Text API, follow these guidelines.

Audio Quality

  • Format: Use lossless formats like WAV or flac when possible.
  • Sample Rate: 16kHz or higher is recommended.
  • Noise: Minimize background noise and echo.
  • Clarity: Ensure the speaker is close to the microphone.

Configuration

  • Language: Explicitly set the language parameter.

Post-Processing

  • Diarization: Enable speaker diarization to separate speakers in multi-speaker audio.