Transcriber Reference
TranscriberConfig
The sampling rate of the audio in samples per second (Hz). A higher sampling rate provides better audio quality but may increase processing time and data size.
The encoding format of the audio data. Options include: LINEAR16, MULAW.
The size of each chunk of audio data sent to the transcriber, in bytes. A larger chunk size can reduce network overhead but may increase latency.
Optional endpointing configuration to determine when a transcription segment should end. If not provided, default endpointing behavior will be used.
Optional downsampling factor to reduce the sampling rate of the audio before sending to the transcriber. Can be used to reduce bandwidth usage.
Optional minimum confidence threshold for interrupting the transcription. Confidence values range from 0 to 1, with higher values indicating greater confidence. If provided, transcriptions will only be interrupted when the confidence exceeds the threshold. If not provided, the default interrupting behavior will be used.
If true, silence audio chunks will be sent to the transcriber while a transcription is in progress. Can be used to prevent echo during live transcription.
DeepgramTranscriberConfig
Optional language code to use for transcription.
Optional Deepgram model to use for transcription. Defaults to “nova”.
Optional Deepgram tier to use for transcription.
Optional Deepgram version to use for transcription. Defaults to latest version if not provided.
Optional list of keywords to boost in the transcription results.
GoogleTranscriberConfig
Optional Google Cloud Speech model to use for transcription.
Language code to use for transcription. Defaults to “en-US”.
AssemblyAITranscriberConfig
Buffer duration in seconds to accumulate audio before sending to AssemblyAI. Defaults to 0.1s.
Optional list of words to boost in the transcription results.
WhisperCPPTranscriberConfig
Buffer duration in seconds to accumulate audio before sending to WhisperCPP. Defaults to 1.0s.
Filename of the WhisperCPP shared library.
Filename of the WhisperCPP model.
AzureTranscriberConfig
Language code to use for transcription. Defaults to “en-US”.
Optional list of candidate languages to auto-detect from the audio.
GladiaTranscriberConfig
Buffer duration in seconds to accumulate audio before sending to Gladia. Defaults to 0.1s.