`TranscriberConfig`

sampling_rate

int

The sampling rate of the audio in samples per second (Hz). A higher sampling rate provides better audio quality but may increase processing time and data size.

audio_encoding

AudioEncoding

The encoding format of the audio data. Options include: LINEAR16, MULAW.

chunk_size

int

The size of each chunk of audio data sent to the transcriber, in bytes. A larger chunk size can reduce network overhead but may increase latency.

endpointing_config

Optional[EndpointingConfig]

Optional endpointing configuration to determine when a transcription segment should end. If not provided, default endpointing behavior will be used.

downsampling

Optional[int]

Optional downsampling factor to reduce the sampling rate of the audio before sending to the transcriber. Can be used to reduce bandwidth usage.

min_interrupt_confidence

Optional[float]

Optional minimum confidence threshold for interrupting the transcription. Confidence values range from 0 to 1, with higher values indicating greater confidence. If provided, transcriptions will only be interrupted when the confidence exceeds the threshold. If not provided, the default interrupting behavior will be used.

mute_during_speech

bool

If true, silence audio chunks will be sent to the transcriber while a transcription is in progress. Can be used to prevent echo during live transcription.

`DeepgramTranscriberConfig`

language

Optional[str]

Optional language code to use for transcription.

model

Optional[str]

Optional Deepgram model to use for transcription. Defaults to “nova”.

tier

Optional[str]

Optional Deepgram tier to use for transcription.

version

Optional[str]

Optional Deepgram version to use for transcription. Defaults to latest version if not provided.

keywords

Optional[List[str]]

Optional list of keywords to boost in the transcription results.

`GoogleTranscriberConfig`

model

Optional[str]

Optional Google Cloud Speech model to use for transcription.

language_code

str

Language code to use for transcription. Defaults to “en-US”.

`AssemblyAITranscriberConfig`

buffer_size_seconds

float

Buffer duration in seconds to accumulate audio before sending to AssemblyAI. Defaults to 0.1s.

word_boost

Optional[List[str]]

Optional list of words to boost in the transcription results.

`WhisperCPPTranscriberConfig`

buffer_size_seconds

float

Buffer duration in seconds to accumulate audio before sending to WhisperCPP. Defaults to 1.0s.

libname

str

Filename of the WhisperCPP shared library.

fname_model

str

Filename of the WhisperCPP model.

`AzureTranscriberConfig`

language

str

Language code to use for transcription. Defaults to “en-US”.

candidate_languages

Optional[List[str]]

Optional list of candidate languages to auto-detect from the audio.

`GladiaTranscriberConfig`

buffer_size_seconds

float

Buffer duration in seconds to accumulate audio before sending to Gladia. Defaults to 0.1s.

Vocode 101

Getting Started

Agents

Synthesizers (Voice)

Transcribers (Speech-to-Text)

Actions

Conversation Tuning

Monitoring

Testing

Advanced Functionality

Legacy (0.0.111) Guides

Transcriber Reference

`TranscriberConfig`

`DeepgramTranscriberConfig`

`GoogleTranscriberConfig`

`AssemblyAITranscriberConfig`

`WhisperCPPTranscriberConfig`

`AzureTranscriberConfig`

`GladiaTranscriberConfig`

Vocode 101

Getting Started

Agents

Synthesizers (Voice)

Transcribers (Speech-to-Text)

Actions

Conversation Tuning

Monitoring

Testing

Advanced Functionality

Legacy (0.0.111) Guides

​TranscriberConfig

​DeepgramTranscriberConfig

​GoogleTranscriberConfig

​AssemblyAITranscriberConfig

​WhisperCPPTranscriberConfig

​AzureTranscriberConfig

​GladiaTranscriberConfig

`TranscriberConfig`

`DeepgramTranscriberConfig`

`GoogleTranscriberConfig`

`AssemblyAITranscriberConfig`

`WhisperCPPTranscriberConfig`

`AzureTranscriberConfig`

`GladiaTranscriberConfig`