Multilingual support
How to configure different languages
Multilingual Support
The vocode project can be configured to support multiple languages for speech synthesis and automatic speech recognition (ASR).
Speech Synthesis
The speech synthesizer used in vocode is configurable. By default, the AzureSynthesizer
is used which supports over 75 voices across over 45 languages.
To configure a different language, modify the SynthesizerConfig
when initializing the conversation:
Transcription
The transcriber used in vocode is also configurable. By default, DeepgramTranscriber
is used which supports over 35 languages.
To configure a different language model, modify the language code passed to TranscriberConfig
when initializing the config object (en-US
is the default):
Note: the default model for Deepgram is Nova, so you must pass model="nova-2"
to use that model.
Other transcription services like Google Cloud Speech or Assembly AI could also be used by configuring the appropriate TranscriberConfig
.
Language Configuration
It is recommended to load the speech synthesizer voice and transcription model from environment variables or configuration to avoid hard-coding language choices:
This allows dynamically configuring the speech language without code changes.