Synthesizer Reference
SynthesizerConfig
The sampling rate of the audio in Hz.
The encoding format for the audio data.
Whether the audio chunks returned should be encoded as WAV format. Defaults to False.
Optional configuration for controlling sentiment/emotion in the synthesized speech.
AzureSynthesizerConfig
The name of the voice to use. Defaults to “en-US-SteffanNeural”.
The pitch of the voice, as a percentage adjustment. Defaults to 0.
The speaking rate of the voice, as a percentage adjustment. Defaults to 15.
The language code to use. Defaults to “en-US”.
GoogleSynthesizerConfig
The language code to use for synthesis. Defaults to “en-US”.
The name of the Google Cloud Text-to-Speech voice to use. Defaults to “en-US-Neural2-I”.
The pitch of the voice. Defaults to 0.
The speaking rate of the voice. Defaults to 1.2
ElevenLabsSynthesizerConfig
The API key for accessing the ElevenLabs API.
The ID of the voice to use. Defaults to Adam voice.
Level of latency optimization between 0-4.
Whether to use experimental streaming.
Stability level for voice. Used with similarity_boost.
Similarity boost level for voice. Used with stability.
Custom model ID to use.
RimeSynthesizerConfig
Name of Rime speaker to use. Defaults to “young_male_unmarked-1”.
Sampling rate of generated audio. Defaults to 22050.
Base URL for Rime API. Defaults to “https://rjmopratfrdjgmfmaios.functions.supabase.co/rime-tts”
Speed adjustment factor. Defaults to None.
PlayHtSynthesizerConfig
API key for Play.ht API.
User ID for Play.ht API.
Speed adjustment.
Random seed for stochastic models.
Sampling temperature for stochastic models.
ID of voice to use. Defaults to “larry”
Whether to use experimental streaming. Defaults to False.
StreamElementsSynthesizerConfig
Name of StreamElements voice to use. Defaults to “Brian”.
CoquiTTSSynthesizerConfig
Keyword arguments for Coqui TTS model.
Speaker ID for multi-speaker models. Defaults to None.
Language code for multi-lingual models. Defaults to None.
CoquiSynthesizerConfig
API key for Coqui API.
ID of voice to use. Defaults to “ebe2db86-62a6-49a1-907a-9a1360d4416e”
Voice prompt to use instead of voice ID.
Whether to use X-Tts voices.
BarkSynthesizerConfig
Kwargs for Bark model preloading.
Kwargs for Bark audio generation.
PollySynthesizerConfig
Language code to use. Defaults to “en-US”.
Voice ID of the Polly voice to use. Defaults to “Matthew”.
Sampling rate of generated audio. Defaults to 16000.