SynthesizerConfig

sampling_rate
int

The sampling rate of the audio in Hz.

audio_encoding
AudioEncoding

The encoding format for the audio data.

should_encode_as_wav
bool

Whether the audio chunks returned should be encoded as WAV format. Defaults to False.

sentiment_config
Optional[SentimentConfig]

Optional configuration for controlling sentiment/emotion in the synthesized speech.

AzureSynthesizerConfig

voice_name
str

The name of the voice to use. Defaults to “en-US-SteffanNeural”.

pitch
int

The pitch of the voice, as a percentage adjustment. Defaults to 0.

rate
int

The speaking rate of the voice, as a percentage adjustment. Defaults to 15.

language_code
str

The language code to use. Defaults to “en-US”.

GoogleSynthesizerConfig

language_code
str

The language code to use for synthesis. Defaults to “en-US”.

voice_name
str

The name of the Google Cloud Text-to-Speech voice to use. Defaults to “en-US-Neural2-I”.

pitch
float

The pitch of the voice. Defaults to 0.

speaking_rate
float

The speaking rate of the voice. Defaults to 1.2

ElevenLabsSynthesizerConfig

api_key
Optional[str]

The API key for accessing the ElevenLabs API.

voice_id
Optional[str]

The ID of the voice to use. Defaults to Adam voice.

optimize_streaming_latency
Optional[int]

Level of latency optimization between 0-4.

experimental_streaming
Optional[bool]

Whether to use experimental streaming.

stability
Optional[float]

Stability level for voice. Used with similarity_boost.

similarity_boost
Optional[float]

Similarity boost level for voice. Used with stability.

model_id
Optional[str]

Custom model ID to use.

RimeSynthesizerConfig

speaker
str

Name of Rime speaker to use. Defaults to “young_male_unmarked-1”.

sampling_rate
int

Sampling rate of generated audio. Defaults to 22050.

base_url
str

Base URL for Rime API. Defaults to ”https://rjmopratfrdjgmfmaios.functions.supabase.co/rime-tts

speed_alpha
Optional[float]

Speed adjustment factor. Defaults to None.

PlayHtSynthesizerConfig

api_key
Optional[str]

API key for Play.ht API.

user_id
Optional[str]

User ID for Play.ht API.

speed
Optional[int]

Speed adjustment.

seed
Optional[int]

Random seed for stochastic models.

temperature
Optional[int]

Sampling temperature for stochastic models.

voice_id
str

ID of voice to use. Defaults to “larry”

experimental_streaming
bool

Whether to use experimental streaming. Defaults to False.

StreamElementsSynthesizerConfig

voice
str

Name of StreamElements voice to use. Defaults to “Brian”.

CoquiTTSSynthesizerConfig

tts_kwargs
dict

Keyword arguments for Coqui TTS model.

speaker
Optional[str]

Speaker ID for multi-speaker models. Defaults to None.

language
Optional[str]

Language code for multi-lingual models. Defaults to None.

CoquiSynthesizerConfig

api_key
Optional[str]

API key for Coqui API.

voice_id
Optional[str]

ID of voice to use. Defaults to “ebe2db86-62a6-49a1-907a-9a1360d4416e”

voice_prompt
Optional[str]

Voice prompt to use instead of voice ID.

use_xtts
Optional[bool]

Whether to use X-Tts voices.

BarkSynthesizerConfig

preload_kwargs
Dict[str, Any]

Kwargs for Bark model preloading.

generate_kwargs
Dict[str, Any]

Kwargs for Bark audio generation.

PollySynthesizerConfig

language_code
str

Language code to use. Defaults to “en-US”.

voice_id
str

Voice ID of the Polly voice to use. Defaults to “Matthew”.

sampling_rate
int

Sampling rate of generated audio. Defaults to 16000.