How to tune the responsiveness in Vocode conversations
EndpointingConfig
controls how this is done. There are a couple of different ways to configure endpointing:
We provide DeepgramEndpointingConfig()
which has some reasonable defaults and knobs to suit most use-cases (but only works with the Deepgram transcriber).
vad_threshold_ms
: translates to Deepgram’s endpointing
featureutterance_cutoff_ms
: uses Deepgram’s Utterance End featurestime_silent_config
: is a Vocode specific parameter that marks an utterance final if we haven’t seen any new words in X secondsuse_single_utterance_endpointing_for_first_utterance
: Uses is_final
instead of speech_final
for endpointing for the first utterance (works really well for outbound conversations, where the user’s first utterance is something like “Hello?”) - see this doc on Deepgram for more info.StreamingConversation
, it can be interrupted by the user. AgentConfig
itself provides a parameter called interrupt_sensitivity
that can be used to control how sensitive the AI is to interruptions. Interrupt sensitivity has two options: low (default) and high. Low sensitivity makes the bot ignore backchannels (e.g. “sure”, “uh-huh”) while the bot is speaking. High sensitivity makes the agent treat any word from the human as an interruption.
The implementation of this configuration is in StreamingConversation.TranscriptionsWorker
- in order to make this work well, you may need to fork Vocode and override this behavior, but it provides a good starting place for most use-cases.
Stay tuned, more dials to come here soon!
StreamingConversation
also exposes a parameter called conversation_speed
, which controls the length of endpointing pauses, i.e. how long the bot will wait before responding to the human. This includes normal utterances from the human as well as interruptions.
The amount of time the bot waits inversely scales with the conversation_speed
value. So a bot with conversation_speed
of 2 responds in half the time compared to a conversation_speed
of 1. Likewise a conversation_speed
of 0.5 means the bot takes twice as long to respond.
speed_coefficient
updates throughout the course of the conversation - see vocode.streaming.utils.speed_manager
to see this implementation!