You can subclass a RespondAgent to create a simple agent that can be passed into StreamingConversation. To do so, you must create an agent type, create an agent config, and then create your agent subclass. In the examples below, we will create an agent that responds with the same message no matter what is said to it, called BrokenRecordAgent.

Agent type

Each agent has a unique agent type string that is checked in various parts of Vocode, most notably in the factories that create agents. So, you must create a new type for your custom agent. See the AgentType enum in vocode/streaming/models/ for examples. For our BrokenRecordAgent, we will use “agent_broken_record” as our type.

Agent config

Your agent must have a corresponding agent config that is a subclass of AgentConfig and is (JSON-serializable). Serialization is automatically handled by Pydantic.

The agent config should only contain the information you need to deterministically create the same agent each time. This means with the same parameters in your config, the corresponding agent should have the same behavior each time you create it.

For our BrokenRecordAgent, we create a config like:

from vocode.streaming.models.agent import AgentConfig

class BrokenRecordAgentConfig(AgentConfig, type="agent_broken_record"):
    message: str # The message we will always return

Custom Agent

Now, you can create your custom agent subclass of RespondAgent. In your class header, pass in RespondAgent with a your agent type as a type hint. This should look like RespondAgent[Your_Agent_Type].

Each agent should override the generate_response() async method to support streaming and respond() method to support turn-based conversations.

If you want to only support turn-based conversations, you do not have to overwrite generate_response() but you MUST set generate_response=False in your agent config (see ChatVertexAIAgentConfig in vocode/streaming/models/ for an example). Otherwise, you must ALWAYS implement the generate_response() async method.

The generate_response() method returns an AsyncGenerator of tuples containing each message/sentence and a boolean for whether that message can be interrupted by the human speaking. You can automatically create this generator by yielding instead of returning (see example below).

We will now define our BrokenRecordAgent. Since we simply return the same message each time, we can return and yield that message in respond() and generate_response(), respectively:

class BrokenRecordAgent(RespondAgent[BrokenRecordAgentConfig]):

    # is_interrupt is True when the human has just interrupted the bot's last response
    def respond(
        self, human_input, is_interrupt: bool = False
    ) -> tuple[Optional[str], bool]:
        return self.agent_config.message

    async def generate_response(
        self, human_input, is_interrupt: bool = False
    ) -> AsyncGenerator[Tuple[str, bool], None]: # message and whether or not the message is interruptible
        """Returns a generator that yields the agent's response one sentence at a time."""
        yield self.agent_config.message, False

See our other agent implementations for more examples!