Introduction

This example shows how to use Vocode as a tool augmenting the abilities of a Langchain agent. By providing it with access to Vocode, a Langchain Agent can now make autonomous phone calls and take action based on the outcome of the calls.

Our demo will walk through how to instruct the agent to lookup a phone number and make the appropriate call.


How to run it

Requirements

  1. Install Ngrok
  2. Install Redis
  3. Install Poetry

Run the example

Note: gpt-4 is required for this to work. gpt-3.5-turbo or older models are not smart enough to parse JSON responses in Langchain agents reliably out-of-the-box.

To get started, clone the Vocode repo or copy the Langchain agent app directory.

git clone https://github.com/vocodedev/vocode-python.git

Environment

  1. Copy the .env.template and fill in your API keys. You’ll need:
  1. Tunnel port 3000 to ngrok by running:
ngrok http 3000

Fill in the TELEPHONY_SERVER_BASE_URL environment variable with your ngrok base URL: don’t include https:// so should be something like:

TELEPHONY_SERVER_BASE_URL=asdf1234.ngrok.app
  1. Buy a phone number on Twilio or verify your caller ID to use as the outbound phone number. Set this phone number as the OUTBOUND_CALLER_NUMBER environment variable. Include + and the area code, so for a US phone number, it would look something like.
OUTBOUND_CALLER_NUMBER=+15555555555

Set up self-hosted telephony server

Run the following setups from the langchain_agent directory.

Running with Docker

  1. Build the telephony server Docker image
docker build -t vocode-langchain-agent-telephony-app .
  1. Run the service using docker-compose
docker-compose up

Running with Python

  1. (optional) Set up a Python environment: we recommend virtualenv
python3 -m venv venv
source venv/bin/activate
  1. Install requirements
poetry install
  1. Run an instance of Redis at http://localhost:6379. With Docker, this can be done with:
docker run -dp 6379:6379 -it redis/redis-stack:latest
  1. Run the TelephonyServer:
uvicorn telephony_app:app --reload --port 3000

Set up the Langchain agent

With the self-hosted telephony server running:

  1. Update the phone numbers in the contact book in tools/contacts.py
CONTACTS = [{"name": "Kian", "phone": "+123456789"}]
  1. Run main.py
poetry install
poetry run python main.py

Code explanation

The Langchain agent is implemented in main.py. It uses the Langchain library to initialize an agent that can have a conversation.

Langchain Agent

main.py instantiates the langchain agent and relevant tools. It sets an objective, initializes a Langchain agent, and runs the conversation.

agent = initialize_agent(
    tools=[get_all_contacts, call_phone_number, word_of_the_day],
    llm=llm,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=verbose,
    memory=memory,
)

Langchain tools

Tool to get all contacts

@tool("get_all_contacts")
def get_all_contacts(placeholder: str) -> List[dict]:
    """Get contacts."""
    return CONTACTS

Tool to call phone number

tools/vocode.py makes use of the OutboundCall class to initiate a phone call

@tool("call phone number")
def call_phone_number(input: str) -> str:
    """calls a phone number as a bot and returns a transcript of the conversation.
    the input to this tool is a pipe separated list of a phone number, a prompt, and the first thing the bot should say.
    The prompt should instruct the bot with what to do on the call and be in the 3rd person,
    like 'the assistant is performing this task' instead of 'perform this task'.

    should only use this tool once it has found an adequate phone number to call.

    for example, `+15555555555|the assistant is explaining the meaning of life|i'm going to tell you the meaning of life` will call +15555555555, say 'i'm going to tell you the meaning of life', and instruct the assistant to tell the human what the meaning of life is.
    """
    phone_number, prompt, initial_message = input.split("|", 2)
    call = OutboundCall(
        base_url=os.environ["TELEPHONY_SERVER_BASE_URL"],
        to_phone=phone_number,
        from_phone=os.environ["OUTBOUND_CALLER_NUMBER"],
        config_manager=RedisConfigManager(),
        agent_config=ChatGPTAgentConfig(
            prompt_preamble=prompt,
            initial_message=BaseMessage(text=initial_message),
        ),
        logger=logging.Logger("call_phone_number"),
    )
    LOOP.run_until_complete(call.start())
    while True:
        maybe_transcript = get_transcript(call.conversation_id)
        if maybe_transcript:
            delete_transcript(call.conversation_id)
            return maybe_transcript
        else:
            time.sleep(1)

TelephonyServer

telephony_app.py instantiates a TelephonyServer object to manage the phone call initiated by OutboundCall

telephony_server = TelephonyServer(
    base_url=BASE_URL,
    config_manager=config_manager,
    inbound_call_configs=[],
    events_manager=EventsManager(),
    logger=logger,
)

app.include_router(telephony_server.get_router())