Integrate Sentry for error tracking and performance monitoring
quickstarts/streaming_conversation.py
, replace the main
function with the following code:
app/telephony_app/main.py
LATENCY_OF_CONVERSATION
) measures the overall latency of a conversation, from when the user finishes their utterance to when the agent begins its response. It is broken up into the following sub-spans:
ENDPOINTING_LATENCY
): Captures the extra latency involved from retrieving finalized transcripts from Deepgram before deciding to invoke the agent.LANGUAGE_MODEL_TIME_TO_FIRST_TOKEN
): Tracks the time taken by the language model to generate the first token (word or character) in its response.SYNTHESIS_TIME_TO_FIRST_TOKEN
): Measures the time taken by the synthesizer to generate the first token in the synthesized speech. This is useful for evaluating the initial response time of the synthesizer.CONNECTED_TO_FIRST_SEND
): Measures the time from when the Deepgram websocket connection is established to when the first data is sentFIRST_SEND_TO_FIRST_RECEIVE
): Measures the time from when the first data is sent to Deepgram to when the first response is receivedSTART_TO_CONNECTION
): Tracks the time it takes to establish the websocket connection with DeepgramTIME_TO_FIRST_TOKEN
): Measures the time taken by the language model to generate the first token (word or character) in its response.LLM_FIRST_SENTENCE_TOTAL
): Measures the total time taken by the language model to generate the first complete sentence.SYNTHESIS_GENERATE_FIRST_CHUNK
): Measures the time taken to generate the first chunk of synthesized speech.SYNTHESIZER_SYNTHESIS_TOTAL
): Tracks the total time taken for the entire speech synthesis process. This span helps in understanding the overall performance of the synthesizer.ElevenLabsSynthesizer
, the span SYNTHESIZER_SYNTHESIS_TOTAL
will be recorded as ElevenLabsSynthesizer.synthesis_total
:
SYNTHESIZER_SYNTHESIS_TOTAL
)SYNTHESIZER_TIME_TO_FIRST_TOKEN
)