Call latency increase

Hey,

We noticed a significant increase with call latency since yesterday.

It doesn’t seem to be correlated with new agent version, for example, in our Android agent we haven’t submitted any changes to the agent since 2/2/26 though we still notice the huge increase in latency across all agents.

Currently this hurts out metrics really bad and provides a very negative experience to our production users.

Could someone please take a look and help us understand what causes the increase?

I understand you’re experiencing a significant latency increase since yesterday. Based on the available troubleshooting guidance, here are the recommended steps to diagnose and resolve this issue:

Immediate Troubleshooting Steps

Monitor LLM Response Time(1)
Check your current LLM latency and compare against the normal range of 500ms - 900ms.(1) If consistently above 900ms, consider using fast tier, switching to an alternative LLM provider, or waiting for your current provider to resolve performance issues.(1)

Check the Status Page(1)
Visit the Status Page for any ongoing issues.(1) If there are ongoing issues, wait for them to resolve.(1)

Consider Geographic Distance(1)
International phone calls may introduce additional latency due to geographic distance.(1) Check if you’re making calls between different countries or continents, and consider getting a local phone number in the same region as your users.(1)

Turn-Taking Model Impact(2)
In some cases, extra end-to-end latency can be explained by turn-taking taking longer during non-standard interactions.(2) If you have too many global nodes with long instructions, all those things will be added to the node transition prompt, which will cause the node transition to consistently slow.(2) Reducing the global node settings would help with the node transition latency.(2)

Configuration Optimizations

Change to a Faster LLM(3)
Use a more efficient language model to reduce response time and improve performance.(3)

Change to a Faster TTS(3)
Implement a faster text-to-speech system for quicker audio output and smoother conversations.(3)

Better Turn-Taking Model(3)
Implement a more sophisticated turn-taking model to accurately detect the end of a user’s turn, preventing the AI from interrupting or prematurely responding.(3)

Get Support

If the above steps don’t resolve your latency issues:(1)

Since you mentioned this started yesterday without any agent changes on your end, checking the Status Page and monitoring LLM response times should be your first priority to rule out platform-wide issues.

Would really appreciate a human to take a closer look and advice how can we address this issue which is affecting our production environment.

Hello @yakir.ag

Can you share some call IDs that have this high latency?

And is this still happening?

Hey Omar, yes it is still happening. @Omar_Ashraf

Here’s an example: call_1ccc8ea19efbd82fe6ced499e4d

https://dashboard.retellai.com/call-history?history=call_1ccc8ea19efbd82fe6ced499e4d

Hello @yakir.ag

I see the latency is coming from the LLM. Try a different provider like GPT 4.1.

To address LLM latency issues in Retell AI, you can switch to GPT 4.1, which is one of the supported models.(1)

If you’re experiencing high latency, here are several optimization steps:

Change to a faster LLM: Switching to an alternative LLM provider like GPT 4.1 can reduce response time and improve performance.(2)

Use Fast Tier: Retell AI offers a high priority pool option that provides more dedicated resources to ensure lower and consistent latency.(1) You can enable this by setting model_high_priority to true in your configuration.(1)

Check Current LLM Latency: Monitor your LLM response time and compare it against the normal range of 500ms-900ms.(3) If consistently above 900ms, consider using fast tier or switching providers.(3)

Optimize Other Components: Beyond the LLM, also consider changing to a faster TTS (text-to-speech) system for quicker audio output.(2)

Retell AI typically achieves sub-600ms to 620ms average end-to-end latency with optimized configurations.(4) The Retell LLM itself averages around 800ms end-to-end latency.(5)