We are having a latency issue . as you can see all of our inbound calls have more than 2500 ms.. tried to changed the LLM provider and voice provider with no luck. Can Any one from the team let us know what the real cause of the problem is, please?
I can help you troubleshoot this latency issue. With end-to-end latency exceeding 2500ms, follow these steps:(1)
1. Check Estimated Latency
Look for features marked with a turtle icon
in your configuration and check if your estimated latency is higher than 1.5s. If high, disable some features to reduce latency.(1)
2. Monitor LLM Response Time
Check your current LLM latency and compare against the normal range (500ms - 900ms). If consistently above 900ms, use fast tier, switch to an alternative LLM provider, or wait for your current provider to resolve performance issues.(1)
3. Check Status Page
Check the Status Page for any ongoing issues. If there are any ongoing issues, wait for them to resolve.(1)
4. Consider Geographic Distance
International phone calls may introduce additional latency due to geographic distance. Check if you’re making calls between different countries or continents. Consider getting a local phone number in the same region as your users.(1)
5. Turn-Taking Model Impact
In some cases, extra end-to-end latency can be explained by turn-taking taking longer during non-standard interactions. The system may wait for a “pause” before the agent starts speaking, and that pause time is counted towards E2E latency.(2)
If the above steps don’t resolve your latency issues:
- Locate your call ID(1)
- Message the support-questions channel in the Discord community(1)
- Include your call ID, steps you’ve already tried, and current latency measurements(1)
You can also:
- Visit Contact Sales(1)
- Visit https://github.com/RetellAI/(1)
Hello @b_yildiz
Can you share some call ids.
Hello @Omar
Sure
call_3dd532cc03233595592d14f4bc3
call_887fd553ab38a47f480d558a300
call_56534e38b5e5e7f1e47cdede56e
call_80df7ff3adcf378414e5f2454c5
Hello @b_yildiz
Actually, the latency is coming from the TTS provider. Try using something other than Eleven Labs for the Turkish language and see what other providers give you the best latency.
Thanks for the feed back. @Omar_Ashraf
I tried the other providers , there seems not to be a significant change
Let me clearify . the traffic is beetween your server and the TTs provider , isnt it?
Thanks
Based on the available documentation, here’s how latency works in Retell AI’s architecture:
Where Latency Happens in Voice Automation:(1)
- Speech Recognition Processing (ASR) - Time taken to transcribe spoken words into text(1)
- Response Generation (NLG or LLM) - Time to understand and craft an appropriate, contextual reply(1)
- Speech Synthesis (TTS) - Time to turn the generated text back into spoken words(1)
- Network Transmission - Delays caused by sending audio and data between systems, especially in cloud setups(1)
Regarding TTS Provider Traffic:
The documentation indicates that Retell AI uses a distributed architecture with edge deployment to minimize latency.(2) The platform processes speech through multiple stages, and yes, there is network communication between Retell’s servers and TTS providers.(1)
Additional Troubleshooting Steps:
Since you’ve already tried different TTS providers with no significant change, consider:
- Check if you’re using features marked with
in your configuration, as these add latency(3) - Verify your LLM response time is in the normal range (500ms-900ms)(3)
- Review your denoising mode setting - switching between “noise-cancellation” and “noise-and-background-speech-cancellation” can impact performance(4)
- Check your STT mode - you can choose between “fast” and “accurate” modes(4)
For detailed investigation of your specific issue, contact support with your call IDs through the Discord community or visit Contact Sales.(3)
