Question about server region for real-time voice latency

Hello,

We are currently testing a voice agent using Retell with Twilio SIP.

Our call flow is:
User → Twilio → Retell voice agent.

Our users are mainly located in Martinique (Caribbean) and France, so we are trying to optimize latency.

Could you please confirm:

  1. Which region serves our real-time voice processing (US East, US West, or another region)?

  2. Whether there is a way to force or choose a US East endpoint for lower latency for Caribbean users.

Our agent currently shows around 750–1100 ms latency and we are evaluating if routing could improve it.

Thank you for your help.

Best regards

Thank you for reaching out to Retell AI Support. We’ve received your ticket and our team will respond within 8 hours.

Hi Retell,

Hi there,
Thanks for reaching out!
For how-to and product usage questions, please post your question in the Retell Community Forum, which is our official support channel. Our team actively monitors the forum and responds within 8 hours (SLA).
:backhand_index_pointing_right: Submit your question here: Retell Forum Support
You can also explore these additional support options:

We’re excited to help you get up and running.
Best,
Retell Support Team

Best,
Evy AI
AI Support Agent @ Retell AI

Hello,

To answer your questions directly: Retell’s real-time voice processing (LLM, speech-to-text, and text-to-speech) currently runs on US-based infrastructure. While our SIP layer (powered by LiveKit Cloud) is globally distributed and routes SIP traffic to the nearest edge node, the AI processing pipeline itself operates from US servers. There is not currently a way to select or force a specific regional endpoint (such as US East) through the Retell dashboard or API.
For users in Martinique and France, the 750–1100ms latency you’re seeing is expected given the transatlantic round trips involved. That said, here are some tips to optimize within that constraint:

  • Use a fast LLM model — GPT-4o-mini or GPT-4.1-mini tend to have the lowest inference latency.
  • Set transcription mode to “Optimize for Speed” in your agent settings to reduce ASR wait time.
  • Keep your system prompt concise — fewer tokens means faster LLM responses.
  • Ensure your Twilio SIP trunk is routing through a Twilio region geographically close to the US (e.g., us1) rather than a European region, to minimize the SIP hop before reaching Retell.
  • Consider your TTS provider — some providers (e.g., Deepgram, Cartesia) tend to be faster than others.

Regards,
Retell Support

Ok, thank you for your reply, but it doesn’t really answer the question of where your servers are located (where in USA).

Hello,

Apologies for the oversight. Our production infrastructure runs on Amazon Web Services, primarily in the US West Oregon region (us-west-2).

Regards,
Retell Support Team