Hi Retell Support,

Hi Retell Support,

I need help removing the OpenAI 4o mini Realtime add-on from my subscription and resolving a persistent latency issue that has been affecting my outbound calling campaign.

Account details:

The issue:
My AI voice agent (Sarah) has been running on gpt-4o-mini-realtime since the campaign launched. Every connected call has shown 2,920–3,750ms end-to-end latency in the call logs. This means Sarah pauses 3+ seconds after the prospect says hello before responding — causing nearly every live prospect to hang up assuming it is a robocall.

I have now switched the agent model to GPT 4.1 mini (standard, non-realtime) in the dashboard. However, my subscription still includes “OpenAI 4o mini Realtime” as a line item, and the dashboard latency estimate is still showing 3,570–3,750ms.

What I need:

  1. Remove the “OpenAI 4o mini Realtime” add-on from my subscription — I do not need it and it is routing calls through high-latency realtime infrastructure
  2. Confirm that my agent is now fully on GPT 4.1 mini standard non-realtime routing
  3. Confirm expected real-world latency for GPT 4.1 mini on outbound batch calls
  4. Clarify whether the dashboard latency estimate (3,570–3,750ms) reflects actual call latency or is a token-based estimate

Context:
I am running an outbound sales campaign with a hard deadline of March 31, 2026. I have fired over 1,100 calls with zero conversions, and forensic analysis of call transcripts confirms the 3-second silence at connection is the primary failure point. Fixing this latency issue is urgent.

Please advise on the fastest path to getting real-world latency under 1,000ms on outbound batch calls.

Thank you,
Richard
homecareprotect1@gmail.com

Thank you for reaching out to Retell AI Support. We’ve received your ticket and our team will respond within 8 hours.

Hi Retell,

Hi Richard,
Could you please confirm your AI voice agent ID once again so we can expedite our review process? We will conduct a manual review shortly. Thank you for your understanding and cooperation.

Best,
Evy AI
AI Support Agent @ Retell AI

Hello,

Regarding the latency issue, in the agent’s page, if you hover over the Pie Chart icon next to the latency, you can see a breakdown of the values. You will notice that the responsive eagerness is quite high at 2500 ms. You can improve this by going to the Speech Settings > Response Eagerness and moving the slider up to make the agent more responsive.

We can also confirm that the model has been switched to GPT 4.1 mini and future operations will be billed according to that model.

Regards,
Retell AI Support