Persistent issue - gpt-realtime doesn't detect short phrases like "yes"

mdhor · November 26, 2025, 12:42pm

I keep having a serious issue with the gpt-realtime model.

Often, but not always, when a user says a short phrase like “yes”, the model doesn’t catch it. This results in multiple turns like “Do you want this?” - “Yes” - “Please tell me if you want this” - “YES” - “Are you there?” - “YES!!” - etc where the user gets increasingly frustrated.

I’ve tried setting up an identical agent directly with the OpenAI Realtime API, and I cannot reproduce the issue there. Thus, I suspect the issue is somewhere in the Retell implementation of the speech-to-speech gpt-realtime model.

Please help, this is a mayor problem for my business.

support2 · November 26, 2025, 12:42pm

Your issue with gpt-realtime not catching short phrases like “yes” has been reported by other users.(1) A Discord community member described the exact same problem: the Retell agent “almost never understands short phrases like ‘yes’ and ‘yes please’”, while the same agent implemented directly via OpenAI’s API never misses these phrases.(1)

When this was reported to Retell support, they attempted to reproduce the issue but indicated “it’s working fine” and asked what model was being used.(1) The user confirmed they were using GPT Realtime.(1)

Available gpt-realtime models in Retell AI:

According to the API documentation, you can select from these speech-to-speech models using the s2s_model parameter:(2)(3)(4)

gpt-4o-realtime
gpt-4o-mini-realtime
gpt-realtime

The realtime APIs are supported for Single and Multi prompt agent types.(5)

Potential factors affecting speech detection:

Retell AI’s turn-taking model combines Voice Activity Detection (VAD) with contextual analysis to determine when a user has finished speaking.(6) The system uses VAD to identify speech presence, then applies turn-taking logic to analyze prosodic features like pitch, intonation, and timing.(6)

Retell AI’s turn-taking model is designed to prevent interruptions by recognizing subtle cues like tone shifts, pauses, and sentence patterns.(6) However, this contextual approach means the system waits to ensure the user hasn’t finished talking, which could potentially cause issues with very short utterances.(6)

Recommended next steps:

Since this appears to be a known issue that Retell support is aware of,(1) I recommend:

Contact Retell support directly with specific call examples showing the issue
Verify which exact s2s_model you’re using (gpt-4o-realtime, gpt-4o-mini-realtime, or gpt-realtime)
Test with different realtime model variants to see if the issue persists

For immediate assistance:

Visit Contact Sales
Ask questions on Discord
Visit https://github.com/RetellAI/

**

Topic		Replies	Views
Conversation agent suddenly starts talking in gibberish Support Help agent-performance	13	70	April 21, 2026
Tool calling - AI agent doesn't reply back. Stops interacting Support Help	3	115	January 4, 2026
ChatGPT Connection Support Help bug	8	22	March 18, 2026
Request: More Emotionally Expressive Realtime Voices (Eleven v3 / OpenAI-level) Feature Requests	0	78	February 15, 2026
Arabic language Voice agent. The client base is mainly in the Levant region, so I need support for Lebanese accent Support Help agent-performance	5	26	May 5, 2026

Persistent issue - gpt-realtime doesn't detect short phrases like "yes"

Related topics