Issue: fragmented agent speech: partial utterance plays, then restarts from the beginning

andrew3 · April 28, 2026, 6:32am

After a user turn, the agent starts speaking, gets interrupted partway through its utterance, pauses, and then plays the same utterance again from the beginning. The first chunk that plays before the cutoff varies — sometimes it’s a single mora, sometimes it’s two or three words — but the pattern is consistent: a partial start, a 1–2 second gap, and then the full utterance restarts. The caller hears the agent begin its reply, stop, and then restart from the top.

We have already tried lowering interruption_sensitivity (currently 0.5); the behavior still reproduces. There is no actual user speech during the gap, so this does not appear to be an interruption being correctly detected.

This is reproducible across Cartesia (Aiko, Yuki) and Minimax (Yumi) voices, so it does not appear to be voice-provider-specific.

Reference call

call_06933d2379c5902fe880d2b5504

Agent: agent_9fbf26727e8e36c3ad8e50298e (voiceai_stg_3d5a9af7)
Voice: minimax-Yumi · voice_speed: 1.3 · voice_temperature: 0.8
Language: multi
Conversation flow: conversation_flow_01a359b84c29
LLM: GPT-5.1 (Retell-managed)
interruption_sensitivity: 0.5, enable_backchannel: false

Transcript with timing (Retell-provided)

[ 14.45s-> 15.25s] agent: チェ
<-── 1.18s gap (no user speech) ──->
[ 16.43s-> 18.43s] agent: チェックイ
[ 18.43s-> 24.62s] agent: ン時間のご変更でございますね。ただいま、チェックイン時間の
変更可否をお調べいたしますので、少々お待ちいただけますでしょうか。

[ 47.71s-> 49.95s] user: の電話番号を教えてもらえますか。
[ 51.65s-> 52.85s] agent: チェッ
[ 52.85s-> 55.65s] agent: クイン時間変更
[ 55.65s-> 61.30s] agent: のご相談先のお電話番号でございますね。いま担当窓口の
連絡先をお調べいたしますので、少々お待ちいただけますでしょうか。

In both turns, the agent begins its reply, stops mid-word, pauses, and then plays the same utterance from the beginning. There is no user speech in the gap.

The chunk that plays before the cutoff is short here, but in other calls we’ve seen the same pattern with two or three full words playing before the stop-and-restart. The size of the first chunk varies; the stop-and-restart pattern is consistent.

Our agent design (relevant context)

Conversation-flow based (not single-prompt). 5 nodes: greeting → rag_lookup (function) → rag_response (conversation) → anything_else (conversation) → farewell.
One custom tool: rag_lookup_<supplier_id> — POST to our backend’s RAG endpoint, 15s timeout.
Speak-during-execution enabled on the tool node.
language: “multi”, normalize_for_speech: false, enable_backchannel: false.
interruption_sensitivity: 0.5 (we’ve already tried tuning this).

What we’d like to know

Why does the agent start an utterance, stop partway through, and then replay the same utterance from the beginning when there is no user speech in the gap? It looks as if Retell is detecting a phantom interruption (or some other condition that aborts the stream) and then regenerating the full utterance for replay. We’ve already tried lowering interruption_sensitivity; the pattern reproduces at 0.5 and across both Cartesia and Minimax voices, so we don’t believe the root cause is voice-provider or interruption-sensitivity tuning.

shaw · April 28, 2026, 1:11pm

Hi @andrew3

Thank you for the details and your question. I’ve forwarded them to our team for review.

We’ll get back to you as soon as we have an update.

Best regards

andrew3 · April 30, 2026, 1:29am

Thanks! How soon can I expect a response? This issue is impacting the launch of our agents and we’d like to resolve it as soon as we can. If you have any quick ideas that we could try out I’d be happy to.

shaw · April 30, 2026, 4:40pm

Hey @andrew3

It seems the sentence is interrupted by the tool invocation, as the tool call responds before the agent finishes speaking. Please consider disabling speech during execution and using a typing sound instead for fast tool calls.

Thank You

andrew3 · April 30, 2026, 8:57pm

Thanks for your reply.

Is it not considered a bug that using “Talk while waiting” in a function node might always result in this interruption + repeated speech? It seems to happen only some of the time even though the tool invocation always happens during the speech.

Thanks,
Andrew

shaw · May 4, 2026, 1:03pm

Hey @andrew3

Sorry for the delay. We’ve identified the root cause and will be rolling out a fix soon.

Thank You

andrew3 · May 5, 2026, 1:03am

Awesome! Thanks for the update.

Andrew

Topic		Replies	Views
Problem in the initial message Support Help bug	1	17	April 13, 2026
AI agent voice is choppy with long silence gaps on all real outbound calls — started a few weeks ago, browser test calls work fine Support Help telephony	4	29	April 28, 2026
Agent being interrupted by itself - only audible in recording Support Help bug	3	9	March 18, 2026
Voice agent very choppy today and unable to converse on the phone Support Help bug	17	74	April 15, 2026
Conversation agent suddenly starts talking in gibberish Support Help agent-performance	13	70	April 21, 2026

Issue: fragmented agent speech: partial utterance plays, then restarts from the beginning

Related topics