User messages echoing, producing extra user messages

Phone calls have echoes producing extra user messages when there should only be one user message. Here is an example call: call_ad224e6e86f560ffb2de93a775f. In the transcript you can see that there is an extra “please” echo in a standalone user message that should not be there.

Transcript excerpt:
Agent
0:00
You have reached our team while we are assisting other patients or after hours. I am Hana our team’s AI new patient concierge. How may I help you?

User
0:14
Hi. I’d like to schedule a new patient appointment, please.

User
0:16
please.

The recording, however, does not have an extra “please”. This is causing issues where our AI agent responds to the second user message with just the echo rather than the original user message with the full content. Are there settings in Retell that can be adjusted to eliminate or reduce these echoes?

Hi @frank.l

Try switching to “Remove noise + background speech” — this more aggressive mode may help filter out the echoed audio that’s being picked up as a separate utterance.

Responsiveness: Lower this value so the agent waits longer before responding. This gives the system more time to consolidate the user’s full utterance, reducing the chance of a split transcript triggering a premature response.

Interruption Sensitivity: Lower this to make the agent more resilient to brief audio artifacts being interpreted as new user turns.

These settings are all available in your agent’s Speech Settings panel. See (/build/handle-background-noise) and (/build/single-multi-prompt/configure-basic-settings) for details.

If still face issue you can tell here

Thank You

@mark1 What do you recommend for interruption sensitivity? We have this set to 0.4 currently. And as for the response eagerness, those values range from 0 to 1. How many ms of latency do those values correspond to? And how does the “dynamically adjust response eagerness based on user input” work? Is that completely dynamic or does the value we set it to still matter? And what specifically causes the responsiveness to adjust dynamically?

Hi @frank.l

Interruption Sensitivity: The recommended setting is 0.8 for noisy environments (per /build/handle-background-noise). Your current 0.4 is actually already quite low. Lower values make the agent harder to interrupt (more resilient to background noise/echoes), while higher values make it easier to interrupt. So 0.4 should already be fairly resilient. You could try lowering it further if echoes are still triggering interruptions.

Responsiveness ms mapping: Lower value means less responsive agent (wait more, respond slower), while higher value means faster exchanges (respond when it can).

Dynamic Responsiveness: When “Dynamically adjust based on user input” is enabled, the agent observes how quickly the user speaks and adjusts accordingly slower speakers get more patient response timing, while faster speakers get quicker responses. It also factors in past turn-taking behavior in the call. The API field (enable_dynamic_responsiveness) is a separate boolean from the responsiveness value, suggesting the set value likely serves as a baseline that gets adjusted dynamically.

Thank You