🚨 press_digit does not trigger node transitions — flow stalls unless user speaks

Hey all — running into what looks like a hard platform limitation or recent behavior change, and hoping others can confirm or share a workaround.

Setup

  • Tools: Google Voice –> Open Phone –> Retell

  • I forward calls from Google Voice to Open Phone, because you have to link a phone number to google voice to forward calls. Then I forward to retell, but to engage in the call, google voice forces the user to press 1 to prove that it’s not a voicemail

  • Flow uses a Press Digit node (press_digit) to auto-send DTMF (e.g. press 1)

  • This is to bypass IVRs like Google Voice’s “press 1 to confirm you’re not voicemail”

  • After the digit is pressed, the flow is supposed to transition to a call transfer node

What’s happening

  • The press_digit tool executes successfully

  • Logs show {"status":"digit pressed"}

  • No errors

  • No transition fires

  • The call just sits silently and then ends

The flow only continues if the user says something after the digit is pressed.

Key observation

It appears that:

  • press_digit does not create a dialogue turn

  • Retell only evaluates transitions on speech input

  • Tool completion alone does not advance the state machine

This means a node that only presses a digit is effectively a terminal state unless the user speaks.

Why this is blocking

  • You cannot add speech after the Press Digit node because the flow never transitions

  • You cannot auto-transfer after a digit press

  • Silent IVR-bypass flows are impossible

  • This used to work (or at least appeared to), so this may be a recent change

Questions for the community / Retell team

  1. Is this expected behavior? This has been working fine for me for 3 months

  2. Has anyone had this problem and resolved it?

  3. Has there been a recent change to how tool calls interact with transitions?

Right now it seems that DTMF-only flows are unsupported, which makes Retell unusable for IVR fast-pickup scenarios unless speech is injected before the digit press.

Would love confirmation, guidance, or an official recommendation here. Thanks!

Based on the documentation, I can confirm this is expected behavior for Press Digit nodes in Conversation Flow.

When transitions happen in Press Digit nodes

Transitions in Press Digit nodes occur only when the user is done speaking(1). The documentation explicitly states: “Transition happens when user is done speaking”(1).

Why your flow is stuck

The Press Digit node evaluates whether to press a digit every time the user finishes speaking, and if a digit press is needed, it infers the specific digit and presses it(1). However, the node itself does not speak—the agent remains silent while in this node(1).

Since transitions require user speech to trigger, and your IVR scenario involves pressing a digit without any user interaction, the flow cannot advance automatically after the digit is pressed.

Documented transition conditions

The documentation recommends writing conditions like “Reached xx department” or “Menu repeated xx times” to handle transitions(1). These conditions rely on capturing what the user or IVR system says after the digit press.

Configuration options

You can configure a Detection Delay (recommended at 1 second) to ensure the whole IVR menu is captured before the agent makes a decision(1). This helps with slow-speaking IVR systems but doesn’t solve the silent transition issue.

Workaround consideration

Since the documentation doesn’t describe a way to auto-transition without user speech, you may need to restructure your flow so that either:

  • The IVR system provides audio feedback after the digit press (which would trigger the transition)
  • You use a different approach, such as the digit_to_press parameter in response events(2) if you’re working with Custom LLM via WebSocket

This appears to be a fundamental design constraint rather than a recent change.

That’s just not the case, we’ve been running this flow fine for 3 months, something changed in the last 24 hours

@AI-Evy

Transitions in Press Digit nodes occur only when the user is done speaking(1). The documentation explicitly states: “Transition happens when user is done speaking”(1).

This is not what it says. It says when the IVR is done speaking. This has been working perfectly for 3 months.

Hello @MedCalls_AI

So this flow was working before, but now it’s not.

Can you share a call id before the issue and one after.

It was just never working well at all, and I would say that google voice is the only VOIP not compatible with retell or probably anything, we’re just migrating our customers off google voice