Original Problems
Problem 1: After caller pressed 10 digits, the system went silent and didn’t proceed until the caller spoke (e.g., said “Hello?”).
Problem 2: All 10 DTMF tones audibly played back to the caller before the agent responded.
Root Cause We Discovered
The Press Digit node is designed for the agent to navigate an IVR (press 1 for sales, etc.) — NOT for capturing user keypad input. Press Digit nodes wait for a speech turn boundary before advancing, which caused:
-
The flow to hang silently after DTMF input
-
Audio buffering that flushed when the caller finally spoke
The Solution: Remove Press Digit Nodes Entirely
Retell’s agent-level User DTMF settings already capture keypad input in ANY node. The digits are stored in {{user_dtmf_input}}. We don’t need dedicated Press Digit nodes.
Step-by-Step Implementation
Step 1: Enable Agent-Level DTMF Settings
In Agent Settings → Call Settings → User DTMF Options:
| Setting | Value |
|---|---|
| User Keypad Input Detection | ON |
| Digit Limit | 10 |
| Timeout | 4000ms |
| Termination Key | OFF (optional) |
Step 2: Remove Press Digit Nodes from Flow
Delete or disconnect any Press Digit nodes that were being used to “capture” user phone numbers. These nodes cause the hanging/buffering issue.
Step 3: Create a Branch Node to Check for DTMF Input
After the node that asks for the phone number, add a Branch node called CHECK DTMF INPUT.
Edge 1: DTMF_VALID
{{user_dtmf_input}} contains exactly 10 digits. Keypad input is complete.
→ Routes to: FORMAT DTMF NUMBER node
Edge 2: NO_DTMF (Else)
{{user_dtmf_input}} is empty, missing, or does not contain 10 digits. Caller likely spoke the number.
→ Routes to: Extract Variables node (for voice input)
Step 4: Create FORMAT DTMF NUMBER Node
Node Type: Extract Dynamic Variables
Variable Name: phone_number
Variable Description:
Take the DTMF input and format it as E.164.
The caller entered these digits via keypad: {{user_dtmf_input}}
Simply add +1 prefix to the digits.
OUTPUT: +1 followed by the 10 digits exactly as entered.
Example:
- Input: 2896003518
- Output: +12896003518
Do not change, rearrange, or interpret the digits. Just add +1 prefix.
→ Routes to: Confirm Number node
Step 5: Update Validation Branch for E.164 Format
If you have a validation branch that checks the phone number, update the conditions to accept E.164 format:
VALID edge:
The extracted phone_number is in E.164 format (+1 followed by exactly 10 digits, e.g., +12896003518) OR is exactly 10 digits without prefix. Not UNKNOWN, not empty. Proceed to confirm.
Step 6: Keep Voice Path for Spoken Numbers
The existing Extract Variables node handles callers who speak their number instead of pressing digits. The Branch node routes voice input through this path automatically.
Final Flow Structure
ASK FOR PHONE NUMBER (Conversation node)
"What's the best number to reach you at? You can press the digits on your keypad, or say them one at a time — whichever is easier."
↓
CHECK DTMF INPUT (Branch node)
↓
├── DTMF_VALID (10 digits in {{user_dtmf_input}})
│ ↓
│ FORMAT DTMF NUMBER (Extract Variables — adds +1 prefix)
│ ↓
│ CONFIRM NUMBER
│
└── NO_DTMF (Else — voice input)
↓
EXTRACT NUMBER (Extract Variables — parses spoken digits)
↓
VALIDATE EXTRACTED NUMBER (Branch)
↓
CONFIRM NUMBER
Why This Works
-
Agent-level DTMF settings capture keypad input automatically — no Press Digit node needed
-
Branch node checks
{{user_dtmf_input}}and routes accordingly -
Flow advances immediately after 10 digits — no waiting for speech turn
-
No audio buffering because there’s no “stuck” state
-
Both input methods supported — keypad and voice work seamlessly
Results
-
Caller presses 10 digits → flow advances immediately → agent confirms number
-
No more silent hanging
-
No more DTMF tone playback
-
Smooth UX for both keypad and voice input
I hope this helps other users who encounter the same issue. The key insight is that Press Digit nodes are for agent IVR navigation, not user input capture.
Thanks for your support in helping us troubleshoot this!