DTMF Support, Even After 10 Digit is Typed, Agent Stays Silent. I've Confimed All Settings

Question 1: “I have Digit Limit set to 10. After the caller presses exactly 10 digits, the system goes silent and does not proceed to the next node. The agent only responds after the caller speaks (e.g., says ‘Hello?’). The DTMF tool invocation shows in logs, but the next node doesn’t execute until there’s voice input. Why isn’t the system automatically proceeding after detecting 10 digits?”

Question 2: “After the pause, all 10 DTMF tones are audibly played back to the caller before my agent speaks. The caller hears beeping sounds of their key presses being echoed. How do I disable this DTMF tone playback?”

Response to Question 1: System Not Proceeding After 10 Digits

Based on the available documentation, Retell AI supports three methods for determining when DTMF input is complete: Digit Limit, Termination Key, and Timeout(1).

When you set digit_limit to 10, the system should consider input complete once 10 digits are received(1). However, the documentation indicates that DTMF input is “captured and is taken into account when the agent is generating responses by default”(1). This suggests the agent processes DTMF as part of its conversational flow rather than as an immediate trigger.

The behavior you’re experiencing—where the agent waits for voice input before proceeding—aligns with how Retell AI handles DTMF by default. The system captures the digits but may require the agent to be prompted to generate a response(1).

To configure DTMF completion options, you can set these parameters under Call Settings(1):

"user_dtmf_options": {
  "digit_limit": 10,
  "termination_key": "#",
  "timeout_ms": 2000
}

(1), (2)

The timeout_ms parameter specifies “the time in milliseconds to wait after the last digit is received before timing out. The timer resets with each new digit”(1). Adjusting this timeout might help the system proceed more quickly after the 10th digit.

Additionally, ensure your agent’s prompt explicitly instructs it to process the DTMF input immediately. The documentation provides this example prompt: “Please enter your PIN number using the keypad. You can finish by pressing the pound key”(1).


Response to Question 2: DTMF Tones Being Played Back

I don’t have specific information in the available sources about disabling DTMF tone playback to the caller after digits are entered. The documentation covers how to capture DTMF input(1) and configure user_dtmf_options(2)(3), but doesn’t explicitly address preventing the echo or playback of DTMF tones.

This behavior might be related to your telephony provider configuration rather than Retell AI settings. I’d recommend checking:

  • Your SIP trunk or telephony provider settings
  • Any audio mixing or echo cancellation configurations

For further assistance with this specific issue, you can:

Would you like me to provide a general answer based on common telephony practices?


What is the variable name that stores DTMF keypad input so I can reference it in my Extract Variables node or other nodes?

For example, is it {{digit}}, {{dtmf_input}}, {{user_dtmf}}, or something else?

I need to extract the phone number from keypad input, not just voice input.

DTMF Variable Name for Extracting Keypad Input

Based on the documentation, the variable name that stores DTMF keypad input is {{user_dtmf_input}}(1).

How to Use DTMF Input in Your Flow

There are two ways to capture and use keypad input from callers:

Option 1: Save as a Dynamic Variable (Recommended)

After the caller enters digits, use the Extract Dynamic Variable node to store the input(2). Once saved, you can reference it anywhere in your conversation using {{variable_name}}(2).

For example, to extract a phone number from keypad input:

  1. Add an Extract Dynamic Variable node(3)

  2. Create a variable with:

    • Variable Name: phone_number
    • Description: “The phone number entered via keypad”
    • Variable Type: Text or Number(3)
  3. Reference it later using {{phone_number}}

Option 2: Use Automatically Captured Input

DTMF input is captured automatically and added to the call transcript by default(2). It appears in the transcript like this:

[User pressed digits: 1234#]

(2)

The agent can reference this value directly when generating responses without extra setup(2).

Transition Conditions

You can also use {{user_dtmf_input}} in transition conditions. For example:

{{user_dtmf_input}} == "1"

(1)

This checks if the user pressed the digit “1”.

Best Practice

If you need the digits later in the flow for validation, API calls, or confirmations, Option 1 is recommended so the value is reliably stored and reusable(2).


**

I wanted to share the solution we found for the DTMF issues I reported. This might help other users facing the same problems.


Original Problems

Problem 1: After caller pressed 10 digits, the system went silent and didn’t proceed until the caller spoke (e.g., said “Hello?”).

Problem 2: All 10 DTMF tones audibly played back to the caller before the agent responded.


Root Cause We Discovered

The Press Digit node is designed for the agent to navigate an IVR (press 1 for sales, etc.) — NOT for capturing user keypad input. Press Digit nodes wait for a speech turn boundary before advancing, which caused:

  1. The flow to hang silently after DTMF input

  2. Audio buffering that flushed when the caller finally spoke


The Solution: Remove Press Digit Nodes Entirely

Retell’s agent-level User DTMF settings already capture keypad input in ANY node. The digits are stored in {{user_dtmf_input}}. We don’t need dedicated Press Digit nodes.


Step-by-Step Implementation

Step 1: Enable Agent-Level DTMF Settings

In Agent Settings → Call Settings → User DTMF Options:

Setting Value
User Keypad Input Detection ON
Digit Limit 10
Timeout 4000ms
Termination Key OFF (optional)

Step 2: Remove Press Digit Nodes from Flow

Delete or disconnect any Press Digit nodes that were being used to “capture” user phone numbers. These nodes cause the hanging/buffering issue.


Step 3: Create a Branch Node to Check for DTMF Input

After the node that asks for the phone number, add a Branch node called CHECK DTMF INPUT.

Edge 1: DTMF_VALID

{{user_dtmf_input}} contains exactly 10 digits. Keypad input is complete.

→ Routes to: FORMAT DTMF NUMBER node

Edge 2: NO_DTMF (Else)

{{user_dtmf_input}} is empty, missing, or does not contain 10 digits. Caller likely spoke the number.

→ Routes to: Extract Variables node (for voice input)


Step 4: Create FORMAT DTMF NUMBER Node

Node Type: Extract Dynamic Variables

Variable Name: phone_number

Variable Description:

Take the DTMF input and format it as E.164.

The caller entered these digits via keypad: {{user_dtmf_input}}

Simply add +1 prefix to the digits.

OUTPUT: +1 followed by the 10 digits exactly as entered.

Example: 
- Input: 2896003518
- Output: +12896003518

Do not change, rearrange, or interpret the digits. Just add +1 prefix.

→ Routes to: Confirm Number node


Step 5: Update Validation Branch for E.164 Format

If you have a validation branch that checks the phone number, update the conditions to accept E.164 format:

VALID edge:

The extracted phone_number is in E.164 format (+1 followed by exactly 10 digits, e.g., +12896003518) OR is exactly 10 digits without prefix. Not UNKNOWN, not empty. Proceed to confirm.

Step 6: Keep Voice Path for Spoken Numbers

The existing Extract Variables node handles callers who speak their number instead of pressing digits. The Branch node routes voice input through this path automatically.


Final Flow Structure

ASK FOR PHONE NUMBER (Conversation node)
"What's the best number to reach you at? You can press the digits on your keypad, or say them one at a time — whichever is easier."
       ↓
CHECK DTMF INPUT (Branch node)
       ↓
  ├── DTMF_VALID (10 digits in {{user_dtmf_input}})
  │         ↓
  │   FORMAT DTMF NUMBER (Extract Variables — adds +1 prefix)
  │         ↓
  │   CONFIRM NUMBER
  │
  └── NO_DTMF (Else — voice input)
            ↓
      EXTRACT NUMBER (Extract Variables — parses spoken digits)
            ↓
      VALIDATE EXTRACTED NUMBER (Branch)
            ↓
      CONFIRM NUMBER

Why This Works

  1. Agent-level DTMF settings capture keypad input automatically — no Press Digit node needed

  2. Branch node checks {{user_dtmf_input}} and routes accordingly

  3. Flow advances immediately after 10 digits — no waiting for speech turn

  4. No audio buffering because there’s no “stuck” state

  5. Both input methods supported — keypad and voice work seamlessly


Results

  • Caller presses 10 digits → flow advances immediately → agent confirms number

  • No more silent hanging

  • No more DTMF tone playback

  • Smooth UX for both keypad and voice input


I hope this helps other users who encounter the same issue. The key insight is that Press Digit nodes are for agent IVR navigation, not user input capture.

Thanks for your support in helping us troubleshoot this!