Hello support! When creating an outbound call via API it seems like there’s quite a long delay establishing the connection to our agent.
Our agent is set with “user speaks first”. What we’re seeing in practice is that the user has to say “Hello” 2-3 times (about 6 seconds) before our agent / trasncriber picks it up…. this is hurting our conversation success rate.
An example call is here: call_3351a914d3f54b7d132038f6814
The AI chat was unhelpful on the topic, I’m hoping for more guidance on how to make the connection feel like a real phone call.
Switch to “Agent Speaks First” (recommended for outbound calls): This is the most natural behavior — when someone picks up an outbound call, they expect the caller to identify themselves immediately. You can change this in your LLM settings by setting start_speaker to "agent" and providing a begin_message. This eliminates the silent period entirely.
If you prefer to keep “User Speaks First”, reduce the begin_after_user_silence_ms to something shorter like 1500–2000ms. This shortens the fallback window so the agent jumps in faster if the user’s initial speech isn’t immediately recognized.
Additionally, your agent uses the most aggressive background noise cancellation mode noise-and-background-speech-cancellation), which may contribute to slower initial speech detection. You could try switching to standard noise-cancellation to see if the user’s “Hello” is picked up faster.
Thanks, I’ll update those settings and have the agent jump in faster.
Btw, the natural behavior is not agent speaks first on outbound call. User picks up the phone says “hello?” and then the conversation begins. Not immediately getting an agent response before they have a chance to say something. Hope this feedback helps on making the outbound flow more successful.