Warm transfer via custom LLM

Hi everyone, hope you’re doing well!

I’m writing to ask if you currently support warm transfers via custom LLM. I didn’t see this in the documentation, though I noticed it is supported for Retell’s native agents.

If it isn’t currently available for custom LLM setups, do you have plans to support it soon?

Hey @emilia

Confirming with product eng if this is supported.

Ok! looking forward to it!

Hey @emilia

We do not support warm transfer with custom LLM. Warm-transfer handoff is only available when you use Retell LLM, Conversation Flow, or Multi-Prompt agents. The workarounds is to see if you can a Conversation Flow or Retell LLM agent for the leg that performs the transfer. I’ve logged warm-transfer support on Custom LLM as a feature ask internally. Please let us know if you have any further questions.

Thank You

Thanks for the earlier reply about warm transfer not being supported on Custom LLM and your suggestion to use a Conversation Flow or Retell LLM agent for the leg that performs the transfer. We’re planning the workaround now and need confirmation on a few specifics before we build it.

Our intended flow:

  1. The inbound caller stays on our Custom LLM agent (“leg 1”) for the whole conversation.
  2. When warm transfer is needed, our backend creates a second outbound Retell call to the human (“leg 2”), using a separate Retell-native agent (Conversation Flow) that we’ve set up specifically for the briefing + handoff. We pass the briefing and the original caller’s phone number as retell_llm_dynamic_variables, plus correlation IDs in metadata.
  3. The transfer agent on leg 2 reads the briefing to the human, asks accept/decline, then on accept invokes the native transfer_call node targeting the original caller’s phone number.

Questions:

  1. Bridging leg 1 and leg 2. When the Retell-native agent on leg 2 invokes transfer_call with the original caller’s E.164 (the from_number of leg 1, which arrived inbound on our Retell phone number), does it actually bridge into the existing leg 1 call, or does it place a separate outbound call to that number (resulting in two parallel calls to the caller)?
    • If bridging works: is it SIP REFER on leg 2 to the caller’s number, or do you correlate via some call ID we should pass on phone_call.create?
    • If bridging does not work that way: what is the correct primitive for connecting two Retell-controlled calls? Is there a bridge_calls / merge API, or a different recommended workaround?
  2. Caller audio during setup. While leg 2 is being dialed and the human is being briefed (anywhere from a few seconds up to ~30s if the human is slow to pick up), leg 1’s Custom LLM WebSocket is still open. What does the caller actually hear during this window?
    • Will Retell play hold audio / ambient_sound automatically on a Custom-LLM leg if we send silent (empty-content) ResponseResponse frames?
    • Or do we need to keep streaming reassurance phrases (“one moment…”) from the Custom LLM side at intervals to prevent the caller from hearing dead air?
    • Is there a way to put the Custom-LLM leg into an explicit “hold” state via the WS protocol?
  3. Leg 1 WS lifecycle at bridge completion. Once transfer_call on leg 2 succeeds and the human is bridged to the caller, what happens to leg 1’s Custom LLM WebSocket? Does Retell close it (and if so, with what disconnection reason)? Or does it remain open as a silent observer until one of the bridged parties hangs up? We need this to know when our backend should treat the call as ended.