Hey @rodrigo.christiansen both symptoms (the JSONcommentary to=functions.* tokens and the CJK bursts) look like the same underlying failure mode.
What we found on the sampled calls (e.g. call_abdd66c3af9f8d4c96447c98671, call_99bc4283d49e935097c0b675dbb):
They ran on gpt-5.4 via OpenAI’s Responses API. The strings being spoken aloud {"tool_uses":[{"recipient_name":"functions.transition_to_gather_info"...}]}, commentary to=functions.transition_to_gather_info) are OpenAI’s internal tool-call channel markers — they should land in the structured tool-call output, but the model is occasionally emitting them as plain assistant text, so TTS reads them out.
The transition_to_<state> names match how our stateful-LLM engine auto-names transitions between your states, so that part is expected — the bug is that the model is verbalizing the protocol token instead of invoking the tool.
The Chinese-character bursts (大发彩票 / 天天送彩票) co-occur in the same off-distribution turns and appear to come from the model as well — none of those strings exist anywhere in our code, and your agent is configured en-US with 11labs-Kate, so there’s no path inside Retell that could introduce them. We’re attributing them to the provider by elimination; we have not yet extracted the raw model payload for one of those turns to quote verbatim.
Recommended next steps (workarounds while we don’t have a fix for the model behavior itself):
Move the agent off gpt-5.4. You mentioned gpt-4.1 reproduces it too, so I’d suggest trying a different family — e.g. gpt-4o or claude-sonnet-4 — rather than another gpt-5.x variant. We have not specifically reproduced or cleared those alternatives against this exact failure mode, so please share a few new call IDs after the swap and we’ll confirm.
Optionally add a guard near the top of the general prompt: “Respond only in English. Never output the strings functions., to=functions., {"tool_uses", or any non-Latin characters.” This doesn’t fix the underlying model behavior, but is worth trying as a mitigation.
The agent we inspected is currently running off its draft version (isPublished=false). Make sure any model change is published so production traffic picks it up.
We’ll also look internally at whether we can detect and strip these Harmony channel markers server-side before TTS for gpt-5.x sessions.