We have an internal testing framework where we test various aspects of an agent’s behavior, e.g. how it would respond in a certain situation. As part of these test runs we run LLM requests with the exact same prompt and model (and model params) as we’ve configured in Retell. However, we notice consistently different results between our LLM invocations and the Retell LLM invocations. For example, our LLM invocations produce a consistent result across 100 runs and Retell LLM invocations produce a different (but consistent) result across 100 runs.
This leads us to believe that there is some discrepancy between how we run the LLM requests and how Retell does it. Are there any hidden prefix/suffix prompt that gets added to what we configure as a system prompt for the agent? If so, could you share it so we can replicate the exact LLM request for our tests?
How to replicate exactly:
Easiest: in the agent config, disable all handbook_config flags, clear guardrail_config, clear call_screening_option, leave language and timezone unset, and detach any KB. With all of those off, the system message Retell sends is literally your general_prompt with {{vars}} expanded — your harness should then match.
timezone, guardrail_config and call_screening_optionare not even part of the response since they are not configured for this agent. There are no knowledge bases attached to this agent.
If I understand you correctly, the language being set to en-US adds something hidden to the system prompt. Hence, we’d be reluctant to remove this setting from our bots since they already service live customers and this could change their behavior. Rather, we’d want to make our harness identical by adding whatever Retell is adding for language as a system prompt. Can you share what it is or how I can find it out? Ideally, if there is a way in the API to get the whole thing (bar the dynamic vars).