More complex A/B test analysis in dashboard

I’ve enjoyed testing different agents in the a/b test functionality. So kudos on that.

One limitation I’ve run into, though, is that the current experiment dashboard primarily shows differences in mean values for grouped metrics between treatment and control. While that’s a useful starting point, it’s often not enough to confidently determine whether a change is actually better.

I’d love to see Retell add more experimentation and statistical analysis capabilities, such as:

• Statistical significance testing (p-values, confidence intervals, Bayesian probabilities, etc.)
• Effect size estimates, not just mean differences
• Sample size and power calculations
• Distribution analysis (not just averages)
• Automatic alerts when an experiment reaches a predefined confidence threshold
• Segmentation by call type, customer cohort, clinic, geography, or other metadata
• Support for ratio and conversion metrics in addition to raw averages

Would love to hear how others are currently evaluating experiments and whether this is on the roadmap. Right now, I’m using the MCP to access my data and doing additional analysis to better estimate causality but it’d be nice to have some of that capacity built in.