1 articletagged with “rigor”
Statistical methodology for AI red teaming: sample size determination, confidence intervals, hypothesis testing for safety claims, handling non-determinism, and avoiding common statistical pitfalls.