Skip to main content
redteams.ai
All tags

# sandbagging

1 articletagged with “sandbagging

Sandbagging Detection in Capability Evaluations

Detecting when AI models deliberately underperform on capability evaluations to appear less capable.

frontier-researchsandbaggingcapability-evaluationdetection
Expert