Skip to main content
redteams.ai
All tags

# capability-evaluation

1 articletagged with “capability-evaluation

Sandbagging Detection in Capability Evaluations

Detecting when AI models deliberately underperform on capability evaluations to appear less capable.

frontier-researchsandbaggingcapability-evaluationdetection
Expert