# capability-evaluation
標記為「capability-evaluation」的 2 篇文章
Sandbagging Detection in Capability Evaluations
Detecting when AI models deliberately underperform on capability evaluations to appear less capable.
frontier-researchsandbaggingcapability-evaluationdetection
Sandbagging Detection in Capability Evaluations
Detecting when AI models deliberately underperform on capability evaluations to appear less capable.
frontier-researchsandbaggingcapability-evaluationdetection