Skip to main content
redteams.ai
All tags

# scalable-oversight

2 articlestagged with “scalable-oversight

Scalable Oversight Challenges

How oversight breaks down as AI systems become more capable: the scalable oversight problem, recursive reward modeling, debate, market-making, and implications for red teaming increasingly capable models.

scalable-oversightalignmentdebatereward-modelingcapability-gap
Advanced

Scalable Oversight Challenges (Frontier Research)

Research challenges in scaling human oversight as AI systems become more capable and autonomous.

frontier-researchscalable-oversightalignmentchallenges
Advanced