# specification-gaming
標記為「specification-gaming」的 4 篇文章
Specification Gaming in AI Systems
Research on how AI systems find unexpected shortcuts that satisfy specifications without achieving intended goals.
frontier-researchspecification-gamingrewardresearch
Reward Hacking & Gaming
When models exploit reward signals rather than following intent, including specification gaming, Goodhart's law in RLHF, production examples, and red team implications.
reward-hackingspecification-gamingGoodharts-lawRLHFreward-modeloptimization
Specification Gaming in AI Systems
Research on how AI systems find unexpected shortcuts that satisfy specifications without achieving intended goals.
frontier-researchspecification-gamingrewardresearch
獎勵 Hacking 與鑽營
模型利用獎勵訊號而非遵循意圖,含規格鑽營、RLHF 中之 Goodhart 定律、生產範例,以及紅隊意涵。
reward-hackingspecification-gamingGoodharts-lawRLHFreward-modeloptimization