# theory
2 articlestagged with “theory”
Deceptive Alignment Theory
Theoretical frameworks for understanding and predicting deceptive alignment in advanced AI systems.
frontier-researchdeceptive-alignmenttheorymesa-optimization
Formal Models of Prompt Injection
Theoretical frameworks for formally modeling and reasoning about prompt injection vulnerabilities.
frontier-researchformal-modelsprompt-injectiontheory