Skip to main content
redteams.ai
All tags

# theory

2 articlestagged with “theory

Deceptive Alignment Theory

Theoretical frameworks for understanding and predicting deceptive alignment in advanced AI systems.

frontier-researchdeceptive-alignmenttheorymesa-optimization
Expert

Formal Models of Prompt Injection

Theoretical frameworks for formally modeling and reasoning about prompt injection vulnerabilities.

frontier-researchformal-modelsprompt-injectiontheory
Expert