Skip to main content
redteams.ai
All tags

# model-organisms

1 articletagged with “model-organisms

Model Organisms of Misalignment

Deliberately creating misaligned models for study: methodology, threat model instantiation, experimental frameworks, and what model organisms reveal about AI safety failures.

model-organismsmisalignmentalignment-researchthreat-modelsai-safety
Advanced