# jailbreak-defense
2 articlestagged with “jailbreak-defense”
Constitutional Classifiers
Anthropic's Constitutional Classifiers defense: using constitutional AI principles to train input/output classifiers that withstood 3,000+ hours of adversarial red teaming.
constitutional-classifiersdefensejailbreak-defenseanthropicclassifiersconstitutional-ai
Constitutional Classifiers
Anthropic's Constitutional Classifiers defense: using constitutional AI principles to train input/output classifiers that withstood 3,000+ hours of adversarial red teaming.
constitutional-classifiersdefensejailbreak-defenseanthropicclassifiersconstitutional-ai