# constitutional-classifiers
標記為「constitutional-classifiers」的 6 篇文章
Constitutional Classifiers
Anthropic's Constitutional Classifiers defense: using constitutional AI principles to train input/output classifiers that withstood 3,000+ hours of adversarial red teaming.
Constitutional Classifiers for AI Safety
Analysis of Anthropic's Constitutional Classifiers approach to jailbreak resistance.
Constitutional Classifier Bypass
Develop techniques to bypass Anthropic-style constitutional classifiers through adversarial input crafting.
Constitutional Classifiers
Anthropic's Constitutional Classifiers defense: using constitutional AI principles to train input/output classifiers that withstood 3,000+ hours of adversarial red teaming.
Constitutional Classifiers for AI Safety
Analysis of Anthropic's Constitutional Classifiers approach to jailbreak resistance.
Constitutional Classifier Bypass
Develop techniques to bypass Anthropic-style constitutional classifiers through adversarial input crafting.