# harmbench
7 articlestagged with “harmbench”
HarmBench: Standardized Red Team Evaluation
Deep dive into the HarmBench framework for standardized red team evaluation: attack methods, evaluation pipeline, running benchmarks, interpreting results, and comparing model safety across providers.
HarmBench Custom Attack Submission
Develop and evaluate custom attack methods against the HarmBench standardized evaluation framework.
Your First HarmBench Evaluation
Run a standardized safety evaluation using the HarmBench framework against a target model.
Automated Red Teaming Tools Comparison
Comprehensive comparison of automated AI red teaming tools including PyRIT, Garak, DeepTeam, AutoRedTeamer, HarmBench, and ART, with detailed capability matrices, strengths analysis, and use case recommendations.
HarmBench Custom Behavior Sets
Extend HarmBench with custom behavior specifications for domain-specific vulnerability testing.
HarmBench Evaluation Walkthrough
Run standardized attack evaluations using the HarmBench framework and interpret results.
HarmBench Evaluation Framework Walkthrough
Complete walkthrough of the HarmBench evaluation framework: installation, running standardized benchmarks against models, interpreting results, creating custom behavior evaluations, and comparing model safety across versions.