Skip to main content
redteams.ai
All tags

# safety-framework

2 articlestagged with “safety-framework

Fine-Tuning Safety Evaluation Framework

A comprehensive framework for evaluating the safety of fine-tuned models -- combining pre-deployment testing, safety regression benchmarks, and continuous monitoring to detect when fine-tuning has compromised model safety.

safety-evaluationregression-testingbenchmarkingmonitoringfine-tuning-securitysafety-framework
Intermediate

Safety Framework Circumvention

Attacking safety layers in embodied AI systems: watchdog suppression, emergency stop bypass, safety monitor evasion, and techniques for compromising multi-layered safety architectures in robotic systems.

safety-frameworkwatchdoge-stopsafety-monitor
Expert