Skip to main content
redteams.ai
All tags

# safety-reversal

1 articletagged with “safety-reversal

Safety Fine-Tuning Reversal Attacks

Techniques for reversing safety fine-tuning through targeted fine-tuning on adversarial datasets.

trainingfine-tuningsafety-reversal
Advanced