1 articletagged with “safety-reversal”
Techniques for reversing safety fine-tuning through targeted fine-tuning on adversarial datasets.