Skip to main content
redteams.ai
All tags

# SFT

2 articlestagged with “SFT

Fine-Tuning Attack Surface

Comprehensive overview of fine-tuning security vulnerabilities including SFT data poisoning, RLHF manipulation, alignment tax, and all fine-tuning attack vectors.

fine-tuningattack-surfaceSFTRLHFalignmentDPOsafety-training
Advanced

SFT Data Poisoning & Injection

Poisoning supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.

SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger
Expert