# SFT
標記為「SFT」的 4 篇文章
Fine-Tuning Attack Surface
Comprehensive overview of fine-tuning security vulnerabilities including SFT data poisoning, RLHF manipulation, alignment tax, and all fine-tuning attack vectors.
fine-tuningattack-surfaceSFTRLHFalignmentDPOsafety-training
SFT Data Poisoning & Injection
Poisoning supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.
SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger
微調攻擊面
微調安全漏洞的全面概觀,包括 SFT 資料投毒、RLHF 操弄、對齊稅,以及所有微調攻擊向量。
fine-tuningattack-surfaceSFTRLHFalignmentDPOsafety-training
SFT Data 投毒 & Injection
投毒 supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.
SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger