# instruction-tuning
標記為「instruction-tuning」的 4 篇文章
Instruction Tuning Manipulation
Techniques for manipulating instruction-tuned models by crafting adversarial training examples that alter the model's instruction-following behavior.
fine-tuninginstruction-tuningmanipulationsafety
SFT Data Poisoning & Injection
Poisoning supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.
SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger
Instruction Tuning Manipulation
Techniques for manipulating instruction-tuned models by crafting adversarial training examples that alter the model's instruction-following behavior.
fine-tuninginstruction-tuningmanipulationsafety
SFT Data 投毒 & Injection
投毒 supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.
SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger