# guided
標記為「guided」的 4 篇文章
Gradient-Guided Data Poisoning
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-Guided Attack Design
Use mechanistic interpretability to identify exploitable circuits and design targeted attacks.
labexpertguidedattacklabsinterpretability
Gradient-指南d Data 投毒
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-指南d 攻擊 Design
Use mechanistic interpretability to identify exploitable circuits and design targeted attacks.
labexpertguidedattacklabsinterpretability