# guided
標記為「guided」的 2 篇文章
Gradient-Guided Data 投毒
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-Guided 攻擊 Design
Use mechanistic interpretability to identify exploitable circuits與design targeted attacks.
labexpertguidedattacklabsinterpretability