# guided
4 articlestagged with “guided”
Gradient-Guided Data Poisoning
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-Guided Attack Design
Use mechanistic interpretability to identify exploitable circuits and design targeted attacks.
labexpertguidedattacklabsinterpretability
Gradient-指南d Data 投毒
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-指南d 攻擊 Design
Use mechanistic interpretability to identify exploitable circuits and design targeted attacks.
labexpertguidedattacklabsinterpretability