# guided
2 articlestagged with “guided”
Gradient-Guided Data Poisoning
Use gradient information from open-source models to craft optimally poisoned training examples.
advancedlabgradientguidedpoisoninglabs
Interpretability-Guided Attack Design
Use mechanistic interpretability to identify exploitable circuits and design targeted attacks.
labexpertguidedattacklabsinterpretability