1 articletagged with “adversarial-optimization”
Implement token-level adversarial optimization to discover minimal perturbations that bypass safety training.