1 articletagged with “adversarial-triggers”
Discovering and deploying universal adversarial trigger sequences that reliably override safety alignment across multiple LLM families, including gradient-based search, transfer attacks, and defense evasion.