1 articletagged with “automated-red-teaming”
Using a second LLM as an automated attacker to iteratively generate and refine jailbreak prompts against a target model, implementing the Prompt Automatic Iterative Refinement technique.