Lab: een PAIR-aanval implementeren
Implement the PAIR (Prompt Automatic Iterative Refinement) algorithm where an attacker LLM iteratively refines jailbreak prompts against a target LLM until a successful attack is found.
labpairautomatedattack-algorithm