Advanced AI Red Team Labs
Advanced hands-on labs covering automated attack algorithms, production red team tooling, multi-model testing, and sophisticated exploitation techniques for experienced AI security practitioners.
Advanced Labs Overview
The advanced track moves beyond individual attack techniques into automated attack systems, production-grade tooling, and model-specific exploitation. These labs require you to build complete pipelines, implement research algorithms, and test against multiple models simultaneously.
Prerequisites
Before starting these labs, you should have completed:
- All Beginner Labs -- especially the test harness (Lab 4) and API testing (Lab 6)
- Familiarity with Intermediate Labs concepts -- multi-step attacks, indirect injection
- Comfortable reading and modifying Python async code
- API access to at least two different LLM providers
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.10+ | 3.12+ |
| RAM | 16 GB | 32 GB |
| GPU | None (API-based labs) | CUDA-capable (suffix/extraction labs) |
| Disk | 20 GB free | 100 GB free (local models) |
| API keys | OpenAI or Anthropic | OpenAI + Anthropic + Google AI |
| Local models | Ollama with 7B model | Ollama with 7B + 13B models |
Lab Catalog
Automated Attack Algorithms
These labs implement state-of-the-art attack algorithms from recent research:
| Lab | Title | Key Skills | Difficulty |
|---|---|---|---|
| 2 | PAIR Attack | Iterative refinement, attacker/target LLM architecture | Advanced |
| 3 | TAP Algorithm | Tree search, pruning, attack branching | Advanced |
| 10 | Adversarial Suffixes | Gradient-based optimization, GCG algorithm | Expert |
Production Tooling
Build the infrastructure that professional red teams rely on:
| Lab | Title | Key Skills | Difficulty |
|---|---|---|---|
| 4 | Custom Harness | Async Python, multi-model support, reporting | Advanced |
| 5 | Multi-Model Testing | Comparative analysis, normalization, benchmarking | Advanced |
| 9 | CART Pipeline | CI/CD integration, regression detection, Docker | Advanced |
Model-Specific Exploitation
Attack techniques targeting specific model architectures and modalities:
| Lab | Title | Key Skills | Difficulty |
|---|---|---|---|
| 6 | Multimodal Pipeline | Vision-language models, adversarial images | Advanced |
| 7 | Training Data Extraction | Memorization, prefix attacks, privacy | Advanced |
| 8 | Reasoning Exploitation | Chain-of-thought manipulation, reasoning models | Advanced |
| 14 | Audio Adversarial | Speech models, audio perturbations | Advanced |
Defense Bypass and Backdoors
Techniques for circumventing layered defenses and introducing persistent vulnerabilities:
| Lab | Title | Key Skills | Difficulty |
|---|---|---|---|
| 11 | Guardrail Bypass Chain | Multi-layer bypass, defense analysis | Advanced |
| 12 | Fine-Tune Backdoor | Poisoned fine-tuning, trigger design | Expert |
| 13 | Federated Poisoning | Distributed ML attacks, Flower framework | Expert |
Recommended Paths
For practitioners building red team infrastructure:
- Custom Harness -- build your testing foundation
- Multi-Model Testing -- extend to comparative analysis
- PAIR Attack -- add automated attack generation
- CART Pipeline -- deploy continuous testing
For researchers exploring cutting-edge attack techniques:
- PAIR Attack -- implement foundational algorithm
- TAP Algorithm -- explore tree-based search
- Adversarial Suffixes -- gradient-based attacks
- Training Data Extraction -- privacy attacks
For practitioners who need to understand attacks to build better defenses:
- Guardrail Bypass Chain -- understand defense gaps
- PAIR Attack -- see how automated attacks probe defenses
- Fine-Tune Backdoor -- supply chain risks
- CART Pipeline -- build defensive monitoring
What Comes Next
After completing the advanced track, you are ready for:
- Expert Labs -- RLHF reward hacking, quantization exploits, multi-agent warfare
- CTF Challenges -- competitive exercises that combine multiple techniques
- Full Engagement Simulations -- end-to-end red team engagements against realistic targets
Related Topics
- Intermediate Labs - Prerequisites that build the skills needed for advanced exercises
- Expert Labs - Next progression for research-oriented red team practitioners
- CTF Challenges - Competitive exercises that test advanced techniques under pressure
- Red Team Methodology - Methodology framework that advanced automation implements at scale
References
- "PAIR: Prompt Automatic Iterative Refinement" - Chao et al. (2023) - Automated attack algorithm implemented in the PAIR attack lab
- "Tree of Attacks: Jailbreaking Black-Box LLMs with Auto-Regressive Pruning" - Mehrotra et al. (2023) - TAP algorithm implemented in the TAP lab
- "Universal and Transferable Adversarial Attacks on Aligned Language Models" - Zou et al. (2023) - GCG algorithm for adversarial suffix generation
- "AI Risk Management Framework" - NIST (2023) - Framework for systematic AI risk evaluation that informs advanced pipeline design
What is the primary difference between intermediate and advanced red teaming?
Which prerequisite is most important before starting advanced labs?