Getting Started with AI Red Teaming Labs
Introduction to the hands-on lab environment for AI red teaming. Learn what you'll need, how labs are structured, and get an overview of all beginner exercises.
Welcome to the AI Red Teaming Labs
These labs provide hands-on experience with the techniques, tools, and methodologies used to evaluate the safety and robustness of large language models (LLMs). Unlike theoretical material, every lab in this series requires you to run real attacks against real models and observe the results firsthand.
Who These Labs Are For
The beginner track is designed for practitioners who have:
- Basic Python proficiency -- you can write functions, handle exceptions, and work with libraries
- Foundational security awareness -- you understand concepts like input validation, authorization, and attack surfaces
- Curiosity about LLM behavior -- you want to understand how language models fail and how to test them systematically
You do not need prior experience with machine learning, model internals, or advanced prompt engineering. The labs build these skills progressively.
Prerequisites
Before starting the labs, ensure you have the following ready:
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.9+ | 3.11+ |
| RAM | 8 GB | 16 GB |
| Disk space | 10 GB free | 50 GB free (for local models) |
| API access | At least one LLM API key | OpenAI + Anthropic + local model |
| OS | Any (Linux, macOS, Windows with WSL) | Linux or macOS |
How Labs Are Structured
Every lab in this series follows a consistent format:
Learning Objectives
Each lab begins with clear objectives so you know exactly what skills you will gain.
Prerequisites & Setup
Required tools, packages, and configurations are listed upfront. Complete these before attempting the exercises.
Background Context
Brief explanation of the technique or concept being explored, with links to deeper theory pages.
Step-by-Step Exercises
Detailed, numbered instructions that walk you through each attack or test. Every step includes the exact commands or code to run.
Expected Outputs
Sample outputs so you can verify your results match what is expected. Variations are noted where model behavior may differ.
Troubleshooting
Common issues and their solutions, so you spend time learning -- not debugging environment problems.
Knowledge Check
A quiz at the end of each lab to reinforce key concepts and verify understanding.
Beginner Lab Overview
The beginner track contains 11 hands-on labs that progress from environment setup through increasingly sophisticated attack techniques:
Foundation Labs
| Lab | Title | What You Learn |
|---|---|---|
| 1 | Environment Setup | Install tools, configure API keys, verify your setup |
| 2 | Your First Prompt Injection | Basic prompt override techniques against a chatbot |
| 3 | Basic Jailbreak Techniques | Role-play, DAN-style, and framing-based jailbreaks |
Tooling Labs
| Lab | Title | What You Learn |
|---|---|---|
| 4 | Building a Simple Test Harness | Automate prompt testing with Python and CSV output |
| 5 | Scanning with Garak | Use the Garak framework for automated vulnerability scanning |
| 6 | API-Based Model Testing | Test models through OpenAI, Anthropic, and local APIs |
Attack Technique Labs
| Lab | Title | What You Learn |
|---|---|---|
| 7 | System Prompt Extraction | Extract hidden system prompts from deployed models |
| 8 | Output Format Manipulation | Force models into specific output formats for exploitation |
| 9 | Basic Defense Evasion | Bypass keyword filters and basic content classifiers |
| 10 | Role-Play & Persona Attacks | Craft persona-based attacks and test their effectiveness |
| 11 | Encoding & Obfuscation | Use encoding tricks to bypass model safety filters |
Recommended Progression
While the labs are designed to be followed sequentially, here are some alternative paths based on your interests:
- Tool-focused path: Labs 1, 4, 5, 6 -- focuses on building and using testing infrastructure
- Attack-focused path: Labs 1, 2, 3, 7, 8, 10, 11 -- focuses on hands-on attack techniques
- Defense-aware path: Labs 1, 2, 9, 8 -- focuses on understanding and bypassing defenses
Ethical Guidelines
All labs in this series follow responsible AI red teaming principles:
- Test only what you are authorized to test -- your own deployments, or models with explicit testing permissions
- Document everything -- maintain logs of all tests for accountability
- Report vulnerabilities responsibly -- follow the vendor's disclosure process
- Never weaponize findings -- the goal is to improve safety, not to cause harm
- Respect rate limits and terms of service -- do not abuse API access
For a deeper discussion of ethics and legal considerations, see Red Team Ethics and Legal Considerations.
What Comes Next
After completing the beginner track, you will be ready for:
- Intermediate Labs -- multi-step attacks, advanced jailbreaks, tool-use exploitation
- Advanced Labs -- automated red teaming pipelines, model-specific attacks, fine-tuning exploits
- CTF Challenges -- competitive capture-the-flag exercises to test your skills
Related Topics
- Intermediate Labs - Continue your progression with multi-step attacks, RAG poisoning, and tool exploitation
- Red Team Methodology - Understand the structured approach behind effective AI red teaming engagements
- Attack Taxonomy - Comprehensive classification of AI attack vectors covered across these labs
- Tool Landscape - Overview of the red teaming tools you will use throughout these exercises
References
- "OWASP Top 10 for LLM Applications" - OWASP (2025) - Industry-standard classification of LLM security risks that maps to lab exercises
- "AI Risk Management Framework" - NIST (2023) - Federal guidelines for identifying and managing AI risks, relevant to red team methodology
- "Red Teaming Language Models with Language Models" - Perez et al. (2022) - Foundational paper on automated red teaming approaches
- "Garak Documentation" - NVIDIA/garak (2024) - Official documentation for the Garak LLM vulnerability scanner used in Lab 5
What is the recommended approach for completing the beginner labs?
Which of the following is NOT a prerequisite for the beginner labs?