Tool Walkthroughs
End-to-end practical walkthroughs for essential AI red teaming tools, covering installation, configuration, execution, and result interpretation.
The AI red teaming ecosystem has matured significantly. What was once a collection of ad-hoc scripts and manual prompt testing has evolved into a robust set of purpose-built tools, each targeting different aspects of LLM security assessment. This section provides hands-on, step-by-step walkthroughs for the tools you will use most frequently in professional engagements.
Why Tool Proficiency Matters
Effective AI red teaming is not about running a single scanner and handing over a report. It requires layering multiple tools, understanding what each one tests, and knowing when to switch from automated probing to manual exploration. A red teamer who relies on a single tool will miss entire categories of vulnerabilities.
Tool Selection Matrix
| Tool | Primary Use | Strength | Limitation |
|---|---|---|---|
| Garak | Automated vulnerability scanning | Broad probe library, extensible | Can produce false positives without tuning |
| PyRIT | Orchestrated attack campaigns | Multi-turn orchestrators, converters | Steeper learning curve |
| Promptfoo | Eval-driven red teaming | CI/CD integration, declarative configs | Focused on prompt-level testing |
| Burp Suite | API-level interception | Deep HTTP inspection, fuzzing | Requires proxy configuration |
| Inspect AI | Structured evaluations | Benchmark suites, custom scorers | Evaluation-focused, not attack-focused |
| Ollama | Local model testing | No API costs, uncensored models | Limited to models that fit local hardware |
| Python Automation | Custom test harnesses | Full flexibility, API integration | Requires development effort |
Recommended Tool Progression
If you are new to AI red teaming tools, work through the walkthroughs in this order:
-
Start with Ollama -- Set up a local testing environment with no API costs or rate limits. This gives you a safe sandbox to experiment freely.
-
Learn Promptfoo -- Its declarative YAML configuration is the easiest entry point for systematic testing. You will build config files that define providers, test cases, and assertions.
-
Move to Garak -- Once you understand what you are testing for, garak's probe library lets you scan for known vulnerability patterns at scale.
-
Add PyRIT -- For sophisticated multi-turn attacks and orchestrated campaigns, PyRIT provides the automation framework you need.
-
Layer in Burp Suite -- When you need to inspect what is actually going over the wire between client and API, Burp gives you full visibility.
-
Use Inspect AI -- For formal evaluations against benchmark suites, especially when working with governance teams or producing compliance artifacts.
-
Build custom automation -- When off-the-shelf tools do not cover your specific target, Python automation fills the gaps.
Engagement Phase Mapping
Different tools excel at different phases of an engagement:
Reconnaissance Phase
- Burp Suite -- Intercept API calls to understand endpoints, authentication, and request formats
- Python automation -- Script discovery of model capabilities and API parameters
- Ollama -- Test attack hypotheses locally before running against production targets
Active Testing Phase
- Garak -- Broad automated scanning for known vulnerability patterns
- PyRIT -- Orchestrated multi-turn attack campaigns with automatic jailbreak escalation
- Promptfoo -- Systematic evaluation of specific attack vectors with assertions
Validation and Reporting Phase
- Inspect AI -- Formal benchmark evaluation for compliance documentation
- Promptfoo -- Regression testing to verify remediation effectiveness
- Python automation -- Custom reporting and evidence collection
Environment Setup
All walkthroughs assume a Linux or macOS environment. Windows users should use WSL2. You will need:
- Python 3.10+ -- Most tools are Python-based or have Python SDKs
- Node.js 18+ -- Required for promptfoo
- Docker -- Optional but recommended for isolated environments
- 8GB+ RAM -- Minimum for running local models with Ollama (16GB+ recommended)
- API keys -- For testing against hosted models (OpenAI, Anthropic, etc.)
# Verify your environment
python3 --version # 3.10+
node --version # 18+
docker --version # Optional
ollama --version # Install from ollama.com if neededCommon Patterns Across Tools
As you work through the walkthroughs, you will notice several patterns that recur across tools:
Target configuration -- Every tool needs to know how to reach the model under test. This might be an API endpoint, a local Ollama model, or a custom Python wrapper.
Attack taxonomy -- Tools organize attacks differently (probes, plugins, test cases), but they all cover similar categories: injection, jailbreaking, data extraction, harmful content generation.
Result scoring -- Automated tools need a way to determine whether an attack succeeded. This ranges from simple string matching to LLM-as-judge evaluation.
Evidence collection -- Professional engagements require documentation. Each walkthrough covers how to export results in formats suitable for reporting.
Walkthrough Index
- Garak Walkthrough -- Automated vulnerability scanning with NVIDIA's probe framework
- PyRIT Walkthrough -- Microsoft's orchestrated attack automation platform
- Promptfoo Walkthrough -- Eval-driven red teaming with CI/CD integration
- Burp Suite for AI APIs -- HTTP-level interception and fuzzing for LLM endpoints
- Inspect AI Walkthrough -- UK AISI's evaluation framework for formal benchmarking
- Ollama for Red Teaming -- Local model testing with uncensored and fine-tuned models
- Python Automation -- Building custom test harnesses and automation pipelines