Engagement Walkthroughs Overview
Step-by-step walkthroughs for complete AI red team engagements: from scoping and reconnaissance through attack execution to reporting, organized by target system type.
Every AI system has a different architecture, threat model, and set of stakeholders. A chatbot engagement differs from a RAG system engagement, which differs from an agent assessment. The walkthroughs in this section provide complete, system-type-specific engagement guides that cover the full lifecycle: scoping, reconnaissance, attack execution, and reporting.
These are not theoretical frameworks. Each walkthrough follows a realistic engagement scenario and provides the exact testing sequences, tools, and deliverable templates you would use on a real engagement.
How These Walkthroughs Are Structured
Each engagement walkthrough follows the same five-phase structure:
Scoping and Pre-Engagement
Define the target system, agree on rules of engagement, identify stakeholders, set up the testing environment, and establish communication protocols.
Reconnaissance
Map the target system's architecture, identify components, enumerate attack surfaces, and build a testing plan based on what you discover.
Attack Execution
Execute structured attacks against each identified attack surface. Document every attempt, whether successful or not.
Analysis and Triage
Analyze findings, assess severity, identify root causes, and develop remediation recommendations tailored to the system's architecture.
Reporting and Handoff
Write the engagement report, conduct the readout meeting, and hand off findings with actionable remediation guidance.
Choosing the Right Walkthrough
| System Type | Walkthrough | Key Attack Surfaces | Typical Duration |
|---|---|---|---|
| Chatbot | Chatbot Engagement | Prompt injection, system prompt extraction, content filter bypass, PII leakage | 1-2 weeks |
| RAG System | RAG Engagement | Document injection, cross-scope retrieval, embedding manipulation, data exfiltration | 2-3 weeks |
| AI Agent | Agent Engagement | Tool abuse, permission escalation, prompt injection through tools, session manipulation | 2-4 weeks |
| AI API | API Engagement | Authentication bypass, rate limit evasion, input validation, output leakage | 1-2 weeks |
| Multi-Model | Multi-Model Engagement | Model-to-model injection, routing bypass, fallback exploitation, inter-model data leakage | 3-4 weeks |
Common Engagement Phases
Regardless of system type, every AI red team engagement includes these activities:
Pre-Engagement Essentials
Written authorization. Obtain signed authorization that explicitly covers AI-specific testing activities: prompt injection, jailbreaking, data extraction attempts, and tool abuse. Standard penetration testing authorization may not cover these activities.
Scope definition. Define exactly which models, endpoints, data sources, and tools are in scope. AI systems often span multiple services and accounts. Confirm whether model training infrastructure, monitoring systems, and connected data stores are in scope or out of scope.
Testing environment. Whenever possible, test against a staging environment first. AI testing can generate content that enters logs, training data, or feedback loops. Understand the blast radius before testing production.
Reconnaissance Patterns
Every engagement starts with reconnaissance. For AI systems, reconnaissance includes:
- Architecture mapping: Identify the model, hosting platform, orchestration layer, data sources, and tool integrations
- API enumeration: Map all endpoints, authentication mechanisms, and rate limits
- Content policy identification: Determine what content filtering exists and at what thresholds
- Tool inventory: For agent systems, enumerate available tools and their capabilities
- Data source mapping: For RAG systems, identify connected data stores and their access controls
Reporting Standards
All engagement walkthroughs in this section produce deliverables that follow the reporting standards from the Report Writing walkthrough:
- Executive summary tailored to the system type and business context
- Finding details with reproduction steps, severity ratings, and system-specific impact analysis
- Remediation recommendations mapped to the specific platform, framework, and architecture
- Appendices with raw testing data, tool configurations, and cost summaries
Engagement Difficulty Progression
If you are new to AI red teaming, work through the walkthroughs in this order:
-
Chatbot Engagement (intermediate) -- The simplest target architecture. Focuses on core skills: prompt injection, system prompt extraction, and content filter bypass.
-
API Engagement (intermediate) -- Introduces infrastructure-level testing. Combines AI-specific techniques with traditional API security testing.
-
RAG Engagement (advanced) -- Adds data layer complexity. Requires understanding of embedding systems, vector databases, and retrieval pipelines.
-
Agent Engagement (advanced) -- The most complex single-system type. Requires understanding of tool calling, permission models, and multi-step attack chains.
-
Multi-Model Engagement (advanced) -- The most complex overall. Requires understanding model interactions, routing logic, and cross-model attack propagation.