Full Engagement Methodology
A comprehensive methodology for conducting full AI red teaming engagements, integrating all techniques from previous sections into a structured professional assessment.
This capstone section brings together everything from the previous seven sections into a cohesive methodology for conducting professional AI red teaming engagements. A full engagement is not just a collection of individual attacks — it is a structured assessment that systematically evaluates an AI system's security posture.
Engagement Phases
A professional AI red teaming engagement follows six phases:
Phase 1: Planning & Scoping
↓
Phase 2: Reconnaissance
↓
Phase 3: Vulnerability Discovery
↓
Phase 4: Exploitation & Validation
↓
Phase 5: Analysis & Impact Assessment
↓
Phase 6: Reporting & Remediation
Detailed coverage of each phase:
- Planning & Scoping — Defining scope, rules of engagement, success criteria, and methodology
- Execution & Reporting — Running the assessment, documenting findings, and delivering results
What Makes AI Red Teaming Different
| Aspect | Traditional Pentest | AI Red Team |
|---|---|---|
| Findings | Deterministic (vuln exists or not) | Probabilistic (success rate) |
| Scope | Systems, networks, applications | Models, prompts, data pipelines, tools |
| Tools | Scanners, exploits, scripts | Payloads, fuzzers, classifiers |
| Reporting | CVEs, CVSS scores | Attack taxonomies, success rates, impact chains |
| Remediation | Patches, configuration | Retraining, guardrails, architecture changes |
| Retesting | Binary (fixed/not fixed) | Statistical (rate reduced sufficiently?) |
The Assessment Matrix
Structure your engagement around an assessment matrix of attack categories and target components:
| Model | System Prompt | Tools | Data Pipeline | Infrastructure | |
|---|---|---|---|---|---|
| Injection | Jailbreak | Override | Abuse | RAG poison | API exploit |
| Extraction | Training data | Prompt leak | Tool enum | Data access | Config leak |
| Evasion | Safety bypass | Filter bypass | Auth bypass | Validation bypass | WAF bypass |
| Denial | Resource exhaustion | Context overflow | Tool flooding | Data corruption | Service DoS |
Each cell represents a test category. Not all cells apply to every engagement, but the matrix ensures comprehensive coverage.
Key Deliverables
A professional engagement produces:
- Executive Summary — Non-technical overview of findings and risk
- Technical Report — Detailed findings with payloads, success rates, and evidence
- Attack Surface Map — Complete mapping of the system's components and their security posture
- Remediation Roadmap — Prioritized recommendations with effort estimates
- Regression Test Suite — Automated tests to verify remediation and detect regressions
Getting Started
Begin with Planning & Scoping to learn how to set up an engagement properly, then proceed to Execution & Reporting for the operational methodology.
Related Topics
- Planning & Scoping -- detailed engagement planning methodology
- Execution & Reporting -- running the assessment and delivering results
- Recon & Tradecraft -- the reconnaissance phase that starts every engagement
- Exploit Development -- building the exploits used during engagements
- Full Engagement (Advanced) -- advanced engagement methodology with report writing
References
- NIST, "AI Risk Management Framework" (2023) -- federal AI risk assessment framework
- OWASP, "Top 10 for Large Language Model Applications" (2025) -- industry-standard LLM risk taxonomy
- Anthropic, "Challenges in Red Teaming AI Systems" (2024) -- methodological considerations for AI red teaming
- MITRE, "ATLAS: Adversarial Threat Landscape for AI Systems" (2023) -- comprehensive threat framework for structuring assessments
Why does an AI red teaming report need success rates rather than just binary pass/fail findings?