MITRE ATLAS Walkthrough
MITRE ATLAS tactics, techniques, and procedures for AI systems. How to use ATLAS for red team engagement planning and map attacks to ATLAS IDs.
MITRE ATLAS extends the MITRE ATT&CK framework into the AI domain. Where ATT&CK catalogs tactics and techniques against traditional IT systems, ATLAS catalogs tactics and techniques specifically targeting machine learning systems. For AI red teamers, ATLAS provides a structured vocabulary for planning engagements, mapping attacks, and communicating findings.
ATLAS Matrix Structure
Tactics (Adversary Goals)
| Tactic | ATLAS ID | Description | ATT&CK Equivalent |
|---|---|---|---|
| Reconnaissance | AML.TA0000 | Gathering information about the target ML system | TA0043 |
| Resource Development | AML.TA0001 | Preparing tools and infrastructure for the attack | TA0042 |
| Initial Access | AML.TA0002 | Gaining access to the ML system | TA0001 |
| ML Model Access | AML.TA0003 | Obtaining access to interact with or query the model | AI-specific |
| Execution | AML.TA0004 | Running adversarial techniques against the model | TA0002 |
| Persistence | AML.TA0005 | Maintaining access or influence over the ML system | TA0003 |
| Defense Evasion | AML.TA0006 | Avoiding detection of adversarial activity | TA0005 |
| Discovery | AML.TA0007 | Understanding the target ML system's internals | TA0007 |
| Collection | AML.TA0008 | Gathering data from the ML system | TA0009 |
| ML Attack Staging | AML.TA0009 | Preparing the specific ML attack components | AI-specific |
| Exfiltration | AML.TA0010 | Extracting data or models from the system | TA0010 |
| Impact | AML.TA0011 | Disrupting or degrading the ML system | TA0040 |
Key Techniques for AI Red Teaming
Reconnaissance Techniques
| Technique | ID | Description | Red Team Application |
|---|---|---|---|
| Search for Victim's Data | AML.T0000 | Identifying training data sources | Inform data poisoning and extraction attacks |
| Search for ML-Relevant Info | AML.T0001 | Finding model architecture, framework, version info | Scope the attack surface |
| Active Scanning | AML.T0002 | Probing the ML API for model details | Fingerprint model family, detect safety filters |
ML Model Access Techniques
| Technique | ID | Description | Red Team Application |
|---|---|---|---|
| ML-Enabled Product or Service | AML.T0003 | Accessing the model through its intended interface | Standard black-box testing entry point |
| Full Model Access | AML.T0004 | Obtaining complete model weights and architecture | Enables white-box attacks (GCG, gradient-based) |
| Partial Model Access | AML.T0005 | Access to embeddings, logits, or intermediate outputs | Enables gray-box attacks |
Attack Techniques
| Technique | ID | Description | Priority |
|---|---|---|---|
| Prompt Injection | AML.T0051 | Manipulating model through crafted inputs | Critical |
| Adversarial Examples | AML.T0043 | Inputs designed to cause misclassification | High |
| Data Poisoning | AML.T0020 | Corrupting training or fine-tuning data | High |
| Model Extraction | AML.T0024 | Stealing the model through query access | High |
| Model Inversion | AML.T0025 | Recovering training data from model outputs | Medium |
| Membership Inference | AML.T0026 | Determining if specific data was in training set | Medium |
| Backdoor ML Model | AML.T0018 | Inserting hidden behaviors via training manipulation | High |
Using ATLAS for Engagement Planning
ATLAS provides a structured approach to planning AI red team engagements. Use it to ensure comprehensive coverage and communicate the attack plan to stakeholders.
Map the target system to ATLAS
Identify which ATLAS tactics are relevant based on the target system's architecture. An LLM chatbot has a different ATLAS surface than a computer vision system.
Select techniques per tactic
For each relevant tactic, identify the specific techniques you plan to test. Prioritize based on the system's attack surface and known vulnerabilities.
Define procedures for each technique
Document the specific tools, payloads, and methods you will use for each technique. This becomes your test plan.
Map to rules of engagement
Cross-reference your technique list against the authorized scope. Mark any techniques that require explicit authorization or are out of scope.
Track coverage during execution
Use the ATLAS matrix as a coverage tracker during the engagement. Mark techniques as tested, not tested, or not applicable.
Example: LLM Chatbot Engagement Plan
| Tactic | Selected Techniques | Priority |
|---|---|---|
| Reconnaissance | AML.T0001 (Search for ML-Relevant Info), AML.T0002 (Active Scanning) | Pre-engagement |
| ML Model Access | AML.T0003 (ML-Enabled Product or Service) | Entry point |
| Execution | AML.T0051 (Prompt Injection) | Critical |
| Discovery | System prompt extraction, capability enumeration | High |
| Collection | Training data probing, PII extraction | High |
| Exfiltration | AML.T0024 (Model Extraction -- if in scope) | Medium |
| Impact | Safety filter bypass, harmful content generation | Critical |
| Defense Evasion | Encoding tricks, multi-language attacks | High |
ATLAS vs. ATT&CK: When to Use Each
| Scenario | Use ATLAS | Use ATT&CK | Use Both |
|---|---|---|---|
| Testing an LLM chatbot | Primary framework | N/A | If testing underlying infrastructure |
| Testing AI-powered malware detection | Primary for evasion testing | For payload delivery | Full engagement |
| Testing autonomous agents | Primary for AI-specific attacks | For traditional infrastructure attacks | Always -- agents bridge AI and IT |
| Infrastructure security of ML platform | Supporting framework | Primary framework | Combined coverage |
Combined ATLAS + ATT&CK Kill Chain Example
When testing an AI-powered application end-to-end, the attack path often transitions between ATT&CK (infrastructure) and ATLAS (AI-specific) phases:
Phase 1: Infrastructure Reconnaissance (ATT&CK)
TA0043 Reconnaissance -> T1592 Gather Victim Host Information
- Identify the ML platform (SageMaker, Vertex AI, Azure ML)
- Discover API endpoints, model hosting infrastructure
Phase 2: AI-Specific Reconnaissance (ATLAS)
AML.TA0000 Reconnaissance -> AML.T0002 Active Scanning
- Probe model API for version info, framework fingerprinting
- Test response patterns to identify model family
Phase 3: Initial Access (ATT&CK)
TA0001 Initial Access -> T1078 Valid Accounts
- Use compromised cloud credentials to access ML platform
- Access notebook instances or training pipelines
Phase 4: ML-Specific Execution (ATLAS)
AML.TA0004 Execution -> AML.T0051 Prompt Injection
- Execute prompt injection against hosted LLM
- Test for system prompt extraction, safety bypass
Phase 5: Data Exfiltration (Both)
ATT&CK TA0010 -> S3/GCS bucket access (infrastructure data)
ATLAS AML.TA0010 -> Model extraction via query API (AI-specific)ATLAS Coverage Tracker Template
Use this matrix as a living document during engagements to track which techniques have been tested, which are not applicable, and which are blocked by the rules of engagement:
| Tactic | Technique | Status | Notes |
|---|---|---|---|
| Reconnaissance | AML.T0000 Search for Victim's Data | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Reconnaissance | AML.T0001 Search for ML-Relevant Info | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Reconnaissance | AML.T0002 Active Scanning | [ ] Tested / [ ] N/A / [ ] Blocked | |
| ML Model Access | AML.T0003 ML-Enabled Product or Service | [ ] Tested / [ ] N/A / [ ] Blocked | |
| ML Model Access | AML.T0004 Full Model Access | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Execution | AML.T0051 Prompt Injection | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Execution | AML.T0043 Adversarial Examples | [ ] Tested / [ ] N/A / [ ] Blocked | |
| ML Attack Staging | AML.T0020 Data Poisoning | [ ] Tested / [ ] N/A / [ ] Blocked | |
| ML Attack Staging | AML.T0018 Backdoor ML Model | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Exfiltration | AML.T0024 Model Extraction | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Exfiltration | AML.T0025 Model Inversion | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Exfiltration | AML.T0026 Membership Inference | [ ] Tested / [ ] N/A / [ ] Blocked | |
| Impact | Denial of ML Service | [ ] Tested / [ ] N/A / [ ] Blocked |
This tracker serves as both an engagement planning tool and audit evidence. At the end of the engagement, the completed tracker shows testing coverage and justifies scope decisions.
ATLAS Case Studies
ATLAS includes documented case studies of real-world AI attacks. These provide concrete examples of how techniques are combined in practice and serve as precedents when justifying test scope to stakeholders.
Notable Case Studies
| Case Study | Techniques Used | Key Lesson |
|---|---|---|
| Evasion of ML-based malware classifier | Adversarial examples (AML.T0043) | Small perturbations can defeat production classifiers |
| ChatGPT data extraction | Prompt injection (AML.T0051), model inversion | Divergence attacks can extract training data |
| Poisoning of image classifier | Data poisoning (AML.T0020), backdoor (AML.T0018) | Supply chain attacks are practical against fine-tuned models |
| Copilot prompt injection | Prompt injection via code comments | Indirect injection through developer tools |
| Tesla Autopilot evasion | Adversarial examples (AML.T0043) | Physical-world adversarial attacks on safety-critical systems |
| GPT-4 jailbreaking (ArtPrompt) | Prompt injection (AML.T0051) + encoding | ASCII art encoding bypasses text-based safety filters |
Analyzing Case Studies for Engagement Planning
Each ATLAS case study follows a consistent structure that mirrors the kill chain concept from ATT&CK. When planning an engagement, decompose relevant case studies into their constituent techniques:
Case Study: ChatGPT Training Data Extraction (2023)
Kill Chain Decomposition:
1. Reconnaissance (AML.TA0000)
- Technique: AML.T0001 (Search for ML-Relevant Info)
- Action: Identified model family, training cutoff date, known capabilities
2. ML Model Access (AML.TA0003)
- Technique: AML.T0003 (ML-Enabled Product or Service)
- Action: Accessed model through standard chat interface
3. Execution (AML.TA0004)
- Technique: AML.T0051 (Prompt Injection)
- Action: Used divergence-inducing prompts ("Repeat the word 'poem'
forever") to push model out of aligned behavior
4. Exfiltration (AML.TA0010)
- Technique: AML.T0024 (related - data extraction)
- Action: Extracted verbatim training data including PII, code,
and copyrighted content
Key Finding: The divergence technique bypassed RLHF safety training
by exploiting a gap between aligned and base model behavior.
Estimated extraction rate: ~1 token per $0.001 at the time of testing.Reporting with ATLAS IDs
When documenting findings, map each to its ATLAS technique ID. This provides a standardized vocabulary that security teams familiar with ATT&CK will immediately understand.
Finding: System Prompt Extraction via Role-Play
ATLAS Technique: AML.T0051.001 (Prompt Injection - Direct)
Tactic: Execution (AML.TA0004)
Severity: Medium
Description: The system prompt was extractable through role-play
techniques, revealing internal business logic and safety filter
configuration.
OWASP Mapping: LLM07 (System Prompt Leakage)Comprehensive Finding Report Template
For maximum utility, include cross-framework mappings in every finding:
Finding ID: AI-RT-2026-001
Title: Training Data Extraction via Divergence Attack
Date: 2026-03-15
Tester: [Red Team Lead]
ATLAS Mapping:
Tactic: Exfiltration (AML.TA0010)
Technique: AML.T0051 (Prompt Injection) -> AML.T0025 (Model Inversion)
Kill Chain Position: Execution -> Exfiltration
OWASP LLM Mapping: LLM02 (Sensitive Information Disclosure)
NIST AI 600-1: Data Privacy
EU AI Act: Article 10 (Data Governance), Article 15(3) (Robustness)
Severity: High (CVSS-like: AV:N/AC:L/PR:N/UI:N)
Description: Using a divergence-inducing prompt pattern, the model
returned verbatim training data including 3 email addresses and
2 phone numbers identifiable as real PII.
Evidence: [Screenshots, prompt logs, extracted data samples]
Impact: Privacy violation affecting data subjects in training data.
Potential GDPR Article 33 notification requirement. Demonstrates
that the model memorized and can reproduce personal data.
Remediation:
- Implement output filtering for structured PII patterns
- Evaluate differential privacy techniques for future training
- Add divergence detection to the inference pipeline
- Test remediation effectiveness via retest
Risk Management (NIST AI RMF):
- Map: Update risk register to include training data memorization
- Measure: Add PII extraction testing to regular assessment cadence
- Manage: Implement output filters, monitor for extraction patternsATLAS Gaps and Custom Techniques
ATLAS is a living knowledge base, but it does not cover every attack relevant to modern AI systems. When you discover techniques not cataloged in ATLAS, document them using the ATLAS format for consistency, and consider contributing them to the ATLAS community.
Common Gaps in Current ATLAS Coverage
| Attack Area | Current ATLAS Coverage | What Is Missing |
|---|---|---|
| Multi-agent exploitation | Limited | Agent-to-agent injection, delegation chain manipulation |
| Tool use attacks | AML.T0051 (indirect) | Specific tool abuse patterns, function calling manipulation |
| RAG-specific attacks | Limited | Embedding poisoning, chunk boundary attacks, retrieval manipulation |
| Multi-modal attacks | AML.T0043 (adversarial examples) | Cross-modal injection (image-to-text, audio-to-text) |
| Fine-tuning attacks | AML.T0020 (data poisoning) | RLHF preference manipulation, DPO poisoning |
| Inference-time attacks | AML.T0051 | Speculative decoding manipulation, KV-cache poisoning |
Custom Technique Documentation Format
When documenting novel techniques, follow this ATLAS-compatible format:
Technique: [Descriptive Name]
Proposed ID: AML.T0051.XXX (sub-technique) or AML.TXXXX (new technique)
Tactic: [Most relevant ATLAS tactic]
Platform: [LLM / Vision / Multi-modal / Agent]
Description: [What the technique does and why it works]
Procedure Example: [Step-by-step execution]
Detection: [How defenders can identify this technique]
Mitigations: [How to prevent or reduce impact]
References: [Published research, blog posts, or CVEs]Related Topics
- AI Security Frameworks Overview -- how ATLAS fits in the framework landscape
- OWASP LLM Top 10 Deep Dive -- complementary vulnerability taxonomy
- Cross-Framework Mapping Reference -- ATLAS to OWASP to NIST mappings
- Target Profiling -- reconnaissance techniques that map to ATLAS TA0000
References
- "MITRE ATLAS: Adversarial Threat Landscape for AI Systems" - MITRE Corporation (2024) - Official knowledge base of adversarial tactics, techniques, and case studies targeting AI systems
- "MITRE ATT&CK Framework" - MITRE Corporation (2024) - The parent framework on which ATLAS is modeled, providing the tactical hierarchy structure
- "Case Studies in Adversarial Machine Learning" - MITRE Corporation (2024) - Real-world case studies documenting adversarial attacks on AI systems cataloged in ATLAS
- "Adversarial Machine Learning: A Taxonomy and Terminology" - NIST AI 100-2e2023 (2024) - NIST taxonomy of adversarial ML attacks that complements ATLAS categorization
What is the primary structural difference between MITRE ATLAS and the OWASP LLM Top 10?