Using MITRE ATLAS for AI Attack Mapping
Walkthrough for mapping AI red team activities and findings to the MITRE ATLAS framework, covering tactic and technique identification, attack chain construction, and navigator visualization.
MITRE ATLAS (Adversarial Threat Landscape for AI Systems) extends the MITRE ATT&CK framework to cover adversarial attacks against machine learning systems. While OWASP LLM Top 10 classifies vulnerability types, ATLAS maps the attacker's journey: the sequence of tactics and techniques used from initial reconnaissance through impact. This walkthrough shows how to use ATLAS to document attack paths, identify gaps in red team coverage, and communicate findings in the language that security operations teams already understand from ATT&CK.
ATLAS complements OWASP -- it does not replace it. Use OWASP for vulnerability classification and ATLAS for attack path documentation.
Step 1: Understand ATLAS Structure
ATLAS Tactics (Attack Phases)
ATLAS follows a similar tactic structure to ATT&CK, adapted for ML/AI systems:
| Tactic ID | Tactic | Description | AI Red Team Relevance |
|---|---|---|---|
| AML.TA0000 | Reconnaissance | Gathering information about the AI system | Model identification, capability enumeration |
| AML.TA0001 | Resource Development | Preparing tools and resources for the attack | Custom prompt libraries, attack scripts |
| AML.TA0002 | Initial Access | Gaining initial access to the AI system | API access, user account creation |
| AML.TA0003 | ML Model Access | Obtaining access to interact with the model | API authentication, direct model queries |
| AML.TA0004 | Execution | Running adversarial techniques against the model | Prompt injection, jailbreaking |
| AML.TA0005 | Persistence | Maintaining adversarial access or influence | Conversation context manipulation, RAG poisoning |
| AML.TA0006 | Defense Evasion | Avoiding detection by security controls | Encoding attacks, obfuscation, multi-turn evasion |
| AML.TA0007 | Discovery | Learning about the AI system's internals | System prompt extraction, capability enumeration |
| AML.TA0008 | Collection | Gathering data from the AI system | Training data extraction, RAG exfiltration |
| AML.TA0009 | Exfiltration | Extracting data from the system | Data encoding in outputs, side-channel extraction |
| AML.TA0010 | ML Attack Staging | Preparing ML-specific attacks | Adversarial example crafting, model probing |
| AML.TA0011 | Impact | Achieving the attacker's end goal | Safety bypass, misinformation, service disruption |
Key ATLAS Techniques for LLM Red Teaming
| Technique ID | Technique | Typical Red Team Application |
|---|---|---|
| AML.T0000 | ML Model Access via API | Accessing the AI system through its API |
| AML.T0015 | Prompt Injection (Direct) | Injecting adversarial instructions via user input |
| AML.T0016 | Prompt Injection (Indirect) | Injecting instructions via external data sources |
| AML.T0043 | LLM Jailbreak | Bypassing model safety alignment |
| AML.T0044 | System Prompt Extraction | Revealing system-level instructions |
| AML.T0045 | Full Prompt Extraction | Extracting the complete prompt context |
| AML.T0046 | Training Data Extraction | Recovering training data from model outputs |
| AML.T0047 | LLM Plugin Compromise | Exploiting model tool use capabilities |
| AML.T0048 | Model Manipulation via Fine-tuning | Using fine-tuning access for model compromise |
| AML.T0049 | RAG Data Poisoning | Injecting adversarial content into knowledge bases |
| AML.T0040 | ML Model Inference API Access | Using inference APIs for systematic probing |
| AML.T0042 | Verify ML Model | Confirming model identity and capabilities |
| AML.T0024 | Exfiltration via ML Inference API | Extracting data through model responses |
Step 2: Map Findings to ATLAS Techniques
Mapping Template
For each finding, identify the ATLAS techniques used in the attack chain:
# ATLAS Technique Mapping
## Finding F-001: Direct Prompt Injection Bypass
### Techniques Used
| Phase | Technique ID | Technique Name | Application |
|-------|-------------|----------------|-------------|
| Reconnaissance | AML.T0042 | Verify ML Model | Identified model type via behavioral analysis |
| ML Model Access | AML.T0000 | ML Model Access via API | Authenticated API access obtained |
| Execution | AML.T0015 | Prompt Injection (Direct) | DAN persona injection payload delivered |
| Defense Evasion | AML.T0043 | LLM Jailbreak | Persona framing evaded content filter |
| Impact | AML.T0011 | Denial of ML Service Integrity | Content safety controls bypassed |
### Attack Chain
AML.T0042 → AML.T0000 → AML.T0015 → AML.T0043 → Impact
## Finding F-003: RAG Cross-Tenant Data Access
### Techniques Used
| Phase | Technique ID | Technique Name | Application |
|-------|-------------|----------------|-------------|
| Reconnaissance | AML.T0042 | Verify ML Model | Confirmed RAG capability via probing |
| ML Model Access | AML.T0000 | ML Model Access via API | Standard user API access |
| Discovery | AML.T0007 | Discover ML Model Family | Identified embedding model behavior |
| Execution | AML.T0015 | Prompt Injection (Direct) | Manipulated retrieval query via prompt |
| Collection | AML.T0024 | Exfiltration via ML Inference API | Cross-tenant documents retrieved |
| Impact | - | Data Breach | Confidential tenant data exposed |
### Attack Chain
AML.T0042 → AML.T0000 → AML.T0007 → AML.T0015 → AML.T0024 → ImpactStep 3: Construct Attack Chain Visualizations
Text-Based Attack Chain Notation
F-001: Prompt Injection → Content Safety Bypass
Reconnaissance ML Model Access Execution Defense Evasion Impact
┌─────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────┐
│ AML.T0042 │──▶│ AML.T0000 │──▶│ AML.T0015 │──▶│ AML.T0043 │──▶│ Safety │
│ Verify │ │ API Access │ │ Direct │ │ Jailbreak │ │ Bypass │
│ ML Model │ │ │ │ Injection │ │ │ │ │
└─────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────┘
F-004: System Prompt Extraction
Reconnaissance ML Model Access Execution Collection Impact
┌─────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────┐
│ AML.T0042 │──▶│ AML.T0000 │──▶│ AML.T0015 │──▶│ AML.T0044 │──▶│ Config │
│ Verify │ │ API Access │ │ Direct │ │ System Prompt│ │ Exposure │
│ ML Model │ │ │ │ Injection │ │ Extraction │ │ │
└─────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────┘
ATLAS Navigator Layer Configuration
Generate a Navigator layer file that highlights the techniques observed during the engagement:
{
"name": "AI Red Team Assessment - [Client Name]",
"versions": {
"atlas": "4.5",
"navigator": "4.9"
},
"domain": "atlas",
"description": "Techniques observed during AI red team assessment",
"techniques": [
{
"techniqueID": "AML.T0042",
"tactic": "reconnaissance",
"color": "#66b1ff",
"comment": "Used in all findings",
"score": 3
},
{
"techniqueID": "AML.T0000",
"tactic": "ml-model-access",
"color": "#66b1ff",
"comment": "API access for all testing",
"score": 3
},
{
"techniqueID": "AML.T0015",
"tactic": "execution",
"color": "#ff6666",
"comment": "F-001, F-003: Direct prompt injection successful",
"score": 5
},
{
"techniqueID": "AML.T0043",
"tactic": "defense-evasion",
"color": "#ff6666",
"comment": "F-001: Jailbreak via persona injection",
"score": 5
},
{
"techniqueID": "AML.T0044",
"tactic": "collection",
"color": "#ff9933",
"comment": "F-004: System prompt partially extracted",
"score": 4
},
{
"techniqueID": "AML.T0024",
"tactic": "exfiltration",
"color": "#ff6666",
"comment": "F-003: Cross-tenant data extracted via RAG",
"score": 5
}
],
"gradient": {
"colors": ["#ffffff", "#66b1ff", "#ff9933", "#ff6666"],
"minValue": 0,
"maxValue": 5
},
"legendItems": [
{"label": "Not tested", "color": "#ffffff"},
{"label": "Tested, no finding", "color": "#66b1ff"},
{"label": "Finding (Medium/Low)", "color": "#ff9933"},
{"label": "Finding (Critical/High)", "color": "#ff6666"}
]
}Step 4: Identify Coverage Gaps Using ATLAS
Use the ATLAS technique matrix to identify attack techniques you did not test.
Coverage Gap Analysis
# ATLAS Coverage Gap Analysis
## Techniques Tested
| Technique ID | Technique | Result |
|-------------|-----------|--------|
| AML.T0000 | ML Model Access via API | Tested (access obtained) |
| AML.T0015 | Prompt Injection (Direct) | Finding F-001 |
| AML.T0042 | Verify ML Model | Tested (model identified) |
| AML.T0043 | LLM Jailbreak | Finding F-001 |
| AML.T0044 | System Prompt Extraction | Finding F-004 |
| AML.T0024 | Exfiltration via ML Inference API | Finding F-003 |
## Techniques Not Tested (In Scope but Not Executed)
| Technique ID | Technique | Reason Not Tested |
|-------------|-----------|-------------------|
| AML.T0016 | Prompt Injection (Indirect) | No writable data source accessible |
| AML.T0049 | RAG Data Poisoning | Read-only access to knowledge base |
## Techniques Not Applicable
| Technique ID | Technique | Reason Not Applicable |
|-------------|-----------|----------------------|
| AML.T0048 | Model Manipulation via Fine-tuning | No fine-tuning access in scope |
| AML.T0046 | Training Data Extraction | Third-party hosted model |Step 5: Generate Report Content
ATLAS Summary for Reports
## MITRE ATLAS Attack Mapping
The following diagram maps all engagement findings to MITRE ATLAS
tactics and techniques, showing the attack paths discovered during
the assessment.
### Attack Path Summary
| Finding | Attack Path (ATLAS Techniques) | Impact |
|---------|-------------------------------|--------|
| F-001 | T0042 → T0000 → T0015 → T0043 | Content safety bypass |
| F-003 | T0042 → T0000 → T0015 → T0024 | Cross-tenant data access |
| F-004 | T0042 → T0000 → T0015 → T0044 | System prompt disclosure |
| F-005 | T0042 → T0000 → T0015 → T0047 | Unauthorized function execution |
### Key Observations
1. **AML.T0015 (Direct Prompt Injection)** is the common entry
technique across all attack paths. Strengthening defenses against
this single technique would disrupt four of five identified attack chains.
2. **Defense evasion techniques** were required for only one finding
(F-001), indicating that content filters are the primary defense
layer but are not required to reach most other impacts.
3. **No indirect injection** paths were tested due to scope
limitations. This represents a significant untested attack surface
that should be assessed in a follow-up engagement.
### ATLAS Technique Heat Map
[Include Navigator visualization or reference the exported layer file]Step 6: Cross-Reference ATLAS with OWASP
For maximum reporting value, cross-reference both frameworks:
# Cross-Framework Mapping
| Finding | ATLAS Primary Technique | OWASP Primary Category | Combined Insight |
|---------|------------------------|----------------------|-----------------|
| F-001 | AML.T0043 (Jailbreak) | LLM01 (Prompt Injection) | ATLAS shows the attack path; OWASP classifies the vulnerability |
| F-003 | AML.T0024 (Exfiltration) | LLM08 (Vector/Embedding) | ATLAS shows exfiltration method; OWASP identifies the weakness type |
| F-004 | AML.T0044 (Prompt Extraction) | LLM07 (System Prompt Leakage) | Both frameworks align; ATLAS provides attack chain context |
| F-005 | AML.T0047 (Plugin Compromise) | LLM06 (Excessive Agency) | ATLAS focuses on technique; OWASP focuses on permission model |Common ATLAS Mapping Mistakes
-
Mapping only the exploitation technique. ATLAS is designed to show the full attack path, not just the exploit. Include reconnaissance, initial access, and defense evasion techniques even if they seem trivial.
-
Ignoring the tactic context. The same technique (e.g., AML.T0015 Prompt Injection) can serve different tactical purposes: execution, defense evasion, or collection. Map the technique to the correct tactic for each finding.
-
Using ATLAS as a checklist. ATLAS is a knowledge base, not a testing checklist. Not every technique is relevant to every engagement. Focus on techniques applicable to the target architecture.
-
Outdated technique IDs. ATLAS is actively maintained and technique IDs may change between versions. Specify the ATLAS version you are mapping to and verify technique IDs before finalizing the report.
-
Confusing ATLAS with ATT&CK. While ATLAS follows ATT&CK conventions, the techniques are different. Do not mix ATT&CK technique IDs with ATLAS technique IDs.
What is the primary difference between OWASP LLM Top 10 and MITRE ATLAS in how they classify AI security findings?
Related Topics
- Mapping to OWASP LLM Top 10 -- Complementary vulnerability classification
- AI Threat Model Creation -- Threat models that inform ATLAS mapping
- Attack Surface Mapping -- Attack surface analysis underlying technique identification
- NIST AI RMF Assessment -- Risk management framework that references ATLAS