Data Exfiltration Incident Response Playbook

intermediate11 min readUpdated 2026-03-15

Playbook for AI-mediated data exfiltration: identifying exposed data, determining exfiltration scope, data classification, breach notification procedures, and remediation.

data-exfiltration playbook data-breach notification

Data Exfiltration Incident Response Playbook

This playbook covers incidents where an AI system has disclosed, leaked, or exfiltrated data that should not have been accessible to the requesting user. This includes training data extraction, RAG data leakage across authorization boundaries, system prompt disclosure containing sensitive information, PII disclosure, and agent-mediated data exfiltration through tool calls.

Trigger Criteria

Activate this playbook when:

Model output contains data classified above the user's authorization level
Model output contains PII, PHI, financial data, or trade secrets
System prompt containing sensitive information (API keys, internal URLs, credentials) is disclosed
RAG system returns documents from another tenant or unauthorized collection
Agent tool calls retrieve or transmit data outside authorized boundaries
Training data extraction is confirmed (model reproduces memorized private data)

Immediate Actions (First 30 Minutes)

Acknowledge and assign

Incident ID: AI-IR-[YYYY]-[NNNN]
Type: Data Exfiltration
Detected: [UTC timestamp]
IC: [Name]
AI Investigator: [Name]
Evidence Custodian: [Name]
Legal Contact: [Name]  ← Required for data exfiltration incidents

Preserve evidence with chain of custody
Data exfiltration incidents may require evidence for legal proceedings or regulatory response. Apply strict chain of custody from the start:
- Capture complete conversation with all model outputs containing exposed data
- Record hashes of all evidence files
- Log the identity of the person collecting evidence and the collection method
- Store in tamper-evident, access-controlled evidence repository
- Capture the user identity and session metadata of the requester
- Preserve the model version, system prompt, and configuration active at the time
- If RAG is involved, capture the retrieval logs and document index state
- If tools are involved, capture all tool call logs with parameters and results

Stop the bleeding

Prevent further exfiltration immediately:

Exfiltration Channel	Containment Action
Model output	Add output classifier for the data type; consider blocking the user
System prompt in output	Rotate any credentials in the system prompt immediately; update prompt
RAG cross-tenant leakage	Disable shared retrieval or add strict tenant filtering
Tool call exfiltration	Restrict tool access; block outbound network for the agent
Training data extraction	Add output filter for known training data patterns; rate limit

Initial data classification

Immediately classify the exposed data to determine notification urgency:

Data Type	Classification	Notification Clock
PII (names, emails, SSNs, etc.)	Regulated (GDPR/CCPA/HIPAA)	Starts now -- 72h for GDPR
PHI (health information)	HIPAA regulated	Starts now -- 60 day deadline
Financial data (accounts, cards)	PCI/financial regulations	Starts now
Credentials (API keys, passwords)	Internal -- credential rotation needed	Immediate rotation
Trade secrets	Internal -- legal review needed	Legal assessment required
System configuration (URLs, architecture)	Internal -- security assessment needed	Risk assessment required

Investigation (Hours 1-4)

Scope Determination

The critical question is: how much data was exposed and to whom?

Identify all exfiltration instances

Search logs for all instances where the identified data type was exposed:

-- Search for PII patterns in model outputs
SELECT
    session_id,
    user_id,
    timestamp,
    output_content,
    model_version
FROM prompt_completion_logs
WHERE (
    output_content ~ '\b\d{3}-\d{2}-\d{4}\b'  -- SSN pattern
    OR output_content ~ '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'  -- Email
    OR output_content ~ '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'  -- Credit card
    -- Add patterns specific to your data types
)
AND timestamp > '[incident_window_start]'
AND timestamp < '[incident_window_end]'
ORDER BY timestamp;

Determine the data source

Identify how the model obtained the exposed data:

Source	How to Verify	Implication
Training data memorization	Test with extraction prompts on a fresh model instance	Affects all conversations; model-level issue
RAG retrieval	Check retrieval logs for the documents returned	Affects users who can trigger specific retrievals
System prompt content	Review the system prompt for embedded sensitive data	Affects all users of this endpoint
Tool call results	Check tool call logs for data access	Depends on tool access controls
Conversation history	Check for context leakage between users	Architecture-level multi-tenancy issue

Quantify the exposure

Build a complete exposure inventory:

## Exposure Inventory
 
### Data Types Exposed
- [ ] PII: [types, count of records]
- [ ] PHI: [types, count of records]
- [ ] Financial: [types, count of records]
- [ ] Credentials: [types, count]
- [ ] Trade secrets: [description]
- [ ] System configuration: [description]
 
### Individuals Affected
- Count: [number]
- Jurisdictions: [list of applicable jurisdictions]
- Data subjects: [customers / employees / third parties]
 
### Exposure Recipients
- Who received the data: [user IDs, count]
- Were they authorized to access this data: [yes/no]
- Is any recipient a known threat actor: [yes/no/unknown]
 
### Exposure Duration
- First known exposure: [timestamp]
- Last known exposure: [timestamp]
- Total duration: [duration]

Assess downstream propagation
Determine whether the exposed data has propagated beyond the initial disclosure:
- Has the user shared, copied, or forwarded the model's output?
- Was the model's output fed into any downstream system?
- Is the conversation data accessible to other users or systems?
- Has the data appeared on any public forums or paste sites?

Containment and Remediation

Immediate Remediation

Action	When	Owner
Rotate exposed credentials	Immediately upon identification	Security team
Add output filter for data type	Within 1 hour	AI team
Fix RAG access controls	Within 1 hour if RAG is the source	Data engineering
Update system prompt	Within 1 hour if prompt contained sensitive data	Application team
Block confirmed attacker	Immediately	Security team

Root Cause Remediation

Root Cause	Remediation	Timeline
Training data memorization	Data deduplication, unlearning techniques, output filtering	Days to weeks
RAG access control failure	Implement per-user document filtering, tenant isolation	Days
Sensitive system prompt	Remove sensitive data from prompt; use backend configuration	Hours
Tool access control failure	Implement least-privilege tool access, parameter validation	Days
Multi-tenant context leakage	Architecture review, session isolation	Weeks

Notification Decision Framework

Step 1: Is This a Reportable Breach?

Factor	Assessment
Was personal data of identifiable individuals exposed?	If yes, likely reportable
Was the data encrypted at the time of exposure?	If encrypted and key not compromised, may reduce notification scope
Was the exposure limited to the AI model's internal processing?	If data was in the model's context but not in the output, may not be an exposure
Was the recipient authorized to access this data?	If authorized, may not be a breach
Has the data been recovered or confirmed deleted?	May reduce but does not eliminate notification obligation

Step 2: Notification Requirements

Jurisdiction / Regulation	Threshold	Timeline	Recipient
GDPR	Any unauthorized disclosure of personal data	72 hours (authority), without undue delay (individuals)	Supervisory authority + affected individuals if high risk
CCPA/CPRA	Unauthorized access to unencrypted personal information	Expeditious	California AG + affected individuals
HIPAA	Unauthorized disclosure of PHI	60 days (individuals), annually (HHS if <500), 60 days (HHS if >=500)	HHS OCR + affected individuals
State breach laws	Varies by state	Varies (30-90 days typical)	State AG + affected individuals
SEC rules	Material cybersecurity incident	4 business days	SEC (Form 8-K)

Step 3: Notification Content

Legal counsel should draft all external notifications. Provide them with:

Exact data types exposed and record counts
Number of affected individuals and their jurisdictions
Timeline of exposure (first to last known instance)
Containment actions taken
Remediation steps planned
Whether the exposure was to a single user or multiple parties

Verification

Check	Procedure	Success Criteria
Exfiltration channel closed	Attempt the original exfiltration technique 50 times	0% success rate
Data type filter effective	Test output filter with known data patterns	All patterns caught
Access controls enforced	Test cross-boundary access (RAG, tools)	All unauthorized access blocked
Credentials rotated	Verify old credentials no longer work	Confirmed revoked
No false positives	Test legitimate queries that use similar patterns	Normal functionality preserved

Post-Mortem Checklist

#	Item	Status
1	Complete exposure inventory documented
2	All affected individuals identified
3	Notification decisions documented with legal review
4	All notifications sent within required timelines
5	Root cause identified and confirmed
6	Remediation deployed and verified
7	Data classification of AI training data reviewed
8	RAG access controls audited
9	System prompts reviewed for sensitive content
10	Tool access controls audited
11	Monitoring for the data type added to production

Incident Classification -- data leak incident classification
Escalation Paths -- regulatory notification procedures
Tool Call Forensics -- investigating tool-mediated exfiltration
Conversation Preservation -- preserving exfiltration evidence
Governance, Legal & Compliance -- regulatory landscape for AI data incidents

References

"GDPR Article 33: Notification of a Personal Data Breach" - European Parliament (2016) - 72-hour notification requirement
"NIST SP 800-122: Guide to Protecting PII" - NIST (2010) - PII handling and breach response
"AI-Mediated Data Breaches: Legal and Technical Analysis" - Berkman Klein Center (2025) - Legal analysis of AI-specific data breach scenarios
"OWASP Top 10 for LLM Applications: LLM06 - Sensitive Information Disclosure" - OWASP Foundation (2025)

Knowledge Check

An AI chatbot disclosed 3 EU customer email addresses in response to a crafted prompt. GDPR notification to the supervisory authority is triggered. When does the 72-hour clock start?

Edit this page on GitHub

Data Exfiltration Incident Response Playbook

intermediate11 min readUpdated 2026-03-15

Playbook for AI-mediated data exfiltration: identifying exposed data, determining exfiltration scope, data classification, breach notification procedures, and remediation.

data-exfiltration playbook data-breach notification

Data Exfiltration Incident Response Playbook

Trigger Criteria

Activate this playbook when:

Model output contains data classified above the user's authorization level
Model output contains PII, PHI, financial data, or trade secrets
System prompt containing sensitive information (API keys, internal URLs, credentials) is disclosed
RAG system returns documents from another tenant or unauthorized collection
Agent tool calls retrieve or transmit data outside authorized boundaries
Training data extraction is confirmed (model reproduces memorized private data)

Immediate Actions (First 30 Minutes)

Acknowledge and assign

Incident ID: AI-IR-[YYYY]-[NNNN]
Type: Data Exfiltration
Detected: [UTC timestamp]
IC: [Name]
AI Investigator: [Name]
Evidence Custodian: [Name]
Legal Contact: [Name]  ← Required for data exfiltration incidents

Preserve evidence with chain of custody
Data exfiltration incidents may require evidence for legal proceedings or regulatory response. Apply strict chain of custody from the start:
- Capture complete conversation with all model outputs containing exposed data
- Record hashes of all evidence files
- Log the identity of the person collecting evidence and the collection method
- Store in tamper-evident, access-controlled evidence repository
- Capture the user identity and session metadata of the requester
- Preserve the model version, system prompt, and configuration active at the time
- If RAG is involved, capture the retrieval logs and document index state
- If tools are involved, capture all tool call logs with parameters and results

Stop the bleeding

Prevent further exfiltration immediately:

Exfiltration Channel	Containment Action
Model output	Add output classifier for the data type; consider blocking the user
System prompt in output	Rotate any credentials in the system prompt immediately; update prompt
RAG cross-tenant leakage	Disable shared retrieval or add strict tenant filtering
Tool call exfiltration	Restrict tool access; block outbound network for the agent
Training data extraction	Add output filter for known training data patterns; rate limit

Initial data classification

Immediately classify the exposed data to determine notification urgency:

Data Type	Classification	Notification Clock
PII (names, emails, SSNs, etc.)	Regulated (GDPR/CCPA/HIPAA)	Starts now -- 72h for GDPR
PHI (health information)	HIPAA regulated	Starts now -- 60 day deadline
Financial data (accounts, cards)	PCI/financial regulations	Starts now
Credentials (API keys, passwords)	Internal -- credential rotation needed	Immediate rotation
Trade secrets	Internal -- legal review needed	Legal assessment required
System configuration (URLs, architecture)	Internal -- security assessment needed	Risk assessment required

Investigation (Hours 1-4)

Scope Determination

The critical question is: how much data was exposed and to whom?

Identify all exfiltration instances

Search logs for all instances where the identified data type was exposed:

-- Search for PII patterns in model outputs
SELECT
    session_id,
    user_id,
    timestamp,
    output_content,
    model_version
FROM prompt_completion_logs
WHERE (
    output_content ~ '\b\d{3}-\d{2}-\d{4}\b'  -- SSN pattern
    OR output_content ~ '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'  -- Email
    OR output_content ~ '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'  -- Credit card
    -- Add patterns specific to your data types
)
AND timestamp > '[incident_window_start]'
AND timestamp < '[incident_window_end]'
ORDER BY timestamp;

Determine the data source

Identify how the model obtained the exposed data:

Source	How to Verify	Implication
Training data memorization	Test with extraction prompts on a fresh model instance	Affects all conversations; model-level issue
RAG retrieval	Check retrieval logs for the documents returned	Affects users who can trigger specific retrievals
System prompt content	Review the system prompt for embedded sensitive data	Affects all users of this endpoint
Tool call results	Check tool call logs for data access	Depends on tool access controls
Conversation history	Check for context leakage between users	Architecture-level multi-tenancy issue

Quantify the exposure

Build a complete exposure inventory:

## Exposure Inventory
 
### Data Types Exposed
- [ ] PII: [types, count of records]
- [ ] PHI: [types, count of records]
- [ ] Financial: [types, count of records]
- [ ] Credentials: [types, count]
- [ ] Trade secrets: [description]
- [ ] System configuration: [description]
 
### Individuals Affected
- Count: [number]
- Jurisdictions: [list of applicable jurisdictions]
- Data subjects: [customers / employees / third parties]
 
### Exposure Recipients
- Who received the data: [user IDs, count]
- Were they authorized to access this data: [yes/no]
- Is any recipient a known threat actor: [yes/no/unknown]
 
### Exposure Duration
- First known exposure: [timestamp]
- Last known exposure: [timestamp]
- Total duration: [duration]

Assess downstream propagation
Determine whether the exposed data has propagated beyond the initial disclosure:
- Has the user shared, copied, or forwarded the model's output?
- Was the model's output fed into any downstream system?
- Is the conversation data accessible to other users or systems?
- Has the data appeared on any public forums or paste sites?

Containment and Remediation

Immediate Remediation

Action	When	Owner
Rotate exposed credentials	Immediately upon identification	Security team
Add output filter for data type	Within 1 hour	AI team
Fix RAG access controls	Within 1 hour if RAG is the source	Data engineering
Update system prompt	Within 1 hour if prompt contained sensitive data	Application team
Block confirmed attacker	Immediately	Security team

Root Cause Remediation

Root Cause	Remediation	Timeline
Training data memorization	Data deduplication, unlearning techniques, output filtering	Days to weeks
RAG access control failure	Implement per-user document filtering, tenant isolation	Days
Sensitive system prompt	Remove sensitive data from prompt; use backend configuration	Hours
Tool access control failure	Implement least-privilege tool access, parameter validation	Days
Multi-tenant context leakage	Architecture review, session isolation	Weeks

Notification Decision Framework

Step 1: Is This a Reportable Breach?

Factor	Assessment
Was personal data of identifiable individuals exposed?	If yes, likely reportable
Was the data encrypted at the time of exposure?	If encrypted and key not compromised, may reduce notification scope
Was the exposure limited to the AI model's internal processing?	If data was in the model's context but not in the output, may not be an exposure
Was the recipient authorized to access this data?	If authorized, may not be a breach
Has the data been recovered or confirmed deleted?	May reduce but does not eliminate notification obligation

Step 2: Notification Requirements

Jurisdiction / Regulation	Threshold	Timeline	Recipient
GDPR	Any unauthorized disclosure of personal data	72 hours (authority), without undue delay (individuals)	Supervisory authority + affected individuals if high risk
CCPA/CPRA	Unauthorized access to unencrypted personal information	Expeditious	California AG + affected individuals
HIPAA	Unauthorized disclosure of PHI	60 days (individuals), annually (HHS if <500), 60 days (HHS if >=500)	HHS OCR + affected individuals
State breach laws	Varies by state	Varies (30-90 days typical)	State AG + affected individuals
SEC rules	Material cybersecurity incident	4 business days	SEC (Form 8-K)

Step 3: Notification Content

Legal counsel should draft all external notifications. Provide them with:

Exact data types exposed and record counts
Number of affected individuals and their jurisdictions
Timeline of exposure (first to last known instance)
Containment actions taken
Remediation steps planned
Whether the exposure was to a single user or multiple parties

Verification

Check	Procedure	Success Criteria
Exfiltration channel closed	Attempt the original exfiltration technique 50 times	0% success rate
Data type filter effective	Test output filter with known data patterns	All patterns caught
Access controls enforced	Test cross-boundary access (RAG, tools)	All unauthorized access blocked
Credentials rotated	Verify old credentials no longer work	Confirmed revoked
No false positives	Test legitimate queries that use similar patterns	Normal functionality preserved

Post-Mortem Checklist

#	Item	Status
1	Complete exposure inventory documented
2	All affected individuals identified
3	Notification decisions documented with legal review
4	All notifications sent within required timelines
5	Root cause identified and confirmed
6	Remediation deployed and verified
7	Data classification of AI training data reviewed
8	RAG access controls audited
9	System prompts reviewed for sensitive content
10	Tool access controls audited
11	Monitoring for the data type added to production

Incident Classification -- data leak incident classification
Escalation Paths -- regulatory notification procedures
Tool Call Forensics -- investigating tool-mediated exfiltration
Conversation Preservation -- preserving exfiltration evidence
Governance, Legal & Compliance -- regulatory landscape for AI data incidents

References

"GDPR Article 33: Notification of a Personal Data Breach" - European Parliament (2016) - 72-hour notification requirement
"NIST SP 800-122: Guide to Protecting PII" - NIST (2010) - PII handling and breach response
"AI-Mediated Data Breaches: Legal and Technical Analysis" - Berkman Klein Center (2025) - Legal analysis of AI-specific data breach scenarios
"OWASP Top 10 for LLM Applications: LLM06 - Sensitive Information Disclosure" - OWASP Foundation (2025)

Knowledge Check

An AI chatbot disclosed 3 EU customer email addresses in response to a crafted prompt. GDPR notification to the supervisory authority is triggered. When does the 72-hour clock start?

Edit this page on GitHub

Data Exfiltration Incident Response Playbook

Acknowledge and assign

Preserve evidence with chain of custody

Stop the bleeding

Initial data classification

Identify all exfiltration instances

Determine the data source

Quantify the exposure

Assess downstream propagation

Related articles

Data Exfiltration Incident Response Playbook

Acknowledge and assign

Preserve evidence with chain of custody

Stop the bleeding

Initial data classification

Identify all exfiltration instances

Determine the data source

Quantify the exposure

Assess downstream propagation

Related articles