SOC 2 for AI Systems
SOC 2 trust services criteria applied to AI systems, AI-specific controls, audit considerations, and how red teaming supports SOC 2 compliance for AI-powered services.
SOC 2 (System and Organization Controls 2) is the dominant audit framework for service organizations in the United States. As organizations increasingly embed AI into their service offerings, SOC 2 audits must evolve to address AI-specific risks. Red teamers who understand SOC 2 requirements can deliver findings that directly support audit evidence and help clients maintain their SOC 2 reports.
Trust Services Criteria Applied to AI
SOC 2 is built on five Trust Services Criteria (TSC). Each has specific implications for AI systems:
Security (Common Criteria)
The security category is always included in SOC 2 reports and forms the foundation for AI system controls:
| Criteria | Traditional Application | AI-Specific Extension |
|---|---|---|
| CC6.1 (Logical access) | User authentication and authorization | Model API access controls, inference endpoint authentication, prompt-level access restrictions |
| CC6.3 (Access removal) | Deprovisioning user accounts | Revoking API keys, removing model access, disabling fine-tuned model variants |
| CC6.6 (Boundary protection) | Network segmentation, firewalls | Model isolation, prompt injection filtering, input/output boundary controls |
| CC6.7 (Data transmission) | Encryption in transit | Protecting prompts and completions in transit, securing model-to-model communication |
| CC6.8 (Malicious software) | Antivirus, endpoint protection | Adversarial input detection, malicious prompt filtering, model integrity verification |
| CC7.2 (Monitoring) | Security event monitoring | Monitoring for adversarial attacks, unusual query patterns, data extraction attempts |
| CC7.3 (Anomaly detection) | Intrusion detection systems | AI behavioral anomaly detection, prompt injection detection, output drift monitoring |
Availability
| Criteria | Traditional Application | AI-Specific Extension |
|---|---|---|
| A1.1 (Capacity management) | Server scaling, bandwidth planning | GPU capacity management, inference queue management, model serving scalability |
| A1.2 (Recovery procedures) | Backup and disaster recovery | Model rollback procedures, training checkpoint recovery, inference fallback paths |
Processing Integrity
Processing integrity is particularly relevant for AI systems, as their outputs directly affect business decisions:
| Criteria | Traditional Application | AI-Specific Extension |
|---|---|---|
| PI1.1 (Accurate processing) | Data validation, calculation verification | Model accuracy monitoring, hallucination detection, output validation |
| PI1.2 (Complete processing) | Transaction completeness | Ensuring AI processes all inputs without silent failures or truncation |
| PI1.3 (Timely processing) | SLA compliance | Inference latency monitoring, timeout handling for AI operations |
| PI1.4 (Authorized processing) | Approval workflows | Human-in-the-loop requirements for high-stakes AI decisions |
| PI1.5 (Error handling) | Exception processing | Graceful degradation when models fail, fallback behavior documentation |
Confidentiality
| Criteria | Traditional Application | AI-Specific Extension |
|---|---|---|
| C1.1 (Confidential data identification) | Data classification | Training data classification, prompt content classification, model weight protection |
| C1.2 (Confidential data disposal) | Secure deletion | Model unlearning, training data removal, conversation data purging |
Privacy
| Criteria | Traditional Application | AI-Specific Extension |
|---|---|---|
| P1-P8 (Privacy criteria) | PII handling, consent, access | AI training data privacy, prompt data handling, model memorization risks, user data in fine-tuning |
AI-Specific Control Objectives
Beyond mapping existing TSC to AI, organizations should implement additional AI-specific controls. These extend the SOC 2 framework to address risks unique to AI systems:
Model Governance Controls
| Control ID | Objective | Description | Red Team Test |
|---|---|---|---|
| AI-GOV-01 | Model inventory | Maintain a complete inventory of all AI models in production | Verify completeness by discovering undocumented models |
| AI-GOV-02 | Model lifecycle management | Track models from development through retirement | Attempt to access deprecated or staging models |
| AI-GOV-03 | Model change control | Approve and document model changes before deployment | Test whether unapproved model versions can be deployed |
| AI-GOV-04 | Third-party model risk | Assess and monitor risks from third-party AI providers | Test third-party model behavior, verify SLA compliance |
Model Security Controls
| Control ID | Objective | Description | Red Team Test |
|---|---|---|---|
| AI-SEC-01 | Prompt injection prevention | Prevent unauthorized actions through prompt manipulation | Execute prompt injection attack scenarios |
| AI-SEC-02 | Output filtering | Prevent sensitive data in model outputs | Attempt data extraction through various output channels |
| AI-SEC-03 | Model access control | Restrict model capabilities based on user authorization | Test privilege escalation through prompt manipulation |
| AI-SEC-04 | Adversarial robustness | Maintain model behavior under adversarial conditions | Adversarial testing across input modalities |
Data Handling Controls
| Control ID | Objective | Description | Red Team Test |
|---|---|---|---|
| AI-DATA-01 | Training data governance | Control what data is used for training and fine-tuning | Verify training data provenance and authorization |
| AI-DATA-02 | Prompt data isolation | Prevent cross-user data leakage through prompts | Test for conversation leakage between sessions |
| AI-DATA-03 | Data retention for AI | Define and enforce retention periods for AI interaction data | Verify that expired data is actually deleted |
| AI-DATA-04 | RAG data integrity | Ensure retrieval-augmented generation uses authorized data | Attempt RAG poisoning and unauthorized data injection |
Audit Considerations
SOC 2 Type I vs Type II for AI Systems
| Dimension | Type I | Type II |
|---|---|---|
| Scope | Design of controls at a point in time | Design and operating effectiveness over a period (typically 6-12 months) |
| AI relevance | Useful for initial AI system launches | Required to demonstrate sustained AI control effectiveness |
| Red team role | Point-in-time assessment of AI controls | Periodic assessments throughout the audit period |
| Evidence needed | Control documentation and design review | Testing results, monitoring logs, incident records over the period |
What Auditors Look For in AI Systems
Common auditor questions about AI controls:
| Question | What They Are Assessing | How Red Teaming Helps |
|---|---|---|
| "How do you prevent prompt injection?" | CC6.6 boundary protection for AI | Demonstrate whether prompt injection controls actually work |
| "How do you monitor AI system behavior?" | CC7.2, CC7.3 monitoring and anomaly detection | Show whether monitoring detects adversarial activity |
| "How do you prevent data leakage through AI?" | C1.1 confidential data, P3 data collection | Test data extraction via model outputs |
| "How do you manage model changes?" | CC8.1 change management | Verify that model change procedures are followed |
| "How do you handle AI errors?" | PI1.5 error handling | Test failure modes and verify graceful degradation |
Evidence Collection for Auditors
Red team engagements supporting SOC 2 should produce evidence formatted for audit consumption:
| Evidence Type | Content | SOC 2 Relevance |
|---|---|---|
| Test plans | Scope, methodology, tools used, controls tested | Demonstrates systematic assessment approach |
| Test results | Detailed findings with steps to reproduce | Proves control operating effectiveness (or failure) |
| Remediation verification | Re-testing results after control improvements | Shows corrective action effectiveness |
| Continuous monitoring data | Automated test results over the audit period | Supports Type II operating effectiveness |
| Exception logs | Documented control failures and responses | Shows management awareness and response capability |
Red Team Engagement Structure for SOC 2
Pre-Engagement Alignment
Before conducting a red team engagement supporting SOC 2, align with the audit team:
Identify in-scope AI systems
Work with the client to identify which AI systems are included in their SOC 2 scope boundary. Only AI systems within the service organization's trust boundary require testing.
Map controls to testing activities
Review the client's control matrix and map AI-specific controls to red team testing scenarios. Each control should have at least one corresponding test.
Coordinate timing with auditors
For Type II reports, schedule testing at intervals throughout the audit period rather than all at once. This provides evidence of sustained control effectiveness.
Agree on evidence format
Confirm with the CPA firm what evidence format they require. Some auditors accept red team reports directly; others need findings formatted into their testing workpapers.
Testing Methodology by Trust Service Category
Security testing (CC criteria):
- Prompt injection attacks against all AI endpoints
- API authentication and authorization testing
- Data extraction attempts through model outputs
- System prompt extraction and abuse
- Adversarial input testing
Availability testing (A criteria):
- Model denial-of-service through resource exhaustion
- Inference pipeline stress testing
- Failover and fallback behavior verification
- Recovery time objective validation
Processing integrity testing (PI criteria):
- Hallucination rate measurement under adversarial conditions
- Output manipulation through carefully crafted inputs
- Verification that human oversight controls function correctly
- Error handling behavior under unexpected inputs
Confidentiality testing (C criteria):
- Training data extraction attempts
- Cross-tenant data leakage testing
- Model weight and configuration extraction
- Conversation history isolation verification
Privacy testing (P criteria):
- PII extraction from model outputs
- Consent mechanism verification for AI data collection
- Data retention and deletion verification
- User data isolation in multi-tenant environments
Common Findings and Remediation
Findings That Affect SOC 2 Reports
| Finding | SOC 2 Impact | Severity for Auditors |
|---|---|---|
| Successful prompt injection bypassing controls | CC6.6 control failure | High -- may result in qualified opinion |
| Data extraction through model outputs | C1.1, P3 control failure | High -- confidentiality and privacy impact |
| No monitoring for adversarial inputs | CC7.2, CC7.3 gap | Medium -- detection deficiency |
| Model changes deployed without approval | CC8.1 control failure | Medium -- change management gap |
| No fallback behavior when model fails | PI1.5, A1.2 gap | Medium -- processing integrity and availability |
| Undocumented AI models in production | AI-GOV-01 gap | Low to Medium -- inventory completeness |
Remediation Priorities
Immediate priorities (fix before audit period ends):
- Any control failure that allows data extraction or unauthorized access
- Missing monitoring for AI-specific attack patterns
- Uncontrolled model deployment processes
Longer-term improvements:
- Automated adversarial testing in CI/CD pipelines
- Enhanced AI behavioral monitoring dashboards
- Formal AI incident response procedures
- Regular model security reviews integrated into change management
Integration with Other Frameworks
Organizations often maintain SOC 2 alongside other compliance frameworks. Red team findings should be mapped across all applicable frameworks:
| SOC 2 Criteria | ISO 42001 Control | NIST AI RMF | EU AI Act |
|---|---|---|---|
| CC6.6 (Boundary protection) | A.6.2.5 (Deployment) | MS-2.3 | Art. 15 (Cybersecurity) |
| CC7.2 (Monitoring) | A.6.2.6 (Monitoring) | MG-2.4 | Art. 9 (Risk management) |
| PI1.1 (Accurate processing) | A.6.2.4 (Verification) | MS-2.6 | Art. 15 (Accuracy) |
| C1.1 (Confidentiality) | A.7.4 (Data provenance) | MP-4.2 | Art. 10 (Data governance) |
| P3 (Collection) | A.10.2 (Fairness) | GV-6.1 | Art. 13 (Transparency) |
This cross-mapping allows a single red team engagement to produce findings relevant to multiple compliance requirements, maximizing the return on investment for clients managing complex compliance landscapes.