Capstone: Compliance Assessment Simulation
Conduct a simulated compliance assessment against EU AI Act, NIST AI RMF, and ISO 42001, producing a comprehensive gap analysis report.
Overview
AI regulation is accelerating. The EU AI Act mandates risk assessments and adversarial testing for high-risk AI systems. NIST's AI Risk Management Framework provides voluntary guidelines that many organizations adopt as their baseline. ISO 42001 establishes an AI management system standard. As an AI red teamer, you will increasingly be asked to contribute to compliance assessments — not just find vulnerabilities, but map findings to regulatory requirements.
This capstone project simulates a compliance assessment for a fictional company. You will evaluate an AI system against all three frameworks, identify gaps, and produce a professional report that both the security team and the legal/compliance team can use.
Prerequisites
- AI Threat Landscape — Understanding the risk landscape that regulations address
- Governance & Compliance — Overview of AI regulatory frameworks
- Defenses and Guardrails — Defense mechanisms required by compliance frameworks
- Execution & Reporting — Professional report writing skills
- Familiarity with reading regulatory and standards documents
Project Brief
Scenario
You are a security consultant at Meridian Compliance Partners. Your client, NovaCare Insurance, is a mid-size health insurance company that has deployed three AI systems:
-
ClaimBot — An LLM-powered chatbot that helps customers file insurance claims and check claim status. It has access to customer policy data and claims history via API integrations. Classified as high-risk under EU AI Act (insurance domain, automated decision influence).
-
FraudDetect — A machine learning model that scores insurance claims for fraud likelihood. Claims above a threshold are flagged for human review. Classified as high-risk under EU AI Act (automated decision-making affecting individuals).
-
DocSummarizer — An internal tool that summarizes medical documents for claims adjusters. It processes sensitive health information. Classified as limited risk under EU AI Act (transparency obligations only).
NovaCare wants to understand their compliance posture before the EU AI Act enforcement deadlines and wants to proactively align with NIST AI RMF and ISO 42001. They have hired you to conduct a gap analysis.
Frameworks to Assess
| Framework | Version | Focus Areas |
|---|---|---|
| EU AI Act | Regulation (EU) 2024/1689 | Risk classification, conformity assessment, transparency, human oversight, robustness, cybersecurity |
| NIST AI RMF | AI 100-1 (January 2023) | Govern, Map, Measure, Manage functions across AI lifecycle |
| ISO 42001 | ISO/IEC 42001:2023 | AI management system, risk assessment, controls, continuous improvement |
Available Information
For the assessment, assume you have been provided:
- Architecture diagrams for all three systems
- System prompts and model cards for ClaimBot and DocSummarizer
- Model documentation for FraudDetect (training data description, performance metrics, fairness evaluation)
- Existing security policies (generic IT security policy, no AI-specific policies)
- Incident logs from the past 6 months (two prompt injection incidents on ClaimBot, one false positive spike on FraudDetect)
- Interview notes from the AI development team, security team, and compliance officer
Deliverables
Primary Deliverables
| Deliverable | Description | Weight |
|---|---|---|
| Gap analysis report | Comprehensive compliance gap analysis (15-25 pages) | 40% |
| Compliance matrix | Mapping of requirements to current state across all three frameworks | 20% |
| Risk assessment | Risk-prioritized findings linking technical gaps to compliance obligations | 20% |
| Remediation roadmap | Phased plan for achieving compliance with effort and timeline estimates | 20% |
Rubric Criteria
- Framework Coverage (20%) — All material requirements from the three frameworks are addressed for the relevant systems
- Gap Identification (25%) — Gaps are specific, accurately identified, and linked to the correct regulatory requirement
- Risk Prioritization (15%) — Gaps are prioritized by regulatory risk (enforcement exposure, penalties) and implementation effort
- Remediation Quality (20%) — Recommendations are specific, actionable, and appropriately scoped for each system's risk classification
- Report Professionalism (20%) — Report is suitable for a compliance audience with clear structure, appropriate regulatory language, and executive summary
Phased Approach
Phase 1: Framework Analysis (3 hours)
Extract relevant requirements from EU AI Act
For each system, identify the applicable EU AI Act requirements based on its risk classification. Focus on Articles 9 (risk management), 10 (data governance), 13 (transparency), 14 (human oversight), and 15 (accuracy, robustness, cybersecurity). List each requirement and what evidence of compliance would look like.
Map NIST AI RMF functions to NovaCare's systems
Apply the four NIST AI RMF functions (Govern, Map, Measure, Manage) to each system. Identify which subcategories are most relevant. For each, define what "implemented" vs. "partially implemented" vs. "not implemented" would look like in NovaCare's context.
Identify ISO 42001 control requirements
Review ISO 42001 Annex A controls and identify which apply to NovaCare. Focus on AI risk assessment, AI policy, third-party AI management, and AI system lifecycle controls. Note which controls are organizational vs. system-specific.
Phase 2: Gap Assessment (4 hours)
Assess ClaimBot compliance
Evaluate ClaimBot against all applicable requirements. Key areas: adversarial testing (has it been red teamed?), transparency (do users know they are interacting with AI?), human oversight (can a human intervene in claim decisions?), data protection (how is customer data handled in the LLM context?), and robustness (what happens under adversarial input?).
Assess FraudDetect compliance
Evaluate FraudDetect against applicable requirements. Key areas: fairness and bias testing, explanation capability (can flagged claims be explained?), human review process (are flagged claims always reviewed by a human?), model monitoring (is performance tracked for drift?), and documentation (is there a conformity assessment?).
Assess DocSummarizer compliance
Evaluate DocSummarizer against its lower-tier requirements. Key areas: transparency (do users know summaries are AI-generated?), accuracy monitoring (are summaries validated?), and data protection (how is sensitive health information handled during processing?).
Assess organizational controls
Evaluate NovaCare's organizational posture: AI governance structure, AI-specific policies, risk management processes, incident response procedures for AI failures, and training/awareness programs. Many ISO 42001 requirements are organizational, not system-specific.
Phase 3: Report Writing (4 hours)
Build the compliance matrix
Create a comprehensive matrix mapping each requirement (rows) against each system (columns) with a status indicator (compliant, partially compliant, non-compliant, not applicable). Include evidence references for each assessment.
Write the gap analysis findings
For each significant gap, document: the requirement, the current state, the gap description, the regulatory risk (what happens if this gap is not addressed), and the recommended remediation. Group findings by framework and system.
Develop the remediation roadmap
Prioritize gaps by: regulatory enforcement timeline (EU AI Act deadlines), severity of non-compliance (financial penalties, market access risk), and implementation effort. Organize into phases: immediate (0-3 months), short-term (3-6 months), medium-term (6-12 months).
Write the executive summary
Summarize the overall compliance posture for each system and each framework. Highlight the most critical gaps and the recommended priorities. Write for a C-suite audience that needs to understand regulatory exposure without reading the full report.
Phase 4: Review and Validation (1 hour)
Cross-reference frameworks
Verify that gaps identified under one framework are consistently reflected in the other frameworks where the same issue applies. For example, a lack of adversarial testing is a gap under EU AI Act Article 15, NIST AI RMF Measure function, and ISO 42001 risk assessment controls.
Validate prioritization
Review the remediation roadmap to ensure the highest-priority items are those with the greatest regulatory risk and the nearest compliance deadlines. Adjust if necessary.
Example Output
Example Compliance Matrix Row
| Requirement | ClaimBot | FraudDetect | DocSummarizer | Evidence |
|-------------|----------|-------------|---------------|----------|
| EU AI Act Art. 15(4) — Resilience against attempts by unauthorized third parties to exploit vulnerabilities | Non-compliant | Partially compliant | N/A (limited risk) | No adversarial testing has been conducted on ClaimBot. FraudDetect has undergone basic adversarial robustness testing but not against LLM-specific attack vectors. |Example Gap Finding
## GAP-007: No Adversarial Testing Program for High-Risk AI Systems
**Framework:** EU AI Act Article 15(4); NIST AI RMF Measure 2.7; ISO 42001 A.6.2.5
**Systems:** ClaimBot (Critical), FraudDetect (High)
**Current State:** Non-compliant
### Description
NovaCare has not conducted adversarial testing (red teaming) on either
of its high-risk AI systems. ClaimBot has experienced two prompt injection
incidents in production but no systematic security assessment has been
performed. FraudDetect has undergone standard ML robustness testing
(adversarial examples on tabular data) but has not been assessed for
AI-specific vulnerabilities related to its decision pipeline.
### Regulatory Risk
The EU AI Act requires that high-risk AI systems are resilient against
"attempts by unauthorized third parties to exploit system vulnerabilities"
(Art. 15(4)). Failure to demonstrate adversarial testing may result in
non-conformity during conformity assessment, blocking market access or
triggering enforcement action. Maximum administrative fine: up to
EUR 15,000,000 or 3% of annual worldwide turnover.
### Recommendation
1. Engage a qualified AI red team to conduct a comprehensive security
assessment of ClaimBot covering OWASP Top 10 for LLMs categories
2. Extend FraudDetect's robustness testing to include pipeline-level
attacks (data poisoning, model evasion, explanation manipulation)
3. Establish a recurring adversarial testing program (quarterly for
high-risk systems) and document results for conformity assessment
**Effort Estimate:** 3-4 weeks for initial assessments, ongoing quarterly
**Priority:** Critical — EU AI Act enforcement deadline approachingHints
Under the EU AI Act, why does a health insurance chatbot that influences claim decisions require adversarial testing, while an internal document summarization tool may not?