NIST AI RMF & ISO 42001
NIST AI Risk Management Framework and ISO 42001 AI management system standard. How red teaming fits into organizational AI risk management and compliance testing.
Risk management frameworks provide the organizational context for AI red teaming. While OWASP and MITRE ATLAS tell you what to test, NIST AI RMF and ISO 42001 tell you why testing matters within an organization's risk posture and how findings should drive action.
NIST AI RMF: Structure and Functions
The Four Core Functions
┌──────────────────────────────────────────────────────┐
│ GOVERN │
│ Policies, roles, accountability, organizational │
│ culture for AI risk management │
│ ┌────────────────────────────────────────────┐ │
│ │ MAP │ │
│ │ Context, risk identification, │ │
│ │ stakeholder analysis │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ MEASURE │ │ MANAGE │ │ │
│ │ │ Assessment, │ │ Mitigation, │ │ │
│ │ │ analysis, │ │ monitoring, │ │ │
│ │ │ tracking │ │ response │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘Function Details and Red Team Relevance
| Function | Categories | Red Team Relevance |
|---|---|---|
| Govern | Policies, accountability, culture, compliance | Red team program establishment, budget, reporting lines |
| Map | Context, risk identification, stakeholders | Threat modeling, attack surface analysis, scope definition |
| Measure | Assessment, metrics, tracking | Testing execution, metrics collection, benchmarking |
| Manage | Mitigation, monitoring, response | Remediation validation, regression testing, continuous assessment |
Red Teaming Within NIST AI RMF
Govern Function: Establishing the Red Team Program
The Govern function creates the organizational foundation for red teaming:
| Govern Subcategory | How Red Teaming Fits |
|---|---|
| GV-1: Policies and procedures | Define red team testing policies, frequency, scope requirements |
| GV-2: Accountability | Assign ownership for red team findings remediation |
| GV-3: Workforce | Ensure red team skills (internal or vendor) are available |
| GV-4: Organizational culture | Foster a culture where red team findings drive improvement, not blame |
| GV-5: Compliance | Align red team testing with regulatory requirements (EU AI Act, etc.) |
Map Function: Scoping the Engagement
| Map Subcategory | Red Team Application |
|---|---|
| MP-1: Context and intended use | Understand the AI system's purpose to design relevant attacks |
| MP-2: Risk identification | Identify which risks red teaming should target |
| MP-3: Stakeholder analysis | Determine who needs red team results and in what format |
| MP-4: Impacts | Assess potential impact of identified vulnerabilities |
Measure Function: Executing and Evaluating
| Measure Subcategory | Red Team Application |
|---|---|
| MS-1: Assessment approaches | Red teaming as a primary assessment method |
| MS-2: Evaluation metrics | Attack success rate, severity scores, coverage metrics |
| MS-3: Tracking over time | Regression testing, trend analysis across engagements |
Manage Function: Acting on Findings
| Manage Subcategory | Red Team Application |
|---|---|
| MG-1: Risk treatment | Prioritize and remediate red team findings |
| MG-2: Monitoring | Continuous automated red teaming (CART) for ongoing assurance |
| MG-3: Response | Incident response triggered by red team discoveries |
| MG-4: Residual risk | Accept, transfer, or further mitigate remaining risks |
NIST AI 600-1: GenAI Profile
NIST AI 600-1 extends the AI RMF specifically for generative AI systems. Published in July 2024, it identifies twelve risk categories particularly relevant to GenAI and maps each to the AI RMF's Govern/Map/Measure/Manage structure.
| Risk Category | Description | Red Team Testing Focus |
|---|---|---|
| CBRN Information | AI generating dangerous CBRN knowledge | Safety filter testing for CBRN content |
| Confabulation | Generating false but plausible outputs | Hallucination rate measurement |
| Data Privacy | Leaking training data or PII | Training data extraction, membership inference |
| Environmental | Excessive resource consumption | Resource exhaustion attacks |
| Human-AI Configuration | Over-reliance or misuse patterns | User interaction testing |
| Information Integrity | Generating disinformation | Factual accuracy evaluation |
| Information Security | Traditional and AI-specific security risks | Comprehensive red teaming |
| Intellectual Property | Generating copyrighted content | Copyright material reproduction testing |
| Obscene Content | Generating harmful or offensive content | Safety filter bypass testing |
| Toxicity | Generating toxic, biased, or hateful content | Bias and toxicity evaluation |
| Value Alignment | Misalignment with human values | Alignment testing, jailbreaking |
| Homogenization | Reduced diversity from AI monoculture | Systemic risk assessment |
NIST AI 600-1 to OWASP LLM Top 10 Mapping
| NIST AI 600-1 Risk | OWASP LLM Category | Testing Priority |
|---|---|---|
| CBRN Information | LLM01 (Prompt Injection to bypass safety filters) | Critical for frontier models |
| Confabulation | LLM09 (Misinformation) | High for information-critical applications |
| Data Privacy | LLM02 (Sensitive Information Disclosure) | Critical for PII-processing systems |
| Information Security | LLM01-LLM10 (comprehensive) | Always critical |
| Intellectual Property | LLM02 (Sensitive Information Disclosure) | High for creative applications |
| Obscene Content | LLM01 (Prompt Injection to bypass content filters) | High for consumer-facing systems |
| Value Alignment | LLM01 (Prompt Injection / Jailbreaking) | High for all LLM applications |
Red Team Testing by NIST AI 600-1 Category
For each NIST AI 600-1 risk category, specific red team tests map to the Measure function:
| Risk Category | Specific Test Cases | Measure Subcategory |
|---|---|---|
| CBRN Information | Attempt to elicit synthesis instructions, weapons design, biological agent creation | MS-1 (Assessment) |
| Confabulation | Query domain-specific facts, check citation accuracy, test under adversarial conditions | MS-2 (Metrics) |
| Data Privacy | Training data extraction, PII probing, membership inference, cross-session leakage | MS-1, MS-2 |
| Information Security | Prompt injection, system prompt extraction, tool abuse, data exfiltration | MS-1, MS-2, MS-3 |
| Obscene Content | Safety filter bypass via encoding, multi-language, role-play, and multi-turn attacks | MS-1 |
| Value Alignment | Jailbreak testing, alignment stability under adversarial pressure | MS-1, MS-3 |
ISO 42001: AI Management System
ISO 42001 provides a certifiable management system standard for AI. It follows the ISO Annex SL structure (similar to ISO 27001 for information security). Published in December 2023, it is the first international standard for AI management systems and is increasingly referenced in procurement requirements and regulatory guidance.
Key Clauses
| Clause | Topic | Red Team Relevance |
|---|---|---|
| 4 | Context of the organization | Understanding AI system landscape |
| 5 | Leadership | Management commitment to AI safety testing |
| 6 | Planning | Risk assessment including adversarial testing |
| 7 | Support | Resources for testing (people, tools, budget) |
| 8 | Operation | Operational planning and control of AI systems |
| 9 | Performance evaluation | Measurement and monitoring (including red team results) |
| 10 | Improvement | Nonconformity handling, continuous improvement |
ISO 42001 Annex A Controls
Annex A provides specific controls. Those most relevant to red teaming:
| Control | Description | Red Team Connection |
|---|---|---|
| A.6.2.6 | AI system verification and validation | Red teaming as a validation method |
| A.6.2.7 | AI system risk assessment | Red team findings as risk evidence |
| A.8.4 | AI system operation and monitoring | CART and continuous monitoring |
| A.9.3 | Third-party and customer relationships | Vendor red team requirements |
| A.10.3 | AI system impact assessment | Red team impact analysis |
ISO 42001 Annex B: Implementation Guidance
Annex B provides implementation guidance for each Annex A control. For red teamers, the most relevant guidance includes:
| Guidance Area | What It Recommends | How Red Teaming Supports |
|---|---|---|
| B.6.2.6 | Verification should include testing against requirements | Red team findings verify that security requirements are met |
| B.6.2.7 | Risk assessment should consider adversarial threats | MITRE ATLAS-based threat modeling feeds into risk assessment |
| B.8.4 | Monitoring should detect anomalous behavior | Red team exercises validate monitoring and alerting capabilities |
| B.10.3 | Impact assessment should be ongoing | Regular red team cadence provides continuous assessment |
ISO 42001 Certification and Red Teaming
During an ISO 42001 certification audit, auditors look for evidence that AI risks have been identified and managed. Red team reports serve as strong evidence for:
| Audit Evidence Needed | Red Team Deliverable |
|---|---|
| Risk identification (Clause 6.1) | Threat model with ATLAS-mapped risks |
| Risk treatment (Clause 6.1) | Remediation validation showing risks are mitigated |
| Performance evaluation (Clause 9.1) | Testing metrics, severity distributions, trend analysis |
| Nonconformity handling (Clause 10.1) | Findings mapped to specific control failures with remediation evidence |
| Continual improvement (Clause 10.2) | Year-over-year comparison of red team findings |
NIST AI RMF vs. ISO 42001
| Dimension | NIST AI RMF | ISO 42001 |
|---|---|---|
| Type | Voluntary framework | Certifiable standard |
| Structure | Four functions, categories, subcategories | Clauses + Annex A controls |
| Certification | No formal certification | Third-party auditable |
| Geographic focus | US-originated, globally referenced | International (ISO) |
| Specificity | Principles-based, flexible | Requirements-based, prescriptive |
| Cost to implement | Lower (no audit required) | Higher (certification audit required) |
| Market recognition | Strong in US, growing globally | Growing, especially in EU/APAC |
Start with NIST AI RMF
Use the RMF to establish your AI risk management approach. Its flexibility allows adaptation to your organization's maturity level.
Layer on ISO 42001 for certification
If certification is required (often for enterprise or regulated industries), implement ISO 42001 controls on top of your RMF-based program.
Integrate red teaming into both
Position red teaming as a primary Measure function (NIST) and verification/validation method (ISO 42001). Map findings to both frameworks.
Practical Application: Structuring Red Team Programs
Building a Red Team Program Around NIST AI RMF
For organizations adopting the NIST AI RMF, here is how to structure a red team program that maps directly to the framework's functions:
| Phase | RMF Function | Activities | Deliverables |
|---|---|---|---|
| Program Setup | Govern | Define red team charter, budget, reporting lines, testing policies, and escalation procedures | Red team policy document, RACI matrix |
| Engagement Scoping | Map | Identify AI systems, classify risks, define attack surface, stakeholder analysis | Threat model, scope document, rules of engagement |
| Testing Execution | Measure | Execute adversarial testing, collect metrics, benchmark performance | Test logs, finding reports, severity scores |
| Remediation Tracking | Manage | Track remediation, validate fixes, monitor for regression, accept residual risk | Remediation tracker, retest reports, risk acceptance records |
| Program Review | Govern + Manage | Annual program review, metrics trending, budget justification | Annual report, metrics dashboard, improvement plan |
Metrics for AI Red Team Programs
The Measure function requires quantifiable metrics. These metrics help organizations track their AI security posture over time:
| Metric | What It Measures | Target Direction |
|---|---|---|
| Finding severity distribution | Proportion of Critical/High/Medium/Low findings | Shift toward lower severity over time |
| Mean time to remediate (MTTR) | How quickly findings are fixed | Decrease over time |
| Coverage score | Percentage of ATLAS techniques tested per engagement | Maintain above 80% for relevant techniques |
| Regression rate | Percentage of previously fixed findings that reappear | Keep below 5% |
| Jailbreak success rate | Percentage of jailbreak attempts that succeed | Decrease over time |
| Data extraction success rate | Percentage of extraction attempts yielding sensitive data | Decrease to zero |
Translating Findings to Risk Management Language
Technical red team findings must be translated into risk management language for organizational stakeholders. Use this structure:
Technical Finding: System prompt extractable via role-play technique
Risk Statement: There is a [MEDIUM] likelihood that an adversary
will extract the AI system's internal instructions, revealing
business logic and security control configurations that enable
more sophisticated follow-up attacks.
NIST AI RMF Mapping:
- Map: MP-2 (Risk identification) -- system prompt exposure
was not identified as a risk in the system's risk register
- Measure: MS-1 (Assessment) -- current security controls do
not prevent instruction extraction
- Manage: MG-1 (Risk treatment) -- implement output filtering
for instruction-like content, add to monitoring
Remediation Owner: AI Platform Team
Acceptance Authority: CISO (if risk is accepted)Related Topics
- AI Security Frameworks Overview -- how risk management frameworks fit in the broader landscape
- EU AI Act Compliance Testing -- regulatory framework that references NIST AI RMF
- Cross-Framework Mapping Reference -- mapping NIST and ISO to OWASP and ATLAS
- Insurance & Compliance Requirements -- compliance certifications including ISO 42001
References
- "Artificial Intelligence Risk Management Framework (AI RMF 1.0)" - National Institute of Standards and Technology (2023) - The complete NIST AI RMF with Govern, Map, Measure, and Manage functions
- "NIST AI RMF Playbook" - National Institute of Standards and Technology (2023) - Companion guidance with suggested actions for implementing each AI RMF subcategory
- "ISO/IEC 42001:2023 Artificial Intelligence Management System" - International Organization for Standardization (2023) - Certifiable AI management system standard complementing NIST AI RMF
- "NIST AI 600-1: Generative AI Profile" - National Institute of Standards and Technology (2024) - Supplementary profile mapping generative AI risks to the AI RMF structure
- "Crosswalk: NIST AI RMF to ISO/IEC 42001" - National Institute of Standards and Technology (2024) - Official mapping between NIST and ISO AI governance standards
In which NIST AI RMF function does AI red teaming execution primarily belong?