NIST AI RMF Assessment Walkthrough
Step-by-step guide for conducting assessments aligned with the NIST AI Risk Management Framework, covering the Govern, Map, Measure, and Manage functions for AI system security.
The NIST AI Risk Management Framework (AI RMF 1.0, published January 2023) provides a voluntary framework for organizations to manage AI risks. Unlike OWASP or MITRE ATLAS, the AI RMF covers governance, organizational, and process-level risks in addition to technical vulnerabilities. For organizations in regulated industries or those seeking to demonstrate due diligence, an AI RMF-aligned assessment provides comprehensive coverage that technical testing alone cannot deliver.
This walkthrough guides you through assessing an organization against the AI RMF's four core functions, with emphasis on the security-relevant subcategories that red team findings inform.
Step 1: Understand the AI RMF Structure
Four Core Functions
┌──────────────────────────────────────────────────┐
│ NIST AI RMF │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ │ GOVERN │ │ MAP │ │ MEASURE │ │ MANAGE │
│ │ │ │ │ │ │ │ │
│ │ Culture & │ │ Context │ │ Analyze │ │ Respond │
│ │ process │ │ & risk │ │ & assess │ │ & treat │
│ │ for AI │ │ framing │ │ AI risks │ │ AI risks │
│ │ risk mgmt │ │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │
│ Cross-cutting: Govern applies across all functions │
└──────────────────────────────────────────────────┘
Security-Relevant Subcategories
| Function | Category | Subcategory | Security Relevance |
|---|---|---|---|
| GOVERN | Gov 1 | Gov 1.1-1.7 | AI risk policies, roles, legal compliance |
| GOVERN | Gov 2 | Gov 2.1-2.3 | Accountability structures |
| GOVERN | Gov 3 | Gov 3.1-3.2 | Workforce diversity and AI literacy |
| GOVERN | Gov 4 | Gov 4.1-4.3 | Organizational commitments to AI principles |
| GOVERN | Gov 5 | Gov 5.1-5.2 | Processes for AI risk management integration |
| GOVERN | Gov 6 | Gov 6.1-6.2 | Policies for third-party AI components |
| MAP | Map 1 | Map 1.1-1.6 | Intended purpose and context documentation |
| MAP | Map 2 | Map 2.1-2.3 | Interdisciplinary involvement |
| MAP | Map 3 | Map 3.1-3.5 | AI-specific benefits and costs |
| MAP | Map 5 | Map 5.1-5.2 | Impact characterization |
| MEASURE | Meas 1 | Meas 1.1-1.3 | Metrics and measurement approaches |
| MEASURE | Meas 2 | Meas 2.1-2.13 | AI system testing and evaluation |
| MEASURE | Meas 3 | Meas 3.1-3.3 | Risk tracking and monitoring |
| MANAGE | Man 1 | Man 1.1-1.4 | Risk prioritization and treatment |
| MANAGE | Man 2 | Man 2.1-2.4 | Residual risk response strategies |
| MANAGE | Man 3 | Man 3.1-3.2 | Risk communication to stakeholders |
| MANAGE | Man 4 | Man 4.1-4.3 | Incident response for AI risks |
Step 2: Assess the GOVERN Function
The GOVERN function examines whether the organization has the policies, processes, and accountability structures to manage AI risk. This assessment is conducted through interviews and document review, not technical testing.
GOVERN Assessment Checklist
# GOVERN Function Assessment
## Gov 1: Policies and Procedures
- [ ] Organization has documented AI risk management policies
- [ ] Policies specifically address AI security risks (not just general IT security)
- [ ] Policies cover third-party AI model usage and associated risks
- [ ] Legal and regulatory compliance requirements for AI are identified
- [ ] Policies are reviewed and updated at least annually
- [ ] AI-specific incident response procedures exist
### Interview Questions
1. Who owns the AI risk management policy?
2. When was it last updated?
3. Does it specifically address adversarial attacks against AI systems?
4. How are third-party AI model risks assessed before deployment?
5. What triggers a policy review (new regulations, incidents, scheduled)?
### Evidence to Collect
- AI risk management policy document
- Policy review history and change log
- Third-party AI vendor assessment procedures
- AI-specific incident response playbook
## Gov 2: Accountability
- [ ] Roles and responsibilities for AI risk management are defined
- [ ] A specific individual or team is accountable for AI security
- [ ] Reporting lines for AI security issues are established
- [ ] Cross-functional collaboration (security, ML, legal, product) is formalized
## Gov 6: Third-Party AI Risk
- [ ] Policies exist for assessing third-party AI models and services
- [ ] AI provider terms of service are reviewed for security implications
- [ ] Procedures exist for handling AI provider security incidents
- [ ] Third-party AI dependencies are inventoried and monitoredMaturity Assessment
| Gov Category | Level 0: None | Level 1: Partial | Level 2: Defined | Level 3: Managed | Level 4: Optimizing |
|---|---|---|---|---|---|
| Gov 1: Policies | No AI policies | General IT policies apply | AI-specific policies exist | Policies actively enforced | Continuous improvement cycle |
| Gov 2: Accountability | No ownership | Informal ownership | Roles defined | Accountability enforced | Cross-functional governance board |
| Gov 6: Third-party | No assessment | Ad hoc review | Formal assessment process | Ongoing monitoring | Proactive risk management |
Step 3: Assess the MAP Function
The MAP function evaluates how well the organization understands the context, purpose, and risk profile of its AI systems.
MAP Assessment Checklist
# MAP Function Assessment
## Map 1: Intended Purpose
- [ ] AI system's intended purpose is documented
- [ ] Known limitations are documented and communicated to users
- [ ] Deployment context (who uses it, how, where) is documented
- [ ] Out-of-scope uses are identified and controls exist to prevent them
- [ ] System behavior under adversarial conditions is characterized
### Red Team Contribution to Map 1
The red team assessment directly contributes to Map 1.5 (characterizing
known limitations) and Map 1.6 (determining AI system impact). Document:
- Specific adversarial conditions under which the system fails
- Safety boundary conditions discovered during testing
- Unexpected behaviors observed under edge case inputs
## Map 5: Impact Characterization
- [ ] Potential negative impacts of AI system failures are documented
- [ ] Impact assessment includes adversarial attack scenarios
- [ ] Impact assessment covers direct harm, discrimination, and privacy
- [ ] Severity of potential impacts is quantified where possibleStep 4: Assess the MEASURE Function
The MEASURE function is where red team assessment results contribute most directly. This function evaluates whether the organization tests and monitors its AI systems for risks.
MEASURE Assessment Checklist
# MEASURE Function Assessment
## Meas 2: AI System Testing and Evaluation
Red team findings map directly to these subcategories:
### Meas 2.5: AI System Security Testing
- [ ] Regular security testing of AI systems is conducted
- [ ] Testing includes adversarial attack simulation (red teaming)
- [ ] Testing covers prompt injection, jailbreaking, and data exfiltration
- [ ] Testing methodology is documented and repeatable
- [ ] Testing results are tracked over time for trend analysis
### Meas 2.6: AI System Robustness Testing
- [ ] System behavior under adversarial inputs is evaluated
- [ ] Edge case and boundary condition testing is performed
- [ ] Non-deterministic behavior is characterized and bounded
- [ ] Failure modes are documented with severity classification
### Meas 2.7: AI System Privacy Testing
- [ ] PII exposure through model outputs is tested
- [ ] Training data memorization is evaluated
- [ ] Cross-tenant data isolation is verified
- [ ] Data minimization in model context is assessed
## Meas 3: Risk Monitoring
- [ ] Continuous monitoring for adversarial inputs exists
- [ ] Anomaly detection for model behavior is implemented
- [ ] Security metrics for AI systems are defined and tracked
- [ ] Monitoring covers all identified risk categories from Map functionMapping Red Team Findings to MEASURE
# Red Team Findings → MEASURE Mapping
| Finding | Meas Subcategory | Gap Identified |
|---------|-----------------|----------------|
| F-001: Prompt injection bypass | Meas 2.5 | No prior adversarial testing of content filters |
| F-003: Cross-tenant data access | Meas 2.7 | Privacy testing did not include adversarial scenarios |
| F-004: System prompt extraction | Meas 2.6 | Robustness testing did not cover prompt extraction |
| No monitoring for injection attempts | Meas 3.1 | AI-specific security monitoring not implemented |Step 5: Assess the MANAGE Function
The MANAGE function evaluates how the organization responds to identified AI risks.
MANAGE Assessment Checklist
# MANAGE Function Assessment
## Man 1: Risk Prioritization
- [ ] AI risks are prioritized using a documented methodology
- [ ] Risk appetite for AI-specific risks is defined
- [ ] Risk treatment options are evaluated for each identified risk
- [ ] Residual risk is explicitly accepted by appropriate authority
## Man 2: Risk Treatment
- [ ] Remediation plans exist for identified AI security risks
- [ ] Remediation timelines are defined and tracked
- [ ] Effectiveness of remediations is verified (retesting)
- [ ] Alternative risk treatments (accept, transfer, avoid) are considered
## Man 4: AI Incident Response
- [ ] AI-specific incident types are defined (jailbreak in production, data leakage via model, adversarial attack detection)
- [ ] Incident response procedures cover AI-specific scenarios
- [ ] Incident response team includes AI/ML expertise
- [ ] Post-incident review process addresses AI-specific root causes
- [ ] Incident communication plan addresses AI-related public concernsStep 6: Compile the Assessment Report
AI RMF Assessment Report Template
# NIST AI RMF Assessment Report
## 1. Executive Summary
[Organization, system assessed, assessment approach, key findings]
## 2. Assessment Scope and Methodology
- Technical red team assessment covering MEASURE function
- Interviews and document review covering GOVERN and MAP functions
- Organizational assessment covering MANAGE function
## 3. Maturity Assessment Summary
| Function | Category | Maturity Level | Key Gaps |
|----------|----------|---------------|----------|
| GOVERN | Gov 1: Policies | Level 1 | No AI-specific security policies |
| GOVERN | Gov 2: Accountability | Level 2 | AI security role defined but not staffed |
| GOVERN | Gov 6: Third-party | Level 1 | Ad hoc vendor assessment only |
| MAP | Map 1: Purpose | Level 2 | Adversarial limitations not documented |
| MAP | Map 5: Impact | Level 1 | No adversarial impact assessment |
| MEASURE | Meas 2: Testing | Level 1 | No regular adversarial testing |
| MEASURE | Meas 3: Monitoring | Level 0 | No AI-specific security monitoring |
| MANAGE | Man 1: Prioritization | Level 2 | AI risks not integrated into risk register |
| MANAGE | Man 4: Incident Response | Level 1 | No AI-specific IR procedures |
## 4. Function-Specific Findings
[Detailed findings for each function]
## 5. Recommendations
[Prioritized recommendations aligned to AI RMF categories]
## 6. Appendix: Red Team Technical Findings
[Reference to full technical report]Recommendations Aligned to AI RMF
| Priority | Recommendation | AI RMF Category | Timeline |
|---|---|---|---|
| 1 | Establish AI security testing program | Meas 2.5 | 30 days |
| 2 | Implement AI-specific monitoring | Meas 3.1 | 60 days |
| 3 | Develop AI security incident response procedures | Man 4.1 | 30 days |
| 4 | Create AI risk management policy | Gov 1.1 | 45 days |
| 5 | Staff AI security role | Gov 2.1 | 60 days |
| 6 | Establish third-party AI assessment process | Gov 6.1 | 90 days |
| 7 | Document adversarial limitations | Map 1.5 | 30 days |
Common NIST AI RMF Assessment Mistakes
-
Treating it as only a technical assessment. The AI RMF covers governance, process, and organizational risk. A purely technical red team assessment only addresses the MEASURE function. Conduct interviews and document reviews for GOVERN, MAP, and MANAGE.
-
Confusing AI RMF with AI compliance certification. The AI RMF is a voluntary framework, not a compliance standard. Organizations adopt it for risk management, not to achieve certification. Frame recommendations as risk improvements, not compliance gaps.
-
Assessing all subcategories equally. Focus on security-relevant subcategories. An AI red team does not need to assess AI fairness, bias, or explainability unless specifically scoped.
-
Ignoring the GOVERN function. Governance gaps (no AI security policy, no accountability, no third-party assessment) often have more impact than individual technical vulnerabilities because they represent systemic issues.
-
Not linking technical findings to framework gaps. A prompt injection finding (technical) should link to the lack of adversarial testing program (Meas 2.5) and the absence of AI-specific monitoring (Meas 3.1). The connection from technical to organizational makes the AI RMF assessment valuable.
A red team assessment finds that the organization has no process for testing AI systems against adversarial attacks before deployment. Which NIST AI RMF function and category does this gap map to?
Related Topics
- Mapping to OWASP LLM Top 10 -- Technical vulnerability classification that feeds into MEASURE
- EU AI Act Compliance Testing -- Regulatory framework that references NIST AI RMF
- ISO 42001 AI Audit Prep -- Standards-based audit that aligns with AI RMF
- Continuous Assessment Program -- Ongoing assessment supporting MEASURE and MANAGE functions