AI Risk Assessment Methodology
Structured approaches to evaluating AI system risks including identification, scoring frameworks, treatment planning, and templates for conducting comprehensive AI risk assessments.
AI risk assessment is the foundation of any AI governance program. It determines which systems need the most scrutiny, what controls are required, and where red team resources should be focused. Unlike traditional IT risk assessment, AI risk assessment must account for emergent behaviors, algorithmic bias, safety failures, and the probabilistic nature of AI outputs.
Risk Assessment Process
The Five-Phase Approach
System identification and classification
Inventory all AI systems and classify them by risk level. This determines the depth of assessment required.
Activities:
- Enumerate all AI systems (production, development, experimental)
- Classify by AI type (generative, predictive, classification, recommendation)
- Identify the decision domain (healthcare, finance, employment, general)
- Assign initial risk tier based on use case and impact potential
Threat identification
Identify threats specific to each AI system. Use a combination of threat intelligence, framework-based enumeration, and adversarial thinking.
Threat sources:
- OWASP LLM Top 10 vulnerability categories
- MITRE ATLAS tactics and techniques
- NIST AI 600-1 risk categories (for generative AI)
- Industry-specific threat profiles
- Historical incident analysis
Risk analysis and scoring
Evaluate each identified threat against the specific AI system context. Score based on likelihood and impact using a consistent methodology.
Risk treatment planning
For each scored risk, determine the appropriate treatment: accept, mitigate, transfer, or avoid. Map treatments to specific controls and testing activities.
Documentation and communication
Document assessment results and communicate to stakeholders. Feed results into compliance reporting and red team scoping.
AI System Classification
Risk Tier Framework
| Tier | Risk Level | Criteria | Assessment Depth | Red Team Frequency |
|---|---|---|---|---|
| Tier 1 | Critical | Autonomous decisions affecting health, safety, liberty, or financial standing | Full assessment with adversarial testing | Quarterly |
| Tier 2 | High | AI significantly influences human decisions in regulated domains | Comprehensive assessment | Semi-annually |
| Tier 3 | Medium | Customer-facing AI with moderate impact potential | Standard assessment | Annually |
| Tier 4 | Low | Internal tools, non-consequential applications | Lightweight assessment | Every 2 years or event-driven |
Classification Decision Tree
Is the AI system making autonomous decisions?
├── Yes → Does the decision affect health, safety, liberty, or financial standing?
│ ├── Yes → Tier 1 (Critical)
│ └── No → Tier 2 (High)
└── No → Does the AI significantly influence human decisions?
├── Yes → Is the domain regulated (healthcare, finance, employment)?
│ ├── Yes → Tier 2 (High)
│ └── No → Tier 3 (Medium)
└── No → Tier 4 (Low)
Risk Scoring Frameworks
Quantitative Scoring Matrix
Use this scoring matrix to consistently evaluate AI risks across systems:
Likelihood Score (L):
| Score | Level | Description |
|---|---|---|
| 1 | Rare | Requires nation-state resources and novel techniques |
| 2 | Unlikely | Requires significant expertise and targeted effort |
| 3 | Possible | Achievable by skilled practitioners with available tools |
| 4 | Likely | Achievable by moderately skilled attackers with public techniques |
| 5 | Almost certain | Trivially exploitable, public exploits available |
Impact Score (I):
| Score | Level | Description |
|---|---|---|
| 1 | Negligible | No measurable harm, minor inconvenience |
| 2 | Minor | Limited data exposure, temporary service disruption |
| 3 | Moderate | Personal data breach, financial loss under $100K, reputational harm |
| 4 | Major | Large-scale data breach, financial loss over $1M, regulatory action |
| 5 | Catastrophic | Physical harm, loss of life, existential business risk, mass data breach |
Risk Score = L x I
| Risk Score | Rating | Treatment Approach |
|---|---|---|
| 1-4 | Low | Accept or monitor |
| 5-9 | Medium | Mitigate within standard timelines |
| 10-15 | High | Mitigate with priority |
| 16-25 | Critical | Immediate action required |
AI-Specific Risk Dimensions
Beyond likelihood and impact, AI systems require assessment across additional dimensions:
| Dimension | Score Range | What It Measures |
|---|---|---|
| Autonomy | 1-5 | Degree of autonomous decision-making without human oversight |
| Opacity | 1-5 | Difficulty of explaining or interpreting AI decisions |
| Scale | 1-5 | Number of people or decisions affected |
| Reversibility | 1-5 | Difficulty of undoing harm caused by AI errors |
| Bias potential | 1-5 | Likelihood of discriminatory outcomes across protected groups |
Composite AI Risk Score = (L x I) + Autonomy + Opacity + Scale + Reversibility + Bias potential
This produces a score range of 6-50, providing finer granularity for prioritization:
| Composite Score | Priority | Action |
|---|---|---|
| 6-15 | Standard | Routine assessment and monitoring |
| 16-25 | Elevated | Enhanced assessment, periodic red teaming |
| 26-35 | High | Comprehensive assessment, regular red teaming, continuous monitoring |
| 36-50 | Critical | Immediate comprehensive assessment, frequent red teaming, real-time monitoring |
Threat Identification for AI Systems
AI-Specific Threat Categories
| Threat Category | Description | Example Attacks | Assessment Method |
|---|---|---|---|
| Input manipulation | Adversarial modification of AI inputs to alter outputs | Prompt injection, adversarial examples, data poisoning | Red team testing, automated scanning |
| Data extraction | Unauthorized extraction of training data or model information | Training data extraction, model stealing, membership inference | Data extraction testing, privacy audits |
| Output manipulation | Causing the AI to produce harmful, biased, or incorrect outputs | Jailbreaking, hallucination exploitation, bias amplification | Safety testing, bias assessment |
| System compromise | Traditional security attacks on AI infrastructure | API exploitation, model weight theft, supply chain attacks | Penetration testing, infrastructure assessment |
| Operational degradation | Attacks that degrade AI performance over time | Model drift poisoning, feedback loop manipulation, resource exhaustion | Performance monitoring, drift detection |
Threat Modeling Template
For each AI system, document threats using this structure:
| Field | Description |
|---|---|
| Threat ID | Unique identifier (e.g., T-GenAI-001) |
| Category | From the threat categories above |
| Description | Detailed description of the threat scenario |
| Attack vector | How the threat would be realized (e.g., API input, training pipeline, RAG data source) |
| Preconditions | What must be true for the attack to succeed |
| Impact | Specific consequences if the threat is realized |
| Existing controls | Current mitigations in place |
| Residual risk | Risk level after considering existing controls |
| Testing approach | How red teaming would validate this threat |
Risk Treatment Planning
Treatment Options
| Treatment | When to Apply | Documentation Required |
|---|---|---|
| Accept | Risk score is within organizational risk appetite and regulatory tolerance | Risk acceptance form signed by risk owner |
| Mitigate | Controls can reduce risk to acceptable levels at reasonable cost | Control implementation plan with timeline and validation criteria |
| Transfer | Risk can be shifted to a third party (insurance, contract provisions) | Insurance policy or contractual provisions documented |
| Avoid | Risk is unacceptable and no adequate mitigation exists | Decision to discontinue the AI system or use case |
Control Selection Framework
When selecting controls to mitigate AI risks, map to applicable frameworks:
| Risk Category | Technical Controls | Process Controls | Framework Reference |
|---|---|---|---|
| Prompt injection | Input filtering, output scanning, prompt hardening | Security review, change management | OWASP LLM01, ISO 42001 A.6.2.4 |
| Data extraction | Differential privacy, output filtering, rate limiting | Data classification, access management | NIST AI 600-1 (Privacy), ISO 42001 A.7.4 |
| Bias and fairness | Bias testing, fairness constraints, diverse training data | Impact assessments, human oversight | EU AI Act Art. 10, NIST AI 600-1 (Bias) |
| Model security | Access controls, encryption, integrity verification | Model governance, lifecycle management | ISO 42001 A.6.2.5, SOC 2 CC6.1 |
| Safety failures | Safety testing, guardrails, kill switches | Incident response, escalation procedures | EU AI Act Art. 9, ISO 42001 A.5.4 |
Risk Assessment Templates
Template: GenAI Application Risk Assessment
| Section | Content |
|---|---|
| System identification | System name, owner, purpose, users, deployment environment |
| AI characteristics | Model type, training data, inference method, update frequency |
| Regulatory applicability | Applicable regulations (EU AI Act tier, state laws, sector regulations) |
| Threat inventory | Enumerated threats with scores from the threat modeling template |
| Control inventory | Current controls mapped to threats |
| Gap analysis | Threats without adequate controls |
| Risk register | Scored and prioritized risk list |
| Treatment plan | Selected treatment for each risk with timeline |
| Testing requirements | Red team testing activities required to validate controls |
| Review schedule | When the assessment will be updated |
Template: AI System Risk Register Entry
| Field | Example Value |
|---|---|
| Risk ID | R-2026-042 |
| AI System | Customer Service Chatbot v3.2 |
| Risk Description | System prompt extraction revealing internal business logic and API schemas |
| Threat Category | Data extraction |
| Likelihood | 4 (Likely -- public techniques available) |
| Impact | 3 (Moderate -- business logic exposure, potential for further exploitation) |
| Risk Score | 12 (High) |
| Existing Controls | Input filtering for known extraction patterns |
| Control Effectiveness | Partial -- bypassed in 3 of 10 red team tests |
| Treatment | Mitigate -- implement multi-layer prompt protection and monitoring |
| Timeline | 30 days for implementation, 45 days for validation |
| Risk Owner | Head of AI Engineering |
| Last Assessed | 2026-03-15 |
| Next Review | 2026-06-15 |
Integrating Red Team Results
Feeding Red Team Findings into Risk Assessments
Red team assessments produce evidence that directly updates risk assessments:
| Red Team Finding | Risk Assessment Update |
|---|---|
| New vulnerability discovered | Add to threat inventory, score, and add to risk register |
| Existing control bypassed | Update control effectiveness rating, increase residual risk score |
| Control validated as effective | Document as evidence of control operating effectiveness |
| New attack technique used | Add technique to threat model, assess applicability to other systems |
| Severity rating from red team | Cross-reference with risk score to validate scoring methodology |
Continuous Risk Assessment
Risk assessments should not be static documents. Integrate continuous inputs:
| Input Source | Update Frequency | Risk Assessment Action |
|---|---|---|
| Automated red team tests | Daily/weekly | Update control effectiveness metrics |
| Manual red team assessments | Quarterly | Comprehensive risk re-evaluation |
| Threat intelligence | Ongoing | Update threat landscape and likelihood scores |
| Regulatory changes | As published | Update regulatory applicability and required controls |
| AI system changes | Each deployment | Trigger reassessment of affected risk areas |
| Incident reports | As they occur | Update likelihood scores, validate impact assessments |
Common Assessment Mistakes
| Mistake | Consequence | Correction |
|---|---|---|
| Assessing AI risk with IT risk frameworks only | AI-specific risks (bias, hallucination, emergent behavior) are missed | Add AI-specific risk dimensions to existing frameworks |
| Scoring all AI risks equally | Resources spread thin across low-priority items | Use the composite scoring methodology to differentiate |
| Ignoring supply chain risk | Third-party AI components create unassessed risk | Include all AI components in scope, including APIs and foundation models |
| Conflating model risk with system risk | Model may be safe but system integration creates vulnerabilities | Assess the complete system, not just the model |
| No re-assessment trigger | Assessments become stale after AI system updates | Define triggers for re-assessment (model change, data change, regulatory change) |