Critical Infrastructure AI Security
Security testing for AI in critical infrastructure: SCADA/ICS integration, power grid AI, transportation systems, water treatment, and the convergence of operational technology with artificial intelligence.
Critical infrastructure AI represents the convergence of two domains that were traditionally separate: operational technology (OT) that controls physical processes and artificial intelligence that makes decisions based on data patterns. This convergence creates attack surfaces that neither OT security nor AI security frameworks adequately address in isolation.
The defining characteristic of critical infrastructure AI is that failures can cause physical consequences — power outages, transportation accidents, water contamination, communications blackouts. Unlike enterprise AI where the worst case is typically data exposure or financial loss, critical infrastructure AI failures can endanger human life.
The IT/OT/AI Convergence
Traditional IT/OT Separation
Critical infrastructure has historically maintained strict separation between IT networks (corporate email, databases, web servers) and OT networks (SCADA systems, PLCs, RTUs, industrial control systems). This air gap provided security through isolation — even if the corporate network was compromised, the attacker could not reach the systems controlling physical processes.
AI integration erodes this separation in several ways:
Traditional Architecture:
[IT Network] --- AIR GAP --- [OT Network] --- [Physical Process]
AI-Integrated Architecture:
[IT Network] <-> [AI Platform] <-> [OT Network] <-> [Physical Process]
| | |
Cloud APIs Model Training Sensor Data
Data Lakes Inference APIs Control Signals
The AI platform sits between IT and OT, consuming sensor data from OT systems and potentially sending control signals back. This bridging role means that a compromise of the AI platform can propagate in both directions.
New Attack Vectors from AI Integration
| Vector | Description | Impact |
|---|---|---|
| Sensor data poisoning | Manipulate sensor inputs to the AI to cause incorrect control decisions | Physical process disruption |
| Model manipulation | Alter the AI model to produce dangerous control outputs | Equipment damage, safety hazard |
| Inference latency attacks | Slow down AI decision-making during time-critical processes | Process instability |
| Training data poisoning | Poison historical data used to train predictive maintenance or control optimization models | Long-term degradation |
| Digital twin divergence | Cause the digital twin to diverge from the physical system, masking real problems | Undetected failures |
Sector-Specific Considerations
Energy Sector
Power generation, transmission, and distribution increasingly rely on AI for load forecasting, fault detection, renewable integration, and grid optimization. The energy sector's interconnected nature means that AI failures can cascade across the grid.
Key attack surfaces:
- Load forecasting manipulation — Causing the AI to over-predict or under-predict demand leads to generation/load imbalance
- Renewable integration AI — Manipulating predictions of solar and wind output affects grid stability
- Fault detection evasion — Causing the AI to miss equipment failures that could lead to transformer explosions or line faults
See the dedicated Power Grid AI page for detailed coverage.
Transportation
AI in transportation spans autonomous vehicles, air traffic management, rail control, maritime navigation, and traffic signal optimization. Safety certification requirements (DO-178C for aviation, ISO 26262 for automotive) add regulatory complexity.
Key attack surfaces:
- Autonomous vehicle perception — Adversarial inputs that cause misclassification of road objects
- Traffic management AI — Manipulation that creates gridlock or dangerous traffic patterns
- Rail scheduling AI — Attacks that cause scheduling conflicts or speed miscalculations
See the dedicated Transportation AI page for detailed coverage.
Water and Wastewater
Water treatment facilities use AI for chemical dosing optimization, contamination detection, flow management, and predictive maintenance. Attacks on water treatment AI can affect public health.
Key attack surfaces:
- Chemical dosing AI — Manipulating the AI to over-dose or under-dose treatment chemicals
- Contamination detection evasion — Causing the AI to miss contamination events
- Flow management — Disrupting water distribution patterns
Communications
Telecommunications networks use AI for network optimization, anomaly detection, traffic routing, and capacity planning. As 5G networks deploy AI-native architectures, the attack surface expands.
Key attack surfaces:
- Network optimization AI — Causing the AI to misallocate resources, creating coverage gaps
- Anomaly detection evasion — Hiding malicious traffic from AI-based intrusion detection
- Emergency services routing — Manipulating AI that routes emergency calls
Red Team Engagement Framework
Safety-First Methodology
Red team engagements against critical infrastructure AI must follow a safety-first methodology that prioritizes avoiding disruption to essential services.
Digital twin verification
Verify that the test environment accurately replicates the production system. A digital twin that does not faithfully model the physical process may hide vulnerabilities or create false positives.
Safety boundary definition
Define explicit safety boundaries that the red team must not cross, even in the test environment. These boundaries should be enforced by technical controls, not just policy.
Escalation procedures
Establish clear escalation procedures for findings that indicate imminent risk to production systems. The red team must be able to immediately notify operations if they discover a vulnerability that is actively exploitable in production.
Phased testing
Begin with passive analysis (architecture review, data flow mapping) before progressing to active testing. Active testing should start with the least dangerous attack vectors and escalate gradually.
Operations coordination
Coordinate all active testing with operations teams. Ensure operators know testing is occurring and can distinguish test activity from real attacks or system failures.
Consequence-Aware Testing
Critical infrastructure red team testing must account for the physical consequences of AI failures. Unlike enterprise AI testing where finding a vulnerability is always a success, critical infrastructure testing must carefully evaluate whether demonstrating a vulnerability could itself cause harm.
Consequence categories for critical infrastructure AI findings:
| Category | Description | Example | Response Priority |
|---|---|---|---|
| Safety-critical | Vulnerability could cause injury or loss of life | AI-controlled valve exceeds pressure limits | Immediate notification, stop testing |
| Equipment damage | Vulnerability could damage physical equipment | Predictive maintenance AI misses bearing failure | Same-day notification |
| Service disruption | Vulnerability could cause outage of essential service | Load forecast manipulation causes blackout | 24-hour notification |
| Economic impact | Vulnerability causes financial loss without safety risk | Grid optimization AI increases costs | Standard reporting |
| Data exposure | Vulnerability exposes sensitive infrastructure data | SCADA network topology leaked through AI | Standard reporting |
The testing team must have pre-established communication channels with the operations team so that safety-critical findings can be escalated immediately, even outside business hours.
Testing Priorities
Given the safety constraints, prioritize testing based on the intersection of likelihood and impact:
# Priority matrix for critical infrastructure AI testing
testing_priorities = {
"critical": [
"Sensor data poisoning affecting safety-critical controls",
"Model manipulation causing dangerous physical outputs",
"AI bypass allowing direct control of physical processes",
],
"high": [
"Predictive maintenance AI manipulation (delayed maintenance)",
"Load/demand forecasting poisoning",
"Anomaly detection evasion for physical intrusions",
],
"medium": [
"Digital twin divergence attacks",
"Historical data poisoning for model retraining",
"AI-based access control bypass for OT networks",
],
"lower": [
"Reporting and visualization AI manipulation",
"Non-safety administrative AI systems",
"Training data extraction from infrastructure models",
],
}Regulatory Landscape
Critical infrastructure AI is subject to sector-specific regulations in addition to general AI governance requirements:
| Sector | Primary Regulator | Key Requirements |
|---|---|---|
| Energy | FERC, NERC | NERC CIP standards, FERC Order 2222 (DER) |
| Transportation | DOT, FAA, NHTSA | DO-178C, ISO 26262, 49 CFR |
| Water | EPA | SDWA, AWIA Section 2013 |
| Communications | FCC | Network reliability requirements |
| Cross-sector | CISA | CPGs, Shields Up guidance |
Red team findings should map to applicable sector-specific standards to ensure regulatory actionability.
Section Contents
- SCADA/ICS + AI Attacks — Attacking AI integrated with industrial control systems
- Power Grid AI — Energy sector AI security
- Transportation AI — Transportation sector AI security