FDA AI/ML Regulation
FDA regulatory framework for AI and machine learning in medical devices including Software as a Medical Device classification, predetermined change control plans, real-world performance monitoring, and red team testing implications.
The FDA regulates AI and machine learning systems used in clinical settings through its Software as a Medical Device (SaMD) framework. This regulation directly affects red team engagements because many healthcare AI systems are FDA-cleared or FDA-approved medical devices, and security testing must operate within the constraints of that regulatory status. Findings must be reported in terms that map to FDA's risk framework, and certain types of modifications to address security vulnerabilities may themselves require FDA authorization.
Software as a Medical Device (SaMD)
What Qualifies as SaMD
The FDA defines SaMD as software intended to be used for one or more medical purposes that perform those purposes without being part of a hardware medical device. AI systems that analyze medical data to provide diagnostic information, treatment recommendations, or clinical measurements may qualify as SaMD.
SaMD determination factors:
| Factor | SaMD Likely | SaMD Unlikely |
|---|---|---|
| Provides diagnosis or treatment recommendation | Yes — directly influences clinical decisions | No — provides general health information only |
| Analyzes patient-specific data | Yes — processes individual patient data for clinical use | No — provides population-level statistics |
| Claims clinical performance | Yes — marketed with clinical accuracy claims | No — explicitly disclaims clinical use |
| Integrated into clinical workflow | Yes — outputs feed into clinical decision-making | No — standalone wellness application |
| FDA-cleared predicate exists | Yes — similar cleared devices exist in the market | Varies — novel intended use may require de novo classification |
Risk Classification
The FDA classifies SaMD using a two-dimensional framework based on the significance of the information provided and the healthcare situation or condition:
Significance of Information
──────────────────────────────
Treat/Diagnose Drive Clinical Inform Clinical
Management Management
──────────────┬──────────────┬──────────────┬──────────────
Critical │ Class III │ Class III │ Class II
│ (PMA) │ (PMA/De Novo)│ (510(k))
──────────────┼──────────────┼──────────────┼──────────────
Serious │ Class III │ Class II │ Class II
│ (PMA) │ (510(k)) │ (510(k))
──────────────┼──────────────┼──────────────┼──────────────
Non-serious │ Class II │ Class II │ Class I
│ (510(k)) │ (510(k)) │ (Exempt)
──────────────┴──────────────┴──────────────┴──────────────
PMA = Premarket Approval (highest scrutiny)
510(k) = Premarket Notification (substantial equivalence)
De Novo = Novel device classification
Red team implication: The classification level determines how security findings should be reported and how quickly they must be addressed. A vulnerability in a Class III diagnostic AI that directly drives treatment decisions is categorically more severe than the same vulnerability in a Class I wellness application.
Predetermined Change Control Plans
The Adaptive AI Challenge
Traditional medical device regulation assumes a static product: the device is evaluated, cleared, and then manufactured identically. AI/ML systems that learn and adapt over time break this model. A diagnostic AI that updates its parameters based on new data is, from a regulatory perspective, a different device after each update.
The FDA addressed this through the Predetermined Change Control Plan (PCCP) framework, which allows manufacturers to pre-specify the types of changes an AI system will make and the methodology for validating those changes without requiring new premarket submissions for each update.
PCCP Components
| Component | Description | Security Relevance |
|---|---|---|
| Description of Modifications | What types of changes the AI will make (performance improvement, new data sources, architecture changes) | Defines the boundary of authorized AI behavior — changes outside the PCCP may indicate compromise |
| Modification Protocol | The process for implementing changes including data requirements, training procedures, and validation methodology | Security controls in the modification pipeline are regulatory requirements, not optional |
| Performance Evaluation | How the manufacturer will verify that modifications maintain safety and effectiveness | Red team findings that demonstrate performance degradation map directly to PCCP compliance |
| Update Transparency | How users will be informed of changes | Unauthorized changes that are not disclosed violate both security and regulatory requirements |
Security Implications of PCCPs
Red team testers should understand how PCCPs create both constraints and opportunities:
Testing within PCCP boundaries:
- Verify that the AI system only makes changes within the scope defined by its PCCP
- Test whether adversarial inputs can cause the system to make changes outside PCCP boundaries
- Assess whether the modification pipeline has adequate security controls against poisoning
- Verify that the performance evaluation process would detect adversarially induced degradation
PCCP as a detection mechanism:
- Changes to AI behavior that fall outside the PCCP scope may indicate compromise
- Monitoring systems designed for PCCP compliance can double as security monitoring
- Unauthorized modifications are both security incidents and regulatory violations
# Framework for testing PCCP compliance under adversarial conditions
class PCCPComplianceTest:
"""
Test whether an adaptive AI system maintains compliance
with its Predetermined Change Control Plan under
adversarial conditions.
"""
def __init__(self, pccp_specification, model_api):
self.pccp = pccp_specification
self.model = model_api
def test_boundary_modifications(self):
"""
Attempt to induce the AI to make modifications
that fall outside its PCCP-defined boundaries.
"""
results = []
# Test: can adversarial feedback cause architecture changes
# outside PCCP scope?
if self.pccp.allows_architecture_changes is False:
result = self.attempt_architecture_modification()
results.append({
"test": "architecture_modification",
"pccp_allows": False,
"modification_occurred": result.modified,
"violation": result.modified,
})
# Test: can poisoned data cause performance to drop
# below PCCP-specified thresholds?
result = self.test_performance_degradation()
results.append({
"test": "performance_degradation",
"pccp_threshold": self.pccp.min_performance,
"post_attack_performance": result.performance,
"violation": (
result.performance < self.pccp.min_performance
),
})
return results
def test_modification_pipeline_security(self):
"""
Assess the security of the data pipeline that
feeds the AI's learning/update process.
"""
pipeline_tests = {
"data_source_authentication": (
self.test_data_source_auth()
),
"data_integrity_validation": (
self.test_data_integrity()
),
"training_process_isolation": (
self.test_training_isolation()
),
"rollback_capability": (
self.test_rollback()
),
}
return pipeline_testsReal-World Performance Monitoring
Post-Market Surveillance for AI
The FDA requires post-market surveillance for medical devices, including AI systems. For SaMD, this includes monitoring real-world performance to ensure the device continues to perform as intended after deployment.
Post-market monitoring requirements relevant to security:
| Monitoring Area | FDA Requirement | Security Testing Angle |
|---|---|---|
| Performance drift | Detect changes in diagnostic accuracy over time | Could be caused by data drift or adversarial data poisoning |
| Adverse events | Report adverse events and malfunctions | AI security incidents may qualify as reportable adverse events |
| Complaints | Track and investigate user complaints | Complaints about incorrect AI outputs may indicate security compromise |
| Real-world effectiveness | Validate that cleared performance levels are maintained | Adversarial attacks that degrade performance below cleared thresholds are regulatory violations |
Adversarial Performance Degradation
A particularly insidious attack against FDA-cleared AI involves gradually degrading its performance below the thresholds established during premarket review. If an attacker can poison the data pipeline or exploit the learning process to reduce diagnostic accuracy, the device is no longer operating within its cleared parameters.
Testing for performance degradation attacks:
Establish Baseline Performance
Obtain the AI system's cleared performance specifications (sensitivity, specificity, AUC). These are typically published in the FDA clearance summary (510(k) or De Novo decision summary). The baseline is the minimum performance level the system must maintain.
Design Poisoning Strategy
Design a data poisoning strategy that targets the system's weakest performance dimension. If the system's cleared sensitivity for detecting pneumonia is 95%, design adversarial training examples that specifically degrade pneumonia detection sensitivity.
Measure Performance Impact
After introducing poisoned data, measure the AI's performance against its cleared specifications. Any performance below cleared thresholds constitutes a finding with regulatory implications — the device is no longer operating as authorized.
Assess Monitoring Detection
Determine whether the organization's post-market monitoring would detect the performance degradation. Many organizations rely on periodic batch evaluation rather than continuous monitoring, creating windows where degraded performance goes undetected.
FDA Reporting Obligations for AI Security Incidents
Medical Device Reports
Manufacturers must submit Medical Device Reports (MDRs) to the FDA when they become aware that their device may have caused or contributed to a death or serious injury, or has malfunctioned in a way that would be likely to cause or contribute to death or serious injury if the malfunction were to recur.
AI security incidents that meet these thresholds include:
| Incident Type | MDR Required? | Reasoning |
|---|---|---|
| Adversarial attack causes misdiagnosis leading to patient harm | Yes | Device contributed to serious injury |
| Data poisoning degrades diagnostic accuracy below cleared threshold | Likely | Malfunction likely to cause serious injury if recurrent |
| Model extraction enabling creation of unregulated diagnostic tool | Possibly | Depends on downstream harm from unregulated copy |
| PHI exposure through diagnostic AI | Generally no | Privacy incident rather than device malfunction (but HIPAA notification applies) |
| Prompt injection alters diagnostic output | Likely | Malfunction likely to cause serious injury if recurrent |
Corrective Actions
FDA-required corrective actions for AI security vulnerabilities may include:
- Field Safety Corrective Action — Notification to users about the vulnerability and interim mitigation measures
- Software update — Patching the vulnerability, which itself may require FDA review depending on the nature of the change and the PCCP scope
- Recall — In severe cases, temporary removal from the market until the vulnerability is addressed
- Labeling update — Updated warnings or instructions for use addressing the security risk
Red team reporting consideration: When a security finding has potential FDA reporting implications, include explicit language in the report identifying the finding as potentially MDR-reportable. This ensures the organization's regulatory affairs team can make the reporting determination with full information.
Impact on Red Team Engagement Scoping
FDA-Specific Scoping Considerations
| Scoping Element | FDA-Regulated Device | Non-Regulated AI |
|---|---|---|
| Testing environment | Must not affect production device performance or availability | Standard staging environment |
| Modification restrictions | Cannot modify cleared device outside PCCP during testing | Standard penetration testing scope |
| Finding classification | Must map to FDA risk framework and MDR thresholds | Standard vulnerability classification |
| Reporting audience | Security, regulatory affairs, quality management | Security team |
| Remediation constraints | Security patches may require FDA review | Standard patch management |
| Timeline | MDR reporting has 30-day deadline from awareness | Standard disclosure timeline |
Navigating the Regulatory Relationship
Red team engagements against FDA-regulated AI systems operate at the intersection of security testing and regulatory compliance. Key principles:
-
Coordinate with regulatory affairs before testing. The regulatory team needs to understand what you are testing and how findings might affect the device's regulatory status.
-
Use FDA risk language in findings. Translate technical vulnerability descriptions into FDA's risk framework (probability of harm, severity of harm, detectability).
-
Understand the 30-day MDR clock. Once the manufacturer becomes aware of a potentially MDR-reportable event, they have 30 calendar days to submit the report. Red team findings that trigger this clock create time-sensitive reporting obligations.
-
Recognize that security fixes are device modifications. A software patch to address a security vulnerability in a cleared medical device is a device modification that may require FDA review depending on its nature and the device's PCCP.
Related Topics
- Healthcare AI Security Overview -- foundational context for healthcare AI testing
- HIPAA & AI -- complementary regulatory framework for AI handling PHI
- Governance, Legal & Compliance -- broader regulatory compliance testing
- Clinical AI Attacks -- attack techniques relevant to SaMD vulnerabilities
References
- "Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices" - U.S. Food and Drug Administration (2025) - Comprehensive listing of FDA-authorized AI/ML-enabled medical devices and regulatory pathway information
- "Marketing Submission Recommendations for a Predetermined Change Control Plan for AI/ML-Enabled Device Software Functions" - FDA Guidance Document (2024) - Detailed guidance on PCCP requirements for adaptive AI medical devices
- "Software as a Medical Device (SaMD): Clinical Evaluation" - IMDRF/SaMD Working Group (2017) - International framework for clinical evaluation of SaMD adopted by FDA
- "Postmarket Management of Cybersecurity in Medical Devices" - FDA Guidance Document (2023) - FDA guidance on managing cybersecurity risks in medical devices throughout their lifecycle
Why do security patches for FDA-cleared AI medical devices present a regulatory challenge?