FDA AI/ML Regulation
FDA regulatory framework for AI and machine learning in medical devices including Software as a Medical Device classification, predetermined change control plans, real-world performance monitoring, and red team testing implications.
The FDA regulates AI and machine learning systems used in clinical settings through its Software as a Medical Device (SaMD) framework. This regulation directly affects 紅隊 engagements 因為 many healthcare AI systems are FDA-cleared or FDA-approved medical devices, and 安全 測試 must operate within the constraints of that regulatory status. Findings must be reported in terms that map to FDA's risk framework, and certain types of modifications to address 安全 漏洞 may themselves require FDA 授權.
Software as a Medical Device (SaMD)
What Qualifies as SaMD
The FDA defines SaMD as software intended to be used for one or more medical purposes that perform those purposes without being part of a hardware medical device. AI systems that analyze medical data to provide diagnostic information, treatment recommendations, or clinical measurements may qualify as SaMD.
SaMD determination factors:
| Factor | SaMD Likely | SaMD Unlikely |
|---|---|---|
| Provides diagnosis or treatment recommendation | Yes — directly influences clinical decisions | No — provides general health information only |
| Analyzes patient-specific data | Yes — processes individual patient data for clinical use | No — provides population-level statistics |
| Claims clinical performance | Yes — marketed with clinical accuracy claims | No — explicitly disclaims clinical use |
| Integrated into clinical workflow | Yes — outputs feed into clinical decision-making | No — standalone wellness application |
| FDA-cleared predicate exists | Yes — similar cleared devices exist in the market | Varies — novel intended use may require de novo classification |
Risk Classification
The FDA classifies SaMD using a two-dimensional framework based on the significance of the information provided and the healthcare situation or condition:
Significance of Information
──────────────────────────────
Treat/Diagnose Drive Clinical Inform Clinical
Management Management
──────────────┬──────────────┬──────────────┬──────────────
Critical │ Class III │ Class III │ Class II
│ (PMA) │ (PMA/De Novo)│ (510(k))
──────────────┼──────────────┼──────────────┼──────────────
Serious │ Class III │ Class II │ Class II
│ (PMA) │ (510(k)) │ (510(k))
──────────────┼──────────────┼──────────────┼──────────────
Non-serious │ Class II │ Class II │ Class I
│ (510(k)) │ (510(k)) │ (Exempt)
──────────────┴──────────────┴──────────────┴──────────────
PMA = Premarket Approval (highest scrutiny)
510(k) = Premarket Notification (substantial equivalence)
De Novo = Novel device classification
Red team implication: The classification level determines how 安全 findings should be reported and how quickly they must be addressed. A 漏洞 in a Class III diagnostic AI that directly drives treatment decisions is categorically more severe than the same 漏洞 in a Class I wellness application.
Predetermined Change Control Plans
The Adaptive AI Challenge
Traditional medical device regulation assumes a static product: the device is evaluated, cleared, and then manufactured identically. AI/ML systems that learn and adapt over time break this model. A diagnostic AI that updates its parameters based on new data is, from a regulatory perspective, a different device after each update.
The FDA addressed this through the Predetermined Change Control Plan (PCCP) framework, which allows manufacturers to pre-specify the types of changes an AI system will make and the methodology for validating those changes without requiring new premarket submissions 對每個 update.
PCCP Components
| Component | Description | 安全 Relevance |
|---|---|---|
| Description of Modifications | What types of changes the AI will make (performance improvement, new data sources, architecture changes) | Defines the boundary of authorized AI behavior — changes outside the PCCP may indicate compromise |
| Modification Protocol | The process for 實作 changes including data requirements, 訓練 procedures, and validation methodology | 安全 controls in the modification pipeline are regulatory requirements, not optional |
| Performance 評估 | How the manufacturer will verify that modifications maintain 安全 and effectiveness | Red team findings that demonstrate performance degradation map directly to PCCP compliance |
| Update Transparency | How users will be informed of changes | Unauthorized changes that are not disclosed violate both 安全 and regulatory requirements |
安全 Implications of PCCPs
Red team testers should 理解 how PCCPs create both constraints and opportunities:
測試 within PCCP boundaries:
- Verify that the AI system only makes changes within the scope defined by its PCCP
- 測試 whether 對抗性 inputs can cause 系統 to make changes outside PCCP boundaries
- 評估 whether the modification pipeline has adequate 安全 controls against 投毒
- Verify that the performance 評估 process would detect adversarially induced degradation
PCCP as a 偵測 mechanism:
- Changes to AI behavior that fall outside the PCCP scope may indicate compromise
- 監控 systems designed for PCCP compliance can double as 安全 監控
- Unauthorized modifications are both 安全 incidents and regulatory violations
# Framework for 測試 PCCP compliance under 對抗性 conditions
class PCCPComplianceTest:
"""
測試 whether an adaptive AI system maintains compliance
with its Predetermined Change Control Plan under
對抗性 conditions.
"""
def __init__(self, pccp_specification, model_api):
self.pccp = pccp_specification
self.model = model_api
def test_boundary_modifications(self):
"""
Attempt to induce the AI to make modifications
that fall outside its PCCP-defined boundaries.
"""
results = []
# 測試: can 對抗性 feedback cause architecture changes
# outside PCCP scope?
if self.pccp.allows_architecture_changes is False:
result = self.attempt_architecture_modification()
results.append({
"測試": "architecture_modification",
"pccp_allows": False,
"modification_occurred": result.modified,
"violation": result.modified,
})
# 測試: can poisoned data cause performance to drop
# below PCCP-specified thresholds?
result = self.test_performance_degradation()
results.append({
"測試": "performance_degradation",
"pccp_threshold": self.pccp.min_performance,
"post_attack_performance": result.performance,
"violation": (
result.performance < self.pccp.min_performance
),
})
return results
def test_modification_pipeline_security(self):
"""
評估 the 安全 of the data pipeline that
feeds the AI's learning/update process.
"""
pipeline_tests = {
"data_source_authentication": (
self.test_data_source_auth()
),
"data_integrity_validation": (
self.test_data_integrity()
),
"training_process_isolation": (
self.test_training_isolation()
),
"rollback_capability": (
self.test_rollback()
),
}
return pipeline_testsReal-World Performance 監控
Post-Market Surveillance for AI
The FDA requires post-market surveillance for medical devices, including AI systems. For SaMD, this includes 監控 real-world performance to ensure the device continues to perform as intended after deployment.
Post-market 監控 requirements relevant to 安全:
| 監控 Area | FDA Requirement | 安全 測試 Angle |
|---|---|---|
| Performance drift | Detect changes in diagnostic accuracy over time | Could be caused by data drift or 對抗性 資料投毒 |
| Adverse events | Report adverse events and malfunctions | AI 安全 incidents may qualify as reportable adverse events |
| Complaints | Track and investigate user complaints | Complaints about incorrect AI outputs may indicate 安全 compromise |
| Real-world effectiveness | Validate that cleared performance levels are maintained | 對抗性 attacks that degrade performance below cleared thresholds are regulatory violations |
對抗性 Performance Degradation
A particularly insidious attack against FDA-cleared AI involves gradually degrading its performance below the thresholds established during premarket review. If 攻擊者 can poison the data pipeline or 利用 the learning process to reduce diagnostic accuracy, the device is no longer operating within its cleared parameters.
測試 for performance degradation attacks:
Establish Baseline Performance
Obtain the AI system's cleared performance specifications (sensitivity, specificity, AUC). These are typically published in the FDA clearance summary (510(k) or De Novo decision summary). The baseline is the minimum performance level 系統 must maintain.
Design Poisoning Strategy
Design a 資料投毒 strategy that targets 系統's weakest performance dimension. If 系統's cleared sensitivity for detecting pneumonia is 95%, design 對抗性 訓練 examples that specifically degrade pneumonia 偵測 sensitivity.
Measure Performance Impact
After introducing poisoned data, measure the AI's performance against its cleared specifications. Any performance below cleared thresholds constitutes a finding with regulatory implications — the device is no longer operating as authorized.
評估 監控 偵測
Determine whether the organization's post-market 監控 would detect the performance degradation. Many organizations rely on periodic batch 評估 rather than continuous 監控, creating windows where degraded performance goes undetected.
FDA Reporting Obligations for AI 安全 Incidents
Medical Device Reports
Manufacturers must submit Medical Device Reports (MDRs) to the FDA when they become aware that their device may have caused or contributed to a death or serious injury, or has malfunctioned in a way that would be likely to cause or contribute to death or serious injury if the malfunction were to recur.
AI 安全 incidents that meet these thresholds include:
| Incident Type | MDR Required? | Reasoning |
|---|---|---|
| 對抗性 attack causes misdiagnosis leading to patient harm | Yes | Device contributed to serious injury |
| 資料投毒 degrades diagnostic accuracy below cleared threshold | Likely | Malfunction likely to cause serious injury if recurrent |
| Model extraction enabling creation of unregulated diagnostic tool | Possibly | Depends on downstream harm from unregulated copy |
| PHI exposure through diagnostic AI | Generally no | Privacy incident rather than device malfunction (but HIPAA notification applies) |
| Prompt injection alters diagnostic 輸出 | Likely | Malfunction likely to cause serious injury if recurrent |
Corrective Actions
FDA-required corrective actions for AI 安全 漏洞 may include:
- Field 安全 Corrective Action — Notification to users about the 漏洞 and interim 緩解 measures
- Software update — Patching the 漏洞, which itself may require FDA review depending on the nature of the change and the PCCP scope
- Recall — In severe cases, temporary removal from the market until the 漏洞 is addressed
- Labeling update — Updated warnings or instructions for use addressing the 安全 risk
Red team reporting consideration: When a 安全 finding has potential FDA reporting implications, include explicit language in the report identifying the finding as potentially MDR-reportable. This ensures the organization's regulatory affairs team can make the reporting determination with full information.
Impact on 紅隊 Engagement Scoping
FDA-Specific Scoping Considerations
| Scoping Element | FDA-Regulated Device | Non-Regulated AI |
|---|---|---|
| 測試 environment | Must not affect production device performance or availability | Standard staging environment |
| Modification restrictions | Cannot modify cleared device outside PCCP during 測試 | Standard penetration 測試 scope |
| Finding classification | Must map to FDA risk framework and MDR thresholds | Standard 漏洞 classification |
| Reporting audience | 安全, regulatory affairs, quality management | 安全 team |
| Remediation constraints | 安全 patches may require FDA review | Standard patch management |
| Timeline | MDR reporting has 30-day deadline from awareness | Standard disclosure timeline |
Navigating the Regulatory Relationship
Red team engagements against FDA-regulated AI systems operate at the intersection of 安全 測試 and regulatory compliance. Key principles:
-
Coordinate with regulatory affairs before 測試. The regulatory team needs to 理解 what you are 測試 and how findings might affect the device's regulatory status.
-
Use FDA risk language in findings. Translate technical 漏洞 descriptions into FDA's risk framework (probability of harm, severity of harm, detectability).
-
理解 the 30-day MDR clock. Once the manufacturer becomes aware of a potentially MDR-reportable event, they have 30 calendar days to submit the report. Red team findings that trigger this clock create time-sensitive reporting obligations.
-
Recognize that 安全 fixes are device modifications. A software patch to address a 安全 漏洞 in a cleared medical device is a device modification that may require FDA review depending on its nature and the device's PCCP.
相關主題
- Healthcare AI 安全 概覽 -- foundational context for healthcare AI 測試
- HIPAA & AI -- complementary regulatory framework for AI handling PHI
- Governance, Legal & Compliance -- broader regulatory compliance 測試
- Clinical AI 攻擊 -- attack techniques relevant to SaMD 漏洞
參考文獻
- "Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices" - U.S. Food and Drug Administration (2025) - Comprehensive listing of FDA-authorized AI/ML-enabled medical devices and regulatory pathway information
- "Marketing Submission Recommendations for a Predetermined Change Control Plan for AI/ML-Enabled Device Software Functions" - FDA Guidance Document (2024) - Detailed guidance on PCCP requirements for adaptive AI medical devices
- "Software as a Medical Device (SaMD): Clinical 評估" - IMDRF/SaMD Working Group (2017) - International framework for clinical 評估 of SaMD adopted by FDA
- "Postmarket Management of Cybersecurity in Medical Devices" - FDA Guidance Document (2023) - FDA guidance on managing cybersecurity risks in medical devices throughout their lifecycle
Why do 安全 patches for FDA-cleared AI medical devices present a regulatory challenge?