AI 供應鏈事件回應

進階4 分鐘閱讀更新於 2026-03-20

針對 AI 供應鏈妥協的事件回應程序,涵蓋模型倉儲、訓練管線與相依鏈。

ai-forensics-ir supply-chain incident-response model-security

概觀

AI 供應鏈涵蓋為已部署 AI 系統貢獻的每個元件:預訓練模型權重、微調資料集、訓練框架(PyTorch、TensorFlow、JAX)、推論伺服器(vLLM、Triton、TGI)、模型檔案格式(safetensors、GGUF、ONNX)、分詞器、嵌入模型、向量資料庫、護欄系統,以及其底下的整個軟體相依樹。在此鏈中任何一點的妥協可傳播到每個依賴它的下游系統。

AI 供應鏈攻擊因其涉及的信任關係而特別危險。當組織從 Hugging Face Hub 下載模型時,他們信任模型作者、平台的完整性檢查、序列化格式,以及推論期間載入的每個相依套件。與傳統軟體不同——傳統供應鏈攻擊注入可執行程式碼——AI 供應鏈攻擊還可注入微妙的行為修改(後門、偏見或安全對齊退化),這些對標準程式碼審查都不可見。

本文提供專為 AI 供應鏈妥協設計的事件回應程序,涵蓋偵測、圍堵、調查與復原。這些程序與 NIST 安全軟體開發框架(SSDF)及 MITRE ATLAS 供應鏈技術對齊。

AI 供應鏈攻擊面

模型倉儲

公開模型倉儲(Hugging Face Hub、PyTorch Hub、TensorFlow Hub、ONNX Model Zoo)是最顯眼的攻擊面。威脅包括:

攻擊向量	描述	歷史案例
惡意模型上傳	攻擊者上傳看似合法名稱的特洛伊模型	JFrog Security 研究(2024)證明了 HF 模型中基於 pickle 的程式碼執行
類似命名 (typosquatting)	模型命名類似熱門模型以捕捉誤打下載	常見於 PyPI/npm,在模型註冊庫中興起
帳號妥協	合法模型作者的帳號被妥協	HF 在社群回報後實作了組織級存取權杖
序列化漏洞利用	模型檔案含可執行程式碼(pickle、PyTorch JIT)	ESET 與 Trail of Bits 對 pickle 反序列化攻擊的研究
元資料竄改	模型卡宣稱的安全特性與實際模型不符	模型卡宣稱無標準驗證機制

訓練框架與相依套件

Python ML 生態系有深度相依樹。典型的 LLM 服務堆疊可能包含:

# Example dependency tree for a vLLM deployment
pip install vllm
# This transitively installs:
# - torch (PyTorch) - core ML framework
# - transformers (Hugging Face) - model loading and tokenization
# - safetensors - safe model serialization
# - sentencepiece / tokenizers - tokenization
# - numpy, scipy - numerical computation
# - triton - GPU kernel compilation
# - uvicorn, fastapi - HTTP serving
pip install pipdeptree
pipdeptree -p vllm

此樹中任何套件的妥協——透過惡意更新、相依混淆攻擊或維護者帳號被接管——都可能影響每個依賴它的 AI 系統。

訓練資料來源

第三方訓練資料是供應鏈元件。妥協包括:

資料廠商外洩導致資料集被投毒
網頁爬取管線擷取攻擊者控制的內容
標註服務中標註者被妥協或被賄賂
由被妥協模型生成的合成資料

入侵指標

模型層級 IoC

import hashlib, json
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class SupplyChainIoC:
    ioc_type: str
    severity: str  # "low", "medium", "high", "critical"
    description: str
    evidence: dict
    mitre_atlas_id: str | None = None

check_model_file_safety(model_path) 執行三項檢查:

危險序列化格式檢查:若副檔名為 .pkl、.pickle、.bin,標示為「dangerous_serialization」(high),因這些格式支援反序列化期間任意程式碼執行;建議使用 safetensors 格式。對應 MITRE ATLAS AML.T0010。
Pickle 檔案惡意模式掃描:讀取檔案前 10MB 並搜尋可疑字串(os.system、subprocess、eval、exec、__import__、builtins、commands、/bin/sh、/bin/bash、cmd.exe、requests.get、urllib)。發現即標記為「malicious_payload」(critical)。
safetensors 完整性驗證:嘗試以 safe_open 開啟並列舉鍵。失敗則標記「corrupted_model_file」(medium)。

check_dependency_integrity(requirements_file) 檢查 Python 相依:

未鎖定相依:若行內無 == 或 >=,標記「unpinned_dependency」(medium)。
可能的 typosquatting:套件名若與已知套件(torch、pytorch、tensorflow、transformers、numpy、scipy、pandas、safetensors)的 Levenshtein 距離 ≤ 2 但不相等,標記「possible_typosquat」(high),對應 AML.T0010。

_levenshtein_distance(s1, s2) 是標準編輯距離實作,用於 typosquat 偵測。

管線層級 IoC

check_training_pipeline_integrity(pipeline_config, expected_config_hash) 檢查訓練管線組態:

不安全資料來源:URL 以 http:// 開頭時標記「insecure_data_source」(high)
未驗證資料來源:source.verified 不為 True 時標記「unverified_data_source」(medium)
組態被竄改:將 pipeline_config 以排序鍵序列化並計算 SHA-256,若與 expected_config_hash 不符標記「config_tampering」(critical)

事件回應程序

階段 1:偵測與分類

當懷疑供應鏈妥協時:

告警關聯:判斷告警是否孤立或屬於更廣泛模式的一部分。檢查使用相同模型/套件/資料的其他組織是否也回報問題。
範圍評估:識別所有依賴被妥協元件的系統。
嚴重性評分:使用 AI 事件嚴重性評分框架。供應鏈妥協在影響範圍面向通常評為 HIGH 或 CRITICAL。

階段 2:圍堵

# Immediate containment steps for a suspected model compromise
 
# 1. Quarantine the suspected model (preserve as evidence, do not delete)
mkdir -p /evidence/quarantine/$(date +%Y%m%d)
mv /models/production/compromised-model/ /evidence/quarantine/$(date +%Y%m%d)/
 
# 2. Roll back to last known-good model version
sha256sum /models/archive/model-v1.2.3/model.safetensors
# Compare against stored reference hash
 
# 3. Block the compromised source
echo "compromised-org/malicious-model" >> /etc/ai-security/model-blocklist.txt
 
# 4. Revoke any API keys or tokens that may have been exposed
 
# 5. Capture forensic evidence before any cleanup
tar -czf /evidence/model-artifacts-$(date +%s).tar.gz \
    /var/log/model-server/ \
    /tmp/model-cache/ \
    /etc/model-config/

上述步驟依序為:(1) 將疑似被妥協模型隔離至帶日期的 /evidence/quarantine 目錄——保留為證據而非刪除;(2) 回滾至最近已知良好的模型版本,先以 sha256sum 驗證目標完整性;(3) 將妥協來源加入模型註冊庫封鎖清單;(4) 撤銷可能曝露的 API 金鑰或權杖;(5) 清理前先以 tar 打包模型伺服器日誌、快取與設定檔作為鑑識證據。

階段 3:調查

調查階段判斷妥協的完整範圍與衝擊:

元件分析:檢查被妥協元件有無惡意修改(模型後門、套件中的惡意程式碼)
傳播追溯:判斷哪些下游系統攝取了被妥協元件以及何時
衝擊評估:評估被妥協元件可能執行的動作(資料存取、模型修改、橫向移動)
歸因:嘗試判定威脅行為者及其目的

階段 4:根除與復原

移除被妥協元件:以已驗證的乾淨版本取代所有被妥協模型、套件或資料的實例
重建受影響模型:若訓練資料被妥協,模型必須從乾淨資料重新訓練
修補進入點:處理允許供應鏈妥協的漏洞(如缺乏簽章驗證、不安全反序列化)
驗證復原:對復原系統執行完整行為評估以確認乾淨狀態

階段 5:事後

經驗教訓:記錄事件並更新供應鏈安全政策
分享情資:透過適當管道發布 IoC(不揭露組織敏感細節)
更新偵測:為觀察到的特定攻擊模式加入偵測規則
審查供應鏈控制:評估是否需要額外控制

預防與強化

模型出處驗證

verify_model_provenance(model_path, expected_hashes, expected_source) 在部署前驗證模型出處:

格式安全檢查:若副檔名在 .safetensors/.onnx/.gguf 的安全集合中則通過;否則警告該格式可能支援程式碼執行
雜湊驗證:對 expected_hashes 中每個檔名,從目錄或檔案讀取 bytes、計算 SHA-256、比對預期值,不符則整體狀態 FAIL
回傳結構化結果:每個檢查的名稱、狀態、詳情,並保留整體 PASS/FAIL 狀態

相依套件鎖定與驗證

# Generate locked dependencies with hashes
pip install pip-tools
pip-compile --generate-hashes requirements.in -o requirements.txt
 
# Verify installed packages against expected hashes
pip install --require-hashes -r requirements.txt
 
# Use sigstore for Python package verification (where available)
pip install sigstore
python -m sigstore verify identity \
    --cert-identity publisher@example.com \
    --cert-oidc-issuer https://accounts.google.com \
    package-1.0.0.tar.gz

上述流程說明:先以 pip-tools 產生含雜湊的鎖定相依清單;安裝時以 --require-hashes 驗證每個套件的完整性;對支援的套件以 sigstore 驗證作者身分與 OIDC 簽章。

MITRE ATLAS 對應

供應鏈攻擊	ATLAS 技術	ATLAS ID
被妥協的模型倉儲	ML Supply Chain Compromise	AML.T0010
被投毒的預訓練資料	Poison Training Data	AML.T0020
被妥協的 ML 框架	ML Supply Chain Compromise > Software Dependencies	AML.T0010.001
惡意模型序列化	ML Supply Chain Compromise > Model Repository	AML.T0010.000
被後門化的微調資料	Poison Training Data > Inject Poisoned Data	AML.T0020.000

參考資料

JFrog Security Research. (2024). Malicious ML Models on Hugging Face. JFrog Blog. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NIST. (2022). Secure Software Development Framework (SSDF) Version 1.1. NIST SP 800-218. https://doi.org/10.6028/NIST.SP.800-218
Trail of Bits. (2024). Fickling: A Python Pickling Decompiler and Static Analyzer. https://github.com/trailofbits/fickling

AI 供應鏈事件回應

進階4 分鐘閱讀更新於 2026-03-20

針對 AI 供應鏈妥協的事件回應程序,涵蓋模型倉儲、訓練管線與相依鏈。

ai-forensics-ir supply-chain incident-response model-security

攻擊向量	描述	歷史案例
惡意模型上傳	攻擊者上傳看似合法名稱的特洛伊模型	JFrog Security 研究(2024)證明了 HF 模型中基於 pickle 的程式碼執行
類似命名 (typosquatting)	模型命名類似熱門模型以捕捉誤打下載	常見於 PyPI/npm,在模型註冊庫中興起
帳號妥協	合法模型作者的帳號被妥協	HF 在社群回報後實作了組織級存取權杖
序列化漏洞利用	模型檔案含可執行程式碼(pickle、PyTorch JIT)	ESET 與 Trail of Bits 對 pickle 反序列化攻擊的研究
元資料竄改	模型卡宣稱的安全特性與實際模型不符	模型卡宣稱無標準驗證機制

訓練框架與相依套件

Python ML 生態系有深度相依樹。典型的 LLM 服務堆疊可能包含:

# Example dependency tree for a vLLM deployment
pip install vllm
# This transitively installs:
# - torch (PyTorch) - core ML framework
# - transformers (Hugging Face) - model loading and tokenization
# - safetensors - safe model serialization
# - sentencepiece / tokenizers - tokenization
# - numpy, scipy - numerical computation
# - triton - GPU kernel compilation
# - uvicorn, fastapi - HTTP serving
pip install pipdeptree
pipdeptree -p vllm

此樹中任何套件的妥協——透過惡意更新、相依混淆攻擊或維護者帳號被接管——都可能影響每個依賴它的 AI 系統。

訓練資料來源

第三方訓練資料是供應鏈元件。妥協包括:

資料廠商外洩導致資料集被投毒
網頁爬取管線擷取攻擊者控制的內容
標註服務中標註者被妥協或被賄賂
由被妥協模型生成的合成資料

入侵指標

模型層級 IoC

import hashlib, json
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class SupplyChainIoC:
    ioc_type: str
    severity: str  # "low", "medium", "high", "critical"
    description: str
    evidence: dict
    mitre_atlas_id: str | None = None

check_model_file_safety(model_path) 執行三項檢查:

危險序列化格式檢查:若副檔名為 .pkl、.pickle、.bin,標示為「dangerous_serialization」(high),因這些格式支援反序列化期間任意程式碼執行;建議使用 safetensors 格式。對應 MITRE ATLAS AML.T0010。
Pickle 檔案惡意模式掃描:讀取檔案前 10MB 並搜尋可疑字串(os.system、subprocess、eval、exec、__import__、builtins、commands、/bin/sh、/bin/bash、cmd.exe、requests.get、urllib)。發現即標記為「malicious_payload」(critical)。
safetensors 完整性驗證:嘗試以 safe_open 開啟並列舉鍵。失敗則標記「corrupted_model_file」(medium)。

check_dependency_integrity(requirements_file) 檢查 Python 相依:

未鎖定相依:若行內無 == 或 >=,標記「unpinned_dependency」(medium)。
可能的 typosquatting:套件名若與已知套件(torch、pytorch、tensorflow、transformers、numpy、scipy、pandas、safetensors)的 Levenshtein 距離 ≤ 2 但不相等,標記「possible_typosquat」(high),對應 AML.T0010。

_levenshtein_distance(s1, s2) 是標準編輯距離實作,用於 typosquat 偵測。

管線層級 IoC

check_training_pipeline_integrity(pipeline_config, expected_config_hash) 檢查訓練管線組態:

不安全資料來源:URL 以 http:// 開頭時標記「insecure_data_source」(high)
未驗證資料來源:source.verified 不為 True 時標記「unverified_data_source」(medium)
組態被竄改:將 pipeline_config 以排序鍵序列化並計算 SHA-256,若與 expected_config_hash 不符標記「config_tampering」(critical)

事件回應程序

階段 1:偵測與分類

當懷疑供應鏈妥協時:

告警關聯:判斷告警是否孤立或屬於更廣泛模式的一部分。檢查使用相同模型/套件/資料的其他組織是否也回報問題。
範圍評估:識別所有依賴被妥協元件的系統。
嚴重性評分:使用 AI 事件嚴重性評分框架。供應鏈妥協在影響範圍面向通常評為 HIGH 或 CRITICAL。

階段 2:圍堵

# Immediate containment steps for a suspected model compromise
 
# 1. Quarantine the suspected model (preserve as evidence, do not delete)
mkdir -p /evidence/quarantine/$(date +%Y%m%d)
mv /models/production/compromised-model/ /evidence/quarantine/$(date +%Y%m%d)/
 
# 2. Roll back to last known-good model version
sha256sum /models/archive/model-v1.2.3/model.safetensors
# Compare against stored reference hash
 
# 3. Block the compromised source
echo "compromised-org/malicious-model" >> /etc/ai-security/model-blocklist.txt
 
# 4. Revoke any API keys or tokens that may have been exposed
 
# 5. Capture forensic evidence before any cleanup
tar -czf /evidence/model-artifacts-$(date +%s).tar.gz \
    /var/log/model-server/ \
    /tmp/model-cache/ \
    /etc/model-config/

階段 3:調查

調查階段判斷妥協的完整範圍與衝擊:

元件分析:檢查被妥協元件有無惡意修改(模型後門、套件中的惡意程式碼)
傳播追溯:判斷哪些下游系統攝取了被妥協元件以及何時
衝擊評估:評估被妥協元件可能執行的動作(資料存取、模型修改、橫向移動)
歸因:嘗試判定威脅行為者及其目的

階段 4:根除與復原

移除被妥協元件:以已驗證的乾淨版本取代所有被妥協模型、套件或資料的實例
重建受影響模型:若訓練資料被妥協,模型必須從乾淨資料重新訓練
修補進入點:處理允許供應鏈妥協的漏洞(如缺乏簽章驗證、不安全反序列化)
驗證復原:對復原系統執行完整行為評估以確認乾淨狀態

階段 5:事後

經驗教訓:記錄事件並更新供應鏈安全政策
分享情資:透過適當管道發布 IoC(不揭露組織敏感細節)
更新偵測:為觀察到的特定攻擊模式加入偵測規則
審查供應鏈控制:評估是否需要額外控制

預防與強化

模型出處驗證

verify_model_provenance(model_path, expected_hashes, expected_source) 在部署前驗證模型出處:

格式安全檢查:若副檔名在 .safetensors/.onnx/.gguf 的安全集合中則通過;否則警告該格式可能支援程式碼執行
雜湊驗證:對 expected_hashes 中每個檔名,從目錄或檔案讀取 bytes、計算 SHA-256、比對預期值,不符則整體狀態 FAIL
回傳結構化結果:每個檢查的名稱、狀態、詳情,並保留整體 PASS/FAIL 狀態

相依套件鎖定與驗證

# Generate locked dependencies with hashes
pip install pip-tools
pip-compile --generate-hashes requirements.in -o requirements.txt
 
# Verify installed packages against expected hashes
pip install --require-hashes -r requirements.txt
 
# Use sigstore for Python package verification (where available)
pip install sigstore
python -m sigstore verify identity \
    --cert-identity publisher@example.com \
    --cert-oidc-issuer https://accounts.google.com \
    package-1.0.0.tar.gz

MITRE ATLAS 對應

供應鏈攻擊	ATLAS 技術	ATLAS ID
被妥協的模型倉儲	ML Supply Chain Compromise	AML.T0010
被投毒的預訓練資料	Poison Training Data	AML.T0020
被妥協的 ML 框架	ML Supply Chain Compromise > Software Dependencies	AML.T0010.001
惡意模型序列化	ML Supply Chain Compromise > Model Repository	AML.T0010.000
被後門化的微調資料	Poison Training Data > Inject Poisoned Data	AML.T0020.000

參考資料

JFrog Security Research. (2024). Malicious ML Models on Hugging Face. JFrog Blog. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NIST. (2022). Secure Software Development Framework (SSDF) Version 1.1. NIST SP 800-218. https://doi.org/10.6028/NIST.SP.800-218
Trail of Bits. (2024). Fickling: A Python Pickling Decompiler and Static Analyzer. https://github.com/trailofbits/fickling

AI 供應鏈事件回應

相關文章

AI 供應鏈事件回應

相關文章