AI Supply Chain Incident Response

advanced10 min readUpdated 2026-03-20

Incident response procedures for compromises in the AI supply chain, including model repositories, training pipelines, and dependency chains.

ai-forensics-ir supply-chain incident-response model-security

Overview

The AI supply chain encompasses every component that contributes to a deployed AI system: pre-trained model weights, fine-tuning datasets, training frameworks (PyTorch, TensorFlow, JAX), inference servers (vLLM, Triton, TGI), model file formats (safetensors, GGUF, ONNX), tokenizers, embedding models, vector databases, guardrail systems, and the entire software dependency tree beneath them. A compromise at any point in this chain can propagate to every downstream system that depends on it.

AI supply chain attacks are particularly dangerous because of the trust relationships involved. When an organization downloads a model from Hugging Face Hub, they are trusting the model author, the platform's integrity checks, the serialization format, and every dependency loaded during inference. Unlike traditional software where supply chain attacks inject executable code, AI supply chain attacks can also inject subtle behavioral modifications -- backdoors, biases, or safety alignment degradation -- that are invisible to standard code review.

This article provides incident response procedures specifically designed for AI supply chain compromises, covering detection, containment, investigation, and recovery. The procedures are aligned with the NIST Secure Software Development Framework (SSDF) and MITRE ATLAS supply chain techniques.

AI Supply Chain Attack Surface

Model Repositories

Public model repositories (Hugging Face Hub, PyTorch Hub, TensorFlow Hub, ONNX Model Zoo) are the most visible attack surface. Threats include:

Attack Vector	Description	Historical Precedent
Malicious model upload	Attacker uploads a trojaned model with a legitimate-sounding name	JFrog Security research (2024) demonstrated pickle-based code execution in HF models
Typosquatting	Model named similarly to a popular model to capture mistaken downloads	Common in PyPI/npm, emerging in model registries
Account compromise	Legitimate model author's account is compromised	HF implemented org-level access tokens after community reports
Serialization exploits	Model files containing executable code (pickle, PyTorch JIT)	ESET and Trail of Bits research on pickle deserialization attacks
Metadata manipulation	Model card claims safety properties that the model does not have	No standard verification mechanism for model card claims

Training Frameworks and Dependencies

The Python ML ecosystem has deep dependency trees. A typical LLM serving stack might include:

# Example dependency tree for a vLLM deployment
pip install vllm
# This transitively installs:
# - torch (PyTorch) - core ML framework
# - transformers (Hugging Face) - model loading and tokenization
# - safetensors - safe model serialization
# - sentencepiece / tokenizers - tokenization
# - numpy, scipy - numerical computation
# - triton - GPU kernel compilation
# - uvicorn, fastapi - HTTP serving
# Plus dozens of transitive dependencies
 
# Check the full dependency tree
pip install pipdeptree
pipdeptree -p vllm

A compromise of any package in this tree -- through a malicious update, a dependency confusion attack, or a maintainer account takeover -- could affect every AI system that depends on it.

Training Data Sources

Third-party training data is a supply chain component. Compromises include:

Data vendor breach leading to poisoned datasets
Web scraping pipelines that ingest adversary-controlled content
Annotation services where annotators are compromised or bribed
Synthetic data generated by compromised models

Indicators of Compromise

Model-Level IoCs

"""
AI supply chain indicators of compromise (IoC) detection module.
"""
import hashlib
import json
import subprocess
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class SupplyChainIoC:
    """An indicator of compromise in the AI supply chain."""
    ioc_type: str
    severity: str  # "low", "medium", "high", "critical"
    description: str
    evidence: dict
    mitre_atlas_id: str | None = None
 
def check_model_file_safety(model_path: str) -> list[SupplyChainIoC]:
    """
    Check a model file for supply chain indicators of compromise.
 
    Examines the model serialization format and contents for
    known malicious patterns.
    """
    iocs = []
    path = Path(model_path)
 
    # Check 1: Dangerous serialization formats
    if path.suffix in (".pkl", ".pickle", ".bin"):
        # PyTorch .bin files use pickle internally
        iocs.append(SupplyChainIoC(
            ioc_type="dangerous_serialization",
            severity="high",
            description=(
                f"Model uses {path.suffix} format which supports arbitrary "
                "code execution during deserialization. Prefer safetensors format."
            ),
            evidence={"file": str(path), "format": path.suffix},
            mitre_atlas_id="AML.T0010",  # ML Supply Chain Compromise
        ))
 
    # Check 2: Scan pickle files for known malicious patterns
    if path.suffix in (".pkl", ".pickle", ".bin", ".pt", ".pth"):
        try:
            import pickletools
            import io
            with open(path, "rb") as f:
                # Read first 10MB to scan for suspicious opcodes
                header = f.read(10 * 1024 * 1024)
 
            # Look for os.system, subprocess, eval, exec patterns
            suspicious_strings = [
                b"os.system", b"subprocess", b"eval", b"exec",
                b"__import__", b"builtins", b"commands",
                b"/bin/sh", b"/bin/bash", b"cmd.exe",
                b"requests.get", b"urllib",
            ]
            for pattern in suspicious_strings:
                if pattern in header:
                    iocs.append(SupplyChainIoC(
                        ioc_type="malicious_payload",
                        severity="critical",
                        description=f"Suspicious pattern '{pattern.decode()}' found in model file",
                        evidence={"file": str(path), "pattern": pattern.decode()},
                        mitre_atlas_id="AML.T0010",
                    ))
        except Exception:
            pass
 
    # Check 3: Verify safetensors integrity
    if path.suffix == ".safetensors":
        try:
            from safetensors import safe_open
            # safetensors format does not support code execution
            # but verify the file is valid
            with safe_open(str(path), framework="pt") as f:
                _ = f.keys()
        except Exception as e:
            iocs.append(SupplyChainIoC(
                ioc_type="corrupted_model_file",
                severity="medium",
                description=f"Safetensors file failed validation: {e}",
                evidence={"file": str(path), "error": str(e)},
            ))
 
    return iocs
 
def check_dependency_integrity(requirements_file: str) -> list[SupplyChainIoC]:
    """
    Check Python dependencies for supply chain indicators.
 
    Verifies package integrity and checks for known compromised versions.
    """
    iocs = []
    req_path = Path(requirements_file)
 
    if not req_path.exists():
        return iocs
 
    lines = req_path.read_text().splitlines()
    for line in lines:
        line = line.strip()
        if not line or line.startswith("#"):
            continue
 
        # Check for unpinned dependencies
        if "==" not in line and ">=" not in line:
            iocs.append(SupplyChainIoC(
                ioc_type="unpinned_dependency",
                severity="medium",
                description=f"Dependency '{line}' is not version-pinned",
                evidence={"package": line, "file": str(req_path)},
            ))
 
        # Check for known typosquatting patterns in ML packages
        known_packages = {
            "torch", "pytorch", "tensorflow", "transformers",
            "numpy", "scipy", "pandas", "safetensors",
        }
        pkg_name = line.split("==")[0].split(">=")[0].split("<=")[0].strip()
        for known in known_packages:
            if pkg_name != known and _levenshtein_distance(pkg_name, known) <= 2:
                iocs.append(SupplyChainIoC(
                    ioc_type="possible_typosquat",
                    severity="high",
                    description=f"Package '{pkg_name}' is similar to known package '{known}'",
                    evidence={"package": pkg_name, "similar_to": known},
                    mitre_atlas_id="AML.T0010",
                ))
 
    return iocs
 
def _levenshtein_distance(s1: str, s2: str) -> int:
    if len(s1) < len(s2):
        return _levenshtein_distance(s2, s1)
    if len(s2) == 0:
        return len(s1)
    prev_row = range(len(s2) + 1)
    for i, c1 in enumerate(s1):
        curr_row = [i + 1]
        for j, c2 in enumerate(s2):
            insertions = prev_row[j + 1] + 1
            deletions = curr_row[j] + 1
            substitutions = prev_row[j] + (c1 != c2)
            curr_row.append(min(insertions, deletions, substitutions))
        prev_row = curr_row
    return prev_row[-1]

Pipeline-Level IoCs

def check_training_pipeline_integrity(
    pipeline_config: dict,
    expected_config_hash: str | None = None,
) -> list[SupplyChainIoC]:
    """
    Check training pipeline configuration for supply chain IoCs.
    """
    iocs = []
 
    # Check for unexpected data sources
    data_sources = pipeline_config.get("data_sources", [])
    for source in data_sources:
        url = source.get("url", "")
        # Flag non-HTTPS data sources
        if url.startswith("http://"):
            iocs.append(SupplyChainIoC(
                ioc_type="insecure_data_source",
                severity="high",
                description=f"Data source uses insecure HTTP: {url}",
                evidence={"url": url},
            ))
 
        # Flag data sources from unknown domains
        if source.get("verified") is not True:
            iocs.append(SupplyChainIoC(
                ioc_type="unverified_data_source",
                severity="medium",
                description=f"Data source not verified: {url}",
                evidence={"url": url, "source_config": source},
            ))
 
    # Check for config tampering
    if expected_config_hash:
        config_bytes = json.dumps(pipeline_config, sort_keys=True).encode()
        actual_hash = hashlib.sha256(config_bytes).hexdigest()
        if actual_hash != expected_config_hash:
            iocs.append(SupplyChainIoC(
                ioc_type="config_tampering",
                severity="critical",
                description="Pipeline configuration hash does not match expected value",
                evidence={
                    "expected": expected_config_hash,
                    "actual": actual_hash,
                },
            ))
 
    return iocs

Incident Response Procedures

Phase 1: Detection and Triage

When a supply chain compromise is suspected:

Alert correlation: Determine whether the alert is isolated or part of a broader pattern. Check if other organizations using the same model/package/data have reported issues.
Scope assessment: Identify all systems that depend on the compromised component.
Severity scoring: Use the AI Incident Severity Scoring framework. Supply chain compromises typically score HIGH or CRITICAL on blast radius.

Phase 2: Containment

# Immediate containment steps for a suspected model compromise
 
# 1. Quarantine the suspected model
# Move to isolated storage, do not delete (preserve evidence)
mkdir -p /evidence/quarantine/$(date +%Y%m%d)
mv /models/production/compromised-model/ /evidence/quarantine/$(date +%Y%m%d)/
 
# 2. Roll back to last known-good model version
# Verify the rollback target's integrity first
sha256sum /models/archive/model-v1.2.3/model.safetensors
# Compare against stored reference hash
 
# 3. Block the compromised source
# Add to model registry blocklist
echo "compromised-org/malicious-model" >> /etc/ai-security/model-blocklist.txt
 
# 4. Revoke any API keys or tokens that may have been exposed
# through the compromised component
 
# 5. Capture forensic evidence before any cleanup
tar -czf /evidence/model-artifacts-$(date +%s).tar.gz \
    /var/log/model-server/ \
    /tmp/model-cache/ \
    /etc/model-config/

Phase 3: Investigation

The investigation phase determines the full scope and impact of the compromise:

Component analysis: Examine the compromised component for malicious modifications (backdoors in models, malicious code in packages)
Propagation tracing: Determine which downstream systems ingested the compromised component and when
Impact assessment: Evaluate what actions the compromised component could have performed (data access, model modifications, lateral movement)
Attribution: Attempt to determine the threat actor and their objectives

Phase 4: Eradication and Recovery

Remove compromised components: Replace all instances of the compromised model, package, or data with verified clean versions
Rebuild affected models: If training data was compromised, models must be retrained from clean data
Patch entry points: Address the vulnerability that allowed the supply chain compromise (e.g., lack of signature verification, insecure deserialization)
Verify recovery: Run full behavioral evaluation on recovered systems to confirm clean state

Phase 5: Post-Incident

Lessons learned: Document the incident and update supply chain security policies
Share intelligence: Publish IoCs through appropriate channels (without revealing sensitive organizational details)
Update detection: Add detection rules for the specific attack patterns observed
Review supply chain controls: Assess whether additional controls are needed

Prevention and Hardening

Model Provenance Verification

def verify_model_provenance(
    model_path: str,
    expected_hashes: dict[str, str],
    expected_source: str,
) -> dict:
    """
    Verify model provenance before deployment.
 
    Checks file integrity, source attestation, and format safety.
    """
    results = {"checks": [], "overall_status": "PASS"}
    path = Path(model_path)
 
    # Check 1: File format safety
    safe_formats = {".safetensors", ".onnx", ".gguf"}
    if path.suffix in safe_formats:
        results["checks"].append({"name": "format_safety", "status": "PASS"})
    else:
        results["checks"].append({
            "name": "format_safety",
            "status": "WARN",
            "detail": f"Format {path.suffix} may support code execution",
        })
 
    # Check 2: Hash verification
    for filename, expected_hash in expected_hashes.items():
        filepath = Path(model_path) / filename if path.is_dir() else path
        if filepath.exists():
            actual_hash = hashlib.sha256(filepath.read_bytes()).hexdigest()
            status = "PASS" if actual_hash == expected_hash else "FAIL"
            results["checks"].append({
                "name": f"hash_{filename}",
                "status": status,
                "expected": expected_hash,
                "actual": actual_hash,
            })
            if status == "FAIL":
                results["overall_status"] = "FAIL"
        else:
            results["checks"].append({
                "name": f"hash_{filename}",
                "status": "FAIL",
                "detail": "File not found",
            })
            results["overall_status"] = "FAIL"
 
    return results

Dependency Pinning and Verification

# Generate locked dependencies with hashes
pip install pip-tools
pip-compile --generate-hashes requirements.in -o requirements.txt
 
# Verify installed packages against expected hashes
pip install --require-hashes -r requirements.txt
 
# Use sigstore for Python package verification (where available)
pip install sigstore
python -m sigstore verify identity \
    --cert-identity publisher@example.com \
    --cert-oidc-issuer https://accounts.google.com \
    package-1.0.0.tar.gz

MITRE ATLAS Mapping

Supply Chain Attack	ATLAS Technique	ATLAS ID
Compromised model repository	ML Supply Chain Compromise	AML.T0010
Poisoned pre-training data	Poison Training Data	AML.T0020
Compromised ML framework	ML Supply Chain Compromise > Software Dependencies	AML.T0010.001
Malicious model serialization	ML Supply Chain Compromise > Model Repository	AML.T0010.000
Backdoored fine-tuning data	Poison Training Data > Inject Poisoned Data	AML.T0020.000

References

JFrog Security Research. (2024). Malicious ML Models on Hugging Face. JFrog Blog. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NIST. (2022). Secure Software Development Framework (SSDF) Version 1.1. NIST SP 800-218. https://doi.org/10.6028/NIST.SP.800-218
Trail of Bits. (2024). Fickling: A Python Pickling Decompiler and Static Analyzer. https://github.com/trailofbits/fickling

Edit this page on GitHub

AI Supply Chain Incident Response

advanced10 min readUpdated 2026-03-20

Incident response procedures for compromises in the AI supply chain, including model repositories, training pipelines, and dependency chains.

ai-forensics-ir supply-chain incident-response model-security

Attack Vector	Description	Historical Precedent
Malicious model upload	Attacker uploads a trojaned model with a legitimate-sounding name	JFrog Security research (2024) demonstrated pickle-based code execution in HF models
Typosquatting	Model named similarly to a popular model to capture mistaken downloads	Common in PyPI/npm, emerging in model registries
Account compromise	Legitimate model author's account is compromised	HF implemented org-level access tokens after community reports
Serialization exploits	Model files containing executable code (pickle, PyTorch JIT)	ESET and Trail of Bits research on pickle deserialization attacks
Metadata manipulation	Model card claims safety properties that the model does not have	No standard verification mechanism for model card claims

Training Frameworks and Dependencies

The Python ML ecosystem has deep dependency trees. A typical LLM serving stack might include:

# Example dependency tree for a vLLM deployment
pip install vllm
# This transitively installs:
# - torch (PyTorch) - core ML framework
# - transformers (Hugging Face) - model loading and tokenization
# - safetensors - safe model serialization
# - sentencepiece / tokenizers - tokenization
# - numpy, scipy - numerical computation
# - triton - GPU kernel compilation
# - uvicorn, fastapi - HTTP serving
# Plus dozens of transitive dependencies
 
# Check the full dependency tree
pip install pipdeptree
pipdeptree -p vllm

A compromise of any package in this tree -- through a malicious update, a dependency confusion attack, or a maintainer account takeover -- could affect every AI system that depends on it.

Training Data Sources

Third-party training data is a supply chain component. Compromises include:

Data vendor breach leading to poisoned datasets
Web scraping pipelines that ingest adversary-controlled content
Annotation services where annotators are compromised or bribed
Synthetic data generated by compromised models

Indicators of Compromise

Model-Level IoCs

"""
AI supply chain indicators of compromise (IoC) detection module.
"""
import hashlib
import json
import subprocess
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class SupplyChainIoC:
    """An indicator of compromise in the AI supply chain."""
    ioc_type: str
    severity: str  # "low", "medium", "high", "critical"
    description: str
    evidence: dict
    mitre_atlas_id: str | None = None
 
def check_model_file_safety(model_path: str) -> list[SupplyChainIoC]:
    """
    Check a model file for supply chain indicators of compromise.
 
    Examines the model serialization format and contents for
    known malicious patterns.
    """
    iocs = []
    path = Path(model_path)
 
    # Check 1: Dangerous serialization formats
    if path.suffix in (".pkl", ".pickle", ".bin"):
        # PyTorch .bin files use pickle internally
        iocs.append(SupplyChainIoC(
            ioc_type="dangerous_serialization",
            severity="high",
            description=(
                f"Model uses {path.suffix} format which supports arbitrary "
                "code execution during deserialization. Prefer safetensors format."
            ),
            evidence={"file": str(path), "format": path.suffix},
            mitre_atlas_id="AML.T0010",  # ML Supply Chain Compromise
        ))
 
    # Check 2: Scan pickle files for known malicious patterns
    if path.suffix in (".pkl", ".pickle", ".bin", ".pt", ".pth"):
        try:
            import pickletools
            import io
            with open(path, "rb") as f:
                # Read first 10MB to scan for suspicious opcodes
                header = f.read(10 * 1024 * 1024)
 
            # Look for os.system, subprocess, eval, exec patterns
            suspicious_strings = [
                b"os.system", b"subprocess", b"eval", b"exec",
                b"__import__", b"builtins", b"commands",
                b"/bin/sh", b"/bin/bash", b"cmd.exe",
                b"requests.get", b"urllib",
            ]
            for pattern in suspicious_strings:
                if pattern in header:
                    iocs.append(SupplyChainIoC(
                        ioc_type="malicious_payload",
                        severity="critical",
                        description=f"Suspicious pattern '{pattern.decode()}' found in model file",
                        evidence={"file": str(path), "pattern": pattern.decode()},
                        mitre_atlas_id="AML.T0010",
                    ))
        except Exception:
            pass
 
    # Check 3: Verify safetensors integrity
    if path.suffix == ".safetensors":
        try:
            from safetensors import safe_open
            # safetensors format does not support code execution
            # but verify the file is valid
            with safe_open(str(path), framework="pt") as f:
                _ = f.keys()
        except Exception as e:
            iocs.append(SupplyChainIoC(
                ioc_type="corrupted_model_file",
                severity="medium",
                description=f"Safetensors file failed validation: {e}",
                evidence={"file": str(path), "error": str(e)},
            ))
 
    return iocs
 
def check_dependency_integrity(requirements_file: str) -> list[SupplyChainIoC]:
    """
    Check Python dependencies for supply chain indicators.
 
    Verifies package integrity and checks for known compromised versions.
    """
    iocs = []
    req_path = Path(requirements_file)
 
    if not req_path.exists():
        return iocs
 
    lines = req_path.read_text().splitlines()
    for line in lines:
        line = line.strip()
        if not line or line.startswith("#"):
            continue
 
        # Check for unpinned dependencies
        if "==" not in line and ">=" not in line:
            iocs.append(SupplyChainIoC(
                ioc_type="unpinned_dependency",
                severity="medium",
                description=f"Dependency '{line}' is not version-pinned",
                evidence={"package": line, "file": str(req_path)},
            ))
 
        # Check for known typosquatting patterns in ML packages
        known_packages = {
            "torch", "pytorch", "tensorflow", "transformers",
            "numpy", "scipy", "pandas", "safetensors",
        }
        pkg_name = line.split("==")[0].split(">=")[0].split("<=")[0].strip()
        for known in known_packages:
            if pkg_name != known and _levenshtein_distance(pkg_name, known) <= 2:
                iocs.append(SupplyChainIoC(
                    ioc_type="possible_typosquat",
                    severity="high",
                    description=f"Package '{pkg_name}' is similar to known package '{known}'",
                    evidence={"package": pkg_name, "similar_to": known},
                    mitre_atlas_id="AML.T0010",
                ))
 
    return iocs
 
def _levenshtein_distance(s1: str, s2: str) -> int:
    if len(s1) < len(s2):
        return _levenshtein_distance(s2, s1)
    if len(s2) == 0:
        return len(s1)
    prev_row = range(len(s2) + 1)
    for i, c1 in enumerate(s1):
        curr_row = [i + 1]
        for j, c2 in enumerate(s2):
            insertions = prev_row[j + 1] + 1
            deletions = curr_row[j] + 1
            substitutions = prev_row[j] + (c1 != c2)
            curr_row.append(min(insertions, deletions, substitutions))
        prev_row = curr_row
    return prev_row[-1]

Pipeline-Level IoCs

def check_training_pipeline_integrity(
    pipeline_config: dict,
    expected_config_hash: str | None = None,
) -> list[SupplyChainIoC]:
    """
    Check training pipeline configuration for supply chain IoCs.
    """
    iocs = []
 
    # Check for unexpected data sources
    data_sources = pipeline_config.get("data_sources", [])
    for source in data_sources:
        url = source.get("url", "")
        # Flag non-HTTPS data sources
        if url.startswith("http://"):
            iocs.append(SupplyChainIoC(
                ioc_type="insecure_data_source",
                severity="high",
                description=f"Data source uses insecure HTTP: {url}",
                evidence={"url": url},
            ))
 
        # Flag data sources from unknown domains
        if source.get("verified") is not True:
            iocs.append(SupplyChainIoC(
                ioc_type="unverified_data_source",
                severity="medium",
                description=f"Data source not verified: {url}",
                evidence={"url": url, "source_config": source},
            ))
 
    # Check for config tampering
    if expected_config_hash:
        config_bytes = json.dumps(pipeline_config, sort_keys=True).encode()
        actual_hash = hashlib.sha256(config_bytes).hexdigest()
        if actual_hash != expected_config_hash:
            iocs.append(SupplyChainIoC(
                ioc_type="config_tampering",
                severity="critical",
                description="Pipeline configuration hash does not match expected value",
                evidence={
                    "expected": expected_config_hash,
                    "actual": actual_hash,
                },
            ))
 
    return iocs

Incident Response Procedures

Phase 1: Detection and Triage

When a supply chain compromise is suspected:

Alert correlation: Determine whether the alert is isolated or part of a broader pattern. Check if other organizations using the same model/package/data have reported issues.
Scope assessment: Identify all systems that depend on the compromised component.
Severity scoring: Use the AI Incident Severity Scoring framework. Supply chain compromises typically score HIGH or CRITICAL on blast radius.

Phase 2: Containment

# Immediate containment steps for a suspected model compromise
 
# 1. Quarantine the suspected model
# Move to isolated storage, do not delete (preserve evidence)
mkdir -p /evidence/quarantine/$(date +%Y%m%d)
mv /models/production/compromised-model/ /evidence/quarantine/$(date +%Y%m%d)/
 
# 2. Roll back to last known-good model version
# Verify the rollback target's integrity first
sha256sum /models/archive/model-v1.2.3/model.safetensors
# Compare against stored reference hash
 
# 3. Block the compromised source
# Add to model registry blocklist
echo "compromised-org/malicious-model" >> /etc/ai-security/model-blocklist.txt
 
# 4. Revoke any API keys or tokens that may have been exposed
# through the compromised component
 
# 5. Capture forensic evidence before any cleanup
tar -czf /evidence/model-artifacts-$(date +%s).tar.gz \
    /var/log/model-server/ \
    /tmp/model-cache/ \
    /etc/model-config/

Phase 3: Investigation

The investigation phase determines the full scope and impact of the compromise:

Component analysis: Examine the compromised component for malicious modifications (backdoors in models, malicious code in packages)
Propagation tracing: Determine which downstream systems ingested the compromised component and when
Impact assessment: Evaluate what actions the compromised component could have performed (data access, model modifications, lateral movement)
Attribution: Attempt to determine the threat actor and their objectives

Phase 4: Eradication and Recovery

Remove compromised components: Replace all instances of the compromised model, package, or data with verified clean versions
Rebuild affected models: If training data was compromised, models must be retrained from clean data
Patch entry points: Address the vulnerability that allowed the supply chain compromise (e.g., lack of signature verification, insecure deserialization)
Verify recovery: Run full behavioral evaluation on recovered systems to confirm clean state

Phase 5: Post-Incident

Lessons learned: Document the incident and update supply chain security policies
Share intelligence: Publish IoCs through appropriate channels (without revealing sensitive organizational details)
Update detection: Add detection rules for the specific attack patterns observed
Review supply chain controls: Assess whether additional controls are needed

Prevention and Hardening

Model Provenance Verification

def verify_model_provenance(
    model_path: str,
    expected_hashes: dict[str, str],
    expected_source: str,
) -> dict:
    """
    Verify model provenance before deployment.
 
    Checks file integrity, source attestation, and format safety.
    """
    results = {"checks": [], "overall_status": "PASS"}
    path = Path(model_path)
 
    # Check 1: File format safety
    safe_formats = {".safetensors", ".onnx", ".gguf"}
    if path.suffix in safe_formats:
        results["checks"].append({"name": "format_safety", "status": "PASS"})
    else:
        results["checks"].append({
            "name": "format_safety",
            "status": "WARN",
            "detail": f"Format {path.suffix} may support code execution",
        })
 
    # Check 2: Hash verification
    for filename, expected_hash in expected_hashes.items():
        filepath = Path(model_path) / filename if path.is_dir() else path
        if filepath.exists():
            actual_hash = hashlib.sha256(filepath.read_bytes()).hexdigest()
            status = "PASS" if actual_hash == expected_hash else "FAIL"
            results["checks"].append({
                "name": f"hash_{filename}",
                "status": status,
                "expected": expected_hash,
                "actual": actual_hash,
            })
            if status == "FAIL":
                results["overall_status"] = "FAIL"
        else:
            results["checks"].append({
                "name": f"hash_{filename}",
                "status": "FAIL",
                "detail": "File not found",
            })
            results["overall_status"] = "FAIL"
 
    return results

Dependency Pinning and Verification

# Generate locked dependencies with hashes
pip install pip-tools
pip-compile --generate-hashes requirements.in -o requirements.txt
 
# Verify installed packages against expected hashes
pip install --require-hashes -r requirements.txt
 
# Use sigstore for Python package verification (where available)
pip install sigstore
python -m sigstore verify identity \
    --cert-identity publisher@example.com \
    --cert-oidc-issuer https://accounts.google.com \
    package-1.0.0.tar.gz

MITRE ATLAS Mapping

Supply Chain Attack	ATLAS Technique	ATLAS ID
Compromised model repository	ML Supply Chain Compromise	AML.T0010
Poisoned pre-training data	Poison Training Data	AML.T0020
Compromised ML framework	ML Supply Chain Compromise > Software Dependencies	AML.T0010.001
Malicious model serialization	ML Supply Chain Compromise > Model Repository	AML.T0010.000
Backdoored fine-tuning data	Poison Training Data > Inject Poisoned Data	AML.T0020.000

References

JFrog Security Research. (2024). Malicious ML Models on Hugging Face. JFrog Blog. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NIST. (2022). Secure Software Development Framework (SSDF) Version 1.1. NIST SP 800-218. https://doi.org/10.6028/NIST.SP.800-218
Trail of Bits. (2024). Fickling: A Python Pickling Decompiler and Static Analyzer. https://github.com/trailofbits/fickling

Edit this page on GitHub

AI Supply Chain Incident Response

Related articles

AI Supply Chain Incident Response

Related articles