AI System Memory Forensics

expert11 min readUpdated 2026-03-20

Memory forensics techniques for investigating compromised AI systems, including GPU memory analysis, model weight extraction, and runtime state recovery.

ai-forensics-ir memory-forensics gpu-memory runtime-analysis

Overview

Memory forensics for AI systems extends traditional digital forensics into the unique runtime environment of machine learning workloads. AI systems maintain complex state in both CPU and GPU memory during inference and training: model weights, optimizer states, attention caches (KV caches), intermediate activations, tokenizer configurations, and dynamically loaded adapter weights. When an AI system is compromised -- whether through model tampering, unauthorized fine-tuning, or runtime manipulation -- memory forensics provides the investigator with a snapshot of the system's actual state at the time of capture.

Traditional memory forensics tools like Volatility are designed for CPU-addressable memory and operating system artifacts. AI workloads, however, distribute critical state across GPU VRAM, unified memory architectures, and framework-managed memory pools that require specialized extraction techniques. This article covers the end-to-end process of AI system memory forensics, from capture through analysis and reporting.

The stakes are significant: if an attacker has modified model weights in memory without altering the on-disk checkpoint, only a memory forensic investigation will reveal the tampering. Similarly, if an attacker has injected malicious code into a model serving pipeline that modifies outputs at runtime, memory analysis may be the only way to recover the injected logic.

Memory Architecture of AI Systems

CPU Memory Layout

AI serving frameworks (vLLM, Triton Inference Server, TorchServe) maintain several categories of data in CPU memory:

Model configuration: Hyperparameters, tokenizer vocabulary, generation parameters
Request queues: Pending inference requests including full prompt text
Response buffers: Generated outputs before delivery to clients
Framework metadata: Scheduling state, batch composition, memory allocation maps
Logging buffers: Circular buffers of recent inference events

GPU Memory Layout

GPU VRAM contains the computationally active components:

Model weights: The parameter tensors that define model behavior, often in quantized formats (FP16, INT8, INT4)
KV cache: Key-value attention cache for active generation sessions, containing the model's "working memory" of ongoing conversations
Activation tensors: Intermediate computation results during forward passes
CUDA graphs: Pre-compiled computation graphs for optimized inference paths

Unified Memory and NVLink

Modern GPU architectures support unified virtual addressing (UVA) where CPU and GPU memory appear as a single address space. Multi-GPU systems connected via NVLink distribute model weights through tensor parallelism or pipeline parallelism, meaning a complete model state may be spread across multiple GPUs.

Memory Capture Techniques

CPU Memory Acquisition

Standard memory acquisition tools work for the CPU portion of AI system state. On Linux systems, the primary methods are:

# Method 1: /proc filesystem capture (requires root)
# Capture the memory of a running AI serving process
AI_PID=$(pgrep -f "vllm.entrypoints")
cp /proc/${AI_PID}/maps /evidence/proc_maps_$(date +%s).txt
 
# Dump specific memory regions identified from the maps file
# Focus on heap regions where model configs and request data reside
grep "heap" /proc/${AI_PID}/maps
 
# Method 2: Using gcore for a complete process core dump
# This suspends the process briefly -- coordinate with operations
gcore -o /evidence/ai_server_core ${AI_PID}
 
# Method 3: Using LiME (Linux Memory Extractor) for full system memory
# Build LiME kernel module for the running kernel
# insmod lime.ko "path=/evidence/full_memory.lime format=lime"

GPU Memory Acquisition

GPU memory acquisition is more complex because GPU VRAM is not directly addressable from host CPU code. The primary approaches are:

"""
GPU memory forensic capture module.
 
Captures model weights, KV cache, and activation state from GPU memory
for forensic analysis. Requires the model process to be accessible
(either running or via a saved CUDA context).
"""
import torch
import json
import hashlib
import time
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class GPUMemoryCapture:
    """Container for a forensic GPU memory capture."""
    capture_time: float
    gpu_device: int
    tensors: dict[str, dict]  # name -> {shape, dtype, hash, data_path}
    gpu_info: dict
    cuda_memory_stats: dict
 
def capture_gpu_state(
    model: torch.nn.Module,
    output_dir: str,
    device_id: int = 0,
) -> GPUMemoryCapture:
    """
    Capture the complete GPU-resident state of a model for forensic analysis.
 
    This function iterates over all parameters and buffers in the model,
    computes integrity hashes, and saves tensors to disk. The capture
    preserves the exact binary representation of each tensor.
    """
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)
 
    torch.cuda.synchronize(device_id)
 
    tensors = {}
    for name, param in model.named_parameters():
        cpu_copy = param.detach().cpu()
        tensor_bytes = cpu_copy.numpy().tobytes()
        tensor_hash = hashlib.sha256(tensor_bytes).hexdigest()
 
        tensor_path = output_path / f"{name.replace('.', '_')}.pt"
        torch.save(cpu_copy, tensor_path)
 
        tensors[name] = {
            "shape": list(param.shape),
            "dtype": str(param.dtype),
            "device": str(param.device),
            "hash_sha256": tensor_hash,
            "data_path": str(tensor_path),
            "requires_grad": param.requires_grad,
            "size_bytes": len(tensor_bytes),
        }
 
    # Capture buffer state (running means, variances, etc.)
    for name, buf in model.named_buffers():
        cpu_copy = buf.detach().cpu()
        tensor_bytes = cpu_copy.numpy().tobytes()
        tensor_hash = hashlib.sha256(tensor_bytes).hexdigest()
 
        tensor_path = output_path / f"buffer_{name.replace('.', '_')}.pt"
        torch.save(cpu_copy, tensor_path)
 
        tensors[f"buffer:{name}"] = {
            "shape": list(buf.shape),
            "dtype": str(buf.dtype),
            "hash_sha256": tensor_hash,
            "data_path": str(tensor_path),
        }
 
    gpu_info = {
        "name": torch.cuda.get_device_name(device_id),
        "total_memory_bytes": torch.cuda.get_device_properties(device_id).total_mem,
        "capability": list(torch.cuda.get_device_capability(device_id)),
    }
 
    memory_stats = torch.cuda.memory_stats(device_id)
 
    capture = GPUMemoryCapture(
        capture_time=time.time(),
        gpu_device=device_id,
        tensors=tensors,
        gpu_info=gpu_info,
        cuda_memory_stats={
            k: v for k, v in memory_stats.items()
            if isinstance(v, (int, float))
        },
    )
 
    manifest_path = output_path / "capture_manifest.json"
    manifest_path.write_text(json.dumps({
        "capture_time": capture.capture_time,
        "gpu_device": capture.gpu_device,
        "gpu_info": capture.gpu_info,
        "tensor_count": len(capture.tensors),
        "tensors": capture.tensors,
    }, indent=2))
 
    return capture

KV Cache Extraction

The KV (Key-Value) attention cache is particularly valuable forensically because it contains the model's computed representations of all tokens processed in active sessions. Extracting the KV cache can reveal what prompts were being processed and what conversation context the model was operating with.

def extract_kv_cache_forensics(
    kv_cache: list[tuple[torch.Tensor, torch.Tensor]],
    output_dir: str,
) -> dict:
    """
    Extract and analyze KV cache state for forensic purposes.
 
    The KV cache contains key and value tensors for each attention layer,
    representing the model's computed context for active sessions.
 
    Args:
        kv_cache: List of (key, value) tensor pairs, one per layer.
        output_dir: Directory to write extracted cache data.
    """
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)
 
    analysis = {"layers": [], "total_tokens_cached": 0}
 
    for layer_idx, (keys, values) in enumerate(kv_cache):
        # keys shape: (batch, num_heads, seq_len, head_dim)
        seq_len = keys.shape[2] if keys.dim() == 4 else keys.shape[1]
        analysis["total_tokens_cached"] = max(
            analysis["total_tokens_cached"], seq_len
        )
 
        layer_info = {
            "layer": layer_idx,
            "key_shape": list(keys.shape),
            "value_shape": list(values.shape),
            "key_hash": hashlib.sha256(
                keys.detach().cpu().numpy().tobytes()
            ).hexdigest(),
            "value_hash": hashlib.sha256(
                values.detach().cpu().numpy().tobytes()
            ).hexdigest(),
            "cached_sequence_length": seq_len,
        }
 
        # Save tensors for detailed analysis
        torch.save(keys.detach().cpu(), output_path / f"layer_{layer_idx}_keys.pt")
        torch.save(values.detach().cpu(), output_path / f"layer_{layer_idx}_values.pt")
 
        analysis["layers"].append(layer_info)
 
    return analysis

Analyzing Memory Captures

Weight Integrity Verification

The most critical analysis is comparing captured weights against known-good reference checksums. Any discrepancy indicates either model tampering or an unauthorized update.

def verify_weight_integrity(
    capture_manifest: dict,
    reference_hashes: dict[str, str],
) -> dict:
    """
    Compare captured model weight hashes against reference checksums.
 
    Args:
        capture_manifest: The manifest from a GPU memory capture.
        reference_hashes: Dict mapping parameter names to expected SHA-256 hashes.
 
    Returns:
        Analysis results including any mismatches.
    """
    results = {
        "total_parameters": len(capture_manifest["tensors"]),
        "verified_matching": 0,
        "mismatches": [],
        "missing_from_reference": [],
        "extra_in_capture": [],
    }
 
    captured_names = set(capture_manifest["tensors"].keys())
    reference_names = set(reference_hashes.keys())
 
    results["missing_from_reference"] = list(captured_names - reference_names)
    results["extra_in_capture"] = list(reference_names - captured_names)
 
    for name, tensor_info in capture_manifest["tensors"].items():
        if name in reference_hashes:
            if tensor_info["hash_sha256"] == reference_hashes[name]:
                results["verified_matching"] += 1
            else:
                results["mismatches"].append({
                    "parameter": name,
                    "expected_hash": reference_hashes[name],
                    "captured_hash": tensor_info["hash_sha256"],
                    "shape": tensor_info["shape"],
                    "dtype": tensor_info["dtype"],
                })
 
    results["integrity_status"] = (
        "VERIFIED" if not results["mismatches"]
        and not results["missing_from_reference"]
        else "COMPROMISED"
    )
 
    return results

Detecting Injected Adapter Weights

An attacker who has gained access to a model serving system may inject LoRA adapter weights to modify model behavior without changing the base model weights. This is forensically stealthy because the base model hashes will still match the reference.

def detect_unexpected_adapters(
    model: torch.nn.Module,
    expected_adapter_names: set[str] | None = None,
) -> dict:
    """
    Scan a model for unexpected LoRA or adapter modules.
 
    Attackers may inject adapter weights to modify behavior without
    altering base model weights. This function identifies any adapter
    modules that were not part of the expected configuration.
    """
    expected = expected_adapter_names or set()
    findings = {"expected_adapters": [], "unexpected_adapters": [], "suspicious_modules": []}
 
    for name, module in model.named_modules():
        module_type = type(module).__name__
 
        # Check for common adapter module types
        is_adapter = any(keyword in module_type.lower() for keyword in [
            "lora", "adapter", "prefix", "prompt_tuning", "ia3",
        ])
 
        if is_adapter:
            info = {
                "name": name,
                "type": module_type,
                "param_count": sum(p.numel() for p in module.parameters()),
            }
            if name in expected:
                findings["expected_adapters"].append(info)
            else:
                findings["unexpected_adapters"].append(info)
 
        # Also check for suspiciously named parameters
        for pname, param in module.named_parameters(recurse=False):
            if any(kw in pname.lower() for kw in ["inject", "hook", "patch", "backdoor"]):
                findings["suspicious_modules"].append({
                    "module": name,
                    "parameter": pname,
                    "shape": list(param.shape),
                })
 
    return findings

Runtime Hook Detection

PyTorch's hook mechanism allows code to intercept forward and backward passes. An attacker can register hooks that modify model outputs without changing weights. Forensic analysis should enumerate all registered hooks.

def enumerate_model_hooks(model: torch.nn.Module) -> dict:
    """
    Enumerate all registered forward and backward hooks on a model.
 
    PyTorch hooks can modify model behavior at runtime without
    altering weights. An attacker could use hooks to:
    - Modify specific outputs based on trigger inputs
    - Exfiltrate data through side channels
    - Bypass safety filters selectively
    """
    findings = {"forward_hooks": [], "backward_hooks": [], "forward_pre_hooks": []}
 
    for name, module in model.named_modules():
        # Check forward hooks
        if hasattr(module, '_forward_hooks') and module._forward_hooks:
            for hook_id, hook in module._forward_hooks.items():
                findings["forward_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                    "source_file": getattr(hook, '__module__', 'unknown'),
                })
 
        # Check backward hooks
        if hasattr(module, '_backward_hooks') and module._backward_hooks:
            for hook_id, hook in module._backward_hooks.items():
                findings["backward_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                })
 
        # Check forward pre-hooks
        if hasattr(module, '_forward_pre_hooks') and module._forward_pre_hooks:
            for hook_id, hook in module._forward_pre_hooks.items():
                findings["forward_pre_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                })
 
    findings["total_hooks"] = (
        len(findings["forward_hooks"])
        + len(findings["backward_hooks"])
        + len(findings["forward_pre_hooks"])
    )
 
    return findings

Process Memory Analysis for AI Frameworks

Python Object Recovery

AI serving systems typically run in Python processes. Traditional memory forensics can be augmented with Python-specific analysis to recover objects from the heap.

# Use py-spy to get a snapshot of the Python process state
# This captures the call stack of all threads without stopping the process
py-spy dump --pid ${AI_PID} > /evidence/python_state.txt
 
# For deeper analysis, use gdb with Python extensions
gdb -batch -ex "source /usr/lib/python3.11/gdb_helpers.py" \
    -ex "py-bt" -ex "quit" -p ${AI_PID} > /evidence/python_backtrace.txt

Recovering Request Data from Memory

Inference requests passing through the serving pipeline leave traces in memory that can be recovered even after the request has been processed. These traces exist in:

Python string objects in the garbage collector's tracked objects
Framework request queue data structures
HTTP server buffers (if using HTTP-based serving)
Tokenizer encode/decode buffers

Investigation Workflow

Phase 1: Scene Preservation

Document the current system state (running processes, network connections, GPU utilization)
Capture volatile evidence in order of volatility: GPU VRAM first, then CPU memory, then disk
Record all system timestamps and synchronize with NTP

Phase 2: Memory Acquisition

Perform GPU memory capture using the techniques described above
Acquire CPU memory using LiME or /proc filesystem
Capture process-specific memory for each AI serving process
Verify capture integrity with checksums

Phase 3: Analysis

Compare weight hashes against reference checksums
Scan for unexpected adapter modules or hooks
Analyze KV cache for evidence of specific interactions
Search CPU memory for artifacts of attacker activity

Phase 4: Correlation

Cross-reference memory findings with log analysis from other forensic workstreams
Map findings to MITRE ATLAS techniques
Establish timeline of compromise using memory artifacts

Challenges and Limitations

Memory forensics for AI systems faces several unique challenges:

Memory volatility: GPU memory is extremely volatile. VRAM contents change with every inference request, and there is no persistent storage or swap for GPU memory.
Scale: Large language models can occupy 100+ GB of VRAM across multiple GPUs. Capturing and analyzing this volume of data requires significant storage and compute resources.
Encryption: Some GPU architectures support memory encryption (AMD SEV-SNP, NVIDIA Confidential Computing) that prevents direct memory reads from the host.
Framework opacity: Deep learning frameworks manage their own memory pools, making it difficult to map raw memory addresses to meaningful data structures without framework-specific knowledge.
Quantization artifacts: Models served in quantized formats (INT4, INT8, FP8) require knowledge of the quantization scheme to correctly interpret weight values.

References

Ligh, M. H., Case, A., Levy, J., & Walters, A. (2014). The Art of Memory Forensics: Detecting Malware and Threats in Windows, Linux, and Mac Memory. Wiley.
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NVIDIA. (2024). CUDA C++ Programming Guide: Unified Memory. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#unified-memory-programming
NIST. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1. https://doi.org/10.6028/NIST.AI.100-1

Edit this page on GitHub

AI System Memory Forensics

expert11 min readUpdated 2026-03-20

Memory forensics techniques for investigating compromised AI systems, including GPU memory analysis, model weight extraction, and runtime state recovery.

ai-forensics-ir memory-forensics gpu-memory runtime-analysis

Overview

Memory Architecture of AI Systems

CPU Memory Layout

AI serving frameworks (vLLM, Triton Inference Server, TorchServe) maintain several categories of data in CPU memory:

Model configuration: Hyperparameters, tokenizer vocabulary, generation parameters
Request queues: Pending inference requests including full prompt text
Response buffers: Generated outputs before delivery to clients
Framework metadata: Scheduling state, batch composition, memory allocation maps
Logging buffers: Circular buffers of recent inference events

GPU Memory Layout

GPU VRAM contains the computationally active components:

Model weights: The parameter tensors that define model behavior, often in quantized formats (FP16, INT8, INT4)
KV cache: Key-value attention cache for active generation sessions, containing the model's "working memory" of ongoing conversations
Activation tensors: Intermediate computation results during forward passes
CUDA graphs: Pre-compiled computation graphs for optimized inference paths

Unified Memory and NVLink

Memory Capture Techniques

CPU Memory Acquisition

Standard memory acquisition tools work for the CPU portion of AI system state. On Linux systems, the primary methods are:

# Method 1: /proc filesystem capture (requires root)
# Capture the memory of a running AI serving process
AI_PID=$(pgrep -f "vllm.entrypoints")
cp /proc/${AI_PID}/maps /evidence/proc_maps_$(date +%s).txt
 
# Dump specific memory regions identified from the maps file
# Focus on heap regions where model configs and request data reside
grep "heap" /proc/${AI_PID}/maps
 
# Method 2: Using gcore for a complete process core dump
# This suspends the process briefly -- coordinate with operations
gcore -o /evidence/ai_server_core ${AI_PID}
 
# Method 3: Using LiME (Linux Memory Extractor) for full system memory
# Build LiME kernel module for the running kernel
# insmod lime.ko "path=/evidence/full_memory.lime format=lime"

GPU Memory Acquisition

GPU memory acquisition is more complex because GPU VRAM is not directly addressable from host CPU code. The primary approaches are:

"""
GPU memory forensic capture module.
 
Captures model weights, KV cache, and activation state from GPU memory
for forensic analysis. Requires the model process to be accessible
(either running or via a saved CUDA context).
"""
import torch
import json
import hashlib
import time
from pathlib import Path
from dataclasses import dataclass
 
@dataclass
class GPUMemoryCapture:
    """Container for a forensic GPU memory capture."""
    capture_time: float
    gpu_device: int
    tensors: dict[str, dict]  # name -> {shape, dtype, hash, data_path}
    gpu_info: dict
    cuda_memory_stats: dict
 
def capture_gpu_state(
    model: torch.nn.Module,
    output_dir: str,
    device_id: int = 0,
) -> GPUMemoryCapture:
    """
    Capture the complete GPU-resident state of a model for forensic analysis.
 
    This function iterates over all parameters and buffers in the model,
    computes integrity hashes, and saves tensors to disk. The capture
    preserves the exact binary representation of each tensor.
    """
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)
 
    torch.cuda.synchronize(device_id)
 
    tensors = {}
    for name, param in model.named_parameters():
        cpu_copy = param.detach().cpu()
        tensor_bytes = cpu_copy.numpy().tobytes()
        tensor_hash = hashlib.sha256(tensor_bytes).hexdigest()
 
        tensor_path = output_path / f"{name.replace('.', '_')}.pt"
        torch.save(cpu_copy, tensor_path)
 
        tensors[name] = {
            "shape": list(param.shape),
            "dtype": str(param.dtype),
            "device": str(param.device),
            "hash_sha256": tensor_hash,
            "data_path": str(tensor_path),
            "requires_grad": param.requires_grad,
            "size_bytes": len(tensor_bytes),
        }
 
    # Capture buffer state (running means, variances, etc.)
    for name, buf in model.named_buffers():
        cpu_copy = buf.detach().cpu()
        tensor_bytes = cpu_copy.numpy().tobytes()
        tensor_hash = hashlib.sha256(tensor_bytes).hexdigest()
 
        tensor_path = output_path / f"buffer_{name.replace('.', '_')}.pt"
        torch.save(cpu_copy, tensor_path)
 
        tensors[f"buffer:{name}"] = {
            "shape": list(buf.shape),
            "dtype": str(buf.dtype),
            "hash_sha256": tensor_hash,
            "data_path": str(tensor_path),
        }
 
    gpu_info = {
        "name": torch.cuda.get_device_name(device_id),
        "total_memory_bytes": torch.cuda.get_device_properties(device_id).total_mem,
        "capability": list(torch.cuda.get_device_capability(device_id)),
    }
 
    memory_stats = torch.cuda.memory_stats(device_id)
 
    capture = GPUMemoryCapture(
        capture_time=time.time(),
        gpu_device=device_id,
        tensors=tensors,
        gpu_info=gpu_info,
        cuda_memory_stats={
            k: v for k, v in memory_stats.items()
            if isinstance(v, (int, float))
        },
    )
 
    manifest_path = output_path / "capture_manifest.json"
    manifest_path.write_text(json.dumps({
        "capture_time": capture.capture_time,
        "gpu_device": capture.gpu_device,
        "gpu_info": capture.gpu_info,
        "tensor_count": len(capture.tensors),
        "tensors": capture.tensors,
    }, indent=2))
 
    return capture

KV Cache Extraction

def extract_kv_cache_forensics(
    kv_cache: list[tuple[torch.Tensor, torch.Tensor]],
    output_dir: str,
) -> dict:
    """
    Extract and analyze KV cache state for forensic purposes.
 
    The KV cache contains key and value tensors for each attention layer,
    representing the model's computed context for active sessions.
 
    Args:
        kv_cache: List of (key, value) tensor pairs, one per layer.
        output_dir: Directory to write extracted cache data.
    """
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)
 
    analysis = {"layers": [], "total_tokens_cached": 0}
 
    for layer_idx, (keys, values) in enumerate(kv_cache):
        # keys shape: (batch, num_heads, seq_len, head_dim)
        seq_len = keys.shape[2] if keys.dim() == 4 else keys.shape[1]
        analysis["total_tokens_cached"] = max(
            analysis["total_tokens_cached"], seq_len
        )
 
        layer_info = {
            "layer": layer_idx,
            "key_shape": list(keys.shape),
            "value_shape": list(values.shape),
            "key_hash": hashlib.sha256(
                keys.detach().cpu().numpy().tobytes()
            ).hexdigest(),
            "value_hash": hashlib.sha256(
                values.detach().cpu().numpy().tobytes()
            ).hexdigest(),
            "cached_sequence_length": seq_len,
        }
 
        # Save tensors for detailed analysis
        torch.save(keys.detach().cpu(), output_path / f"layer_{layer_idx}_keys.pt")
        torch.save(values.detach().cpu(), output_path / f"layer_{layer_idx}_values.pt")
 
        analysis["layers"].append(layer_info)
 
    return analysis

Analyzing Memory Captures

Weight Integrity Verification

The most critical analysis is comparing captured weights against known-good reference checksums. Any discrepancy indicates either model tampering or an unauthorized update.

def verify_weight_integrity(
    capture_manifest: dict,
    reference_hashes: dict[str, str],
) -> dict:
    """
    Compare captured model weight hashes against reference checksums.
 
    Args:
        capture_manifest: The manifest from a GPU memory capture.
        reference_hashes: Dict mapping parameter names to expected SHA-256 hashes.
 
    Returns:
        Analysis results including any mismatches.
    """
    results = {
        "total_parameters": len(capture_manifest["tensors"]),
        "verified_matching": 0,
        "mismatches": [],
        "missing_from_reference": [],
        "extra_in_capture": [],
    }
 
    captured_names = set(capture_manifest["tensors"].keys())
    reference_names = set(reference_hashes.keys())
 
    results["missing_from_reference"] = list(captured_names - reference_names)
    results["extra_in_capture"] = list(reference_names - captured_names)
 
    for name, tensor_info in capture_manifest["tensors"].items():
        if name in reference_hashes:
            if tensor_info["hash_sha256"] == reference_hashes[name]:
                results["verified_matching"] += 1
            else:
                results["mismatches"].append({
                    "parameter": name,
                    "expected_hash": reference_hashes[name],
                    "captured_hash": tensor_info["hash_sha256"],
                    "shape": tensor_info["shape"],
                    "dtype": tensor_info["dtype"],
                })
 
    results["integrity_status"] = (
        "VERIFIED" if not results["mismatches"]
        and not results["missing_from_reference"]
        else "COMPROMISED"
    )
 
    return results

Detecting Injected Adapter Weights

def detect_unexpected_adapters(
    model: torch.nn.Module,
    expected_adapter_names: set[str] | None = None,
) -> dict:
    """
    Scan a model for unexpected LoRA or adapter modules.
 
    Attackers may inject adapter weights to modify behavior without
    altering base model weights. This function identifies any adapter
    modules that were not part of the expected configuration.
    """
    expected = expected_adapter_names or set()
    findings = {"expected_adapters": [], "unexpected_adapters": [], "suspicious_modules": []}
 
    for name, module in model.named_modules():
        module_type = type(module).__name__
 
        # Check for common adapter module types
        is_adapter = any(keyword in module_type.lower() for keyword in [
            "lora", "adapter", "prefix", "prompt_tuning", "ia3",
        ])
 
        if is_adapter:
            info = {
                "name": name,
                "type": module_type,
                "param_count": sum(p.numel() for p in module.parameters()),
            }
            if name in expected:
                findings["expected_adapters"].append(info)
            else:
                findings["unexpected_adapters"].append(info)
 
        # Also check for suspiciously named parameters
        for pname, param in module.named_parameters(recurse=False):
            if any(kw in pname.lower() for kw in ["inject", "hook", "patch", "backdoor"]):
                findings["suspicious_modules"].append({
                    "module": name,
                    "parameter": pname,
                    "shape": list(param.shape),
                })
 
    return findings

Runtime Hook Detection

def enumerate_model_hooks(model: torch.nn.Module) -> dict:
    """
    Enumerate all registered forward and backward hooks on a model.
 
    PyTorch hooks can modify model behavior at runtime without
    altering weights. An attacker could use hooks to:
    - Modify specific outputs based on trigger inputs
    - Exfiltrate data through side channels
    - Bypass safety filters selectively
    """
    findings = {"forward_hooks": [], "backward_hooks": [], "forward_pre_hooks": []}
 
    for name, module in model.named_modules():
        # Check forward hooks
        if hasattr(module, '_forward_hooks') and module._forward_hooks:
            for hook_id, hook in module._forward_hooks.items():
                findings["forward_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                    "source_file": getattr(hook, '__module__', 'unknown'),
                })
 
        # Check backward hooks
        if hasattr(module, '_backward_hooks') and module._backward_hooks:
            for hook_id, hook in module._backward_hooks.items():
                findings["backward_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                })
 
        # Check forward pre-hooks
        if hasattr(module, '_forward_pre_hooks') and module._forward_pre_hooks:
            for hook_id, hook in module._forward_pre_hooks.items():
                findings["forward_pre_hooks"].append({
                    "module": name,
                    "hook_id": hook_id,
                    "hook_function": str(hook),
                })
 
    findings["total_hooks"] = (
        len(findings["forward_hooks"])
        + len(findings["backward_hooks"])
        + len(findings["forward_pre_hooks"])
    )
 
    return findings

Process Memory Analysis for AI Frameworks

Python Object Recovery

AI serving systems typically run in Python processes. Traditional memory forensics can be augmented with Python-specific analysis to recover objects from the heap.

# Use py-spy to get a snapshot of the Python process state
# This captures the call stack of all threads without stopping the process
py-spy dump --pid ${AI_PID} > /evidence/python_state.txt
 
# For deeper analysis, use gdb with Python extensions
gdb -batch -ex "source /usr/lib/python3.11/gdb_helpers.py" \
    -ex "py-bt" -ex "quit" -p ${AI_PID} > /evidence/python_backtrace.txt

Recovering Request Data from Memory

Inference requests passing through the serving pipeline leave traces in memory that can be recovered even after the request has been processed. These traces exist in:

Python string objects in the garbage collector's tracked objects
Framework request queue data structures
HTTP server buffers (if using HTTP-based serving)
Tokenizer encode/decode buffers

Investigation Workflow

Phase 1: Scene Preservation

Document the current system state (running processes, network connections, GPU utilization)
Capture volatile evidence in order of volatility: GPU VRAM first, then CPU memory, then disk
Record all system timestamps and synchronize with NTP

Phase 2: Memory Acquisition

Perform GPU memory capture using the techniques described above
Acquire CPU memory using LiME or /proc filesystem
Capture process-specific memory for each AI serving process
Verify capture integrity with checksums

Phase 3: Analysis

Compare weight hashes against reference checksums
Scan for unexpected adapter modules or hooks
Analyze KV cache for evidence of specific interactions
Search CPU memory for artifacts of attacker activity

Phase 4: Correlation

Cross-reference memory findings with log analysis from other forensic workstreams
Map findings to MITRE ATLAS techniques
Establish timeline of compromise using memory artifacts

Challenges and Limitations

Memory forensics for AI systems faces several unique challenges:

Memory volatility: GPU memory is extremely volatile. VRAM contents change with every inference request, and there is no persistent storage or swap for GPU memory.
Scale: Large language models can occupy 100+ GB of VRAM across multiple GPUs. Capturing and analyzing this volume of data requires significant storage and compute resources.
Encryption: Some GPU architectures support memory encryption (AMD SEV-SNP, NVIDIA Confidential Computing) that prevents direct memory reads from the host.
Framework opacity: Deep learning frameworks manage their own memory pools, making it difficult to map raw memory addresses to meaningful data structures without framework-specific knowledge.
Quantization artifacts: Models served in quantized formats (INT4, INT8, FP8) require knowledge of the quantization scheme to correctly interpret weight values.

References

Ligh, M. H., Case, A., Levy, J., & Walters, A. (2014). The Art of Memory Forensics: Detecting Malware and Threats in Windows, Linux, and Mac Memory. Wiley.
MITRE ATLAS. (2024). Adversarial Threat Landscape for Artificial Intelligence Systems. https://atlas.mitre.org/
NVIDIA. (2024). CUDA C++ Programming Guide: Unified Memory. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#unified-memory-programming
NIST. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1. https://doi.org/10.6028/NIST.AI.100-1

Edit this page on GitHub

AI System Memory Forensics

Related articles

AI System Memory Forensics

Related articles