Experiment Metadata Leakage
How experiment metadata reveals sensitive information: hyperparameters exposing architecture secrets, loss curves revealing training data properties, run names and tags disclosing project intent, and techniques for extracting intelligence from ML experiment logs.
Experiment Metadata Leakage
Every experiment logged to a tracking platform generates metadata that, individually, appears innocuous. Learning rates, loss values, and run names do not look like sensitive data. But metadata accumulates, and the aggregate reveals far more than any single data point. 攻擊者 who can read experiment metadata -- through a leaked API key, an unprotected tracking instance, or a public project -- can reconstruct an organization's ML strategy, 訓練資料 characteristics, and architectural decisions without ever accessing a single model weight file.
Hyperparameters Revealing Architecture
Hyperparameters logged for reproducibility encode the complete blueprint of a model's architecture and 訓練 methodology. 攻擊者 reading these parameters gains the equivalent of access to 模型 definition code.
What Hyperparameters Reveal
| Parameter | What It Reveals |
|---|---|
num_layers, hidden_size, num_attention_heads | Exact transformer architecture dimensions |
vocab_size, max_position_embeddings | Tokenizer and 上下文視窗 specifications |
num_experts, num_experts_per_tok | Mixture-of-Experts architecture and routing configuration |
rope_theta, rope_scaling | Positional encoding strategy and context extension approach |
intermediate_size, num_key_value_heads | MLP dimensions and grouped-query 注意力 configuration |
lora_r, lora_alpha, lora_target_modules | 微調 methodology and which layers are adapted |
quantization_config | Precision trade-offs and deployment constraints |
Reconstructing Architecture from Hyperparameters
A systematic approach to architecture reconstruction:
Collect hyperparameters across runs
Gather all logged hyperparameters from multiple 訓練 runs. Architecture parameters are usually consistent across runs while 訓練 parameters vary.
識別 the base architecture
Match the hyperparameter set against known model architectures (Llama, Mistral, GPT-NeoX, etc.). The combination of
hidden_size,num_attention_heads, andnum_layersis often unique to a specific base model.Determine modifications
Look for parameters that deviate from the base architecture: custom layer counts indicate knowledge distillation, unusual 注意力 head configurations suggest architectural innovations, and the presence of LoRA parameters reveals 微調 targets.
Map the 訓練 methodology
Optimizer choice, learning rate schedule, warmup steps, and gradient accumulation reveal the 訓練 recipe. Combined with architecture knowledge, this enables reproduction of 模型.
Case Study: Architecture Inference
考慮 these hyperparameters extracted from a tracking platform:
hidden_size: 4096
num_attention_heads: 32
num_key_value_heads: 8
num_layers: 32
intermediate_size: 14336
vocab_size: 32000
max_position_embeddings: 32768
rope_theta: 500000
This immediately identifies a Llama 2-class architecture with grouped-query 注意力 (32 heads, 8 KV heads), an extended 上下文視窗 (32K via high RoPE theta), and a standard MLP ratio. The rope_theta: 500000 reveals the team is using a specific context extension technique rather than the default 10000.
Loss Curves Revealing Training Data Properties
Training loss curves are more than performance indicators. Their shape, magnitude, and behavior encode information about the 訓練資料 that produced them.
What Loss Curves Reveal
Initial loss magnitude. The starting loss value before any 訓練 indicates 模型's prior knowledge of the 訓練 distribution. A low initial loss suggests 微調 from a model already familiar with similar data. A high initial loss indicates a significant distribution shift.
Convergence rate. How quickly loss decreases reveals dataset size and diversity. Rapid convergence suggests a small or homogeneous dataset. Slow, steady convergence indicates a large, diverse corpus.
Loss plateaus and drops. Sudden drops in loss often correspond to learning rate schedule changes, but unexpected plateaus can reveal data quality issues or distribution boundaries in the 訓練資料.
Validation-訓練 gap. The gap between 訓練 and validation loss reveals overfitting, which in turn reveals effective dataset size. A large gap on a small model suggests limited 訓練資料 -- valuable intelligence for 資料投毒 attacks.
Metric-Based Data Inference
| Metric Pattern | Inference |
|---|---|
| Low perplexity on domain-specific eval | 訓練資料 includes domain-specific content |
| Sudden accuracy improvement on specific benchmark | Dataset augmented with benchmark-relevant data |
| High variance in per-batch loss | 訓練資料 is heterogeneous or poorly shuffled |
| 評估 metrics plateau while 訓練 loss decreases | Model memorizing 訓練資料 -- small effective dataset |
| Different convergence rates across 訓練 runs | Dataset composition changed between runs |
Detecting Dataset Changes
Comparing loss curves across experiment runs reveals when 訓練資料 changed:
- A run with identical hyperparameters but different loss behavior indicates a dataset modification
- Improved metrics on specific 評估 sets after a data change reveal what data was added
- Degraded metrics on previously strong benchmarks reveal what data was removed
Run Names and Tags Revealing Project Intent
Naming Convention Intelligence
ML teams use descriptive names for experiments, runs, and tags. These names reveal organizational priorities, research directions, and development timelines.
Common naming patterns and what they reveal:
| Pattern | 範例 | Intelligence |
|---|---|---|
| Feature names | medical-qa-v3, code-gen-rust-support | Active development areas |
| Customer names | acme-corp-custom, client-x-finetune | Business relationships |
| Deadline references | launch-q2-candidate, demo-board-meeting | Timeline and priority |
| Competition | beat-gpt4-math, match-claude-coding | Competitive positioning |
| 安全 research | 越獄-resistant-v2, 安全-filter-bypass-測試 | 安全 posture and concerns |
Tag-Based Intelligence
Tags provide categorical metadata that reveals organizational structure:
- Team tags (
team:nlp,team:安全) reveal organizational structure - Priority tags (
priority:p0,critical) reveal what the organization considers important - Status tags (
status:blocked,needs-review) reveal development bottlenecks - Deployment tags (
env:staging,region:eu-west) reveal infrastructure topology
Experiment Organization
How experiments are organized into projects and grouped reveals the organization's ML strategy:
- Project names indicate product lines and research areas
- The number of runs per project indicates investment level
- Active vs. inactive projects reveal strategic pivots
- Cross-references between projects reveal dependencies
Combining Metadata Signals
The real power of metadata analysis comes from combining signals across multiple dimensions. Individual data points are suggestive; correlations are conclusive.
Cross-Signal Analysis Framework
Timeline reconstruction
Order all experiments chronologically. Map the progression of architecture changes, dataset modifications, and performance improvements. This reveals the development narrative.
Resource allocation mapping
Correlate GPU hours, run durations, and experiment frequency to 識別 which projects receive the most investment. High-resource projects are the organization's priorities.
Dependency mapping
Trace artifact lineage across experiments. 識別 which datasets feed which models, which base models are used for 微調, and how the pipeline connects.
Personnel identification
If runs are tagged with user IDs or author information, map which individuals work on which projects. This reveals the team structure and key personnel.
Intelligence Report Structure
A comprehensive metadata analysis produces:
- Architecture inventory: All model architectures in development, their specifications, and evolution over time
- Data asset map: Training datasets used, their sources (inferred from paths and descriptions), and when they were modified
- Capability roadmap: Which capabilities the organization is developing, based on 評估 metrics and project names
- 安全 posture: How the organization approaches 安全 and 安全, based on 安全-related experiments and 評估 benchmarks
- Timeline: Development velocity, upcoming milestones, and historical pivots
Metadata Hygiene
What to Sanitize
Not all metadata needs to be logged to a shared tracking system. 實作 filtering to remove:
- Absolute file paths (reveal infrastructure layout)
- Environment variables (frequently contain credentials)
- Customer-identifying information in run names
- Internal project codenames in tags
- Competitive references in descriptions
What to Keep
Reproducibility requires certain metadata. The key is to log what is needed and nothing more:
- Model architecture parameters (necessary for reproducibility, accept the risk)
- Training hyperparameters (learning rate, batch size, etc.)
- Anonymized 評估 metrics
- Artifact hashes (not paths)
- Version identifiers (not descriptive names)
Access Tiering
實作 different access levels for different metadata sensitivity:
| Tier | Content | Access |
|---|---|---|
| Public | Published model metrics, architecture summaries | Anyone |
| Team | Full hyperparameters, loss curves, run configs | Team members |
| Restricted | Dataset paths, customer-specific runs, 安全 research | Project leads |
| Confidential | Credentials, infrastructure details, competitive analysis | 安全 team |
參考文獻
- Carlini et al., "Extracting Training Data from Large Language Models" -- 訓練資料 extraction from model behavior
- Tramèr et al., "Stealing Machine Learning Models via Prediction APIs" -- Model extraction through metadata
- MITRE ATLAS: ML Model Inference -- Threat taxonomy for model information leakage
攻擊者 observes that a model's 訓練 loss drops sharply at epoch 15, then converges at a lower level. The same hyperparameters were used throughout 訓練. What is the most likely explanation, and why does it matter?