# training-data
15 articlestagged with “training-data”
Training Data Breach Forensics
Investigating training data breaches including data extraction evidence and membership inference indicators.
Training Data Provenance Forensics
Forensic techniques for tracing the origins, lineage, and integrity of training data used in machine learning models.
Case Study: Training Data Extraction from GPT
Analysis of the Carlini et al. work on extracting training data from ChatGPT in production.
Code Suggestion Poisoning
Overview of attacks that manipulate AI coding assistant suggestions through training data poisoning and inference-time context manipulation.
Training Data Extraction from Code Models
Techniques for recovering proprietary code from code generation model weights — covering memorization detection, targeted extraction, membership inference, and defensive countermeasures.
Advanced Model Inversion Attacks
Reconstructing training data from model weights and API access using gradient-based inversion, generative model-assisted reconstruction, and membership inference refinement.
Advanced Training Data Extraction
Advanced techniques for extracting memorized training data from language models.
Membership Inference via Embeddings
Determining if specific data was in an embedding model's training set through distance-based inference, statistical tests, and embedding behavior analysis.
Repository Poisoning for Code Models
Techniques for poisoning code repositories to influence code generation models, including training data poisoning through popular repositories, backdoor injection in open-source dependencies, and supply chain attacks targeting code model training pipelines.
Securing Storage Systems for Training Data
Attack and defense strategies for S3, GCS, HDFS, and object storage systems holding AI training datasets and model artifacts
Lab: Training Data Extraction at Scale
Extract memorized training data from language models using prefix-based extraction, divergence testing, and membership inference. Measure extraction rates and assess privacy risks.
Training Data Extraction from Production LLMs
Implement Carlini et al.'s techniques to extract memorized training data from production language model APIs.
Extracting Training Data
Techniques for extracting memorized training data, system prompts, and private information from LLMs through targeted querying and membership inference attacks.
RAG, Data & Training Attacks
Overview of attacks targeting the data layer of AI systems, including RAG poisoning, training data manipulation, and data extraction techniques.
Training Data Manipulation
Attacks that corrupt model behavior by poisoning training data, fine-tuning datasets, or RLHF preference data, including backdoor installation and safety alignment removal.