# data-poisoning
22 articlestagged with “data-poisoning”
Data Poisoning Assessment
Comprehensive assessment of training data poisoning, synthetic data attacks, and supply chain vulnerabilities.
RAG & Data Attack Assessment
Test your knowledge of Retrieval-Augmented Generation attack vectors, knowledge base poisoning, embedding manipulation, and data exfiltration through RAG systems with 10 intermediate-level questions.
Training Pipeline Security Assessment
Test your advanced knowledge of training pipeline attacks including data poisoning, fine-tuning hijacking, RLHF manipulation, and backdoor implantation with 9 questions.
Capstone: Training Pipeline Attack & Defense
Attack a model training pipeline through data poisoning and backdoor insertion, then build defenses to detect and prevent these attacks.
Case Study: Training Data Poisoning in Code Generation Models
Analysis of training data poisoning attacks targeting code generation models like GitHub Copilot and OpenAI Codex, where adversarial code patterns in training data cause models to suggest vulnerable or malicious code.
Data & Training Security
Security vulnerabilities in the AI data pipeline, covering RAG exploitation, training data attacks, model extraction and intellectual property theft, and privacy attacks against deployed models.
Clean-Label Data Poisoning
Deep dive into clean-label poisoning attacks that corrupt model behavior without modifying labels, including gradient-based methods, feature collision, and witches' brew attacks.
Data Poisoning Methods
Practical methodology for poisoning training datasets at scale, including crowdsource manipulation, web-scale dataset attacks, label flipping, feature collision, bilevel optimization for poison selection, and detection evasion techniques.
Training & Fine-Tuning Attacks
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Synthetic Data Poisoning
Attacking synthetic data generation pipelines to produce poisoned training sets, including generator manipulation, prompt poisoning, and contamination amplification.
Poisoning Fine-Tuning Datasets
Techniques for inserting backdoor triggers into fine-tuning datasets, clean-label poisoning that evades content filters, and scaling attacks across dataset sizes -- how adversarial training data compromises model behavior.
Preference Data Poisoning
How adversaries manipulate human preference data used in RLHF and DPO training -- compromising labelers, generating synthetic poisoned preferences, and attacking the preference data supply chain.
Manipulating Feature Stores
Advanced techniques for attacking feature stores used in ML systems, including feature poisoning, schema manipulation, serving layer exploitation, and integrity attacks against platforms like Feast, Tecton, and Databricks Feature Store.
CTF: RAG Infiltrator
Poison a RAG system to return attacker-controlled content for specific queries. Score based on precision of targeting and stealth of the injected documents.
Feature Poisoning Attacks
Techniques for poisoning feature store data to manipulate model behavior: direct feature value manipulation, time-travel attacks, online/offline store consistency exploitation, and targeted entity-level feature poisoning.
Indirect Prompt Injection
How attackers embed malicious instructions in external data sources that LLMs process, enabling attacks without direct access to the model's input.
RAG, Data & Training Attacks
Overview of attacks targeting the data layer of AI systems, including RAG poisoning, training data manipulation, and data extraction techniques.
Training Data Manipulation
Attacks that corrupt model behavior by poisoning training data, fine-tuning datasets, or RLHF preference data, including backdoor installation and safety alignment removal.
Data Poisoning at Scale
Techniques for poisoning training data at scale to influence model behavior across broad capabilities.
SFT Data Poisoning & Injection
Poisoning supervised fine-tuning datasets through instruction-response pair manipulation, backdoor triggers in SFT data, and determining minimum poisoned example thresholds.
Training Pipeline Security
Security of the full AI model training pipeline, covering pre-training attacks, fine-tuning and alignment manipulation, architecture-level vulnerabilities, and advanced training-time threats.
Poisoning Attacks on Synthetic Training Data
Comprehensive analysis of poisoning vectors in synthetic data generation pipelines, from teacher model manipulation to post-generation filtering evasion.