# data-poisoning

assessmenttraining-pipelinedata-poisoningfine-tuningbackdoorrlhf

Beveiligingsbeoordeling van de trainingspijplijn

Test your advanced knowledge of training pipeline attacks including data poisoning, fine-tuning hijacking, RLHF manipulation, and backdoor implantation with 9 questions.

capstonetraining-pipelinedata-poisoningbackdooradvanced

Capstone: aanval en verdediging van de trainingspipeline

Attack a model training pipeline through data poisoning and backdoor insertion, then build defenses to detect and prevent these attacks.

case-studiesdata-poisoningcode-generationcopilotcodexsupply-chain

Casestudy: vergiftiging van trainingsdata in codegeneratiemodellen

Analysis of training data poisoning attacks targeting code generation models like GitHub Copilot and OpenAI Codex, where adversarial code patterns in training data cause models to suggest vulnerable or malicious code.

data-securityragtraining-attacksmodel-extractionprivacydata-poisoning

Data- en trainingsbeveiliging

Security vulnerabilities in the AI data pipeline, covering RAG exploitation, training data attacks, model extraction and intellectual property theft, and privacy attacks against deployed models.

Beginner

Clean-label datavergiftiging

Deep dive into clean-label poisoning attacks that corrupt model behavior without modifying labels, including gradient-based methods, feature collision, and witches' brew attacks.

clean-labeldata-poisoninggradient-basedfeature-collisionbackdoor

data-poisoningtrainingclean-labelfeature-collisionbilevel-optimizationdetection-evasion

Methoden voor datavergiftiging

Practical methodology for poisoning training datasets at scale, including crowdsource manipulation, web-scale dataset attacks, label flipping, feature collision, bilevel optimization for poison selection, and detection evasion techniques.

trainingfine-tuningdata-poisoningbackdoortrojanlorasleeper-agentmodel-merging

Aanvallen op training en fine-tuning

Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.

synthetic-datadata-poisoninggenerationcontaminationpipeline-attacks

Vergiftiging van synthetische data

Attacking synthetic data generation pipelines to produce poisoned training sets, including generator manipulation, prompt poisoning, and contamination amplification.

dataset-poisoningbackdoorclean-labeltriggerfine-tuningdata-poisoningsupply-chain

Fine-tuning-datasets vergiftigen

Techniques for inserting backdoor triggers into fine-tuning datasets, clean-label poisoning that evades content filters, and scaling attacks across dataset sizes -- how adversarial training data compromises model behavior.

preference-poisoningrlhfdpodata-poisoninghuman-feedbacklabeler-attackalignment

Vergiftiging van preferentiedata

How adversaries manipulate human preference data used in RLHF and DPO training -- compromising labelers, generating synthetic poisoned preferences, and attacking the preference data supply chain.

supply-chainowasprisk-assessmentmodel-poisoningdata-poisoningdependenciesdefense

Overzicht beveiliging AI-toeleveringsketen

Uitgebreid overzicht van het aanvalsoppervlak van de AI/ML-toeleveringsketen, met modelvergiftiging, datavergiftiging, afhankelijkheidsaanvallen en risicobeoordelingsframeworks afgestemd op OWASP LLM03:2025.

feature-storedata-poisoningfeasttectonml-infrastructure

Feature stores manipuleren

Geavanceerde technieken voor het aanvallen van feature stores die in ML-systemen worden gebruikt, waaronder feature-vergiftiging, schema-manipulatie, exploitatie van de serving-laag en integriteitsaanvallen tegen platforms zoals Feast, Tecton en Databricks Feature Store.

supply-chaindata-poisoningowasp-llm04validationprovenanceanomaly-detectiondefense

Integriteit van trainingsdata

Defense-focused guide to ensuring training data has not been poisoned, covering label flipping, backdoor insertion, clean-label attacks, data validation pipelines, provenance tracking, and anomaly detection.

ctfraginfiltrationdata-poisoning

CTF: RAG Infiltrator

Poison a RAG system to return attacker-controlled content for specific queries. Score based on precision of targeting and stealth of the injected documents.

feature-poisoningdata-poisoningfeature-storetime-travelonline-offline-skewtargeted-attacks

Feature-poisoning-aanvallen

Techniques for poisoning feature store data to manipulate model behavior: direct feature value manipulation, time-travel attacks, online/offline store consistency exploitation, and targeted entity-level feature poisoning.

prompt-injectionindirect-injectionragdata-poisoningsupply-chain

Indirecte prompt injection

Hoe aanvallers kwaadaardige instructies verstoppen in externe databronnen die LLM's verwerken, waardoor aanvallen mogelijk worden zonder directe toegang tot de invoer van het model.

ragdata-poisoningtraining-datadata-extractionai-security

RAG-, data- en trainingsaanvallen

Overzicht van aanvallen gericht op de datalaag van AI-systemen, waaronder RAG-poisoning, manipulatie van trainingsdata en data-extractietechnieken.

Beginner

Manipulatie van trainingsdata

Aanvallen die het modelgedrag corrumperen door trainingsdata, fine-tuning-datasets of RLHF-voorkeursdata te vergiftigen, waaronder het installeren van backdoors en het verwijderen van safety alignment.

training-datadata-poisoningbackdoorsfine-tuningalignment

training-pipelinedata-poisoningscaleattacks

Datavergiftiging op schaal

Technieken om trainingsdata op schaal te vergiftigen om modelgedrag over brede capaciteiten heen te beïnvloeden.

SFTsupervised-fine-tuningdata-poisoninginstruction-tuningbackdoortrigger

SFT-datavergiftiging & -injectie

Het vergiftigen van supervised-fine-tuning-datasets via manipulatie van instructie-responsparen, backdoor-triggers in SFT-data en het bepalen van de minimale drempel aan vergiftigde voorbeelden.

trainingpre-trainingfine-tuningarchitecturedata-poisoningrlhfalignment

Beveiliging van de trainingspijplijn

Beveiliging van de volledige AI-modeltrainingspijplijn, met aandacht voor pre-training-aanvallen, fine-tuning- en alignment-manipulatie, kwetsbaarheden op architectuurniveau en geavanceerde dreigingen tijdens de training.

Beginner

Vergiftigingsaanvallen op synthetische trainingsdata

Uitgebreide analyse van vergiftigingsvectoren in pipelines voor het genereren van synthetische data, van manipulatie van het teacher-model tot het ontwijken van filtering na generatie.

training-pipelinesynthetic-datadata-poisoningsupply-chain