# provenance
15 articlestagged with “provenance”
LLM Output Watermark Detection
Techniques for detecting, extracting, and analyzing watermarks embedded in LLM-generated text for provenance tracking and forensic attribution.
Capstone: Build an AI Supply Chain Security Tool
Build a tool that scans, audits, and monitors the security of AI/ML supply chains including model provenance, dependency integrity, and artifact verification.
July 2026: Supply Chain Audit Challenge
Audit an ML project's entire supply chain for security issues including dependencies, model provenance, data pipelines, training infrastructure, and deployment artifacts.
Data Provenance Attacks
Compromising training data integrity by attacking provenance tracking systems, falsifying data lineage, and manipulating data pipeline metadata.
Watermark & Fingerprint Evasion
Deep dive into detecting and removing output watermarks, degrading weight watermarks, evading model fingerprinting, building provenance-stripping pipelines, and understanding the legal landscape of model ownership verification.
AI Watermarking and Attacks
Current AI watermarking schemes for model outputs and training data, their security properties, and known attacks that remove, forge, or evade watermarks.
Output Provenance Tracking
Systems for tracking the provenance of LLM outputs to detect hallucination, data leakage, and instruction injection.
Watermarking LLM Outputs for Provenance
Advanced techniques for watermarking LLM-generated text to establish provenance, including deployment architectures, multi-bit encoding schemes, robustness considerations, and the role of watermarking in AI security and accountability frameworks.
Output Watermarking as Defense
Using output watermarking for content provenance tracking and misuse detection in LLM applications.
ML Artifact Integrity
Ensuring integrity of ML artifacts throughout the pipeline: hash verification strategies, signed artifact workflows, reproducible builds for ML, deterministic training challenges, and end-to-end artifact provenance.
Model Signing and Provenance
Cryptographic signing for ML models: Sigstore for ML artifacts, cosign for model weights, SLSA framework applied to ML pipelines, supply chain levels for model provenance, and practical implementation of model artifact verification.
Synthetic Data Pipeline Attacks
Attacks on synthetic data generation pipelines: model collapse from synthetic feedback loops, poisoning synthetic data generators, quality control bypass, and data provenance attacks.
Watermark Removal Techniques
Techniques for removing AI watermarks: paraphrasing attacks, token substitution, embedding space perturbation, and implications for model provenance and accountability.
Training Data Provenance Attacks
Attacking training data provenance and attribution systems to inject unverified data sources.
Hugging Face Security Audit Walkthrough
Step-by-step walkthrough for auditing Hugging Face models: scanning for malicious model files, verifying model provenance, assessing model card completeness, and testing Spaces and Inference API security.