# extraction
25 articlestagged with “extraction”
Prompt Leakage Assessment
Assessment of system prompt extraction techniques including direct probing, logprob analysis, and side-channel methods.
Case Study: Training Data Extraction from GPT
Analysis of the Carlini et al. work on extracting training data from ChatGPT in production.
Prompt Extraction from Code Generation Tools
Techniques for extracting system prompts, custom instructions, and proprietary configurations from AI code generation tools.
Training Data Extraction from Code Models
Techniques for recovering proprietary code from code generation model weights — covering memorization detection, targeted extraction, membership inference, and defensive countermeasures.
January 2026: System Prompt Extraction Challenge
Extract system prompts from five increasingly defended chatbots, progressing from unprotected to heavily hardened configurations.
Knowledge Distillation Attacks
Attacking knowledge distillation pipelines to transfer backdoors from teacher to student models or extract proprietary capabilities.
Model Memorization Patterns
Understanding when and why language models memorize training data, and techniques for detecting and exploiting memorization behavior.
PII Extraction Techniques
Techniques for extracting personally identifiable information from trained language models including prompt-based extraction, prefix attacks, targeted queries, and real-world examples.
Advanced Training Data Extraction
Advanced techniques for extracting memorized training data from language models.
Embedding Extraction Techniques
Methods for extracting embedding model weights and behavior through API access, including dimension reduction and reconstruction attacks.
Lab: Safety Classifier Model Extraction
Extract the decision boundary of safety classifiers through systematic probing to craft maximally evasive payloads.
Lab: Training Data Extraction at Scale
Extract memorized training data from language models using prefix-based extraction, divergence testing, and membership inference. Measure extraction rates and assess privacy risks.
Training Data Extraction from Production LLMs
Implement Carlini et al.'s techniques to extract memorized training data from production language model APIs.
Lab: Prompt Leaking Basics
Learn to extract system prompts from chatbots using common leaking techniques including direct requests, indirect elicitation, and format manipulation.
System Prompt Extraction Fundamentals
Learn to extract hidden system prompts from LLM applications using direct and indirect techniques.
Lab: System Prompt Extraction
Hands-on techniques for extracting hidden system prompts from LLM-powered applications using direct asking, role-play, instruction replay, and other methods.
Lab: System Prompt Reconstruction
Use various techniques to systematically extract and reconstruct the full system prompt of an LLM application, combining direct, indirect, and incremental extraction methods.
Context Heist: System Prompt Extraction Championship
Extract system prompts from 10 progressively harder LLM applications using diverse techniques.
Guardrail Rule Extraction Lab
Systematically extract the rules and decision boundaries of guardrail systems through targeted probing.
System Prompt Theft Techniques
Comprehensive techniques for extracting hidden system prompts from production LLM applications, from simple direct requests to sophisticated indirect methods.
System Prompt Extraction
Techniques for extracting hidden system prompts from AI applications, revealing safety rules, tool definitions, behavioral constraints, and sensitive configuration.
System Prompt Extraction Techniques
Catalog of system prompt extraction methods against LLM-powered applications: direct attacks, indirect techniques, multi-turn strategies, and defensive evasion.
Stealth Data Extraction Techniques
Stealthy techniques for extracting sensitive data from AI systems without triggering alerts.
Advanced Prompt Leaking Walkthrough
Advanced techniques for extracting system prompts including iterative reconstruction and side-channel methods.
Prompt Leaking Step by Step
Systematic approaches to extract system prompts from LLM applications, covering direct elicitation, indirect inference, differential analysis, and output-based reconstruction.