# data-extraction
標記為「data-extraction」的 14 篇文章
Embedding Inversion Attacks
Techniques for reconstructing input text from embedding vectors: model-specific inversion methods, privacy implications, and defenses against embedding inversion.
CTF: RAG Heist
Extract sensitive information from a Retrieval-Augmented Generation system by exploiting retrieval mechanisms, document parsing, embedding manipulation, and context window management vulnerabilities.
GPT-4 Known Vulnerabilities
Documented GPT-4 vulnerabilities including DAN jailbreaks, data extraction incidents, system prompt leaks, tool-use exploits, and fine-tuning safety removal.
Extracting Training Data
Techniques for extracting memorized training data, system prompts, and private information from LLMs through targeted querying and membership inference attacks.
Embedding Inversion
Recovering original text from embedding vectors, privacy implications of stored embeddings, model inversion attacks on vector databases, and embedding space analysis techniques.
RAG, Data & Training Attacks
Overview of attacks targeting the data layer of AI systems, including RAG poisoning, training data manipulation, and data extraction techniques.
Attack Payload Reference
Categorized reference of common attack payloads for AI red teaming, including prompt injection, jailbreaks, data extraction, and adversarial inputs with effectiveness notes.
嵌入反演攻擊
由嵌入向量重建輸入文字的技術:模型特定反演方法、隱私意涵,以及對抗嵌入反演的防禦。
CTF:RAG 劫案
透過利用檢索機制、文件解析、嵌入操弄與上下文視窗管理漏洞,從檢索增強生成(RAG)系統中擷取敏感資訊。
GPT-4 已知漏洞
已記錄之 GPT-4 漏洞——含 DAN jailbreak、資料擷取事件、系統提示洩漏、工具使用利用,以及微調安全移除。
擷取訓練資料
透過針對性查詢與成員推論攻擊,從 LLM 中擷取已記憶之訓練資料、系統提示與私密資訊的技術。
Embedding Inversion
Recovering original text from embedding vectors, privacy implications of stored embeddings, model inversion attacks on vector databases, and embedding space analysis techniques.
RAG、資料與訓練攻擊
針對 AI 系統資料層攻擊的概覽,包含 RAG 投毒、訓練資料操控與資料萃取技術。
攻擊 Payload Reference
Categorized reference of common attack payloads for AI red teaming, including prompt injection, jailbreaks, data extraction, and adversarial inputs with effectiveness notes.