# privacy
標記為「privacy」的 85 篇文章
Memory Exfiltration
Techniques for extracting data from AI agent memory systems, including extracting previous conversations, revealing other users' data, and cross-session information leakage.
Data Breach Investigation for AI Systems
Investigating data breaches involving AI systems including training data exposure, model memorization exploitation, and embedding inversion attacks.
Data Privacy in AI Assessment
Assessment on training data privacy, membership inference, data extraction, and privacy-preserving techniques.
Embedding & Vector Security Assessment (Assessment)
Test your understanding of embedding inversion attacks, vector database security, similarity search manipulation, and privacy risks of stored embeddings with 10 questions.
Privacy Attack Assessment
Test your advanced knowledge of privacy attacks against AI systems including data leakage, PII extraction, differential privacy failures, and inference-time privacy risks with 9 questions.
Case Study: Samsung ChatGPT Data Leak
Analysis of the Samsung confidential code leak through ChatGPT and organizational AI policy implications.
Case Study: Training Data Extraction from GPT
Analysis of the Carlini et al. work on extracting training data from ChatGPT in production.
Facial Recognition Bias and Failures
Analysis of facial recognition AI incidents including racial and gender bias in commercial systems, wrongful arrests, privacy violations, and the security implications of adversarial attacks on biometric AI.
ChatGPT Data Leak (March 2023)
Analysis of the March 2023 ChatGPT incident where a Redis client library bug caused users to see other users' conversation titles, partial chat history, and payment information. Covers root cause, impact, and lessons for AI application security.
Dataset Inference Attacks
Inferring properties of the training dataset through model behavior analysis.
Data & Training Security
Security vulnerabilities in the AI data pipeline, covering RAG exploitation, training data attacks, model extraction and intellectual property theft, and privacy attacks against deployed models.
Membership Inference Defenses
Evaluating and implementing defenses against membership inference attacks that determine whether specific samples were in a model's training set.
Practical Membership Inference Attacks
Practical guide to conducting membership inference attacks against deployed language models.
Model Inversion Attacks
Recovering sensitive training data features through model inversion techniques.
Model Memorization Patterns
Understanding when and why language models memorize training data, and techniques for detecting and exploiting memorization behavior.
Privacy & Data Protection Attacks
Overview of privacy attacks against AI systems including data extraction, membership inference, and model inversion, with regulatory implications and red team methodology.
Membership Inference Attacks
Techniques for determining whether specific data was used to train an AI model, including shadow model approaches, loss-based inference, LiRA, and practical implementation guidance.
Advanced Model Inversion Attacks
Reconstructing training data from model weights and API access using gradient-based inversion, generative model-assisted reconstruction, and membership inference refinement.
Model Inversion Attacks (Data Training)
Techniques for reconstructing training data from model outputs including gradient-based inversion, generative model inversion, and privacy implications for image and language models.
PII Extraction Techniques
Techniques for extracting personally identifiable information from trained language models including prompt-based extraction, prefix attacks, targeted queries, and real-world examples.
Privacy Attacks on Embeddings
Recovering sensitive information from embedding vectors through inversion attacks, attribute inference, and reconstruction techniques.
Synthetic Data Security Risks
Security implications of using synthetic data for model training, including inherited biases, poisoning propagation, and privacy leakage.
AI Logging Architecture
What to capture in AI system logs — prompts, completions, latency, tokens, tool calls — along with storage strategies, retention policies, and privacy considerations.
Embedding Privacy
What embeddings reveal about source data — covering embedding inversion attacks, membership inference, attribute inference, privacy-preserving embedding techniques, and regulatory implications.
Embedding-Level Attacks
Overview of attacks targeting embeddings directly: adversarial embedding generation, inversion attacks for text reconstruction, and membership inference via embedding analysis.
Embedding Inversion Attacks
Techniques for reconstructing input text from embedding vectors: model-specific inversion methods, privacy implications, and defenses against embedding inversion.
Membership Inference via Embeddings
Determining if specific data was in an embedding model's training set through distance-based inference, statistical tests, and embedding behavior analysis.
Embedding Inversion Attacks (Embedding Vector Security)
Recovering original text from embedding vectors using inversion techniques.
Data Exfiltration via Vector Databases
Data exfiltration techniques for vector databases: embedding inversion to reconstruct documents, enumeration attacks, and similarity-based data harvesting.
Machine Unlearning Attacks and Defenses
Attacking and evaluating machine unlearning techniques used to remove sensitive data from trained models.
Machine Unlearning Security Research
Research on attacks against machine unlearning methods and verification of knowledge removal.
AI Data Governance and Security
Data governance practices specific to AI systems including training data provenance, access controls, and retention.
Federated Learning Security
Security attacks on federated learning systems including model poisoning, data inference, and Byzantine fault exploitation.
GPU Memory Side-Channel Attacks
Side-channel attacks exploiting GPU memory allocation, timing, and electromagnetic emanation to extract sensitive data from AI workloads.
Lab: Training Data Extraction at Scale
Extract memorized training data from language models using prefix-based extraction, divergence testing, and membership inference. Measure extraction rates and assess privacy risks.
Embedding Inversion Attack Implementation
Implement embedding inversion to recover original text from vector database embeddings.
Simulation: Voice Assistant Red Team
Red team engagement simulation targeting an AI voice assistant deployed in a smart home platform, covering audio-based prompt injection, wake word exploitation, and privacy exfiltration.
Data Harvesting via Injection
Using injection techniques to extract training data, system prompts, user data, and other sensitive information from LLM applications.
Extracting Training Data
Techniques for extracting memorized training data, system prompts, and private information from LLMs through targeted querying and membership inference attacks.
Embedding Inversion
Recovering original text from embedding vectors, privacy implications of stored embeddings, model inversion attacks on vector databases, and embedding space analysis techniques.
Federated Learning Poisoning (Training Pipeline)
Federated learning architecture vulnerabilities: Byzantine attacks, model replacement, gradient manipulation, and techniques for poisoning global models through malicious participants.
Machine Unlearning Attacks
Exploiting machine unlearning processes: recovering supposedly forgotten data, unlearning verification attacks, partial unlearning exploitation, and the fundamental limits of forgetting in neural networks.
Security of Training Data Attribution Methods
Analysis of vulnerabilities in training data attribution techniques including influence functions, membership inference, and data provenance tracking, with implications for privacy and security.
PII Redaction Pipeline
Step-by-step walkthrough for building an automated PII detection and redaction pipeline for LLM outputs, covering regex-based detection, NER-based detection, presidio integration, redaction strategies, and compliance testing.
記憶外洩
自 AI 代理記憶系統提取資料之技術,含提取先前對話、揭露其他使用者資料,與跨會話資訊洩漏。
Data Breach Investigation for AI Systems
Investigating data breaches involving AI systems including training data exposure, model memorization exploitation, and embedding inversion attacks.
Data Privacy in AI 評量
評量 on training data privacy, membership inference, data extraction, and privacy-preserving techniques.
章節評量:隱私攻擊
15 題校準評量,測試你對 AI 系統中隱私攻擊的理解——PII 萃取、成員推論與模型反演。
Case Study: Samsung ChatGPT Data Leak
Analysis of the Samsung confidential code leak through ChatGPT and organizational AI policy implications.
Case Study: 訓練 Data Extraction from GPT
Analysis of the Carlini et al. work on extracting training data from ChatGPT in production.
人臉辨識安全案例
人臉辨識 AI 中的安全事件案例——涵蓋偏誤與歧視、對抗性攻擊、隱私侵犯與監控濫用。
Dataset Inference 攻擊s
Inferring properties of the training dataset through model behavior analysis.
資料與訓練安全
AI 資料管線中的安全漏洞,涵蓋 RAG 利用、訓練資料攻擊、模型萃取與智慧財產盜竊,以及對已部署模型的隱私攻擊。
Membership Inference 防禦s
Evaluating and implementing defenses against membership inference attacks that determine whether specific samples were in a model's training set.
Practical Membership Inference 攻擊s
Practical guide to conducting membership inference attacks against deployed language models.
模型 Inversion 攻擊s
Recovering sensitive training data features through model inversion techniques.
模型 Memorization Patterns
Understanding when and why language models memorize training data, and techniques for detecting and exploiting memorization behavior.
隱私與資料保護攻擊
對 AI 系統之隱私攻擊概觀,含資料提取、成員推論與模型反轉,配法規意涵與紅隊方法論。
Membership Inference 攻擊s
Techniques for determining whether specific data was used to train an AI model, including shadow model approaches, loss-based inference, LiRA, and practical implementation guidance.
進階 模型 Inversion 攻擊s
Reconstructing training data from model weights and API access using gradient-based inversion, generative model-assisted reconstruction, and membership inference refinement.
模型 Inversion 攻擊s (Data 訓練)
Techniques for reconstructing training data from model outputs including gradient-based inversion, generative model inversion, and privacy implications for image and language models.
PII Extraction Techniques
Techniques for extracting personally identifiable information from trained language models including prompt-based extraction, prefix attacks, targeted queries, and real-world examples.
Privacy 攻擊s on Embeddings
Recovering sensitive information from embedding vectors through inversion attacks, attribute inference, and reconstruction techniques.
Synthetic Data 安全 Risks
安全 implications of using synthetic data for model training, including inherited biases, poisoning propagation, and privacy leakage.
AI Logging Architecture
What to capture in AI system logs — prompts, completions, latency, tokens, tool calls — along with storage strategies, retention policies, and privacy considerations.
嵌入向量層級攻擊
直接鎖定嵌入向量的攻擊概覽:對抗性嵌入產生、用於文字重建的反演攻擊,以及透過嵌入分析的成員推論。
嵌入反演攻擊
由嵌入向量重建輸入文字的技術:模型特定反演方法、隱私意涵,以及對抗嵌入反演的防禦。
透過嵌入進行成員推論
透過距離式推論、統計檢定與嵌入行為分析,判定特定資料是否存在於嵌入模型的訓練集之中。
Embedding Inversion 攻擊s (Embedding Vector 安全)
Recovering original text from embedding vectors using inversion techniques.
經由向量資料庫的資料外洩
向量資料庫的資料外洩技術:以嵌入反演重建文件、列舉攻擊、以相似度為基礎的資料採集。
Machine Unlearning 攻擊s and 防禦s
攻擊ing and evaluating machine unlearning techniques used to remove sensitive data from trained models.
Machine Unlearning 安全 Research
Research on attacks against machine unlearning methods and verification of knowledge removal.
AI Data Governance and 安全
Data governance practices specific to AI systems including training data provenance, access controls, and retention.
Federated Learning 安全
安全 attacks on federated learning systems including model poisoning, data inference, and Byzantine fault exploitation.
GPU 記憶體 Side-Channel 攻擊s
Side-channel attacks exploiting GPU memory allocation, timing, and electromagnetic emanation to extract sensitive data from AI workloads.
實驗室: 訓練 Data Extraction at Scale
Extract memorized training data from language models using prefix-based extraction, divergence testing, and membership inference. Measure extraction rates and assess privacy risks.
Embedding Inversion 攻擊 Implementation
Implement embedding inversion to recover original text from vector database embeddings.
模擬:語音助理紅隊
針對部署於智慧家庭平台之 AI 語音助理之紅隊委任模擬,涵蓋音訊型提示注入、喚醒詞利用,以及隱私外洩。
Data Harvesting via Injection
Using injection techniques to extract training data, system prompts, user data, and other sensitive information from LLM applications.
擷取訓練資料
透過針對性查詢與成員推論攻擊,從 LLM 中擷取已記憶之訓練資料、系統提示與私密資訊的技術。
Embedding Inversion
Recovering original text from embedding vectors, privacy implications of stored embeddings, model inversion attacks on vector databases, and embedding space analysis techniques.
聯邦學習投毒(訓練管線)
聯邦學習架構漏洞:Byzantine 攻擊、模型替換、梯度操弄,以及經由惡意參與者投毒全域模型之技術。
Machine Unlearning 攻擊s
利用ing machine unlearning processes: recovering supposedly forgotten data, unlearning verification attacks, partial unlearning exploitation, and the fundamental limits of forgetting in neural networks.
安全 of 訓練 Data Attribution Methods
Analysis of vulnerabilities in training data attribution techniques including influence functions, membership inference, and data provenance tracking, with implications for privacy and security.
PII Redaction Pipeline
Step-by-step walkthrough for building an automated PII detection and redaction pipeline for LLM outputs, covering regex-based detection, NER-based detection, presidio integration, redaction strategies, and compliance testing.