Embedding Privacy
What embeddings reveal about source data — covering embedding inversion attacks, membership inference, attribute inference, privacy-preserving embedding techniques, and regulatory implications.
Organizations frequently embed sensitive text — customer communications, medical records, legal documents, financial reports — into vector databases for semantic search and retrieval. A common assumption is that embeddings are one-way transformations: like cryptographic hashes, they cannot be reversed to recover the source text. This assumption is wrong.
Research has demonstrated that embeddings can be partially or fully inverted to recover source text, that the presence of specific documents in an embedding database can be determined through membership inference, and that sensitive attributes of the source data can be inferred from embeddings without recovering the full text.
This page covers the privacy risks of embeddings and techniques for mitigating them.
What Embeddings Reveal
Information Content
An embedding vector is a compressed representation of the semantic content of its source text. A typical embedding (768 or 1536 dimensions, 32-bit floating point) contains 24,576 to 49,152 bits of information. The source text for a short document might contain a few hundred to a few thousand bits of semantic information. This means the embedding has sufficient capacity to encode most of the semantic content of the source text.
The key question is not whether the information is in the embedding — it usually is — but whether it can be extracted by an attacker. The answer depends on the attacker's access to the embedding model, the size and structure of the embedding, and the domain and vocabulary of the source text.
Types of Information Leakage
Semantic content: The intended purpose of embeddings — they encode what the text is about. An embedding of a medical record encodes that it discusses specific conditions, treatments, and symptoms. An embedding of a financial report encodes revenue figures, risk assessments, and strategic plans.
Structural information: Embeddings encode not just what is said but how it is said. Writing style, formality level, document type, and structural patterns are reflected in the embedding. This can reveal whether a document is a personal email, a legal filing, or a technical specification.
Named entities: Embeddings of text containing names, organizations, locations, and other named entities encode these entities in the embedding space. Partial inversion may recover entity names without recovering the full text.
Demographic indicators: Text written by or about specific demographic groups may produce embeddings with statistically identifiable characteristics. This enables attribute inference attacks where an attacker determines demographic attributes of the text's author or subject from the embedding alone.
Inversion Attacks
White-Box Inversion
When the attacker has access to the embedding model (open-source models or leaked model weights), they can train an inversion model — a decoder that takes an embedding vector and produces text that would generate that embedding.
The inversion model is trained by generating embeddings for a large text corpus using the target embedding model, then training a text generation model to reverse the process. The generator learns the mapping from embedding space back to text space.
White-box inversion can recover significant portions of the source text, especially for shorter documents. Research has demonstrated recovery of 50-90% of words from short text passages using modern inversion techniques against sentence-transformers models. Longer documents are harder to invert because the embedding compresses more information, but key phrases, named entities, and topical content can still be recovered.
Black-Box Inversion
When the attacker only has API access to the embedding model, inversion is harder but not impossible. The attacker can query the API to build a mapping between known texts and their embeddings. Then, given a target embedding, the attacker searches for texts whose embeddings are closest to the target.
This approach is limited by API rate limits and costs, but it can be effective for constrained domains. If the attacker knows that the source text comes from a specific domain (medical records, legal documents, customer emails), they can build a domain-specific inversion database that covers the most likely vocabulary and patterns.
Nearest Neighbor Inversion
The simplest inversion technique finds the nearest known embedding to the target embedding in a reference database. If the attacker can build a large database of text-embedding pairs from the same domain as the target, the nearest neighbor is often semantically similar to the source text.
This technique does not recover the exact source text, but it reveals the topic, sentiment, and key concepts. For many privacy threat models, this level of information recovery is sufficient to constitute a breach.
Membership Inference
Membership inference determines whether a specific document is in the embedding database without recovering any document content. This is a distinct privacy concern: knowing that a specific document (a medical record, a legal filing, a financial report) is in a database reveals information even without knowing what the database contains.
Direct Membership Inference
If the attacker has query access to the embedding database, they can embed the target document and search for exact or near-exact matches. A match indicates membership. This is straightforward but requires the attacker to know the candidate document.
Statistical Membership Inference
More sophisticated membership inference exploits statistical differences between embeddings of documents that are in the database and documents that are not. The embedding model assigns slightly different representations to documents it has processed before (due to potential model state effects) or documents that are semantically typical of the database's contents.
By training a classifier on embeddings of known-member and known-non-member documents, the attacker can predict membership for unknown documents. This technique works even when the attacker does not know the exact content of the target document — they only need a good approximation.
Attribute Inference
Attribute inference extracts specific properties of the source data from embeddings without recovering the full text. This is relevant for compliance with privacy regulations that protect specific categories of information.
Sensitive Attribute Recovery
Embeddings of text containing sensitive attributes — age, gender, ethnicity, health conditions, financial status — encode these attributes in statistically detectable ways. A classifier trained on embeddings with known attributes can predict attributes from embeddings with unknown attributes.
Research has demonstrated that the following attributes can be inferred from text embeddings with significant accuracy: author demographic characteristics, document confidentiality level, presence of specific medical conditions, financial status indicators, and geographic location.
Topic Inference
Even without full inversion, the topic of a document can be reliably inferred from its embedding. An attacker who clusters a database of embeddings can identify groups of documents about the same topic. Combined with metadata (timestamp, document size, source identifier), topic clusters can reveal sensitive information about the organization.
Privacy-Preserving Embedding Techniques
Differential Privacy
Differential privacy adds calibrated noise to embeddings to bound the information any individual document contributes. The noise makes inversion harder because the embedding no longer exactly represents the source text. The noise makes membership inference harder because individual documents have less impact on the embedding distribution.
The trade-off is that noisy embeddings have lower retrieval accuracy. The more noise added (higher privacy budget epsilon), the better the privacy protection but the worse the retrieval performance. Finding the right epsilon for your application requires balancing privacy requirements against retrieval quality.
Dimensionality Reduction
Reducing the dimensionality of embeddings (from 1536 to 256 dimensions, for example) discards information that may include privacy-relevant details. However, dimensionality reduction also degrades retrieval performance and does not provide formal privacy guarantees.
Dimensionality reduction is often used alongside differential privacy, where the reduced dimensionality makes the privacy noise more efficient (less noise needed to achieve the same privacy guarantee).
Embedding Quantization
Converting embedding values from 32-bit floating point to lower precision (8-bit integers, binary) reduces the information content of each dimension. This is primarily a performance optimization but has a side effect of reducing the precision available for inversion attacks.
Like dimensionality reduction, quantization does not provide formal privacy guarantees. It makes attacks harder in practice but does not prevent them in theory.
Secure Computation
For the highest privacy requirements, embeddings can be computed and compared using secure computation techniques. Homomorphic encryption allows similarity computation on encrypted embeddings without decrypting them. Secure multi-party computation distributes the computation across multiple parties so no single party sees the full embedding.
These techniques are computationally expensive and add significant latency, but they provide strong privacy guarantees. They are currently practical only for small-scale or high-value applications.
Regulatory Implications
GDPR
Under GDPR, embeddings of personal data are themselves personal data because they can be linked back to identifiable individuals through inversion or membership inference. Organizations storing embeddings of EU residents' data must comply with GDPR requirements including data minimization, purpose limitation, storage limitation, right to erasure, and data protection impact assessment.
The right to erasure is particularly challenging for embedding databases. Deleting a specific individual's embedding from a vector database may not fully remove their data if the individual's information influenced other embeddings (in a fine-tuned model) or if embedding database snapshots retain the deleted data.
HIPAA
Embeddings of protected health information (PHI) are themselves PHI under HIPAA. Organizations storing medical text embeddings must implement HIPAA-compliant access controls, encryption, audit logging, and breach notification.
CCPA and State Privacy Laws
California's CCPA and similar state privacy laws apply to data that can be linked to individuals. Embeddings that encode identifiable information fall under these laws, requiring disclosure, opt-out mechanisms, and deletion capabilities.
Assessment Methodology
When assessing the privacy properties of an embedding deployment, start by identifying what data is embedded and its sensitivity classification. Then determine what embedding model is used and whether the model is accessible for white-box attacks. Test inversion attacks using the techniques described above. Test membership inference against known member and non-member documents. Test attribute inference for sensitive attributes relevant to the data domain. Evaluate whether privacy-preserving techniques are applied and their effectiveness. Finally, assess compliance with relevant privacy regulations.
The privacy risks of embeddings are not theoretical. Organizations storing embeddings of sensitive data should treat those embeddings as derived sensitive data and apply commensurate protections. The assumption that embeddings are privacy-safe is dangerous and incorrect.