Vector Database Access Control

intermediate7 min readUpdated 2026-03-15

Vector database access control weaknesses: API key management, tenant isolation failures, namespace security, and metadata filtering bypass techniques.

access-control api-keys tenant-isolation namespace-security metadata-filtering vector-databases

Access control in vector databases is a rapidly evolving area where product capabilities frequently lag behind deployment needs. Many vector databases were designed as search infrastructure, not as security-sensitive data stores, and their access control models reflect this origin. This page covers the most common access control weaknesses and how to exploit them during security assessments.

API Key Management

API keys are the primary authentication mechanism for most vector databases. The security of the entire vector database often depends on the security of a single API key.

Common API Key Weaknesses

Single key per project. Pinecone and many other managed vector databases issue a single API key per project. This key grants full read/write access to all data. There is no way to issue a read-only key, a key scoped to a specific namespace, or a key with limited query capabilities.

# A single Pinecone API key grants access to everything
import pinecone
 
pinecone.init(api_key="compromised-key")
index = pinecone.Index("production-index")
 
# Full read access
results = index.query(vector=[0.1] * 1536, top_k=10000)
 
# Full write access
index.upsert(vectors=[("malicious-id", [0.1] * 1536, {"injected": True})])
 
# Full delete access
index.delete(ids=["any-id"])

Keys in client-side code. When AI applications embed vector database queries in client-side code (browser applications, mobile apps), API keys are exposed to end users. Even when keys are meant to be restricted, the vector database may not enforce granular permissions.

Keys in configuration files. Vector database API keys frequently appear in:

Environment files committed to version control
Docker Compose files
Kubernetes ConfigMaps (instead of Secrets)
CI/CD pipeline configurations
Terraform state files

Key Rotation Challenges

Vector database API keys are rarely rotated because:

Many databases do not support multiple simultaneous active keys, making rotation require downtime
Keys are embedded in multiple services that all need coordinated updates
There is no automated rotation mechanism comparable to AWS Secrets Manager integration

Tenant Isolation

Multi-tenant vector database deployments require isolation between tenants to prevent data leakage. Isolation mechanisms vary significantly across databases and deployment models.

Logical vs. Physical Isolation

Isolation Type	Mechanism	Bypass Risk
Namespace	Logical partition within an index	High — single API key accesses all namespaces
Collection	Separate index with separate configuration	Medium — depends on access control model
Database	Separate database instance	Low — separate credentials required
Infrastructure	Separate compute and storage	Lowest — physical separation

Most production deployments use namespace or collection-level isolation for cost efficiency. This creates weaker boundaries than separate database instances.

Namespace Isolation Bypass

In Pinecone and similar databases, namespaces are logical partitions within an index. A query specifying a namespace returns only results from that namespace. However, an attacker with the API key can query any namespace:

# Intended: application queries only its own namespace
results = index.query(
    vector=query_embedding,
    top_k=10,
    namespace="tenant-a"
)
 
# Attack: query another tenant's namespace
results = index.query(
    vector=query_embedding,
    top_k=10,
    namespace="tenant-b"  # Access to tenant-b's data
)
 
# Attack: enumerate namespaces
# Most vector databases do not restrict namespace listing
stats = index.describe_index_stats()
# Returns namespace names and vector counts

Cross-Tenant Query Leakage

Even with proper namespace isolation, certain operations may leak information across tenant boundaries:

Index statistics reveal the existence and size of other tenants' data
Similarity scores in shared indexes may reflect proximity to other tenants' vectors
Metadata filtering may not be enforced at the storage level, allowing filter bypass

Namespace Security

Namespaces provide logical separation but are not a security boundary in most vector databases. Understanding the limitations is essential for accurate risk assessment.

Namespace Enumeration

Most vector databases allow listing all namespaces through statistics endpoints:

# Pinecone namespace enumeration
stats = index.describe_index_stats()
for namespace, info in stats.namespaces.items():
    print(f"Namespace: {namespace}, Vectors: {info.vector_count}")

Namespace names often reveal organizational structure, project names, or tenant identifiers. This information supports further targeted attacks.

Namespace Deletion

An attacker with write access can delete entire namespaces, causing data loss:

# Delete all vectors in a namespace
index.delete(delete_all=True, namespace="target-namespace")

Most vector databases do not have separate permissions for read, write, and delete operations. A key that can query can also delete.

Metadata Filtering Bypass

Vector databases store metadata alongside embeddings to enable filtered queries. Many RAG systems implement access control through metadata filtering — attaching access labels to embeddings and filtering queries to only return embeddings the user is authorized to see.

Client-Side Filtering

The most common and most dangerous pattern is implementing access control through client-side metadata filters:

# Application code implements access control via metadata filter
def search(user, query_embedding):
    results = index.query(
        vector=query_embedding,
        top_k=10,
        filter={"department": user.department}  # "Access control"
    )
    return results

An attacker who can modify the query (through API manipulation, client-side code modification, or injection) can remove or change the filter:

# Attack: remove the filter to access all data
results = index.query(
    vector=query_embedding,
    top_k=10
    # No filter — returns results from all departments
)
 
# Attack: query a different department
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={"department": "executive"}  # Access executive data
)

Filter Injection

Some applications construct metadata filters from user input, creating opportunities for filter injection:

# Vulnerable: filter constructed from user input
def search(request):
    filters = {"department": request.user.department}
 
    # User-supplied additional filters
    if request.params.get("category"):
        filters["category"] = request.params["category"]
 
    # Attacker can inject: category={"$exists": true}
    # or other filter operators depending on the database
    results = index.query(
        vector=query_embedding,
        top_k=10,
        filter=filters
    )

Metadata Enumeration

Even when filters are enforced, metadata values can be enumerated through targeted queries:

# Enumerate metadata values by querying with each possible value
departments = ["engineering", "sales", "executive", "hr", "finance"]
for dept in departments:
    results = index.query(
        vector=random_embedding,
        top_k=1,
        filter={"department": dept}
    )
    if results.matches:
        print(f"Department exists: {dept}")

Red Team Assessment Checklist

When assessing vector database access control:

Obtain the API key — Check client-side code, configuration files, environment variables, CI/CD pipelines
Test namespace isolation — Query across namespaces, enumerate namespace names
Test metadata filter bypass — Remove filters, modify filter values, inject filter operators
Test write access — Attempt to upsert, update, and delete vectors
Test administrative operations — Index creation, deletion, configuration changes
Test key scope — Determine what operations the key permits and whether different keys have different permissions
Check for key rotation — Determine when the key was last rotated and whether rotation is automated

Vector Database Security Landscape — Overview of vector database architectures
Injection Attacks — Attacks that exploit access to write embeddings
Data Exfiltration — Leveraging access control gaps for data extraction

Vector Database Access Control

Related articles

Vector Database Access Control

Related articles