Vector Database Hardening Guide
Security best practices for hardening vector databases — covering Pinecone, Weaviate, Chroma, Qdrant, and Milvus with configuration guidance, access controls, and monitoring.
Vector databases are the backbone of RAG systems, semantic search, and recommendation engines. They store embedding vectors alongside metadata and provide similarity search capabilities. A compromised vector database can lead to RAG poisoning, data exfiltration, privacy breaches, and service disruption. This guide provides hardening recommendations for the most widely deployed vector databases.
Universal Hardening Principles
Before diving into platform-specific guidance, these principles apply to all vector database deployments.
Authentication and Authorization
Every vector database deployment must enforce authentication. Never deploy a vector database with default credentials or without authentication enabled. Use strong, unique credentials for each service that connects to the database. Implement role-based access control (RBAC) to separate read and write permissions. Service accounts should have the minimum permissions required for their function.
Separate access patterns require separate credentials. The service that writes embeddings (the indexing pipeline) needs write access but may not need read access. The service that queries embeddings (the retrieval pipeline) needs read access but should not have write access. The administrative interface needs management access but should not be accessible from application code.
Encryption
Encrypt data at rest and in transit. For managed services, verify that encryption is enabled and using customer-managed keys where compliance requires it. For self-hosted databases, configure TLS for all client connections and encrypt the storage volumes.
Embedding vectors, while not directly human-readable, contain semantic information about the source data. As discussed in the embedding privacy section, embeddings can potentially be inverted to recover source text. Treat embedding storage with the same encryption requirements as the source data.
Network Security
Restrict network access to the vector database to only the services that need it. Use private networking (VPCs, private endpoints) rather than public endpoints. Implement network policies or security groups that whitelist specific source IPs or services. Block all inbound traffic from the public internet unless the database is intentionally public-facing.
Monitoring and Logging
Enable comprehensive logging for all database operations. Log all queries with their parameters, all write operations with the source identity, all administrative operations, and all authentication attempts including failures. Monitor for anomalous patterns that may indicate attack activity.
Platform-Specific Hardening
Pinecone
Pinecone is a fully managed vector database. Security hardening focuses on API key management, network configuration, and organizational controls.
API key management: Pinecone uses API keys for authentication. Rotate keys regularly and use separate keys for different environments (development, staging, production) and different services (indexing, querying, administration). Store keys in a secrets manager, not in code or configuration files.
Network security: Pinecone supports private endpoints through AWS PrivateLink. For production deployments handling sensitive data, configure private endpoints to keep all traffic within your VPC. When private endpoints are not feasible, restrict access using API key scoping and network-level controls on your side.
Namespace isolation: Use Pinecone namespaces to isolate data for different tenants, applications, or security contexts. While namespaces share an index and are not a strong security boundary, they provide logical separation and can be combined with application-level access controls.
Metadata filtering: Use metadata filters to enforce access controls at query time. Store access control attributes (tenant ID, classification level, owner) as metadata on each vector and filter queries to only return vectors the requesting user is authorized to see.
Backup and recovery: Configure regular backups of your Pinecone collections. Verify that backups can be restored successfully. Store backups with the same encryption and access controls as the primary data.
Weaviate
Weaviate can be deployed as a managed service (Weaviate Cloud Services) or self-hosted. Self-hosted deployments offer more security control but require more hardening effort.
Authentication: Weaviate supports API key authentication and OIDC-based authentication. For production deployments, use OIDC integration with your organization's identity provider. This enables SSO, MFA, and centralized credential management.
Authorization: Weaviate's authorization system supports role-based access control. Define roles that map to your application's access patterns: a read-only role for query services, a read-write role for indexing services, and an admin role for schema management. Assign roles to specific API keys or OIDC groups.
Network hardening: For self-hosted Weaviate, deploy within a VPC and restrict access through security groups. Disable the default HTTP port and use only HTTPS with valid TLS certificates. If running Weaviate in a Kubernetes cluster, use NetworkPolicies to restrict pod-to-pod communication.
Module security: Weaviate uses modules for vectorization, generative capabilities, and integrations. Only enable the modules you need. Each enabled module increases the attack surface. Verify that modules are from trusted sources and review their configuration for security-relevant settings.
Multi-tenancy: Weaviate supports native multi-tenancy. Use tenant isolation for applications that serve multiple customers. Each tenant's data is isolated at the storage level, providing stronger separation than metadata-based filtering.
Backup security: Configure Weaviate backups to a secure storage backend (S3 with encryption, GCS with customer-managed keys). Verify that backup storage access controls are as strict as the database access controls.
Chroma
Chroma is popular for development and prototyping, increasingly used in production. Its security posture depends heavily on the deployment configuration.
Server mode: Always run Chroma in server mode (client-server architecture) for production. The embedded mode (in-process) provides no network-level security controls. Server mode supports authentication, TLS, and network access restrictions.
Authentication: Chroma supports token-based authentication. Enable authentication for all server-mode deployments. Use strong, unique tokens and rotate them regularly. Never deploy Chroma without authentication, even in internal networks.
TLS configuration: Configure TLS for all Chroma server connections. Use certificates from a trusted CA rather than self-signed certificates in production. Verify that clients validate the server certificate.
Persistence security: Chroma persists data to the local filesystem. Ensure that the persistence directory has restrictive file permissions. Encrypt the storage volume. Implement file integrity monitoring on the persistence directory.
Docker deployment: When running Chroma in Docker, do not expose the container port to the host network unnecessarily. Use Docker networking to restrict access to the Chroma container. Run the container as a non-root user. Mount the persistence volume with minimal permissions.
Qdrant
Qdrant is a high-performance vector database that can be self-hosted or used as a managed service.
API key authentication: Enable API key authentication for all deployments. Use separate keys for read and write operations. Store keys in a secrets manager.
TLS: Configure TLS for gRPC and REST API endpoints. Qdrant supports TLS through configuration file settings. Verify that TLS is enforced and that unencrypted connections are rejected.
Collection security: Qdrant does not natively support per-collection access control. Implement application-level access controls that validate the requesting service's authorization before forwarding queries to Qdrant. For multi-tenant applications, use separate Qdrant instances or rely on payload-based filtering with application-level enforcement.
Snapshot security: Qdrant snapshots contain all data in a collection. Ensure that snapshot storage is encrypted and access-controlled. Restrict who can create and access snapshots.
Milvus
Milvus is a distributed vector database designed for large-scale deployments. Its distributed architecture introduces additional security considerations.
Authentication: Enable username and password authentication. Integrate with LDAP or OAuth for centralized identity management. Use strong passwords and enforce password rotation.
TLS: Configure TLS for all communication paths: client to proxy, proxy to data nodes, and internal cluster communication. In a distributed deployment, internal TLS prevents network-level attacks between Milvus components.
RBAC: Milvus supports role-based access control with collection-level granularity. Define roles that separate read, write, and administrative access. Assign roles to specific users or service accounts.
etcd security: Milvus uses etcd for metadata storage. Secure etcd with authentication, TLS, and access controls. A compromised etcd instance can be used to manipulate Milvus's metadata and behavior.
MinIO/S3 security: Milvus uses object storage (MinIO or S3) for data persistence. Apply standard object storage hardening: encryption, access controls, versioning, and audit logging.
Operational Security
Capacity Planning
Denial-of-service attacks against vector databases can exploit resource limits. Set appropriate limits for maximum number of vectors per collection, maximum query batch size, concurrent query limits, and storage quotas. Monitor resource utilization and alert before limits are reached.
Update Management
Keep vector database software updated. Security patches for vector databases address vulnerabilities in query processing, authentication, network handling, and data storage. Establish a patch management process that includes testing updates in a staging environment before production deployment.
Disaster Recovery
Maintain tested disaster recovery procedures. Regular backup verification ensures that backups are usable. Recovery testing verifies that the full recovery process works. Runbook documentation provides step-by-step recovery instructions. And RTO/RPO targets define acceptable recovery time and data loss.
Security Assessment
Conduct regular security assessments of vector database deployments. Test authentication bypass, authorization escalation, network exposure, injection attacks, and data exfiltration. Use the attack techniques documented elsewhere in this section as your assessment framework.
Hardening Checklist
- Authentication enabled with strong credentials
- RBAC configured with least-privilege roles
- TLS enabled for all connections
- Encryption at rest for data storage
- Network access restricted to authorized services
- Logging enabled for all operations
- Monitoring configured for anomalous patterns
- Backups configured and tested
- Default credentials changed or removed
- Unnecessary features and modules disabled
- Software updated to latest stable version
- Disaster recovery procedures documented and tested
Vector databases are critical infrastructure for AI applications. Apply the same security rigor to vector databases as you would to any production database containing sensitive data — because in most deployments, that is exactly what they contain.