Embedding Model Security Comparison
Security comparison of major embedding models — OpenAI, Cohere, sentence-transformers — covering vulnerability profiles, adversarial robustness, and privacy characteristics.
The choice of embedding model affects the security posture of any system that relies on semantic similarity — RAG pipelines, semantic search, content moderation, and recommendation systems. Different embedding models have different vulnerability profiles, and understanding these differences is essential for both red teamers assessing these systems and engineers building them.
This page compares the security characteristics of the most widely deployed embedding models across commercial APIs and open-source options.
Embedding Model Landscape
Commercial API Models
OpenAI Embeddings (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002) are the most widely deployed commercial embedding models. They are accessed through OpenAI's API, which means the embedding computation happens on OpenAI's infrastructure. The models are closed-source, with no access to weights or architecture details beyond what OpenAI publishes.
Cohere Embeddings (embed-english-v3.0, embed-multilingual-v3.0) provide both English and multilingual embedding capabilities through an API. Cohere's models support input type classification (search_document, search_query, classification, clustering) which affects the embedding behavior.
Google Embeddings (text-embedding-004 through Vertex AI) offer embeddings through Google Cloud's API. These models support task-type specification similar to Cohere's input types.
Amazon Titan Embeddings provide embedding capabilities through AWS Bedrock, tightly integrated with AWS infrastructure and access controls.
Open-Source Models
sentence-transformers is the most popular open-source embedding library, providing access to hundreds of pre-trained models. Popular models include all-MiniLM-L6-v2, all-mpnet-base-v2, and instructor-large. These models can be self-hosted, providing complete control over the embedding pipeline.
BAAI BGE models (bge-small-en, bge-base-en, bge-large-en) are high-performing open-source models from the Beijing Academy of Artificial Intelligence.
E5 models from Microsoft Research provide strong performance on retrieval tasks and are available in multiple sizes.
Security Comparison Matrix
| Security Property | OpenAI | Cohere | sentence-transformers | BGE/E5 |
|---|---|---|---|---|
| Adversarial robustness | Medium-High | Medium | Low-Medium | Low-Medium |
| Inversion resistance | High (API) | High (API) | Low (local) | Low (local) |
| Data privacy | Depends on policy | Depends on policy | Full control | Full control |
| Supply chain risk | Low (managed) | Low (managed) | Medium (self-hosted) | Medium (self-hosted) |
| Transparency | Low (closed) | Low (closed) | High (open) | High (open) |
| Customization | Limited | Limited | Full | Full |
Adversarial Robustness
Adversarial robustness measures how easy it is to craft inputs that produce misleading embeddings — inputs that are semantically different from a target but embed close to it, or inputs that are semantically similar but embed far apart.
Commercial API models generally show higher adversarial robustness because they are larger models with more capacity to capture semantic nuance. They are trained on more diverse data, reducing susceptibility to distribution-specific attacks. They may include adversarial training or robustness-focused training stages. And they are regularly updated, potentially addressing known adversarial weaknesses.
Open-source models, particularly smaller ones like MiniLM, are more susceptible to adversarial inputs because their smaller capacity forces more compression of the semantic space. An attacker who can download and analyze the model locally can craft adversarial inputs more efficiently through white-box attacks.
However, open-source models have a transparency advantage: you can analyze the model's behavior, identify its weaknesses, and implement targeted defenses. With commercial APIs, you must treat the model as a black box and design defenses without understanding the model's specific failure modes.
Inversion Resistance
Embedding inversion is the process of recovering the original text from an embedding vector. This is a privacy concern because organizations often store embeddings of sensitive documents, assuming that the embeddings do not reveal the source text.
Commercial API models provide higher inversion resistance because the model weights are not available to attackers. An attacker can only perform black-box inversion by querying the API and matching embeddings, which is rate-limited and expensive. The models are large, making the embedding-to-text mapping more complex and harder to invert.
Open-source models are more vulnerable to inversion because the model weights are available, enabling white-box inversion techniques. An attacker can train an inversion model using the embedding model as a component. Smaller models produce embeddings with less informational complexity, making inversion easier.
Data Privacy
Data privacy concerns differ significantly between API-based and self-hosted models.
With commercial API models, source text is transmitted to the provider's infrastructure for embedding. Depending on the provider's data retention and processing policies, this text may be stored, logged, or used for model improvement. For organizations with strict data handling requirements (healthcare, finance, government), this may be unacceptable.
OpenAI's data usage policy has evolved over time, and enterprise customers can negotiate specific data handling terms. Cohere and Google offer similar enterprise agreements. But the fundamental fact remains: using an API-based embedding model means sending your data to a third party.
Self-hosted open-source models keep all data on your infrastructure. There is no third-party data exposure, no dependency on external data handling policies, and no risk of data being used for model training without consent.
Supply Chain Risk
Commercial API models have lower supply chain risk in the sense that the provider manages the model weights, serving infrastructure, and updates. You trust the provider to maintain model integrity and security.
Self-hosted open-source models have supply chain risks that the organization must manage. Model weights downloaded from public repositories could be tampered with. Model dependencies (PyTorch, transformers, tokenizers) could contain vulnerabilities. Model files in pickle format can execute arbitrary code during loading. And model updates must be managed by the organization.
Provider-Specific Vulnerabilities
OpenAI Embeddings
Dimensionality reduction exploitation: OpenAI's text-embedding-3 models support dimensionality reduction through the dimensions parameter. Lower-dimensional embeddings lose semantic nuance, potentially making adversarial attacks easier. Organizations that reduce dimensions for cost savings may inadvertently weaken their embedding space's security properties.
API key exposure: Because embeddings require API calls, API keys must be available to any service that generates embeddings. Organizations with complex architectures may have API keys distributed across many services, increasing exposure risk.
Rate limiting and cost: OpenAI applies rate limits to embedding API calls. An attacker who triggers excessive embedding operations (through document upload floods or query amplification) can cause rate limiting that degrades service or cost amplification that impacts budgets.
Cohere Embeddings
Input type manipulation: Cohere's input type parameter (search_document vs. search_query) affects the embedding. An attacker who can control the input type parameter can cause documents to be embedded as queries or vice versa, disrupting retrieval accuracy.
Multilingual model risks: Cohere's multilingual model embeds text across languages into a shared space. This enables cross-lingual attacks where an injection payload in one language is embedded close to a target query in another language, potentially bypassing language-specific content filters.
sentence-transformers
Model swapping: Because sentence-transformers models are loaded by name from HuggingFace or local storage, an attacker who can modify the model loading path or the local model cache can substitute a malicious model. The malicious model produces normal-looking embeddings for most inputs but manipulated embeddings for specific inputs.
Tokenizer vulnerabilities: sentence-transformers models use tokenizers that may have vulnerabilities to adversarial inputs — specially crafted strings that tokenize in unexpected ways, potentially affecting the embedding output.
Pooling strategy manipulation: sentence-transformers uses pooling to convert token-level representations into a single embedding vector. Different pooling strategies (mean, max, CLS) have different security properties. Mean pooling is more susceptible to token-injection attacks where adding specific tokens shifts the embedding. CLS pooling is more susceptible to prefix attacks.
Assessment Methodology
When assessing an embedding model deployment, follow this methodology.
Phase 1: Model and Configuration Identification
Identify the specific embedding model and version in use. Document the embedding dimension, similarity metric (cosine, dot product, Euclidean), and any post-processing applied to embeddings. Check for dimensionality reduction, normalization, or quantization.
Phase 2: Adversarial Robustness Testing
Craft adversarial inputs designed to produce misleading embeddings. Test semantic collision attacks where dissimilar text embeds close to a target. Test semantic repulsion attacks where similar text embeds far from a target. Test embedding flooding where many similar embeddings are injected to dominate retrieval results.
Measure the success rate and required perturbation magnitude. Compare results across different models if the system supports multiple models.
Phase 3: Inversion Testing
If embeddings are stored and potentially accessible, test inversion resistance. Use black-box inversion techniques if you only have API access. Use white-box inversion techniques if the model weights are available. Assess what information can be recovered and whether it constitutes a privacy risk.
Phase 4: Infrastructure Assessment
Assess the infrastructure around the embedding model. Check API key management and rotation. Verify network security for API calls. Assess model storage and loading security for self-hosted models. Review access controls on embedding databases.
Phase 5: Integration Assessment
Assess how embeddings are used in the broader system. Check whether embedding similarity scores are the sole retrieval criterion or whether additional filtering is applied. Verify whether the system validates that retrieved documents are appropriate for the user's authorization level. Test whether the embedding pipeline preserves security boundaries from the source data.
Model Selection Guidance
When selecting an embedding model for a security-sensitive application, consider these factors.
For applications with strict data privacy requirements, self-hosted open-source models are preferred despite their lower adversarial robustness. The privacy benefits of keeping data on your infrastructure outweigh the security benefits of commercial model robustness in most cases.
For applications where adversarial robustness is the primary concern (content moderation, spam detection), commercial API models offer better baseline robustness. Supplement this with application-level defenses (re-ranking, cross-encoder verification, human review) for high-stakes decisions.
For applications where both privacy and robustness are critical, consider self-hosting a large open-source model (BGE-large, E5-large) with additional adversarial training on your domain-specific data. This provides both data control and improved robustness, at the cost of higher operational complexity and compute requirements.
No embedding model is secure by default. The model is one component of a system, and the system's security depends on how the model is integrated, configured, and defended.