Vector Database Forensics

advanced18 min readUpdated 2026-03-21

Forensic analysis techniques for detecting and investigating vector database poisoning, unauthorized modifications, and data integrity violations.

ai-forensics-ir vector-database poisoning embedding-forensics rag

Overview

Vector databases are the retrieval backbone of Retrieval-Augmented Generation (RAG) systems, semantic search engines, and recommendation systems powered by AI. They store high-dimensional embedding vectors alongside metadata and original content, enabling similarity-based retrieval that feeds into downstream AI models. When an attacker poisons a vector database, every query that retrieves the poisoned vectors produces compromised results, making this a high-leverage attack surface.

Vector database forensics is the process of detecting that a vector database has been tampered with, determining what was changed, identifying how and when the changes were made, and assessing the impact on downstream systems. This is a relatively new forensic discipline because vector databases themselves are a recent addition to most production architectures, and traditional database forensics techniques (transaction log analysis, row-level auditing) do not directly apply to the high-dimensional vector space.

The core forensic challenges are: vectors are opaque (you cannot read an embedding the way you read a database row), poisoned vectors may be statistically similar to legitimate vectors (making them hard to distinguish), and most vector databases have limited built-in auditing compared to relational databases. An investigator must combine vector space analysis, metadata forensics, ingestion pipeline tracing, and access log correlation to build a complete picture of a poisoning incident.

This article covers detection techniques for identifying poisoned vectors, forensic analysis of vector database state and history, pipeline tracing to identify the poisoning source, and monitoring systems for ongoing detection.

Detecting Poisoned Vectors

Statistical Anomaly Detection in Embedding Space

Poisoned vectors often have statistical properties that differ from legitimate vectors, even when the attacker has tried to make them blend in. The most common anomalies are: vectors that cluster unusually close to high-traffic query regions (to maximize retrieval), vectors with magnitude or distribution properties that differ from the legitimate population, and vectors whose metadata is inconsistent with their embedding content.

import numpy as np
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
import hashlib
import json
 
@dataclass
class VectorRecord:
    """A single vector record from a vector database."""
    vector_id: str
    vector: np.ndarray
    metadata: dict
    content: str  # Original text or document content
    created_at: Optional[str] = None
    updated_at: Optional[str] = None
    source: Optional[str] = None  # Ingestion source identifier
 
@dataclass
class AnomalyResult:
    """Result of anomaly detection on a vector record."""
    vector_id: str
    anomaly_score: float  # 0-1, higher is more anomalous
    anomaly_types: list[str]
    details: dict
    flagged: bool = False
 
class VectorForensicAnalyzer:
    """Detect and analyze anomalies in vector database contents."""
 
    def __init__(
        self,
        expected_dimension: int = 1536,
        magnitude_tolerance: float = 2.0,
    ):
        self.expected_dimension = expected_dimension
        self.magnitude_tolerance = magnitude_tolerance
        self.baseline_stats: Optional[dict] = None
 
    def build_baseline(self, vectors: list[VectorRecord]) -> dict:
        """
        Build statistical baseline from a known-good vector collection.
 
        Args:
            vectors: List of known-good vector records.
 
        Returns:
            Baseline statistics dict.
        """
        if not vectors:
            return {}
 
        all_vectors = np.array([v.vector for v in vectors])
        magnitudes = np.linalg.norm(all_vectors, axis=1)
 
        # Per-dimension statistics
        dim_means = np.mean(all_vectors, axis=0)
        dim_stds = np.std(all_vectors, axis=0)
 
        # Pairwise distance statistics (sample for efficiency)
        sample_size = min(len(vectors), 1000)
        sample_indices = np.random.choice(
            len(vectors), sample_size, replace=False
        )
        sample_vectors = all_vectors[sample_indices]
 
        # Cosine similarities between random pairs
        n_pairs = min(5000, sample_size * (sample_size - 1) // 2)
        pair_sims = []
        for _ in range(n_pairs):
            i, j = np.random.choice(sample_size, 2, replace=False)
            sim = np.dot(sample_vectors[i], sample_vectors[j]) / (
                np.linalg.norm(sample_vectors[i])
                * np.linalg.norm(sample_vectors[j])
                + 1e-10
            )
            pair_sims.append(sim)
 
        self.baseline_stats = {
            "count": len(vectors),
            "dimension": all_vectors.shape[1],
            "magnitude_mean": float(np.mean(magnitudes)),
            "magnitude_std": float(np.std(magnitudes)),
            "magnitude_min": float(np.min(magnitudes)),
            "magnitude_max": float(np.max(magnitudes)),
            "dim_means": dim_means,
            "dim_stds": dim_stds,
            "pairwise_sim_mean": float(np.mean(pair_sims)),
            "pairwise_sim_std": float(np.std(pair_sims)),
            "centroid": dim_means.copy(),
        }
 
        return {
            k: v for k, v in self.baseline_stats.items()
            if not isinstance(v, np.ndarray)
        }
 
    def analyze_vector(self, record: VectorRecord) -> AnomalyResult:
        """
        Analyze a single vector for anomalies against the baseline.
 
        Args:
            record: The vector record to analyze.
 
        Returns:
            AnomalyResult with anomaly details.
        """
        if self.baseline_stats is None:
            raise ValueError("Must call build_baseline() first.")
 
        anomaly_types = []
        details = {}
        scores = []
 
        vec = record.vector
 
        # Check 1: Dimension mismatch
        if len(vec) != self.expected_dimension:
            anomaly_types.append("dimension_mismatch")
            details["expected_dim"] = self.expected_dimension
            details["actual_dim"] = len(vec)
            scores.append(1.0)
 
        # Check 2: Magnitude anomaly
        magnitude = float(np.linalg.norm(vec))
        mag_mean = self.baseline_stats["magnitude_mean"]
        mag_std = self.baseline_stats["magnitude_std"]
        mag_z_score = abs(magnitude - mag_mean) / max(mag_std, 1e-10)
        details["magnitude"] = magnitude
        details["magnitude_z_score"] = float(mag_z_score)
 
        if mag_z_score > self.magnitude_tolerance:
            anomaly_types.append("magnitude_anomaly")
            scores.append(min(mag_z_score / 5.0, 1.0))
 
        # Check 3: Per-dimension outlier analysis
        dim_means = self.baseline_stats["dim_means"]
        dim_stds = self.baseline_stats["dim_stds"]
        z_scores = np.abs(vec - dim_means) / np.maximum(dim_stds, 1e-10)
        outlier_dims = int(np.sum(z_scores > 3.0))
        outlier_fraction = outlier_dims / len(vec)
        details["outlier_dimensions"] = outlier_dims
        details["outlier_fraction"] = float(outlier_fraction)
 
        if outlier_fraction > 0.05:  # More than 5% of dimensions are outliers
            anomaly_types.append("dimensional_outlier")
            scores.append(min(outlier_fraction * 5, 1.0))
 
        # Check 4: Distance from centroid
        centroid = self.baseline_stats["centroid"]
        centroid_distance = float(np.linalg.norm(vec - centroid))
        expected_distance = float(np.linalg.norm(dim_stds))
        distance_ratio = centroid_distance / max(expected_distance, 1e-10)
        details["centroid_distance"] = centroid_distance
        details["distance_ratio"] = distance_ratio
 
        if distance_ratio > 3.0:
            anomaly_types.append("centroid_outlier")
            scores.append(min(distance_ratio / 5.0, 1.0))
 
        # Check 5: Zero or near-zero vector (possible corruption)
        if magnitude < 1e-6:
            anomaly_types.append("zero_vector")
            scores.append(1.0)
 
        # Check 6: Metadata consistency
        metadata_anomalies = self._check_metadata_consistency(record)
        if metadata_anomalies:
            anomaly_types.extend(metadata_anomalies)
            scores.append(0.5)
            details["metadata_anomalies"] = metadata_anomalies
 
        # Composite score
        anomaly_score = max(scores) if scores else 0.0
 
        return AnomalyResult(
            vector_id=record.vector_id,
            anomaly_score=round(anomaly_score, 4),
            anomaly_types=anomaly_types,
            details=details,
            flagged=anomaly_score > 0.5,
        )
 
    def _check_metadata_consistency(
        self, record: VectorRecord,
    ) -> list[str]:
        """Check for metadata anomalies in a vector record."""
        anomalies = []
 
        # Check for missing standard metadata fields
        expected_fields = {"source", "timestamp", "content_type"}
        missing = expected_fields - set(record.metadata.keys())
        if missing:
            anomalies.append(f"missing_metadata: {missing}")
 
        # Check for suspiciously recent creation in an old collection
        if record.created_at:
            try:
                created = datetime.fromisoformat(record.created_at)
                if (datetime.utcnow() - created).days < 1:
                    # Very recent addition — may need manual review
                    anomalies.append("recently_created")
            except (ValueError, TypeError):
                anomalies.append("invalid_timestamp")
 
        # Check content length vs embedding (very short content
        # producing normal-length embeddings is suspicious)
        if record.content and len(record.content) < 10:
            anomalies.append("minimal_content")
 
        return anomalies
 
    def scan_collection(
        self,
        records: list[VectorRecord],
        threshold: float = 0.5,
    ) -> dict:
        """
        Scan an entire vector collection for anomalies.
 
        Args:
            records: All vector records in the collection.
            threshold: Anomaly score threshold for flagging.
 
        Returns:
            Scan results with flagged records and statistics.
        """
        results = []
        for record in records:
            result = self.analyze_vector(record)
            if result.anomaly_score >= threshold:
                results.append(result)
 
        # Cluster the flagged vectors to identify coordinated poisoning
        clusters = self._cluster_anomalies(
            [r for r in records if any(
                ar.vector_id == r.vector_id
                for ar in results
            )],
            results,
        )
 
        return {
            "total_scanned": len(records),
            "flagged_count": len(results),
            "flagged_percentage": round(
                len(results) / max(len(records), 1) * 100, 2
            ),
            "anomaly_type_counts": self._count_anomaly_types(results),
            "flagged_records": [
                {
                    "vector_id": r.vector_id,
                    "anomaly_score": r.anomaly_score,
                    "anomaly_types": r.anomaly_types,
                }
                for r in results
            ],
            "potential_clusters": clusters,
        }
 
    def _count_anomaly_types(
        self, results: list[AnomalyResult],
    ) -> dict[str, int]:
        """Count occurrences of each anomaly type."""
        counts: dict[str, int] = {}
        for r in results:
            for atype in r.anomaly_types:
                counts[atype] = counts.get(atype, 0) + 1
        return counts
 
    def _cluster_anomalies(
        self,
        flagged_records: list[VectorRecord],
        anomaly_results: list[AnomalyResult],
    ) -> list[dict]:
        """Identify clusters of anomalous vectors that may indicate
        coordinated poisoning."""
        if len(flagged_records) < 2:
            return []
 
        vectors = np.array([r.vector for r in flagged_records])
 
        # Simple clustering: group vectors with high cosine similarity
        clusters = []
        visited = set()
 
        for i in range(len(flagged_records)):
            if i in visited:
                continue
 
            cluster = [i]
            visited.add(i)
 
            for j in range(i + 1, len(flagged_records)):
                if j in visited:
                    continue
 
                sim = np.dot(vectors[i], vectors[j]) / (
                    np.linalg.norm(vectors[i])
                    * np.linalg.norm(vectors[j])
                    + 1e-10
                )
                if sim > 0.85:  # High similarity threshold
                    cluster.append(j)
                    visited.add(j)
 
            if len(cluster) >= 2:
                clusters.append({
                    "size": len(cluster),
                    "vector_ids": [
                        flagged_records[idx].vector_id for idx in cluster
                    ],
                    "sources": list({
                        flagged_records[idx].source
                        for idx in cluster
                        if flagged_records[idx].source
                    }),
                })
 
        return clusters

Content-Embedding Consistency Verification

A powerful forensic technique for detecting poisoned vectors is verifying that the stored embedding actually corresponds to the stored content. An attacker who injects a document with adversarial content but a manipulated embedding (to ensure retrieval for specific queries) will fail this consistency check.

from typing import Callable
 
class ContentEmbeddingVerifier:
    """Verify that stored embeddings match their associated content."""
 
    def __init__(
        self,
        embedding_function: Callable[[str], np.ndarray],
        similarity_threshold: float = 0.90,
    ):
        """
        Args:
            embedding_function: Function that takes text and returns
                an embedding vector.
            similarity_threshold: Minimum cosine similarity between
                stored and recomputed embeddings.
        """
        self.embed = embedding_function
        self.threshold = similarity_threshold
 
    def verify_record(self, record: VectorRecord) -> dict:
        """
        Verify that a record's embedding matches its content.
 
        Args:
            record: The vector record to verify.
 
        Returns:
            Verification result dict.
        """
        if not record.content:
            return {
                "vector_id": record.vector_id,
                "verified": False,
                "reason": "no_content",
                "similarity": 0.0,
            }
 
        # Recompute embedding from content
        recomputed = self.embed(record.content)
 
        # Compute cosine similarity
        stored = record.vector
        similarity = float(
            np.dot(stored, recomputed)
            / (np.linalg.norm(stored) * np.linalg.norm(recomputed) + 1e-10)
        )
 
        verified = similarity >= self.threshold
 
        return {
            "vector_id": record.vector_id,
            "verified": verified,
            "similarity": round(similarity, 4),
            "threshold": self.threshold,
            "reason": "match" if verified else "embedding_content_mismatch",
        }
 
    def verify_collection(
        self,
        records: list[VectorRecord],
        sample_size: Optional[int] = None,
    ) -> dict:
        """
        Verify embedding-content consistency across a collection.
 
        Args:
            records: Vector records to verify.
            sample_size: If set, randomly sample this many records
                instead of verifying all.
 
        Returns:
            Collection-wide verification results.
        """
        if sample_size and sample_size < len(records):
            indices = np.random.choice(
                len(records), sample_size, replace=False
            )
            sample = [records[i] for i in indices]
        else:
            sample = records
 
        results = []
        mismatches = []
 
        for record in sample:
            result = self.verify_record(record)
            results.append(result)
            if not result["verified"] and result["reason"] != "no_content":
                mismatches.append(result)
 
        similarities = [
            r["similarity"] for r in results
            if r["reason"] != "no_content"
        ]
 
        return {
            "total_verified": len(sample),
            "match_count": len(sample) - len(mismatches),
            "mismatch_count": len(mismatches),
            "mismatch_rate": round(
                len(mismatches) / max(len(sample), 1), 4
            ),
            "mean_similarity": round(
                float(np.mean(similarities)) if similarities else 0.0, 4
            ),
            "min_similarity": round(
                float(np.min(similarities)) if similarities else 0.0, 4
            ),
            "mismatched_records": mismatches,
        }

Nearest Neighbor Analysis for Poisoning Intent

When vectors are confirmed as poisoned, analyzing their nearest neighbors in the embedding space reveals the attacker's intent. Poisoned vectors designed to influence responses about a specific topic will cluster near legitimate vectors on that topic. By identifying the legitimate vectors closest to each poisoned vector, you can determine which queries the attacker was targeting.

def analyze_poisoning_intent(
    poisoned_records: list[VectorRecord],
    all_records: list[VectorRecord],
    top_k: int = 10,
) -> list[dict]:
    """
    Analyze the nearest neighbors of poisoned vectors to determine
    the attacker's likely targeting intent.
 
    Args:
        poisoned_records: The identified poisoned vectors.
        all_records: All vectors in the collection.
        top_k: Number of nearest neighbors to analyze.
 
    Returns:
        List of targeting analysis results per poisoned vector.
    """
    # Build a matrix of all legitimate vectors
    legitimate = [r for r in all_records if r.vector_id not in
                  {p.vector_id for p in poisoned_records}]
    if not legitimate:
        return []
 
    legit_matrix = np.array([r.vector for r in legitimate])
    legit_norms = np.linalg.norm(legit_matrix, axis=1, keepdims=True)
    legit_normalized = legit_matrix / np.maximum(legit_norms, 1e-10)
 
    results = []
    for poisoned in poisoned_records:
        p_vec = poisoned.vector
        p_norm = np.linalg.norm(p_vec)
        p_normalized = p_vec / max(p_norm, 1e-10)
 
        # Compute cosine similarities
        similarities = legit_normalized @ p_normalized
        top_indices = np.argsort(similarities)[-top_k:][::-1]
 
        neighbors = []
        for idx in top_indices:
            neighbors.append({
                "vector_id": legitimate[idx].vector_id,
                "content_preview": legitimate[idx].content[:200],
                "similarity": float(similarities[idx]),
                "metadata": legitimate[idx].metadata,
            })
 
        # Infer targeting from neighbor content
        neighbor_sources = [n["metadata"].get("source", "") for n in neighbors]
        neighbor_topics = [n["content_preview"][:50] for n in neighbors]
 
        results.append({
            "poisoned_vector_id": poisoned.vector_id,
            "poisoned_content_preview": poisoned.content[:200],
            "nearest_legitimate_neighbors": neighbors,
            "likely_target_topics": neighbor_topics[:3],
            "max_similarity_to_legitimate": float(similarities[top_indices[0]]),
        })
 
    return results

Ingestion Pipeline Tracing

Following the Poisoning Path

Once poisoned vectors are identified, the next forensic question is: how did they get into the database? Tracing the ingestion pipeline backward from the poisoned record to its source reveals the attack vector.

Most RAG systems ingest documents through a pipeline: source retrieval (web scraping, file upload, API pull) followed by chunking, embedding, and database insertion. Each stage may have its own logs. The forensic investigator needs to correlate the poisoned vector's metadata (source URL, ingestion timestamp, document ID) with the pipeline logs to identify the injection point.

Common injection points include: direct database API access (the attacker writes vectors directly, bypassing the pipeline), document source manipulation (the attacker modifies a source document that is later ingested normally), pipeline compromise (the attacker modifies the embedding or chunking step to alter vectors in transit), and upload endpoint abuse (the attacker uses a legitimate document upload feature to submit malicious content).

For each injection point, the forensic evidence differs. Direct API access produces authentication logs in the vector database or its proxy. Document source manipulation requires examining the source document's history and access logs. Pipeline compromise requires examining the pipeline code and deployment history for unauthorized changes.

Collection Integrity Monitoring

Hash-Based Integrity Verification

Maintain a cryptographic manifest of the vector collection's state over time. Periodically hash the collection (or a deterministic sample) and compare against the previous hash. Any unexpected change triggers an investigation.

class CollectionIntegrityMonitor:
    """Monitor vector collection integrity over time."""
 
    def __init__(self, collection_name: str, storage_path: str):
        self.collection_name = collection_name
        self.storage_path = storage_path
        self.manifests: list[dict] = []
 
    def create_manifest(
        self,
        records: list[VectorRecord],
    ) -> dict:
        """
        Create an integrity manifest for the current collection state.
 
        Args:
            records: All vector records in the collection.
 
        Returns:
            Manifest dict with collection hash and metadata.
        """
        # Sort records by ID for deterministic hashing
        sorted_records = sorted(records, key=lambda r: r.vector_id)
 
        hasher = hashlib.sha256()
        record_hashes = []
 
        for record in sorted_records:
            # Hash each record individually
            record_data = (
                record.vector_id
                + "|"
                + record.vector.tobytes().hex()
                + "|"
                + json.dumps(record.metadata, sort_keys=True)
            )
            record_hash = hashlib.sha256(record_data.encode()).hexdigest()
            record_hashes.append(record_hash)
            hasher.update(record_hash.encode())
 
        collection_hash = hasher.hexdigest()
 
        manifest = {
            "collection_name": self.collection_name,
            "timestamp": datetime.utcnow().isoformat(),
            "record_count": len(records),
            "collection_hash": collection_hash,
            "sample_record_hashes": record_hashes[:10],
        }
 
        self.manifests.append(manifest)
        return manifest
 
    def compare_manifests(
        self,
        previous: dict,
        current: dict,
    ) -> dict:
        """
        Compare two manifests to detect changes.
 
        Args:
            previous: The baseline manifest.
            current: The current manifest.
 
        Returns:
            Comparison results dict.
        """
        changes = {
            "hash_match": previous["collection_hash"] == current["collection_hash"],
            "record_count_change": (
                current["record_count"] - previous["record_count"]
            ),
            "time_delta": current["timestamp"],
            "previous_timestamp": previous["timestamp"],
        }
 
        if not changes["hash_match"]:
            changes["alert"] = (
                "Collection integrity violation detected. "
                "Hash mismatch between manifests."
            )
            changes["severity"] = "high"
 
            if changes["record_count_change"] > 0:
                changes["likely_cause"] = (
                    f"{changes['record_count_change']} new records added"
                )
            elif changes["record_count_change"] < 0:
                changes["likely_cause"] = (
                    f"{abs(changes['record_count_change'])} records removed"
                )
            else:
                changes["likely_cause"] = (
                    "Records modified in place (count unchanged)"
                )
 
        return changes

Practical Considerations

Vector Database Audit Log Limitations

Most vector databases have limited built-in audit logging compared to relational databases. Pinecone provides index-level metrics but not per-vector audit trails. Weaviate logs operations at the API level but does not maintain a per-object change history. Chroma, being an embedded database, has no built-in audit logging. Qdrant provides access logs but not change-level auditing.

To compensate, implement audit logging at the application layer. Wrap all vector database operations (insert, update, delete, query) with a logging layer that records the operation, the requesting identity, the timestamp, and the affected vector IDs. Store these audit logs separately from the vector database itself, so that an attacker who compromises the database cannot also tamper with the audit trail.

Investigation Case Study: Targeted Retrieval Manipulation

Consider a scenario where a product recommendation RAG system starts recommending a specific vendor's products disproportionately. The security team suspects vector database poisoning but needs forensic evidence.

The investigation begins with a collection-wide scan using the VectorForensicAnalyzer. The scan reveals 47 vectors flagged for anomalies out of 500,000 total vectors (0.009%). Of these, 31 are flagged as "dimensional_outlier" and 12 share the same source metadata pointing to a batch ingestion job that ran two weeks ago.

The content-embedding verification step is more revealing: 28 of the 47 flagged vectors fail the consistency check, meaning their stored embeddings do not match what the embedding model produces from their stored content. This is a strong indicator that the embeddings were manually crafted rather than computed from the content. The attacker created documents with benign-looking product descriptions but assigned them adversarially crafted embeddings that place them close to common product query vectors.

Clustering analysis shows that the 28 mismatched vectors form three tight clusters in embedding space, each positioned near a high-traffic query region (best laptops, best phones, best tablets). All three clusters point to the same vendor's products. The ingestion pipeline trace reveals that the vectors were inserted through a bulk import API endpoint that bypassed the normal document processing pipeline, using an API key assigned to an automated data refresh service.

The forensic evidence package includes: the list of poisoned vector IDs with their anomaly scores, the content-embedding mismatch analysis showing the discrepancy between stored and recomputed embeddings, the cluster analysis showing strategic positioning in embedding space, the ingestion pipeline logs showing the bulk import event, and the API key usage records for the compromised service account.

Forensic Recovery and Remediation

After identifying poisoned vectors, the remediation process must balance speed (removing the poison quickly) with forensic preservation (maintaining evidence for investigation). Before removing any vectors, create a complete export of the affected vectors including their embeddings, metadata, content, and any available audit trail entries.

The safest remediation approach is to re-embed the entire collection from the source documents using the standard ingestion pipeline, then compare the re-embedded collection against the current collection to identify all discrepancies. This catches not only the vectors you identified through anomaly detection but also any poisoned vectors that were sophisticated enough to evade detection. However, for large collections this may be impractical, in which case targeted removal of identified poisoned vectors combined with ongoing monitoring is the alternative.

After remediation, establish a baseline manifest from the clean collection state and implement continuous monitoring using the integrity tools described above. Configure alerts for new vectors that fail the content-embedding consistency check and for vectors with anomalous statistical properties.

Temporal Analysis of Vector Insertions

Analyzing the timing of vector insertions can reveal poisoning campaigns. Legitimate ingestion pipelines typically insert vectors in regular batches corresponding to scheduled data refresh jobs. Poisoning attacks may insert vectors at unusual times, in unusual quantities, or with unusual temporal patterns.

Build a timeline of vector insertions by ingestion timestamp and look for anomalies: insertions outside of normal pipeline execution windows, unusually large or small batch sizes, and insertions from unexpected source identifiers. Cross-reference the insertion timeline with the access logs for the vector database's API endpoint to identify the identity and network origin of each insertion event.

For vector databases that support update operations (modifying an existing vector's embedding or metadata), the distinction between insertion and modification is forensically important. An attacker who modifies existing vectors is harder to detect because the total vector count does not change, but the modification timestamps may still be visible in the database's internal metadata.

Performance Considerations for Forensic Scanning

Full-collection forensic scans (recomputing embeddings and comparing against stored vectors) are computationally expensive. For a collection with 1 million vectors using a 1536-dimensional embedding model, recomputation alone requires 1 million embedding API calls. Use sampling for routine monitoring (verify a random 1-5% of the collection daily) and reserve full scans for confirmed incidents. When sampling, stratify by ingestion date: over-sample recently added vectors, as these are more likely to contain poisoned content.

Forensic Readiness for Vector Databases

Pre-Incident Preparation

To ensure forensic readiness, configure the following before an incident occurs. Enable API-level access logging for the vector database, capturing at minimum the operation type, the requesting identity, the timestamp, and the affected collection and vector IDs. If the vector database supports it, enable change data capture (CDC) to maintain a log of all insert, update, and delete operations.

Establish baseline manifests for each production collection on a regular schedule (at least weekly). Store manifests in an append-only log store separate from the vector database. Implement content-embedding consistency checks on a sampling basis as part of regular data quality monitoring, so that poisoning is detected promptly rather than only during a reactive investigation.

Document the ingestion pipeline architecture thoroughly, including all data sources, processing stages, authentication mechanisms, and access controls at each stage. This documentation is essential for tracing the poisoning path during an investigation. Without it, the forensic investigator must reverse-engineer the pipeline architecture under time pressure.

Finally, maintain known-good snapshots of vector collections. If your vector database supports backup and restore, take regular backups and verify them. If not, maintain a separate copy of the collection's source documents and a record of the embedding model version used, so that the collection can be rebuilt from scratch if poisoning is widespread and remediation of individual vectors is impractical.

References

Zou, W., Geng, R., Wang, B., & Jia, J. (2024). "PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models." arXiv:2402.07867. https://arxiv.org/abs/2402.07867
Zhong, Z., Liu, Z., Shang, J., & Sun, H. (2023). "Poisoning Retrieval Corpora by Injecting Adversarial Passages." arXiv:2310.19156. https://arxiv.org/abs/2310.19156
Xiang, C., Wu, T., Zhong, Z., Wagner, D., Chen, D., & Mittal, P. (2024). "Certifiably Robust RAG against Retrieval Corruption." arXiv:2405.15556. https://arxiv.org/abs/2405.15556

Edit this page on GitHub

Vector Database Forensics

advanced18 min readUpdated 2026-03-21

Forensic analysis techniques for detecting and investigating vector database poisoning, unauthorized modifications, and data integrity violations.

ai-forensics-ir vector-database poisoning embedding-forensics rag

import numpy as np
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
import hashlib
import json
 
@dataclass
class VectorRecord:
    """A single vector record from a vector database."""
    vector_id: str
    vector: np.ndarray
    metadata: dict
    content: str  # Original text or document content
    created_at: Optional[str] = None
    updated_at: Optional[str] = None
    source: Optional[str] = None  # Ingestion source identifier
 
@dataclass
class AnomalyResult:
    """Result of anomaly detection on a vector record."""
    vector_id: str
    anomaly_score: float  # 0-1, higher is more anomalous
    anomaly_types: list[str]
    details: dict
    flagged: bool = False
 
class VectorForensicAnalyzer:
    """Detect and analyze anomalies in vector database contents."""
 
    def __init__(
        self,
        expected_dimension: int = 1536,
        magnitude_tolerance: float = 2.0,
    ):
        self.expected_dimension = expected_dimension
        self.magnitude_tolerance = magnitude_tolerance
        self.baseline_stats: Optional[dict] = None
 
    def build_baseline(self, vectors: list[VectorRecord]) -> dict:
        """
        Build statistical baseline from a known-good vector collection.
 
        Args:
            vectors: List of known-good vector records.
 
        Returns:
            Baseline statistics dict.
        """
        if not vectors:
            return {}
 
        all_vectors = np.array([v.vector for v in vectors])
        magnitudes = np.linalg.norm(all_vectors, axis=1)
 
        # Per-dimension statistics
        dim_means = np.mean(all_vectors, axis=0)
        dim_stds = np.std(all_vectors, axis=0)
 
        # Pairwise distance statistics (sample for efficiency)
        sample_size = min(len(vectors), 1000)
        sample_indices = np.random.choice(
            len(vectors), sample_size, replace=False
        )
        sample_vectors = all_vectors[sample_indices]
 
        # Cosine similarities between random pairs
        n_pairs = min(5000, sample_size * (sample_size - 1) // 2)
        pair_sims = []
        for _ in range(n_pairs):
            i, j = np.random.choice(sample_size, 2, replace=False)
            sim = np.dot(sample_vectors[i], sample_vectors[j]) / (
                np.linalg.norm(sample_vectors[i])
                * np.linalg.norm(sample_vectors[j])
                + 1e-10
            )
            pair_sims.append(sim)
 
        self.baseline_stats = {
            "count": len(vectors),
            "dimension": all_vectors.shape[1],
            "magnitude_mean": float(np.mean(magnitudes)),
            "magnitude_std": float(np.std(magnitudes)),
            "magnitude_min": float(np.min(magnitudes)),
            "magnitude_max": float(np.max(magnitudes)),
            "dim_means": dim_means,
            "dim_stds": dim_stds,
            "pairwise_sim_mean": float(np.mean(pair_sims)),
            "pairwise_sim_std": float(np.std(pair_sims)),
            "centroid": dim_means.copy(),
        }
 
        return {
            k: v for k, v in self.baseline_stats.items()
            if not isinstance(v, np.ndarray)
        }
 
    def analyze_vector(self, record: VectorRecord) -> AnomalyResult:
        """
        Analyze a single vector for anomalies against the baseline.
 
        Args:
            record: The vector record to analyze.
 
        Returns:
            AnomalyResult with anomaly details.
        """
        if self.baseline_stats is None:
            raise ValueError("Must call build_baseline() first.")
 
        anomaly_types = []
        details = {}
        scores = []
 
        vec = record.vector
 
        # Check 1: Dimension mismatch
        if len(vec) != self.expected_dimension:
            anomaly_types.append("dimension_mismatch")
            details["expected_dim"] = self.expected_dimension
            details["actual_dim"] = len(vec)
            scores.append(1.0)
 
        # Check 2: Magnitude anomaly
        magnitude = float(np.linalg.norm(vec))
        mag_mean = self.baseline_stats["magnitude_mean"]
        mag_std = self.baseline_stats["magnitude_std"]
        mag_z_score = abs(magnitude - mag_mean) / max(mag_std, 1e-10)
        details["magnitude"] = magnitude
        details["magnitude_z_score"] = float(mag_z_score)
 
        if mag_z_score > self.magnitude_tolerance:
            anomaly_types.append("magnitude_anomaly")
            scores.append(min(mag_z_score / 5.0, 1.0))
 
        # Check 3: Per-dimension outlier analysis
        dim_means = self.baseline_stats["dim_means"]
        dim_stds = self.baseline_stats["dim_stds"]
        z_scores = np.abs(vec - dim_means) / np.maximum(dim_stds, 1e-10)
        outlier_dims = int(np.sum(z_scores > 3.0))
        outlier_fraction = outlier_dims / len(vec)
        details["outlier_dimensions"] = outlier_dims
        details["outlier_fraction"] = float(outlier_fraction)
 
        if outlier_fraction > 0.05:  # More than 5% of dimensions are outliers
            anomaly_types.append("dimensional_outlier")
            scores.append(min(outlier_fraction * 5, 1.0))
 
        # Check 4: Distance from centroid
        centroid = self.baseline_stats["centroid"]
        centroid_distance = float(np.linalg.norm(vec - centroid))
        expected_distance = float(np.linalg.norm(dim_stds))
        distance_ratio = centroid_distance / max(expected_distance, 1e-10)
        details["centroid_distance"] = centroid_distance
        details["distance_ratio"] = distance_ratio
 
        if distance_ratio > 3.0:
            anomaly_types.append("centroid_outlier")
            scores.append(min(distance_ratio / 5.0, 1.0))
 
        # Check 5: Zero or near-zero vector (possible corruption)
        if magnitude < 1e-6:
            anomaly_types.append("zero_vector")
            scores.append(1.0)
 
        # Check 6: Metadata consistency
        metadata_anomalies = self._check_metadata_consistency(record)
        if metadata_anomalies:
            anomaly_types.extend(metadata_anomalies)
            scores.append(0.5)
            details["metadata_anomalies"] = metadata_anomalies
 
        # Composite score
        anomaly_score = max(scores) if scores else 0.0
 
        return AnomalyResult(
            vector_id=record.vector_id,
            anomaly_score=round(anomaly_score, 4),
            anomaly_types=anomaly_types,
            details=details,
            flagged=anomaly_score > 0.5,
        )
 
    def _check_metadata_consistency(
        self, record: VectorRecord,
    ) -> list[str]:
        """Check for metadata anomalies in a vector record."""
        anomalies = []
 
        # Check for missing standard metadata fields
        expected_fields = {"source", "timestamp", "content_type"}
        missing = expected_fields - set(record.metadata.keys())
        if missing:
            anomalies.append(f"missing_metadata: {missing}")
 
        # Check for suspiciously recent creation in an old collection
        if record.created_at:
            try:
                created = datetime.fromisoformat(record.created_at)
                if (datetime.utcnow() - created).days < 1:
                    # Very recent addition — may need manual review
                    anomalies.append("recently_created")
            except (ValueError, TypeError):
                anomalies.append("invalid_timestamp")
 
        # Check content length vs embedding (very short content
        # producing normal-length embeddings is suspicious)
        if record.content and len(record.content) < 10:
            anomalies.append("minimal_content")
 
        return anomalies
 
    def scan_collection(
        self,
        records: list[VectorRecord],
        threshold: float = 0.5,
    ) -> dict:
        """
        Scan an entire vector collection for anomalies.
 
        Args:
            records: All vector records in the collection.
            threshold: Anomaly score threshold for flagging.
 
        Returns:
            Scan results with flagged records and statistics.
        """
        results = []
        for record in records:
            result = self.analyze_vector(record)
            if result.anomaly_score >= threshold:
                results.append(result)
 
        # Cluster the flagged vectors to identify coordinated poisoning
        clusters = self._cluster_anomalies(
            [r for r in records if any(
                ar.vector_id == r.vector_id
                for ar in results
            )],
            results,
        )
 
        return {
            "total_scanned": len(records),
            "flagged_count": len(results),
            "flagged_percentage": round(
                len(results) / max(len(records), 1) * 100, 2
            ),
            "anomaly_type_counts": self._count_anomaly_types(results),
            "flagged_records": [
                {
                    "vector_id": r.vector_id,
                    "anomaly_score": r.anomaly_score,
                    "anomaly_types": r.anomaly_types,
                }
                for r in results
            ],
            "potential_clusters": clusters,
        }
 
    def _count_anomaly_types(
        self, results: list[AnomalyResult],
    ) -> dict[str, int]:
        """Count occurrences of each anomaly type."""
        counts: dict[str, int] = {}
        for r in results:
            for atype in r.anomaly_types:
                counts[atype] = counts.get(atype, 0) + 1
        return counts
 
    def _cluster_anomalies(
        self,
        flagged_records: list[VectorRecord],
        anomaly_results: list[AnomalyResult],
    ) -> list[dict]:
        """Identify clusters of anomalous vectors that may indicate
        coordinated poisoning."""
        if len(flagged_records) < 2:
            return []
 
        vectors = np.array([r.vector for r in flagged_records])
 
        # Simple clustering: group vectors with high cosine similarity
        clusters = []
        visited = set()
 
        for i in range(len(flagged_records)):
            if i in visited:
                continue
 
            cluster = [i]
            visited.add(i)
 
            for j in range(i + 1, len(flagged_records)):
                if j in visited:
                    continue
 
                sim = np.dot(vectors[i], vectors[j]) / (
                    np.linalg.norm(vectors[i])
                    * np.linalg.norm(vectors[j])
                    + 1e-10
                )
                if sim > 0.85:  # High similarity threshold
                    cluster.append(j)
                    visited.add(j)
 
            if len(cluster) >= 2:
                clusters.append({
                    "size": len(cluster),
                    "vector_ids": [
                        flagged_records[idx].vector_id for idx in cluster
                    ],
                    "sources": list({
                        flagged_records[idx].source
                        for idx in cluster
                        if flagged_records[idx].source
                    }),
                })
 
        return clusters

Content-Embedding Consistency Verification

from typing import Callable
 
class ContentEmbeddingVerifier:
    """Verify that stored embeddings match their associated content."""
 
    def __init__(
        self,
        embedding_function: Callable[[str], np.ndarray],
        similarity_threshold: float = 0.90,
    ):
        """
        Args:
            embedding_function: Function that takes text and returns
                an embedding vector.
            similarity_threshold: Minimum cosine similarity between
                stored and recomputed embeddings.
        """
        self.embed = embedding_function
        self.threshold = similarity_threshold
 
    def verify_record(self, record: VectorRecord) -> dict:
        """
        Verify that a record's embedding matches its content.
 
        Args:
            record: The vector record to verify.
 
        Returns:
            Verification result dict.
        """
        if not record.content:
            return {
                "vector_id": record.vector_id,
                "verified": False,
                "reason": "no_content",
                "similarity": 0.0,
            }
 
        # Recompute embedding from content
        recomputed = self.embed(record.content)
 
        # Compute cosine similarity
        stored = record.vector
        similarity = float(
            np.dot(stored, recomputed)
            / (np.linalg.norm(stored) * np.linalg.norm(recomputed) + 1e-10)
        )
 
        verified = similarity >= self.threshold
 
        return {
            "vector_id": record.vector_id,
            "verified": verified,
            "similarity": round(similarity, 4),
            "threshold": self.threshold,
            "reason": "match" if verified else "embedding_content_mismatch",
        }
 
    def verify_collection(
        self,
        records: list[VectorRecord],
        sample_size: Optional[int] = None,
    ) -> dict:
        """
        Verify embedding-content consistency across a collection.
 
        Args:
            records: Vector records to verify.
            sample_size: If set, randomly sample this many records
                instead of verifying all.
 
        Returns:
            Collection-wide verification results.
        """
        if sample_size and sample_size < len(records):
            indices = np.random.choice(
                len(records), sample_size, replace=False
            )
            sample = [records[i] for i in indices]
        else:
            sample = records
 
        results = []
        mismatches = []
 
        for record in sample:
            result = self.verify_record(record)
            results.append(result)
            if not result["verified"] and result["reason"] != "no_content":
                mismatches.append(result)
 
        similarities = [
            r["similarity"] for r in results
            if r["reason"] != "no_content"
        ]
 
        return {
            "total_verified": len(sample),
            "match_count": len(sample) - len(mismatches),
            "mismatch_count": len(mismatches),
            "mismatch_rate": round(
                len(mismatches) / max(len(sample), 1), 4
            ),
            "mean_similarity": round(
                float(np.mean(similarities)) if similarities else 0.0, 4
            ),
            "min_similarity": round(
                float(np.min(similarities)) if similarities else 0.0, 4
            ),
            "mismatched_records": mismatches,
        }

Nearest Neighbor Analysis for Poisoning Intent

def analyze_poisoning_intent(
    poisoned_records: list[VectorRecord],
    all_records: list[VectorRecord],
    top_k: int = 10,
) -> list[dict]:
    """
    Analyze the nearest neighbors of poisoned vectors to determine
    the attacker's likely targeting intent.
 
    Args:
        poisoned_records: The identified poisoned vectors.
        all_records: All vectors in the collection.
        top_k: Number of nearest neighbors to analyze.
 
    Returns:
        List of targeting analysis results per poisoned vector.
    """
    # Build a matrix of all legitimate vectors
    legitimate = [r for r in all_records if r.vector_id not in
                  {p.vector_id for p in poisoned_records}]
    if not legitimate:
        return []
 
    legit_matrix = np.array([r.vector for r in legitimate])
    legit_norms = np.linalg.norm(legit_matrix, axis=1, keepdims=True)
    legit_normalized = legit_matrix / np.maximum(legit_norms, 1e-10)
 
    results = []
    for poisoned in poisoned_records:
        p_vec = poisoned.vector
        p_norm = np.linalg.norm(p_vec)
        p_normalized = p_vec / max(p_norm, 1e-10)
 
        # Compute cosine similarities
        similarities = legit_normalized @ p_normalized
        top_indices = np.argsort(similarities)[-top_k:][::-1]
 
        neighbors = []
        for idx in top_indices:
            neighbors.append({
                "vector_id": legitimate[idx].vector_id,
                "content_preview": legitimate[idx].content[:200],
                "similarity": float(similarities[idx]),
                "metadata": legitimate[idx].metadata,
            })
 
        # Infer targeting from neighbor content
        neighbor_sources = [n["metadata"].get("source", "") for n in neighbors]
        neighbor_topics = [n["content_preview"][:50] for n in neighbors]
 
        results.append({
            "poisoned_vector_id": poisoned.vector_id,
            "poisoned_content_preview": poisoned.content[:200],
            "nearest_legitimate_neighbors": neighbors,
            "likely_target_topics": neighbor_topics[:3],
            "max_similarity_to_legitimate": float(similarities[top_indices[0]]),
        })
 
    return results

class CollectionIntegrityMonitor:
    """Monitor vector collection integrity over time."""
 
    def __init__(self, collection_name: str, storage_path: str):
        self.collection_name = collection_name
        self.storage_path = storage_path
        self.manifests: list[dict] = []
 
    def create_manifest(
        self,
        records: list[VectorRecord],
    ) -> dict:
        """
        Create an integrity manifest for the current collection state.
 
        Args:
            records: All vector records in the collection.
 
        Returns:
            Manifest dict with collection hash and metadata.
        """
        # Sort records by ID for deterministic hashing
        sorted_records = sorted(records, key=lambda r: r.vector_id)
 
        hasher = hashlib.sha256()
        record_hashes = []
 
        for record in sorted_records:
            # Hash each record individually
            record_data = (
                record.vector_id
                + "|"
                + record.vector.tobytes().hex()
                + "|"
                + json.dumps(record.metadata, sort_keys=True)
            )
            record_hash = hashlib.sha256(record_data.encode()).hexdigest()
            record_hashes.append(record_hash)
            hasher.update(record_hash.encode())
 
        collection_hash = hasher.hexdigest()
 
        manifest = {
            "collection_name": self.collection_name,
            "timestamp": datetime.utcnow().isoformat(),
            "record_count": len(records),
            "collection_hash": collection_hash,
            "sample_record_hashes": record_hashes[:10],
        }
 
        self.manifests.append(manifest)
        return manifest
 
    def compare_manifests(
        self,
        previous: dict,
        current: dict,
    ) -> dict:
        """
        Compare two manifests to detect changes.
 
        Args:
            previous: The baseline manifest.
            current: The current manifest.
 
        Returns:
            Comparison results dict.
        """
        changes = {
            "hash_match": previous["collection_hash"] == current["collection_hash"],
            "record_count_change": (
                current["record_count"] - previous["record_count"]
            ),
            "time_delta": current["timestamp"],
            "previous_timestamp": previous["timestamp"],
        }
 
        if not changes["hash_match"]:
            changes["alert"] = (
                "Collection integrity violation detected. "
                "Hash mismatch between manifests."
            )
            changes["severity"] = "high"
 
            if changes["record_count_change"] > 0:
                changes["likely_cause"] = (
                    f"{changes['record_count_change']} new records added"
                )
            elif changes["record_count_change"] < 0:
                changes["likely_cause"] = (
                    f"{abs(changes['record_count_change'])} records removed"
                )
            else:
                changes["likely_cause"] = (
                    "Records modified in place (count unchanged)"
                )
 
        return changes

Zou, W., Geng, R., Wang, B., & Jia, J. (2024). "PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models." arXiv:2402.07867. https://arxiv.org/abs/2402.07867
Zhong, Z., Liu, Z., Shang, J., & Sun, H. (2023). "Poisoning Retrieval Corpora by Injecting Adversarial Passages." arXiv:2310.19156. https://arxiv.org/abs/2310.19156
Xiang, C., Wu, T., Zhong, Z., Wagner, D., Chen, D., & Mittal, P. (2024). "Certifiably Robust RAG against Retrieval Corruption." arXiv:2405.15556. https://arxiv.org/abs/2405.15556

Edit this page on GitHub

Vector Database Forensics

Related articles

Vector Database Forensics

Related articles