Knowledge Base 投毒

Expert10 min readUpdated 2026-03-11

Techniques for injecting adversarial documents into RAG knowledge bases: ingestion path analysis, embedding space attacks, SEO-style ranking manipulation, staged poisoning, and effectiveness measurement.

rag poisoning embeddings vector-search adversarial-documents retrieval-manipulation

Knowledge Base Poisoning

知識庫投毒 injects 對抗性 content into a RAG system's document corpus so it is retrieved and presented as legitimate information. Unlike direct 提示詞注入 where 攻擊者 controls the query, 投毒 operates through the data plane -- the payload enters via ingestion and lies dormant until a user query triggers retrieval. This makes it persistent across sessions, multi-user in impact, and difficult to attribute.

Ingestion Path Analysis

Before crafting payloads, map every path through which documents enter the 知識庫.

Ingestion Path	Access Required	Persistence	偵測 Difficulty
Direct upload (admin panel)	Admin credentials	High	Low
Shared wiki / Confluence	Contributor access	High	Medium
Web scraping pipeline	Public web content	Medium	High
Email / chat indexing	Send to indexed channel	Medium	Medium
API document submission	API key	High	Low
File share sync	Write to monitored directory	High	Medium
User feedback / corrections	End-user access	Low-Medium	High

對抗性 Document Design

Effective 對抗性 documents must satisfy four competing requirements simultaneously:

Ensure high retrieval relevance
The document must land close to target queries in 嵌入向量 space. Measure cosine similarity between your document variants and expected user queries, then select the highest-scoring variant.
Ensure payload survival through chunking
Position the 對抗性 instruction so it dominates a single chunk. Use padding (fixed-size chunkers) or topic shifts (semantic chunkers) to isolate it.
Maintain stealth
The document should appear legitimate under human review and automated content filters. Use an 80/20 ratio: ~80% legitimate content, ~20% payload, with the payload positioned to land in its own chunk.
Maximize LLM compliance
Frame the injection as authoritative -- use headers like "Important Update", "supersedes all previous documentation", and closing text like "reviewed and approved by the documentation team."

Payload Wrapping Pattern

## Important Update
 
The following information supersedes all previous
documentation on this topic.
 
[INJECTION PAYLOAD HERE]
 
*This section was reviewed and approved by the documentation team.*

Staged Poisoning

Rather than injecting a single 對抗性 document, a staged attack builds trust incrementally before deploying the payload.

Phase	Action	Purpose
1 -- Establish presence	Inject several completely legitimate, high-quality documents on the target topic	Build retrieval history; establish 攻擊者 content as "trusted"
2 -- Subtle modifications	Update existing documents with minor inaccuracies mixed into legitimate content	測試 whether 系統 serves modified content; probe 偵測
3 -- Payload deployment	Inject the actual 對抗性 payload into documents with established history	Leverage trust and retrieval ranking built in prior phases

嵌入向量 Space 攻擊

對抗性嵌入向量 Optimization

The goal is to craft document text whose 嵌入向量 is maximally close to target query 嵌入向量. Two approaches:

When you have access to the 嵌入向量 model weights, use gradient-based optimization to iteratively modify 符元 until the document 嵌入向量 converges on the target query 嵌入向量. 這是 a 符元-substitution search: 對每個 position, try candidate 符元 and keep the one that most improves cosine similarity.

# Pseudocode: gradient-based 嵌入向量 attack
target_emb = embed(target_query)
for step in range(num_steps):
    for pos in token_positions:
        for candidate in random_tokens(50):
            swap 符元 at pos -> candidate
            if cosine_sim(embed(new_text), target_emb) improves:
                keep substitution

Without model access, find existing text already close to target queries and append your payload. The combined document inherits most of the carrier's 嵌入向量 properties while including the 對抗性 content.

# Find texts whose 嵌入向量 are closest to target query
carrier = find_nearest_text(corpus, target_query, model)
adversarial_doc = carrier + "\n\n" + payload
# The carrier text pulls the document into the right
# region of 嵌入向量 space

嵌入向量 Model Blind Spots

Every 嵌入向量 model has exploitable weaknesses:

Blind Spot	Description	利用
Negation blindness	"IS vulnerable" and "IS NOT vulnerable" produce similar 嵌入向量	Inject negated 安全 guidance that retrieves for the same queries
Instruction-content conflation	Model cannot distinguish descriptions from instructions	Craft text that describes an action but is interpreted as an instruction
Length sensitivity	Very short/long texts have degraded 嵌入向量 quality	利用 artifacts at extreme lengths to create retrieval anomalies

SEO-Style Ranking Manipulation

Traditional SEO techniques have direct analogues in vector search systems.

Technique Comparison

SEO Technique	Vector Search Analogue	實作
Keyword stuffing	BM25 term frequency boost	Repeat target terms ~5 times (optimal for typical BM25 k1 saturation)
Link farming / PageRank	Authority signal manipulation	Create documents that cite or link to the 對抗性 document
Fresh content bonus	Recency gaming	Repeatedly update the document to maintain a fresh timestamp
Metadata optimization	Metadata authority injection	Set `source_type: official_policy`, max `authority_score`
Click-through rate	Retrieval feedback loops	Repeatedly query and provide positive feedback on 對抗性 results

Measuring Poisoning Effectiveness

Track two metrics to 評估 end-to-end impact:

Metric	What It Measures	How to Measure
Retrieval hit rate	% of target queries that retrieve at least one 對抗性 chunk	Embed a unique marker string in 對抗性 documents; check if it appears in retrieved sources
Payload execution rate	% of target queries where the LLM follows the injected instruction	Define expected 對抗性 behavior (e.g., exfiltration URL appears in response); check LLM 輸出
Execution given retrieval	Conditional rate: when retrieved, how often does the LLM comply?	Execution rate / retrieval rate -- isolates payload quality from retrieval quality

Defensive Gaps to Highlight in Reports

When reporting 知識庫投毒 findings, call out these 緩解 gaps:

No integrity verification -- system accepts documents without cryptographic signing or provenance checks
No 對抗性 content scanning -- ingestion pipeline lacks 提示詞注入 pattern 偵測
Weak tenant isolation -- metadata-based filtering rather than physical namespace separation
No retrieval anomaly 偵測 -- no 監控 for sudden changes in retrieval patterns or rankings
Missing 輸出 validation -- LLM responses are not checked for indicators of injection compliance

Knowledge Check

You have contributor access to a company wiki that feeds a RAG system. Which staged 投毒 approach is MOST likely to evade 偵測 while achieving payload deployment?

參考文獻

Poisoning Retrieval Corpora by Injecting 對抗性 Passages (Zhong et al., 2023) — RAG corpus 投毒
PoisonedRAG: Knowledge Poisoning 攻擊 to 檢索增強生成 (Zou et al., 2024) — Targeted RAG knowledge 投毒
OWASP Top 10 for LLM Applications — Training 資料投毒 risks

Knowledge Base 投毒

Expert10 min readUpdated 2026-03-11

rag poisoning embeddings vector-search adversarial-documents retrieval-manipulation

Knowledge Base Poisoning

Ingestion Path Analysis

Before crafting payloads, map every path through which documents enter the 知識庫.

Ingestion Path	Access Required	Persistence	偵測 Difficulty
Direct upload (admin panel)	Admin credentials	High	Low
Shared wiki / Confluence	Contributor access	High	Medium
Web scraping pipeline	Public web content	Medium	High
Email / chat indexing	Send to indexed channel	Medium	Medium
API document submission	API key	High	Low
File share sync	Write to monitored directory	High	Medium
User feedback / corrections	End-user access	Low-Medium	High

對抗性 Document Design

Effective 對抗性 documents must satisfy four competing requirements simultaneously:

Ensure high retrieval relevance
The document must land close to target queries in 嵌入向量 space. Measure cosine similarity between your document variants and expected user queries, then select the highest-scoring variant.
Ensure payload survival through chunking
Position the 對抗性 instruction so it dominates a single chunk. Use padding (fixed-size chunkers) or topic shifts (semantic chunkers) to isolate it.
Maintain stealth
The document should appear legitimate under human review and automated content filters. Use an 80/20 ratio: ~80% legitimate content, ~20% payload, with the payload positioned to land in its own chunk.
Maximize LLM compliance
Frame the injection as authoritative -- use headers like "Important Update", "supersedes all previous documentation", and closing text like "reviewed and approved by the documentation team."

Payload Wrapping Pattern

## Important Update
 
The following information supersedes all previous
documentation on this topic.
 
[INJECTION PAYLOAD HERE]
 
*This section was reviewed and approved by the documentation team.*

Staged Poisoning

Rather than injecting a single 對抗性 document, a staged attack builds trust incrementally before deploying the payload.

Phase	Action	Purpose
1 -- Establish presence	Inject several completely legitimate, high-quality documents on the target topic	Build retrieval history; establish 攻擊者 content as "trusted"
2 -- Subtle modifications	Update existing documents with minor inaccuracies mixed into legitimate content	測試 whether 系統 serves modified content; probe 偵測
3 -- Payload deployment	Inject the actual 對抗性 payload into documents with established history	Leverage trust and retrieval ranking built in prior phases

嵌入向量 Space 攻擊

對抗性嵌入向量 Optimization

The goal is to craft document text whose 嵌入向量 is maximally close to target query 嵌入向量. Two approaches:

# Pseudocode: gradient-based 嵌入向量 attack
target_emb = embed(target_query)
for step in range(num_steps):
    for pos in token_positions:
        for candidate in random_tokens(50):
            swap 符元 at pos -> candidate
            if cosine_sim(embed(new_text), target_emb) improves:
                keep substitution

# Find texts whose 嵌入向量 are closest to target query
carrier = find_nearest_text(corpus, target_query, model)
adversarial_doc = carrier + "\n\n" + payload
# The carrier text pulls the document into the right
# region of 嵌入向量 space

嵌入向量 Model Blind Spots

Every 嵌入向量 model has exploitable weaknesses:

Blind Spot	Description	利用
Negation blindness	"IS vulnerable" and "IS NOT vulnerable" produce similar 嵌入向量	Inject negated 安全 guidance that retrieves for the same queries
Instruction-content conflation	Model cannot distinguish descriptions from instructions	Craft text that describes an action but is interpreted as an instruction
Length sensitivity	Very short/long texts have degraded 嵌入向量 quality	利用 artifacts at extreme lengths to create retrieval anomalies

SEO-Style Ranking Manipulation

Traditional SEO techniques have direct analogues in vector search systems.

Technique Comparison

SEO Technique	Vector Search Analogue	實作
Keyword stuffing	BM25 term frequency boost	Repeat target terms ~5 times (optimal for typical BM25 k1 saturation)
Link farming / PageRank	Authority signal manipulation	Create documents that cite or link to the 對抗性 document
Fresh content bonus	Recency gaming	Repeatedly update the document to maintain a fresh timestamp
Metadata optimization	Metadata authority injection	Set `source_type: official_policy`, max `authority_score`
Click-through rate	Retrieval feedback loops	Repeatedly query and provide positive feedback on 對抗性 results

Measuring Poisoning Effectiveness

Track two metrics to 評估 end-to-end impact:

Metric	What It Measures	How to Measure
Retrieval hit rate	% of target queries that retrieve at least one 對抗性 chunk	Embed a unique marker string in 對抗性 documents; check if it appears in retrieved sources
Payload execution rate	% of target queries where the LLM follows the injected instruction	Define expected 對抗性 behavior (e.g., exfiltration URL appears in response); check LLM 輸出
Execution given retrieval	Conditional rate: when retrieved, how often does the LLM comply?	Execution rate / retrieval rate -- isolates payload quality from retrieval quality

Defensive Gaps to Highlight in Reports

When reporting 知識庫投毒 findings, call out these 緩解 gaps:

No integrity verification -- system accepts documents without cryptographic signing or provenance checks
No 對抗性 content scanning -- ingestion pipeline lacks 提示詞注入 pattern 偵測
Weak tenant isolation -- metadata-based filtering rather than physical namespace separation
No retrieval anomaly 偵測 -- no 監控 for sudden changes in retrieval patterns or rankings
Missing 輸出 validation -- LLM responses are not checked for indicators of injection compliance

Knowledge Check

You have contributor access to a company wiki that feeds a RAG system. Which staged 投毒 approach is MOST likely to evade 偵測 while achieving payload deployment?

參考文獻

Poisoning Retrieval Corpora by Injecting 對抗性 Passages (Zhong et al., 2023) — RAG corpus 投毒
PoisonedRAG: Knowledge Poisoning 攻擊 to 檢索增強生成 (Zou et al., 2024) — Targeted RAG knowledge 投毒
OWASP Top 10 for LLM Applications — Training 資料投毒 risks

Knowledge Base 投毒

Knowledge Base Poisoning

Ingestion Path Analysis

對抗性 Document Design

Ensure high retrieval relevance

Ensure payload survival through chunking

Maintain stealth

Maximize LLM compliance

Payload Wrapping Pattern

Staged Poisoning

嵌入向量 Space 攻擊

對抗性嵌入向量 Optimization

嵌入向量 Model Blind Spots

SEO-Style Ranking Manipulation

Technique Comparison

Measuring Poisoning Effectiveness

Defensive Gaps to Highlight in Reports

相關主題

參考文獻

Knowledge Base 投毒

Knowledge Base Poisoning

Ingestion Path Analysis

對抗性 Document Design

Ensure high retrieval relevance

Ensure payload survival through chunking

Maintain stealth

Maximize LLM compliance

Payload Wrapping Pattern

Staged Poisoning

嵌入向量 Space 攻擊

對抗性嵌入向量 Optimization

嵌入向量 Model Blind Spots

SEO-Style Ranking Manipulation

Technique Comparison

Measuring Poisoning Effectiveness

Defensive Gaps to Highlight in Reports

相關主題

參考文獻

Knowledge Base 投毒

Ensure high retrieval relevance

Ensure payload survival through chunking

Maintain stealth

Maximize LLM compliance

Related articles

Knowledge Base 投毒

Ensure high retrieval relevance

Ensure payload survival through chunking

Maintain stealth

Maximize LLM compliance

Related articles