RAG & Data Attack Assessment
Test your knowledge of Retrieval-Augmented Generation attack vectors, knowledge base poisoning, embedding manipulation, and data exfiltration through RAG systems with 10 intermediate-level questions.
RAG & Data Attack Assessment
This assessment evaluates your understanding of attack techniques targeting Retrieval-Augmented Generation systems, including knowledge base poisoning, embedding manipulation, retrieval hijacking, and data exfiltration through RAG pipelines. You should be familiar with vector databases, embedding models, and retrieval architectures before attempting this assessment.
What makes RAG systems particularly vulnerable to indirect prompt injection compared to standard LLM applications?
What is 'embedding collision' and how is it used to manipulate RAG retrieval?
A company's RAG system indexes internal Confluence pages for employee Q&A. An attacker with Confluence edit access adds a page containing hidden text. What category of attack is this?
What is the 'retrieval hijacking' technique in RAG attacks?
How can an attacker use a RAG system to exfiltrate data from the knowledge base that they do not have direct access to?
What is the 'context window stuffing' attack against RAG systems?
What is the security significance of the chunking strategy used in a RAG pipeline?
How does metadata injection in RAG systems enable more targeted attacks?
What is the 'phantom reference' attack against RAG-based question-answering systems?
Which combination of defenses provides the strongest protection for a RAG system against knowledge base poisoning?
Concept Summary
| Concept | Description | Attack Stage |
|---|---|---|
| Knowledge base poisoning | Injecting malicious content into RAG data sources | Pre-retrieval |
| Embedding collision | Crafting documents with targeted embedding vectors | Pre-retrieval |
| Retrieval hijacking | Manipulating retrieval ranking and selection | Retrieval |
| Context window stuffing | Overwhelming context with attacker-controlled content | Context assembly |
| ACL bypass | Accessing restricted content through retrieval queries | Retrieval |
| Metadata injection | Manipulating document metadata for ranking or trust | Pre-retrieval / Retrieval |
| Phantom references | Fabricated facts with fake citations in knowledge base | Post-retrieval |
| Chunking exploitation | Leveraging chunking strategy for payload delivery | Pre-retrieval |
Scoring Guide
| Score | Rating | Next Steps |
|---|---|---|
| 9-10 | Excellent | Strong RAG security knowledge. Proceed to the Multimodal Attack Assessment. |
| 7-8 | Proficient | Review missed questions and revisit RAG security materials. |
| 5-6 | Developing | Spend additional time with RAG architecture and attack sections. |
| 0-4 | Needs Review | Study RAG fundamentals, including vector databases and embeddings, from the beginning. |
Study Checklist
- I understand the RAG pipeline from ingestion to response generation
- I can explain knowledge base poisoning and its prerequisites
- I understand embedding collisions and how they manipulate retrieval
- I can describe retrieval hijacking techniques (flooding, broadening, displacement)
- I understand ACL bypass risks in RAG systems without permission-aware retrieval
- I can explain context window stuffing and its impact on model behavior
- I understand how chunking strategy affects security
- I can describe metadata injection and phantom reference attacks
- I know the multi-layered defense approach for RAG systems
- I can conduct a threat model for a RAG-based application