Embedding & Vector 安全 評量
評量 covering embedding attacks, vector DB poisoning, similarity manipulation, and inversion attacks.
評估 概覽
評估 covering 嵌入向量 attacks, vector DB 投毒, similarity manipulation, and inversion attacks. This 評估 tests both theoretical 理解 and practical application of the covered material.
評估 Details:
- Questions: 10
- Time Limit: 30 minutes recommended
- Passing Score: 70%
- Difficulty: Advanced
Instructions
- Read each question carefully, including all answer options, before selecting your response
- Some questions have multiple correct approaches — select the best answer
- Scenario-based questions require you to 考慮 the full context before answering
- After completing the 評估, review the explanations for all questions — including those you answered correctly
Question 1
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 2
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 3
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 4
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 5
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 6
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 7
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 8
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 9
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Question 10
Based on the concepts covered 在本 評估 area, which approach most effectively addresses the primary 漏洞 class being tested?
Scoring Guide
| Score Range | 評估 | Recommendation |
|---|---|---|
| 90-100% | Expert | Ready to proceed to the next section |
| 80-89% | Proficient | Minor gaps to address through targeted review |
| 70-79% | Competent | Review incorrect areas before proceeding |
| 60-69% | Developing | Revisit the corresponding curriculum sections |
| Below 60% | Foundational | Complete the prerequisite material before retaking |
Study Resources
If you scored below the passing threshold, focus your review on:
- The specific topics where you answered incorrectly
- The hands-on labs that correspond to your weakest areas
- The reference materials for frameworks and tools mentioned in the questions
- Wei et al. 2023 — "Jailbroken: How Does LLM 安全 Training Fail?"
Detailed 評估 Content
Section Review Material
Before attempting this 評估, ensure you have reviewed all relevant curriculum material. The questions 在本 評估 cover both theoretical concepts and practical application of the material.
Key areas of focus for this 評估:
-
Conceptual 理解: You should be able to explain the fundamental mechanisms behind the attack and 防禦 techniques covered, not just recognize them by name.
-
Practical application: Many questions present scenarios that require you to determine the best approach given specific constraints and objectives.
-
Tool proficiency: Questions may reference specific tools and their capabilities. Ensure you have hands-on experience with the tools covered in the labs.
-
Framework knowledge: Governance and compliance questions reference specific frameworks (OWASP LLM Top 10, MITRE ATLAS, NIST AI RMF, EU AI Act). Know the key provisions and how they apply to AI 安全.
Practice Scenarios
考慮 the following scenarios as additional preparation:
Scenario 1: You are conducting a 紅隊 評估 of a customer-facing chatbot. The chatbot has access to a CRM system through 函式呼叫 and can look up customer records, create tickets, and send emails. What is your initial approach?
考慮: What information do you need to gather first? What are the highest-risk capabilities to 測試? How would you structure your 測試 to maximize coverage while staying within scope?
Scenario 2: During 測試, you discover that the chatbot's 系統提示詞 can be extracted through a simple instruction override. The 系統提示詞 contains API endpoint URLs and 認證 patterns. What do you do?
考慮: What is the immediate risk? Should you continue 測試 or escalate immediately? How would you document this finding?
Scenario 3: You are evaluating a RAG system that ingests customer documents. You have the ability to upload documents as a normal user. What attack vectors should you explore?
考慮: Document-based injection, metadata manipulation, 嵌入向量 space 投毒, and the interaction between document processing and the LLM.
# Framework for structured scenario analysis
class ScenarioAnalysis:
"""Structured analysis framework for 評估 scenarios."""
def __init__(self, scenario_description: str):
self.scenario = scenario_description
self.threats = []
self.vectors = []
self.priorities = []
def identify_threats(self) -> list:
"""識別 threats relevant to the scenario."""
# Map scenario to OWASP LLM Top 10 categories
owasp_mapping = {
"function_calling": "LLM06 - Excessive Agency",
"system_prompt": "LLM01 - 提示詞注入",
"rag_documents": "LLM01 - 提示詞注入 (Indirect)",
"user_data": "LLM02 - Insecure 輸出 Handling",
"training_data": "LLM03 - Training Data Poisoning",
}
return self.threats
def prioritize_testing(self) -> list:
"""Prioritize 測試 vectors by risk."""
# Risk = Likelihood × Impact
return sorted(self.vectors, key=lambda v: v["risk_score"], reverse=True)
def generate_test_plan(self) -> dict:
"""Generate a structured 測試 plan for the scenario."""
return {
"scope": self.scenario,
"threats": self.threats,
"test_cases": [
{
"vector": v["name"],
"technique": v["technique"],
"expected_result": v["expected"],
"priority": v["risk_score"],
}
for v in self.prioritize_testing()
],
}Additional Knowledge Checks
A 紅隊 discovers that an LLM-powered customer service 代理 can be manipulated to send emails to arbitrary addresses through 函式呼叫. How should this finding be classified according to OWASP LLM Top 10 2025?
During an AI 紅隊 engagement, you 識別 a 漏洞 that also affects a system outside your authorized scope. What is the correct course of action?
Advanced Considerations
Evolving 攻擊 Landscape
The AI 安全 landscape evolves rapidly as both offensive techniques and defensive measures advance. Several trends shape the current state of play:
Increasing model capabilities create new attack surfaces. As models gain access to tools, code execution, web browsing, and computer use, each new capability introduces potential 利用 vectors that did not exist in earlier, text-only systems. The principle of least privilege becomes increasingly important as model capabilities expand.
安全 訓練 improvements are necessary but not sufficient. Model providers invest heavily in 安全 訓練 through RLHF, DPO, constitutional AI, and other 對齊 techniques. These improvements raise the bar for successful attacks but do not eliminate the fundamental 漏洞: models cannot reliably distinguish legitimate instructions from 對抗性 ones 因為 this distinction is not represented in the architecture.
Automated 紅隊演練 tools democratize 測試. Tools like NVIDIA's Garak, Microsoft's PyRIT, and Promptfoo enable organizations to conduct automated 安全 測試 without deep AI 安全 expertise. 然而, automated tools catch known patterns; novel attacks and business logic 漏洞 still require human creativity and domain knowledge.
Regulatory pressure drives organizational investment. The EU AI Act, NIST AI RMF, and industry-specific regulations increasingly require organizations to 評估 and mitigate AI-specific risks. This regulatory pressure is driving investment in AI 安全 programs, but many organizations are still in the early stages of building mature AI 安全 practices.
Cross-Cutting 安全 Principles
Several 安全 principles apply across all topics covered 在本 curriculum:
-
防禦-in-depth: No single defensive measure is sufficient. Layer multiple independent 防禦 so that failure of any single layer does not result in system compromise. 輸入 classification, 輸出 filtering, behavioral 監控, and architectural controls should all be present.
-
Assume breach: Design systems assuming that any individual component can be compromised. This mindset leads to better isolation, 監控, and incident response capabilities. When a 提示詞注入 succeeds, the blast radius should be minimized through architectural controls.
-
Least privilege: Grant models and 代理 only the minimum capabilities needed for their intended function. A customer service chatbot does not need file system access or code execution. Excessive capabilities magnify the impact of successful 利用.
-
Continuous 測試: AI 安全 is not a one-time 評估. Models change, 防禦 evolve, and new attack techniques are discovered regularly. 實作 continuous 安全 測試 as part of the development and deployment lifecycle.
-
Secure by default: Default configurations should be secure. Require explicit opt-in for risky capabilities, use allowlists rather than denylists, and err on the side of restriction rather than permissiveness.
Integration with Organizational 安全
AI 安全 does not exist in isolation — it must integrate with the organization's broader 安全 program:
| 安全 Domain | AI-Specific Integration |
|---|---|
| Identity and Access | API key management, model access controls, user 認證 for AI features |
| Data Protection | 訓練資料 classification, PII in prompts, data residency for model calls |
| Application 安全 | AI feature threat modeling, 提示詞注入 in SAST/DAST, secure AI design patterns |
| Incident Response | AI-specific playbooks, model behavior 監控, 提示詞注入 forensics |
| Compliance | AI regulatory mapping (EU AI Act, NIST), AI audit trails, model documentation |
| Supply Chain | Model provenance, dependency 安全, adapter/weight integrity verification |
class OrganizationalIntegration:
"""Framework for integrating AI 安全 with organizational 安全 programs."""
def __init__(self, org_config: dict):
self.config = org_config
self.gaps = []
def assess_maturity(self) -> dict:
"""評估 the organization's AI 安全 maturity."""
domains = {
"governance": self._check_governance(),
"technical_controls": self._check_technical(),
"監控": self._check_monitoring(),
"incident_response": self._check_ir(),
"訓練": self._check_training(),
}
overall = sum(d["score"] for d in domains.values()) / len(domains)
return {"domains": domains, "overall_maturity": round(overall, 1)}
def _check_governance(self) -> dict:
has_policy = self.config.get("ai_security_policy", False)
has_framework = self.config.get("risk_framework", False)
score = (int(has_policy) + int(has_framework)) * 2.5
return {"score": score, "max": 5.0}
def _check_technical(self) -> dict:
controls = ["input_classification", "output_filtering", "rate_limiting", "sandboxing"]
active = sum(1 for c in controls if self.config.get(c, False))
return {"score": active * 1.25, "max": 5.0}
def _check_monitoring(self) -> dict:
has_monitoring = self.config.get("ai_monitoring", False)
has_alerting = self.config.get("ai_alerting", False)
score = (int(has_monitoring) + int(has_alerting)) * 2.5
return {"score": score, "max": 5.0}
def _check_ir(self) -> dict:
has_playbook = self.config.get("ai_ir_playbook", False)
return {"score": 5.0 if has_playbook else 0.0, "max": 5.0}
def _check_training(self) -> dict:
has_training = self.config.get("ai_security_training", False)
return {"score": 5.0 if has_training else 0.0, "max": 5.0}Future Directions
Several research and industry trends will shape the evolution of this field:
- Formal methods for AI 安全: Development of mathematical frameworks that can provide bounded guarantees about model behavior under 對抗性 conditions
- Automated 紅隊演練 at scale: Continued improvement of automated 測試 tools that can discover novel 漏洞 without human guidance
- AI-assisted 防禦: Using AI systems to detect and respond to attacks on other AI systems, creating a dynamic attack-防禦 ecosystem
- Standardized 評估: Growing adoption of standardized benchmarks (HarmBench, JailbreakBench) that enable consistent measurement of progress
- Regulatory harmonization: Convergence of AI regulatory frameworks across jurisdictions, providing clearer requirements for organizations
參考文獻 and Further Reading
- Wei et al. 2023 — "Jailbroken: How Does LLM 安全 Training Fail?"
- OWASP LLM Top 10 2025 — LLM07 (Insecure Plugin Design)
- OWASP LLM Top 10 2025 — LLM01 (提示詞注入)
What is the most effective approach to defending against the attack class covered 在本 article?
Why do the techniques described 在本 article remain effective across different model versions and providers?