What is Agent Exploitation?

Hands-on lab for exploiting AI agents with tool access through indirect prompt injection, tool-call chaining, and privilege escalation techniques.

What is MCP Tool Abuse?

Hands-on lab for exploiting Model Context Protocol tool definitions through malicious tool descriptions, tool confusion attacks, and server impersonation.

What is Multi-Turn Campaigns?

Hands-on lab for executing multi-turn crescendo attacks against LLMs, measuring safety degradation over conversation length, and building persistent attack campaigns.

What is RAG Poisoning?

Hands-on lab for setting up a RAG pipeline with LlamaIndex, injecting malicious documents, testing retrieval poisoning, and measuring injection success rates.

What is Embedding Manipulation?

Hands-on lab for crafting documents optimized to be retrieved for specific queries through embedding collision attacks using sentence-transformers.

What is LLM Judge Evaluator?

Hands-on lab for building an LLM-based evaluator to score red team attack outputs, compare model vulnerability, and lay the foundation for automated attack campaigns.

What is Indirect Injection Chains?

Hands-on lab for setting up indirect prompt injection scenarios through web pages, emails, and documents, testing multi-hop injection chains against AI systems.

What is Data Exfiltration?

Hands-on lab for extracting sensitive data from AI systems including system prompt extraction, context leakage via markdown rendering, and URL-based data exfiltration.

What is Model Extraction?

Hands-on lab for API-based model extraction attacks, querying a target model to approximate its behavior, measuring fidelity, and understanding query budgets.

What is PyRIT Campaigns?

Hands-on lab for configuring and running automated red teaming campaigns using Microsoft's PyRIT framework against LLM applications.

中階 AI 紅隊實驗室

中級1 分鐘閱讀更新於 2026-03-13

構成專業 AI 紅隊演練骨幹的核心攻擊技術實驗室——防禦繞過、代理利用、RAG 投毒與更多。

labs intermediate defense-bypass agent-exploitation rag-poisoning

中階實驗室介紹構成專業 AI 紅隊演練骨幹的核心攻擊技術。每個實驗室針對中等防禦系統練習，要求你調整技術至特定目標。

學習路徑

0/159 已完成

~2386 分鐘總計159 課

1
Agent Exploitation中級
Hands-on lab for exploiting AI agents with tool access through indirect prompt injection, tool-call chaining, and privilege escalation techniques.
9m
2
MCP Tool Abuse中級
Hands-on lab for exploiting Model Context Protocol tool definitions through malicious tool descriptions, tool confusion attacks, and server impersonation.
10m
3
Multi-Turn Campaigns中級
Hands-on lab for executing multi-turn crescendo attacks against LLMs, measuring safety degradation over conversation length, and building persistent attack campaigns.
9m
4
RAG Poisoning中級
Hands-on lab for setting up a RAG pipeline with LlamaIndex, injecting malicious documents, testing retrieval poisoning, and measuring injection success rates.
9m
5
Embedding Manipulation中級
Hands-on lab for crafting documents optimized to be retrieved for specific queries through embedding collision attacks using sentence-transformers.
8m
6
LLM Judge Evaluator中級
Hands-on lab for building an LLM-based evaluator to score red team attack outputs, compare model vulnerability, and lay the foundation for automated attack campaigns.
9m
7
Indirect Injection Chains中級
Hands-on lab for setting up indirect prompt injection scenarios through web pages, emails, and documents, testing multi-hop injection chains against AI systems.
9m
8
Data Exfiltration中級
Hands-on lab for extracting sensitive data from AI systems including system prompt extraction, context leakage via markdown rendering, and URL-based data exfiltration.
8m
9
Model Extraction中級
Hands-on lab for API-based model extraction attacks, querying a target model to approximate its behavior, measuring fidelity, and understanding query budgets.
9m
10
PyRIT Campaigns中級
Hands-on lab for configuring and running automated red teaming campaigns using Microsoft's PyRIT framework against LLM applications.
9m
11
Defense Bypass中級
Hands-on lab for methodically identifying, categorizing, and bypassing common LLM defense mechanisms including input filters, output filters, and system prompt hardening.
11m
12
promptfoo Regression中級
Hands-on lab for setting up promptfoo to run automated regression tests against LLM applications, ensuring that safety properties hold across model updates and prompt changes.
9m
13
Memory Poisoning中級
Hands-on lab exploring how conversational memory in AI agents can be poisoned to alter future behavior, inject persistent instructions, and exfiltrate data across sessions.
10m
14
Context Stuffing中級
Hands-on lab demonstrating how oversized inputs can overwhelm an LLM's context window to dilute safety instructions, push system prompts out of the attention window, or cause instruction amnesia.
12m
15
Token Smuggling中級
Hands-on lab exploring how tokenization boundaries, token-level manipulations, and encoding tricks can smuggle attack payloads past safety classifiers.
9m
16
Image Injection中級
Hands-on lab exploring how text instructions embedded in images can be used to perform prompt injection against vision-language models (VLMs) that process visual input.
10m
17
Function Calling Abuse中級
Hands-on lab exploring how attackers can manipulate LLM function calling and tool use to execute unauthorized actions, exfiltrate data, and chain tool calls for maximum impact.
12m
18
Indirect Injection Basics中級
Inject instructions through external data sources including documents, web pages, and emails that a target AI system processes as context.
12m
19
Lab: Function Calling Injection中級
Hands-on lab for exploiting function calling mechanisms by crafting inputs that manipulate which functions get called and with what parameters.
9m
20
Lab: Multimodal Injection中級
Hands-on lab for embedding text instructions in images to exploit vision-enabled LLMs. Learn to craft visual prompt injections, test OCR-based attacks, and evaluate multimodal safety boundaries.
12m
21
Lab: Data Exfiltration Channels中級
Hands-on lab for extracting data from AI systems through markdown image rendering, invisible links, tool call parameters, and other covert exfiltration channels.
11m
22
Lab: Tool Result Poisoning中級
Hands-on lab for poisoning tool outputs to redirect agent behavior by injecting malicious content through tool results.
10m
23
Multimodal Injection中級
Embed prompt injection instructions in images for vision-enabled models, testing how visual content can carry adversarial payloads.
12m
24
Data Exfiltration Channels中級
Extract sensitive information from AI systems through various exfiltration channels including crafted links, image tags, tool calls, and side-channel leakage.
13m
25
Lab: Defense Effectiveness Measurement中級
Hands-on lab for quantifying AI guardrail robustness using attack success rates, evasion metrics, false positive rates, and statistical analysis of defense performance.
11m
26
Lab: MCP Server Exploitation中級
Hands-on lab for attacking a vulnerable MCP server, demonstrating tool shadowing, consent phishing, and server impersonation techniques.
11m
27
Defense Effectiveness Testing中級
Systematically test and measure the robustness of AI guardrails using structured methodology, metrics, and repeatable test suites.
13m
28
Lab: Agent Memory Manipulation中級
Hands-on lab for injecting persistent instructions into an agent's memory and context that affect future interactions and conversations.
9m
29
Lab: Automated Red Team Pipeline中級
Hands-on lab for building a continuous AI red team testing pipeline using promptfoo, GitHub Actions, and automated attack generation to catch safety regressions before deployment.
10m
30
Automated Red Team Testing中級
Set up promptfoo for continuous automated security testing of AI applications, with custom evaluators and CI/CD integration.
11m
31
Lab: Agent Permission Escalation中級
Hands-on lab for tricking AI agents into performing actions beyond their intended scope through privilege escalation techniques.
10m
32
Lab: ML Supply Chain Scan中級
Hands-on lab for auditing machine learning model dependencies, detecting malicious packages in ML pipelines, and scanning model files for backdoors and supply chain threats.
12m
33
Lab: Encoding Bypass Lab中級
Hands-on lab for systematically testing Base64, ROT13, Unicode homoglyph, hexadecimal, and combined encoding bypasses against LLM safety filters with measured success rates.
11m
34
Lab: Workflow Hijacking中級
Hands-on lab for redirecting multi-step agent workflows to perform unintended actions by manipulating intermediate states and control flow.
11m
35
Supply Chain Audit中級
Audit an ML project's dependencies for vulnerabilities, covering model files, Python packages, container images, and training data provenance.
13m
36
Lab: Advanced RAG Poisoning中級
Hands-on lab for crafting documents that reliably get retrieved and influence RAG responses for specific target queries.
11m
37
Lab: Agent Prompt Leaking中級
Hands-on lab for extracting system prompts from tool-using AI agents by exploiting tool interactions, observation channels, and multi-step reasoning to leak protected instructions.
13m
38
Lab: RAG Metadata Injection中級
Hands-on lab for exploiting metadata fields like titles, descriptions, and timestamps to manipulate RAG retrieval ranking and influence responses.
10m
39
Lab: Chunking Exploitation中級
Hands-on lab for crafting documents that split across chunks in ways that hide malicious content from chunk-level filtering while maintaining attack effectiveness.
10m
40
Lab: Re-ranking Attacks中級
Hands-on lab for manipulating the re-ranking stage of RAG pipelines to promote or suppress specific documents in retrieval results.
10m
41
Lab: Citation Fabrication中級
Hands-on lab for getting RAG systems to cite documents that don't exist or misattribute quotes to legitimate sources.
10m
42
Lab: AWS Bedrock Guardrails Testing中級
Hands-on lab for systematically testing and bypassing AWS Bedrock's built-in guardrails including content filters, denied topics, and word filters.
10m
43
Lab: Few-Shot Example Injection中級
Hands-on lab exploring how injected few-shot examples can steer language model outputs toward attacker-chosen behaviors by exploiting in-context learning.
20m
44
Lab: Azure Content Filter Evasion中級
Hands-on lab for mapping and testing Azure OpenAI Service content filtering categories, severity levels, and bypass techniques.
9m
45
Lab: Recursive Injection Chains中級
Hands-on lab building self-reinforcing prompt injection chains where each model response amplifies the injection, creating feedback loops that progressively override safety controls.
18m
46
Lab: Encoding Evasion Techniques中級
Hands-on lab using Base64, ROT13, Unicode normalization, and custom encoding schemes to evade input filters and safety classifiers in language model systems.
18m
47
Lab: Context Window Overflow Attacks中級
Hands-on lab exploring how overflowing a model's context window with padding content can push safety instructions out of the attention window and enable injection attacks.
21m
48
Lab: Language Translation Bypass中級
Hands-on lab exploiting language translation capabilities to bypass safety filters by submitting prompts in low-resource languages, code-switching, and using transliteration tricks.
19m
49
Lab: Markdown-Based Injection中級
Hands-on lab exploring how Markdown rendering in AI-generated outputs can be exploited to inject hidden content, exfiltrate data through image tags, and manipulate displayed information.
16m
50
Lab: JSON Input Injection中級
Hands-on lab exploring how adversarial payloads injected through structured JSON inputs can manipulate language model behavior, bypass schema validation, and exploit parsing inconsistencies.
16m
51
Lab: Multi-Turn Progressive Escalation中級
Hands-on lab building multi-turn attack sequences that progressively escalate from benign queries to full safety bypass through gradual trust establishment and boundary pushing.
19m
52
Indirect Injection via Web Content中級
Plant prompt injection payloads in web pages consumed by RAG-enabled LLM applications.
18m
53
MCP Tool Poisoning Attack中級
Exploit MCP tool descriptions to inject instructions that redirect agent behavior.
18m
54
RAG Context Poisoning中級
Poison a vector database to inject adversarial content into RAG retrieval results.
18m
55
Agent Memory Manipulation中級
Exploit persistent memory in LLM agents to plant false context that persists across sessions.
18m
56
Multi-Model Attack Chaining中級
Chain attacks across multiple LLM models in a pipeline to bypass per-model defenses.
18m
57
Advanced Token Smuggling中級
Use Unicode homoglyphs, zero-width characters, and tokenizer quirks to smuggle payloads.
18m
58
LLM Judge Manipulation中級
Craft responses that exploit LLM-as-judge evaluation patterns to achieve high safety scores while embedding harmful content.
18m
59
Structured Output Exploitation中級
Exploit JSON mode and structured output constraints to bypass safety filters.
18m
60
Embedding Similarity Attacks中級
Manipulate text to achieve target embedding similarity scores for retrieval poisoning.
18m
61
Advanced Function Calling Exploitation中級
Exploit advanced function calling patterns including nested calls, parallel execution, and schema manipulation.
16m
62
Guardrail Fingerprinting中級
Systematically map the rules and thresholds of input/output guardrail systems.
18m
63
Agent Workflow Security Testing中級
Test multi-step agent workflows for injection points, state manipulation, and unauthorized action execution.
16m
64
Conversation State Poisoning中級
Exploit conversation history manipulation to alter model behavior across turns.
18m
65
Advanced MCP Protocol Testing中級
Perform advanced security testing of MCP servers including transport analysis, capability enumeration, and tool poisoning.
16m
66
API Parameter Fuzzing for LLMs中級
Systematically fuzz LLM API parameters to discover unexpected behaviors and bypasses.
18m
67
Cross-Context Injection中級
Inject prompts that persist across separate conversation contexts in shared deployments.
18m
68
Multi-Model Comparison Security Testing中級
Compare security postures across multiple LLM providers by running identical attack suites and analyzing differential responses.
16m
69
Building Defense Bypass Chains中級
Combine multiple bypass techniques into reliable chains that defeat layered defenses.
18m
70
Guardrail Bypass Technique Laboratory中級
Practice guardrail bypass techniques against NeMo Guardrails, LLM Guard, and custom classifier-based defenses.
16m
71
Advanced Embedding Manipulation中級
Manipulate embedding vectors to achieve target similarity scores for RAG poisoning and retrieval manipulation.
17m
72
Automated Red Teaming with PyRIT中級
Use Microsoft's PyRIT framework to run automated multi-turn attack campaigns.
18m
73
Advanced RAG Security Testing中級
Test RAG systems for chunking exploitation, reranking manipulation, and cross-document injection attacks.
16m
74
Custom Jailbreak Wordlist Generation中級
Generate targeted jailbreak wordlists using mutation strategies and seed prompts.
18m
75
Data Exfiltration via Side Channels中級
Extract data from LLM systems using timing, token probability, and formatting side channels.
18m
76
Fine-Tuning Safety Impact Testing中級
Measure how fine-tuning affects model safety by comparing pre and post fine-tuning safety benchmark scores.
17m
77
Code Generation Security Testing中級
Test LLM code generation for insecure patterns, injection vulnerabilities, and code execution safety issues.
17m
78
Semantic Injection Crafting中級
Craft semantically coherent injections that evade both classifiers and human review.
18m
79
Intermediate Multimodal Security Testing中級
Test multimodal LLMs with crafted images containing embedded text, adversarial perturbations, and visual injection payloads.
17m
80
Multi-Language Injection Attacks中級
Exploit language switching and low-resource language gaps to bypass safety training.
18m
81
Cloud AI Platform Security Testing中級
Test cloud AI platform configurations including AWS Bedrock guardrails, Azure content filters, and GCP safety settings.
17m
82
CI/CD Pipeline Security for LLM Apps中級
Test CI/CD pipeline security for LLM applications including prompt template injection, model artifact tampering, and deployment attacks.
17m
83
LLM Security Monitoring Setup中級
Deploy a comprehensive security monitoring system for LLM applications with anomaly detection and alerting.
16m
84
AI Incident Response Simulation中級
Practice AI incident response procedures through a simulated prompt injection incident with escalation and containment.
17m
85
AI Compliance Testing Fundamentals中級
Test LLM applications against OWASP LLM Top 10 and NIST AI RMF requirements using structured test procedures.
17m
86
Chain-of-Thought Exploitation中級
Exploit chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.
16m
87
Tool Result Injection Attacks中級
Inject adversarial content through tool call results to poison model reasoning and redirect subsequent actions.
16m
88
Session Isolation Testing中級
Test session isolation in multi-tenant LLM deployments to detect cross-session data leakage and context contamination.
17m
89
Semantic Search Manipulation中級
Manipulate semantic search results by crafting documents that achieve target similarity scores against specific queries.
16m
90
Output Schema Exploitation中級
Exploit structured output schemas to force models into revealing information through required field population.
16m
91
Prompt Caching Side-Channel Attacks中級
Exploit prompt caching mechanisms to detect cached prompts and extract information through timing side channels.
16m
92
Data Exfiltration via Webhooks中級
Test data exfiltration paths through LLM-triggered webhooks, API callbacks, and external URL rendering.
16m
93
System Prompt Reconstruction中級
Reconstruct full system prompts through iterative partial extraction and cross-referencing multiple leakage vectors.
16m
94
Advanced Classifier Evasion中級
Evade advanced input classifiers using adversarial perturbation, semantic paraphrasing, and gradient-free optimization.
17m
95
Multi-Turn State Tracking Attacks中級
Exploit conversation state tracking to build persistent attack context across multiple conversation turns.
16m
96
Unicode Normalization Bypass Attacks中級
Exploit Unicode normalization differences between input validation and model processing to smuggle injection payloads.
16m
97
Role Hierarchy Exploitation中級
Exploit multi-role LLM architectures by manipulating the instruction hierarchy between system, user, and assistant roles.
17m
98
Vector Database Access Control Testing中級
Test vector database access controls for bypass vulnerabilities including namespace traversal and filter manipulation.
16m
99
Streaming Response Manipulation中級
Exploit streaming response mechanisms to bypass output filters that only check complete responses.
16m
100
Model Context Protocol Fuzzing中級
Fuzz MCP protocol messages to discover parsing errors, validation gaps, and unexpected behavior in MCP servers.
17m
101
Automated Adversarial Prompt Generation中級
Build automated systems for generating adversarial prompts using mutation, crossover, and selection strategies.
17m
102
Exploiting LLM Hallucinations中級
Exploit hallucination tendencies to trigger fabricated tool calls, invented API endpoints, and false fact injection.
17m
103
Log Probability Analysis for Attacks中級
Use token log probabilities to extract information about training data, system prompts, and model internals.
17m
104
Context Window Attack Optimization中級
Optimize prompt injection placement within the context window to maximize attack effectiveness using attention dynamics.
16m
105
Defense Regression Testing Setup中級
Build a regression testing framework to continuously verify that LLM defenses remain effective against known attack patterns.
16m
106
API Versioning Security Gaps中級
Exploit version differences in LLM APIs to find deprecated endpoints and backward-compatible safety bypasses.
16m
107
Instruction Following Measurement中級
Quantitatively measure instruction following compliance to identify where models prioritize competing instructions.
16m
108
Canary Token Effectiveness Testing中級
Deploy and test canary tokens in system prompts to evaluate their effectiveness at detecting prompt extraction.
16m
109
Multi-Agent Communication Testing中級
Test security of inter-agent communication in multi-agent systems including message injection and trust boundary violations.
16m
110
Safety Filter Benchmarking中級
Benchmark safety filters across providers using standardized test suites to compare detection rates and false positives.
16m
111
Tool Result Injection Attacks中級
Craft malicious tool return values that inject instructions back into the model's reasoning chain.
15m
112
Implementing Many-Shot Jailbreaking中級
Implement Anthropic's many-shot jailbreaking technique with scaling analysis across conversation lengths.
15m
113
RAG Document Injection Campaign中級
Design and execute a document injection campaign against a RAG-powered application with vector search.
15m
114
Guardrail Latency-Based Detection中級
Use timing side channels to identify and characterize guardrail implementations in LLM applications.
15m
115
Chat History Manipulation Attacks中級
Inject false messages into conversation history to alter model behavior in subsequent turns.
15m
116
System Prompt Extraction via Token Probabilities中級
Exploit token probability APIs to reconstruct hidden system prompts character by character.
15m
117
Input Classifier Boundary Mapping中級
Systematically probe input classifiers to map their decision boundaries and find bypass regions.
15m
118
Function Schema Manipulation中級
Exploit function calling schemas to inject instructions through parameter descriptions and enums.
15m
119
MCP Resource Data Exfiltration中級
Exploit MCP resource URIs to access and exfiltrate data beyond intended scope boundaries.
15m
120
Crescendo Attack Implementation中級
Implement Microsoft's Crescendo multi-turn escalation attack with automated conversation management.
15m
121
Output Filter Bypass via Encoding中級
Bypass output safety filters by instructing the model to encode responses in Base64, hex, or custom schemes.
15m
122
Agent Goal Hijacking中級
Redirect an AI agent's objectives through carefully crafted inputs that override its primary task.
15m
123
Cross-Plugin Exploitation Chaining中級
Chain exploits across multiple plugins/tools in an agent system to achieve multi-step attacks.
15m
124
JailbreakBench Comparative Testing中級
Run standardized jailbreak benchmarks across multiple models and compare vulnerability profiles.
15m
125
Semantic Search Poisoning中級
Craft adversarial documents that rank highly in semantic search for targeted queries in RAG systems.
15m
126
Multi-Language Safety Gap Exploitation中級
Systematically identify and exploit safety training gaps across low-resource languages.
15m
127
NeMo Guardrails Bypass Techniques中級
Test and bypass NVIDIA NeMo Guardrails configurations using known evasion techniques.
15m
128
LLM Guard Evasion Techniques中級
Develop evasion techniques against LLM Guard input scanners and output detectors.
15m
129
Context Window Boundary Exploitation中級
Exploit the boundaries of context windows to push safety instructions beyond the model's attention.
15m
130
PyRIT Custom Attack Strategies中級
Build custom attack strategies and converters in Microsoft's PyRIT framework for targeted campaigns.
15m
131
Advanced Encoding Chain Attacks中級
Chain multiple encoding schemes to bypass multi-layer input filters in production LLM applications.
18m
132
JSON Mode Safety Bypass中級
Exploit structured JSON output modes to bypass safety filters that only check plain-text responses.
15m
133
Injecting Text via Images to VLMs中級
Embed adversarial text in images that vision-language models read and follow as instructions.
15m
134
Semantic Deception Crafting Lab中級
Craft semantically coherent prompts that conceal adversarial intent from both human reviewers and ML classifiers.
18m
135
Assistant Prefill Injection Attacks中級
Exploit assistant message prefilling to prime model responses and bypass safety alignment.
15m
136
Context Window Poisoning Lab中級
Exploit context window management to inject persistent adversarial content that influences future model responses.
18m
137
Building Custom Garak Probes中級
Develop custom probe modules for NVIDIA Garak to test organization-specific vulnerability patterns.
15m
138
Structured Output Safety Bypass中級
Use JSON mode and structured output constraints to extract information that would be refused in free-form responses.
18m
139
Token Boundary Manipulation中級
Exploit tokenizer-specific behavior by crafting inputs that split across token boundaries in unexpected ways.
15m
140
Multi-Model Pipeline Attack Lab中級
Attack a pipeline where multiple models process data sequentially, exploiting trust between pipeline stages.
18m
141
PDF Document Injection for RAG Systems中級
Craft adversarial PDF documents that inject instructions when processed by RAG document loaders.
15m
142
Practical Embedding Manipulation中級
Manipulate text to achieve target embedding similarities for poisoning retrieval augmented generation systems.
18m
143
Streaming Response Exfiltration中級
Exploit streaming API responses to exfiltrate data that might be caught by batch output filters.
15m
144
ML Classifier Evasion Lab中級
Develop payloads that evade machine learning-based input classifiers through adversarial text perturbation.
18m
145
Prompt Template Variable Injection中級
Inject adversarial content through template variable expansion in prompt management systems.
15m
146
Multi-Turn State Tracking Exploitation中級
Exploit state management in multi-turn conversations to gradually shift model behavior over many turns.
15m
147
Session State Exploitation Lab中級
Exploit session management vulnerabilities in LLM applications to access other users data or escalate privileges.
18m
148
Custom Inspect AI Safety Evaluations中級
Build custom safety evaluation tasks in Inspect AI to test application-specific vulnerability classes.
15m
149
Reasoning Chain Hijacking中級
Hijack the model's chain-of-thought reasoning to redirect its conclusions toward attacker-controlled objectives.
18m
150
Guardrail Rule Extraction Lab中級
Systematically extract the rules and decision boundaries of guardrail systems through targeted probing.
18m
151
Fine-Tuning API Security Probing中級
Probe fine-tuning APIs for security weaknesses including insufficient validation and unsafe default configurations.
18m
152
Multi-Language Safety Bypass Lab中級
Exploit gaps in safety training coverage across languages to bypass safety mechanisms using code-switching.
18m
153
Document-Based RAG Injection Lab中級
Inject adversarial content into documents that will be processed by a RAG system to influence model responses.
18m
154
Conversation Splitting Attack Lab中級
Split adversarial payloads across multiple conversation turns to bypass per-message safety classification.
18m
155
API Abuse Pattern Lab中級
Discover and exploit API abuse patterns including parameter injection, header manipulation, and endpoint confusion.
18m
156
Practical Model Fingerprinting Lab中級
Fingerprint the underlying model behind an API endpoint to identify its type, version, and configuration.
18m
157
Output Channel Discovery Lab中級
Discover alternative output channels in LLM applications that bypass standard output filtering.
18m
158
Prompt Leakage Technique Lab中級
Practice multiple system prompt extraction techniques and measure their effectiveness across different targets.
18m
159
Lab: Advanced Token Smuggling via Unicode Normalization中級
Exploit Unicode normalization differences between input validators and LLM tokenizers to bypass content filters and inject hidden instructions.
22m

開始學習

在 GitHub 上編輯此頁

中階 AI 紅隊實驗室

學習路徑

相關文章

中階 AI 紅隊實驗室

學習路徑

相關文章