# hands-on

labcloudassessmentend-to-endapi-securityadvancedhands-on

實作：雲端 AI 安全評估

Conduct an end-to-end security assessment of a cloud-deployed AI service, covering API security, model vulnerabilities, data handling, and infrastructure configuration.

labcustom-harnesstesting-frameworkdomain-specificadvancedhands-on

Lab: Custom Test Harness for Specific Applications

建構 a tailored testing framework for a specific AI application, with custom attack generators, domain-specific evaluators, and application-aware reporting.

labfederated-learningpoisoningexperthands-on

Lab: Federated Learning Poisoning Attack

動手實作 for understanding and simulating 投毒 attacks against federated learning systems, where a malicious participant corrupts the shared model through crafted 梯度 updates.

labpurple-teamattack-defensecollaborativeadvancedhands-on

實作：紫隊演練

Simultaneously attack and defend an AI application in a structured exercise where red team findings immediately inform blue team defensive improvements.

labtransfer-attacksadversarialcross-modeladvancedhands-on

Lab: Transfer Attack Development (Advanced Lab)

開發對抗性 attacks on open-source models that transfer to closed-source models, leveraging weight access for black-box 漏洞利用.

labdefenseinput-filterbeginnerhands-on

Lab: Build Your First Defense (Beginner Lab)

Create a simple input filter that blocks common 提示詞注入 patterns, then test it against the attack 技術 you have learned in previous labs.

labmodel-comparisonsafetybenchmarkingbeginnerhands-on

實作：模型比較

測試 the same attack 技術 against different 語言模型 and compare their safety behaviors, refusal patterns, and 漏洞 profiles.

labcontext-manipulationprompt-injectionpersonabeginnerhands-on

實作：上下文操控

Manipulate the conversational context to change model behavior, using 技術 like context flooding, false history injection, and persona shifting.

labdefense-bypassevasionfiltersbeginnerhands-on

實作：防禦繞過基礎

學習 basic 技術 to bypass simple LLM defenses including keyword filters, instruction reinforcement, and output validators using encoding, reformulation, and indirect approaches.

labdelimiter-escapeprompt-injectionboundariesbeginnerhands-on

實作：分隔符逃脫攻擊

Craft payloads that escape delimiter boundaries separating system and user content, testing how models handle broken fences, nested delimiters, and format confusion.

labethicsresponsibledocumentationbeginnerhands-on

Lab: Ethical Red Teaming

Practice responsible AI 紅隊演練 with proper documentation, scope management, and ethical decision-making frameworks.

labprompt-injectionbeginnerhands-on

實作：你的第一個提示詞注入

動手實作 where you perform your first 提示詞注入 attacks against a chatbot, progressing from basic overrides to more sophisticated 技術.

labjailbreakollamabeginnerhands-on

實作：你的第一個越獄嘗試

Try basic 越獄技術 against a local model using Ollama, learning the difference between 提示詞注入 and 越獄ing through hands-on experimentation.

labgaraktool-setup漏洞-scannerbeginnerhands-on

實作：Garak 設定與首次掃描

Install and configure garak, the LLM 漏洞 scanner, then run your first automated scan against a local model to discover common vulnerabilities.

labinjection-detectiondefenseclassificationbeginnerhands-on

Lab: Injection Detection Tool

建構 a basic 提示詞注入 detection tool using pattern matching, heuristics, and LLM-based classification to identify malicious inputs before they reach the target model.

labprompt-injectionsurveytechniquesbeginnerhands-on

實作：注入技術概覽

Survey and test ten fundamental 提示詞注入技術 against a local LLM, measuring effectiveness and cataloging behavioral patterns for each approach.

labinstruction-followingpriorityconflicting-instructionsbeginnerhands-on

Lab: Instruction Following Priority

測試 how 語言模型 prioritize conflicting instructions from 系統提示詞s, user messages, and embedded directives to understand the instruction hierarchy.

labmulti-languageprompt-injectiontranslationbeginnerhands-on

實作：多語言注入

測試提示詞注入技術 using multiple natural languages, exploring how safety training coverage varies across languages and how translation requests can mask payloads.

laboutput-format漏洞利用data-leakagebeginnerhands-on

實作：輸出格式利用

Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.

laboutput-steeringprompt-injectionoutput-manipulationbeginnerhands-on

實作：輸出引導

學習 to steer LLM outputs into specific formats, tones, and content using injection 技術 that manipulate how the model structures and presents its responses.

labpayload-craftingprompt-injectionred-teamingbeginnerhands-on

實作：載荷製作

學習 to craft effective 提示詞注入 payloads from scratch by understanding payload structure, testing iteratively, and optimizing for reliability against a local model.

labprompt-leakingsystem-promptextractionbeginnerhands-on

Lab: Prompt Leaking Basics

學習 to extract 系統提示詞s from chatbots using common leaking 技術 including direct requests, indirect elicitation, and format manipulation.

labpromptfootool-setupevaluationbeginnerhands-on

Lab: Promptfoo Setup and First Eval

Install and configure promptfoo for systematic LLM evaluation, then run your first red team evaluation to test model safety boundaries.

labpyrittool-setupred-teamingmicrosoftbeginnerhands-on

Lab: PyRIT Setup and First Attack

Install and configure Microsoft's PyRIT (Python Risk Identification Toolkit) for automated 紅隊演練, then run your first orchestrated attack against a local model.

labrole-playpersonajailbreakbeginnerhands-on

實作：角色扮演攻擊

Use persona-based approaches to bypass AI safety measures by assigning alternate identities, characters, and scenarios that override the model's trained refusal behaviors.

labsafetyboundariesmappingbeginnerhands-on

Lab: Mapping Safety Boundaries

系統性 discover what a language model will and won't do by probing its safety boundaries across multiple categories and documenting the results.

labprompt-injectionsystem-promptoverridebeginnerhands-on

實作：系統提示詞覆寫

探索技術 to override system-level instructions with user-level inputs, testing how models prioritize conflicting directives across different prompt architectures.

labexpertadversarial-suffixGCGoptimizationhands-on

實作：對抗性後綴優化

實作 GCG-style 對抗性 suffix attacks that automatically discover 符元 sequences causing 語言模型 to comply with harmful requests. Covers 梯度-based optimization, transferability analysis, and defense evaluation.

labexpertalignmentstress-testingsafetyhands-on

實作：對齊壓力測試

Push language model 對齊 to its breaking points through systematic stress testing. Identify conditions where safety training fails, measure 對齊 degradation curves, and map the boundaries of model compliance.

labexpertagentscannersecurityhands-on

Lab: Build Agent Security Scanner

建構 an automated security scanner for agentic AI systems that detects vulnerabilities in tool use, permission handling, memory management, and multi-step execution flows. Cover agent-specific 攻擊面s that traditional LLM testing misses.

labexpertfuzzertestingadversarialhands-on

實作：建構 AI 模糊測試器

建構 a comprehensive fuzzing framework for large 語言模型 that 系統性地 generates malformed, unexpected, and 對抗性 inputs to discover crashes, unexpected behaviors, and safety bypasses.

labexpertbehavior-diffregressioncomparisonhands-on

Lab: Build Behavior Diff Tool

建構 a tool that 系統性地 compares language model behavior across versions, configurations, and providers. Detect safety regressions, capability changes, and behavioral drift with automated differential analysis.

labexpertguardrailsevaluationdefense-testinghands-on

Lab: Build Guardrail Evaluator

建構 an automated framework for evaluating AI 護欄s and safety filters. 測試 input filters, output classifiers, content moderation systems, and defense-in-depth architectures for coverage gaps and bypass vulnerabilities.

labexpertautomationjailbreakframeworkhands-on

Lab: Build Jailbreak Automation

建構 an automated 越獄 testing framework that generates, mutates, and evaluates attack prompts at scale. Covers prompt mutation engines, success classifiers, and campaign management for systematic red team testing.

labexpertemergentcapabilityprobinghands-on

實作：湧現能力探測

系統性 test large 語言模型 for undocumented capabilities including hidden knowledge, unreported skills, and behaviors that emerge only under specific conditions. 建構 a structured probing framework for capability discovery.

labexpertfull-stackexploit-chainmulti-layerhands-on

Lab: Full-Stack AI Exploitation

串聯 vulnerabilities across UI, API, model, and infrastructure layers to achieve full-stack compromise of an AI application. 學習 to identify cross-layer attack paths and build exploit chains that no single-layer defense can prevent.

labexpertagentscomputer-useguihands-on

實作：Computer Use 代理攻擊

探索針對以 GUI 為本之 computer use 代理（透過視覺與動作原語與畫面互動）之攻擊向量的實作課程。

labexperthoneypotdeceptiondetectionhands-on

Lab: Deploy Honeypot AI

建構 and deploy a decoy AI system designed to detect, analyze, and characterize attackers targeting AI applications. 學習 honeypot design, telemetry collection, attacker profiling, and threat intelligence generation for AI-specific threats.

labexpertmulti-agentcoordinationhands-on

Lab: Multi-Agent Attack Coordination

動手實作 building and deploying coordinated multi-agent attack systems that divide 紅隊演練 tasks across specialized attacking agents.

labexpertjailbreakresearchnovel-attackshands-on

實作：新型越獄研究

Systematic methodology for discovering new 越獄技術 against large 語言模型. 學習 to identify unexplored 攻擊面s, develop novel attack vectors, and validate findings with scientific rigor.

labexpertpipelinepoisoningsupply-chainhands-on

實作：ML 流水線投毒

Compromise an end-to-end machine learning pipeline by attacking data ingestion, preprocessing, training, evaluation, and deployment stages. 學習 to identify and exploit weaknesses across the full ML lifecycle.

labexpertquantizationmodel-compressionhands-on

實作：利用量化模型弱點

實作探索模型量化如何退化安全對齊之實作，含尋找並利用與精度相關漏洞之技術。

labexpertregistrysupply-chainmodel-integrityhands-on

Lab: Model Registry Compromise

探索技術 for compromising model registries and substituting malicious models into production pipelines. 學習 to detect model tampering, verify model provenance, and secure the model supply chain.

labexpertrlhfreward-hackingalignmenthands-on

實作：RLHF 獎勵 hacking

探索如何鑽營用於 RLHF 對齊之獎勵模型——暴露獎勵代理訊號與實際安全目標之間缺口之實作。

labexpertbenchmarkevaluationsafetyhands-on

Lab: Create a Safety Benchmark

Design, build, and validate a comprehensive AI safety evaluation suite. 學習 benchmark design principles, test case generation, scoring methodology, and statistical validation for measuring LLM safety across multiple risk categories.