# testing

api-securityauthenticationrate-limitingtestinginfrastructure

LLM API 安全測試

LLM API 的安全測試方法論，涵蓋認證、速率限制、輸入驗證、輸出過濾與 LLM 特有的 API 漏洞。

labsapi-authenticationtestingbeginner

Lab: API Authentication Security Testing

測試 LLM API authentication mechanisms for vulnerabilities including 符元 leakage, replay attacks, and scope bypass.

Lab: API-Based Model Testing

學習 to test 語言模型 through their APIs including OpenAI, Anthropic, and local Ollama endpoints. 建構 reusable API testing functions with proper error handling.

labapitestingopenai

basiclabbeginnerlabsautomatedtesting

Basic Automated 測試 Setup

Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.

testingdefensebasiclabbeginnerlabs

Basic 防禦 Mechanism 測試

Identify與categorize the defensive mechanisms present in a target LLM application through structured probing.

labsinstruction-hierarchytestingbeginner

Instruction Hierarchy Testing

測試 how models prioritize conflicting instructions between system, user, and assistant roles.

labsmulti-languagetestingbeginner

Lab: Multi-Language Prompt Testing

測試 LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.

labbeginnerlabsplaygroundtesting

LLM Playground 安全測試

Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.

labsprompt-templatestestingbeginner

Lab: Prompt Template Vulnerability Testing

測試 common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.

labsconsistencytestingbeginner

實作：回應一致性測試

測試 LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.

labssafety-benchmarkstestingbeginner

Lab: Running Safety Benchmarks

Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.

labexpertfuzzertestingadversarialhands-on

實作：建構 AI 模糊測試器

建構 a comprehensive fuzzing framework for large 語言模型 that 系統性地 generates malformed, unexpected, and 對抗性 inputs to discover crashes, unexpected behaviors, and safety bypasses.

專家

Deceptive 對齊測試框架

建構testing framework for detecting mesa-optimization與deceptive alignment in fine-tuned models.

labsdeceptive-alignmenttestingexpert

專家

實驗：代理工作流程安全測試

測試多步代理工作流程的注入點、狀態操弄與未授權的動作執行。

labsagent-workflowtestingintermediate

labscanary-tokenstestingintermediate

實驗：金絲雀 Token 有效性測試

在系統提示中部署並測試金絲雀 token，評估其偵測提示擷取的有效性。

labscloud-platformtestingintermediate

實驗：雲端 AI 平台安全測試

測試雲端 AI 平台組態，包含 AWS Bedrock guardrails、Azure 內容過濾器與 GCP 安全設定。

labscompliancetestingintermediate

實驗：AI 合規測試基礎

以結構化測試程序，依 OWASP LLM Top 10 與 NIST AI RMF 的要求測試 LLM 應用。

labssession-isolationtestingintermediate

實驗：會話隔離測試

測試多租戶 LLM 部署中的會話隔離，偵測跨會話的資料外洩與脈絡污染。

claudetestingmethodologyapi-probingsafety-boundariesmodel-tiers

Claude Testing Methodology

Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.

geminitestingmethodologyvertex-aiai-studiomultimodal-testing

Gemini Testing Methodology

Systematic methodology for red teaming Gemini, including Vertex AI API probing, Google AI Studio testing, multimodal test case design, and grounding attack validation.

gpt-4testingmethodologyapi-probingsafety-boundariesred-teaming

GPT-4 測試方法論

為紅隊 GPT-4 之系統化方法論，含 API 基探測技術、速率限制考量、內容政策對應與安全邊界發現。

referencepromptfooconfigurationtesting

Promptfoo 組態指南

為 LLM 安全測試組態 Promptfoo 的詳細指南,包含供應商設定、測試斷言與 CI/CD 整合。

automatedtestingwalkthroughsdefense

Automated 防禦 Regression Testing

Build automated regression tests for LLM security defenses using attack replay and continuous monitoring.

walkthroughsdefensetestingautomation

Automated 防禦 Testing Pipeline

Build an automated pipeline that continuously tests defensive measures against evolving attack techniques.

walkthroughsmethodologycompliancetesting

合規導向測試方法論

將法規要求對應至具體測試案例,以支援合規導向的 AI 紅隊評估。

eu-ai-actcomplianceregulationtestingmethodologywalkthrough

Testing for EU AI Act Compliance

導覽 for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.