# walkthrough

attacksmemorywalkthroughpersistencewalkthroughs

Memory Persistence 攻擊詳解

Walkthrough of achieving persistent memory manipulation in agent systems for cross-session influence.

memory-poisoningagent-memorypersistenceinjectionagent-securitywalkthrough

記憶體投毒 Step by Step

導覽 of persisting injection payloads in agent memory systems to achieve long-term compromise of LLM-based agents.

multi-agentprompt-relayinjection-chainagent-pipelineagent-securitywalkthrough

Multi-代理 Prompt Relay

進階 walkthrough of relaying prompt injection payloads across multiple agents in a pipeline, achieving cascading compromise of multi-agent systems.

orchestratormulti-agenttask-delegationagent-securitycoordination-attackwalkthrough

Orchestrator Manipulation

進階 walkthrough of attacking the orchestrator layer in multi-agent systems to gain control over task delegation, agent coordination, and system-wide behavior.

plugin-confusiontool-selectionagent-securitymisdirectionwalkthrough

Plugin Confusion 攻擊

導覽 of confusing LLM agents about which plugin or tool to invoke, causing them to call the wrong tool or pass data to unintended destinations.

attacksagentsprivilegeescalationwalkthroughwalkthroughs

Agent Privilege Escalation 詳解

Walkthrough of escalating privileges in multi-agent systems through trust chain exploitation.

attackscamouflagesemanticwalkthroughwalkthroughs

語意偽裝詳解

Walkthrough of crafting semantically camouflaged injections that evade both classifiers and human review.

attackssupplychainpoisoningwalkthroughwalkthroughs

Model Supply Chain Poisoning

Walkthrough of poisoning ML supply chains through dependency confusion, model weight manipulation, and hub attacks.

tool-callingfunction-callinginjectionagent-securitywalkthrough

工具 Call Injection

Step-by-step walkthrough of injecting malicious parameters into LLM tool and function calls to execute unauthorized actions in agent systems.

attacksvisionmodelwalkthroughwalkthroughs

Vision Model 攻擊詳解 (Attack 詳解)

Step-by-step walkthrough of visual prompt injection, adversarial images, and OCR exploitation in vision-language models.

attacksxmlinjectionwalkthroughjsonwalkthroughs

XML and JSON Injection in LLM Apps

Walkthrough of exploiting XML and JSON parsing in LLM applications for injection and data manipulation.

input-sanitizationprompt-injectiondefensellm-securityinput-validationwalkthrough

Building a Production Input Sanitizer

Step-by-step walkthrough for building a production-grade input sanitizer that cleans, normalizes, and validates user prompts before they reach an LLM, covering encoding normalization, injection pattern stripping, length enforcement, and integration testing.

canary-tokensprompt-injectiondetectionmonitoringdefensewalkthrough

Canary Token Deployment

Step-by-step walkthrough for deploying canary tokens in LLM system prompts and context to detect prompt injection and data exfiltration attempts, covering token generation, placement strategies, monitoring, and alerting.

access-controlcapabilitiespermissionsauthorizationdefensewalkthrough

能力式存取控制

為 LLM 功能實作細粒度能力控管的逐步演練,涵蓋能力符記設計、權限範圍、動態能力授予與稽核軌跡。

constitutional-aiclassifierprinciplessafetydefensewalkthrough

憲法式分類器設置

實作憲法式 AI 風格分類器以原則集合評估 LLM 輸出的逐步演練,涵蓋原則定義、分類器訓練、思維鏈評估與部署。

content-filteringdefenseclassifiersmoderationllm-judgeimplementationwalkthrough

Setting Up Content Filtering

Step-by-step walkthrough for implementing multi-layer content filtering for AI applications: keyword filtering, classifier-based detection, LLM-as-judge evaluation, testing effectiveness, and tuning for production.

nemo-guardrailsnvidiaguardrailscolangdefensewalkthrough

部署 NeMo Guardrails

於生產環境設置 NVIDIA NeMo Guardrails 的逐步演練，涵蓋安裝、Colang 配置、自訂動作、主題與安全護欄、測試與監控。

dual-llmarchitecturevalidationdefense-in-depthdefensewalkthrough

雙 LLM 架構設置

實作雙 LLM 模式的逐步演練——一個模型產生回應、另一個模型驗證之,涵蓋架構設計、驗證者提示詞工程、延遲最佳化與失敗處理。

guardrailsnemoinput-validationoutput-filteringpii-detectioncontent-policywalkthrough

Setting Up AI Guardrails

Step-by-step walkthrough for implementing AI guardrails: input validation with NVIDIA NeMo Guardrails, prompt injection detection with rebuff, output filtering for PII and sensitive data, and content policy enforcement.

hallucinationdetectionfactual-groundingoutput-filteringdefensewalkthrough

Hallucination Detection

Step-by-step walkthrough for detecting and flagging hallucinated content in LLM outputs, covering factual grounding checks, self-consistency verification, source attribution validation, and confidence scoring.

guardrailsinput-validationprompt-injection-defensecontent-safetydefensewalkthrough

Building Input Guardrails for LLM Applications

Step-by-step walkthrough for implementing production-grade input guardrails that protect LLM applications from prompt injection, content policy violations, and resource abuse through multi-layer validation, classification, and rate limiting.

incident-responseplaybookbreach-responsecontainmentinvestigationdefensewalkthrough

Incident Response Playbook for AI 安全 Breaches

導覽 for building an incident response playbook tailored to AI security breaches, covering detection triggers, triage procedures, containment strategies, investigation workflows, remediation validation, and post-incident review processes.

incident-responseplaybookstabletopcontainmentevidencepreparationwalkthrough

AI Incident Response Preparation

Step-by-step walkthrough for building AI incident response capabilities: playbook development, tabletop exercises, containment procedures, communication templates, and evidence collection workflows.

defenseguardrailsmonitoringincident-responseimplementationwalkthrough

防禦實作演練

實作 AI 安全防禦的逐步指南：護欄配置、監控與偵測設置，以及 AI 系統的事件回應準備。

instruction-hierarchyprompt-injectionprivilege-separationdefenseadvancedwalkthrough

Instruction Hierarchy Enforcement (防禦導覽)

Step-by-step walkthrough for enforcing instruction priority in LLM applications, ensuring system-level instructions always take precedence over user inputs through privilege separation, instruction tagging, and validation layers.

llm-judgeoutput-validationsafetyevaluationdefensewalkthrough

LLM 評審實作

使用 LLM 評審另一個 LLM 之輸出以評估安全與品質的逐步演練,涵蓋評審提示詞設計、評分準則、校準、成本最佳化與部署模式。

output-validationsanitizationresponse-safetyschema-validationdefensewalkthrough

Validating and Sanitizing 模型 Outputs

導覽 for building output validation systems that verify LLM responses meet structural, factual, and safety requirements before delivery, covering schema validation, factual grounding checks, response consistency verification, and safe rendering.

monitoringobservabilitysecurity-eventsalertingproductiondefensewalkthrough

Production Monitoring for LLM 安全 Events

導覽 for building production monitoring systems that detect LLM security events in real time, covering log collection, anomaly detection, alert configuration, dashboard design, and incident correlation.

monitoringlogginganomaly-detectionsiemalertingobservabilitywalkthrough

AI Monitoring Setup

Step-by-step walkthrough for implementing AI system monitoring: inference logging, behavioral anomaly detection, alert configuration, dashboard creation, and integration with existing SIEM platforms.

defense-in-depthinput-validationpipelinemulti-layerdefensewalkthrough

Multi-Layer Input Validation

Step-by-step walkthrough for building a defense-in-depth input validation pipeline that combines regex matching, semantic similarity, ML classification, and rate limiting into a unified validation system for LLM applications.

output-filteringclassifiercontent-moderationsafetydefensewalkthrough

Output Content Classifier

Step-by-step walkthrough for building a classifier to filter harmful LLM outputs, covering taxonomy definition, multi-label classification, threshold calibration, and deployment as a real-time output gate.

output-filteringcontent-safetypii-detectionresponse-validationdefensewalkthrough

Output Filtering and Content Safety Implementation

導覽 for building output filtering systems that inspect and sanitize LLM responses before they reach users, covering content classifiers, PII detection, response validation, canary tokens, and filter bypass resistance.

piiredactionprivacyoutput-filteringcompliancedefensewalkthrough

PII Redaction Pipeline

Step-by-step walkthrough for building an automated PII detection and redaction pipeline for LLM outputs, covering regex-based detection, NER-based detection, presidio integration, redaction strategies, and compliance testing.

classifiermachine-learningprompt-injectiondetectiontrainingdefensewalkthrough

Prompt Classifier 訓練

Step-by-step walkthrough for training a machine learning classifier to detect malicious prompts, covering dataset curation, feature engineering, model selection, training pipeline, evaluation, and deployment as a real-time detection service.

prompt-injectionmachine-learningdetectionclassifierdefensewalkthrough

ML-Based 提示詞注入 Detection Systems

導覽 for building and deploying ML-based prompt injection detection systems, covering training data collection, feature engineering, model architecture selection, threshold tuning, production deployment, and continuous improvement.

ragaccess-controlretrievaldata-leakageauthorizationdefensewalkthrough

Implementing Access Control in RAG Pipelines

導覽 for building access control systems in RAG pipelines that enforce document-level permissions, prevent cross-user data leakage, filter retrieved context based on user authorization, and resist retrieval poisoning attacks.

rate-limitingabuse-preventionapi-securitytoken-bucketcost-controldefensewalkthrough

Rate Limiting and Abuse Prevention for LLM APIs

導覽 for implementing rate limiting and abuse prevention systems for LLM API endpoints, covering token bucket algorithms, per-user quotas, cost-based limiting, anomaly detection, and graduated enforcement.

rate-limitingdefenseabuse-preventiontokensthrottlingimplementationwalkthrough

AI Rate Limiting 導覽

Step-by-step walkthrough for implementing token-aware rate limiting for AI applications: request-level limiting, token budget enforcement, sliding window algorithms, abuse detection, and production deployment.

regexprompt-injectionpattern-matchinginput-filteringdefensewalkthrough

Regex-Based Prompt Filter

Step-by-step walkthrough for building a regex-based prompt filter that detects common injection payloads using pattern matching, covering pattern library construction, performance optimization, false positive management, and continuous updates.

response-boundariesoutput-filteringcontent-policyguardrailsdefensewalkthrough

Response Boundary Enforcement

Step-by-step walkthrough for keeping LLM responses within defined topic, format, and content boundaries, covering boundary definition, violation detection, response rewriting, and boundary drift monitoring.

sandboxtool-executionisolationcontainerssecuritydefensewalkthrough

沙箱化工具執行

於隔離沙箱中執行 LLM 工具呼叫的逐步演練，涵蓋以容器為基礎的隔離、資源限制、網路限制與輸出清理。

sandboxingtool-usepermissionsagent-securityleast-privilegedefensewalkthrough

Sandboxing and Permission 模型s for 工具-Using 代理s

導覽 for implementing sandboxing and permission models that constrain tool-using LLM agents, covering least-privilege design, parameter validation, execution sandboxes, approval workflows, and audit logging.

embeddingssemantic-similarityprompt-injectionvector-databasedefensewalkthrough

Semantic Similarity Detection

Step-by-step walkthrough for using text embeddings to detect semantically similar prompt injection attempts, covering embedding model selection, vector database setup, similarity threshold tuning, and production deployment.

session-isolationmulti-tenancysecuritydefensewalkthrough

會話隔離模式

於 LLM 應用中隔離使用者會話的逐步演練，防止使用者之間的上下文、記憶與權限互相污染。

structured-outputvalidationjson-schemaoutput-filteringdefensewalkthrough

Structured Output Validation

Step-by-step walkthrough for validating structured LLM outputs against schemas, covering JSON schema validation, type coercion, constraint enforcement, and handling malformed model outputs gracefully.

toxicityscoringoutput-filteringcontent-moderationsafetydefensewalkthrough

毒性評分管線

建置 LLM 輸出過濾毒性評分管線的逐步詳解,涵蓋模型選擇、多維評分、閾值校準與即時評分的生產部署。

unicodenormalizationencodinghomoglyphdefensewalkthrough

Unicode Normalization 防禦

Step-by-step walkthrough for implementing Unicode normalization to prevent encoding-based prompt injection bypasses, covering homoglyph detection, invisible character stripping, bidirectional text handling, and normalization testing.

agentengagementfunction-callingtool-abusepermission-escalationwalkthrough

代理 System 紅隊 Engagement

Complete walkthrough for testing tool-using AI agents: scoping agent capabilities, exploiting function calling, testing permission boundaries, multi-step attack chains, and session manipulation.

apiengagementauthenticationrate-limitinginput-validationmodel-fingerprintingwalkthrough

AI API 紅隊 Engagement

Complete walkthrough for testing AI APIs: endpoint enumeration, authentication bypass, rate limit evasion, input validation testing, output data leakage, and model fingerprinting through API behavior.

chatbotengagementprompt-injectionsystem-promptcontent-filterwalkthrough

Chatbot 紅隊 Engagement

Step-by-step walkthrough for a complete chatbot red team assessment: scoping, system prompt extraction, content filter bypass, PII leakage testing, multi-turn manipulation, and professional reporting.

engagementswalkthroughmethodologyscopingreporting

案件演練概覽

完整 AI 紅隊案件的逐步演練：從範圍界定與偵察，到攻擊執行與報告撰寫，依目標系統類型分類組織。

multi-modelengagementmodel-routingorchestrationfallbackwalkthrough

Multi-模型 System 紅隊 Engagement

Complete walkthrough for testing systems that use multiple AI models: model-to-model injection, routing logic exploitation, fallback chain abuse, inter-model data leakage, and orchestration layer attacks.

ragengagementdocument-injectionembeddingretrievaldata-exfiltrationwalkthrough

RAG System 紅隊 Engagement

Complete walkthrough for testing RAG applications: document injection, cross-scope retrieval exploitation, embedding manipulation, data exfiltration through retrieval, and chunk boundary attacks.

metricseffectivenessreportingkpisprogram-managementmethodologywalkthrough

Measuring and Reporting AI 紅隊 Effectiveness

導覽 for defining, collecting, and reporting metrics that measure the effectiveness of AI red teaming programs, covering coverage metrics, detection rates, time-to-find analysis, remediation tracking, and ROI calculation.

threat-modelingai-securityrisk-assessmentmethodologywalkthrough

Building AI-Specific Threat 模型s

Step-by-step walkthrough for creating threat models tailored to AI and LLM systems, covering asset identification, threat enumeration, attack tree construction, and risk prioritization.

attack-executiontechniquesattack-chainsevidencefindingsworkflowwalkthrough

攻擊執行工作流程

執行 AI 紅隊攻擊之逐步工作流程：自偵察發現選擇技術、打造攻擊鏈、即時記錄發現、管理證據，與知曉何時升級或停止。

attack-surfacemappingreconnaissanceai-securitymethodologywalkthrough

Mapping the 攻擊 Surface of AI Systems

Systematic walkthrough for identifying and mapping every attack surface in an AI system, from user inputs through model inference to output delivery and tool integrations.

communicationreportingstakeholdersexecutive-summaryfindings-presentationwalkthrough

Communicating AI 紅隊 Findings to Stakeholders

導覽 for effectively communicating AI red team findings to diverse stakeholders, covering executive summaries, technical deep dives, live demonstrations, risk narratives, and remediation roadmaps tailored to audience expertise levels.

continuous-testingci-cdautomationpipelineregression-testingmethodologywalkthrough

Setting Up Continuous AI 紅隊ing Pipelines

導覽 for building continuous AI red teaming pipelines that automatically test LLM applications on every deployment, covering automated scan configuration, CI/CD integration, alert thresholds, regression testing, and dashboard reporting.

engagementkickoffscoperules-of-engagementlegalmethodologywalkthrough

委任啟動流程指南

啟動 AI 紅隊委任的逐步指南：客戶初次會議、範圍界定、交戰規則、法律協議、環境設置與工具選擇。

eu-ai-actcomplianceregulationtestingmethodologywalkthrough

Testing for EU AI Act Compliance

導覽 for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.

evidence-collectiondocumentationmethodologyreportingchain-of-custodywalkthrough

Evidence Collection and Documentation Best Practices

導覽 for systematic evidence collection during AI red team engagements, covering request/response capture, screenshot methodology, chain-of-custody documentation, reproducibility requirements, and evidence organization for reports.

evidencedocumentationmethodologyforensicsreportingwalkthrough

Evidence Collection Methods for AI 紅隊s

Comprehensive methods for collecting, preserving, and organizing red team evidence from AI system assessments, including API logs, screenshots, reproduction scripts, and chain-of-custody procedures.

executive-summaryreportingcommunicationmethodologywalkthrough

Writing Executive Summaries for AI 紅隊 Reports

指南 to writing clear, impactful executive summaries for AI red team assessment reports that communicate risk to non-technical stakeholders and drive remediation decisions.

severityclassificationvulnerabilityrisk-assessmentmethodologywalkthrough

Classifying AI 漏洞 Severity

Framework for consistently classifying the severity of AI and LLM vulnerabilities, with scoring criteria, impact assessment, and examples across common finding categories.

methodologyengagementworkflowprocesswalkthrough

方法論導覽

AI 紅隊案件每個階段的逐步導覽：啟動、偵察、攻擊執行與報告撰寫。

iso-42001auditcomplianceai-managementstandardswalkthrough

Preparing for ISO 42001 AI Management System Audit

進階 walkthrough for preparing organizations for ISO 42001 AI management system audits, covering control assessment, evidence preparation, gap remediation, and audit readiness.

mitre-atlasattack-mappingframeworktacticstechniqueswalkthrough

運用 MITRE ATLAS 進行 AI 攻擊對應

將 AI 紅隊演練的活動與發現對應至 MITRE ATLAS 框架的實作詳解,涵蓋戰術與技術辨識、攻擊鏈建構以及 Navigator 視覺化。

owaspllm-top-10classificationstandardsmethodologywalkthrough

將發現對應至 OWASP LLM Top 10

將 AI 紅隊發現對應至 OWASP LLM 應用程式 Top 10 的實作詳解,涵蓋分類指引、報告範本與緩解對應。

comparative-testingmulti-modelbenchmarkingsecurity-evaluationmethodologywalkthrough

Comparative 安全 Testing Across Multiple LLMs

導覽 for conducting systematic comparative security testing across multiple LLM providers and configurations, covering test standardization, parallel execution, cross-model analysis, and differential vulnerability reporting.

nistai-rmfrisk-managementcomplianceframeworkwalkthrough

NIST AI RMF 評量導覽

Step-by-step guide for conducting assessments aligned with the NIST AI Risk Management Framework, covering the Govern, Map, Measure, and Manage functions for AI system security.

pre-engagementchecklistpreparationmethodologyplanningwalkthrough

Pre-Engagement Preparation Checklist

Complete pre-engagement preparation checklist for AI red team operations covering team readiness, infrastructure setup, legal requirements, and initial reconnaissance planning.

reconnaissancesystem-promptmodel-identificationcapability-mappingattack-surfacewalkthrough

偵察工作流程

為 AI 紅隊委任之系統化偵察工作流程：系統提示提取、模型辨識、能力繪製、API 列舉，與記錄攻擊面。

remediationverificationretestingregressionmethodologywalkthrough

Verifying That Remediations Are Effective

導覽 for planning and executing remediation verification testing (retesting) to confirm that AI vulnerability fixes are effective and do not introduce regressions.

report-writingfindingsremediationexecutive-summaryrisk-ratingswalkthrough

報告撰寫實戰演練

撰寫 AI 紅隊報告之逐步指引：結構、執行摘要、技術發現、風險評級、修復建議、同儕審查與交付。

risk-scoringvulnerability-assessmentseverityCVSSmethodologywalkthrough

Risk Scoring Frameworks for AI Vulnerabilities

導覽 for applying risk scoring frameworks to AI and LLM vulnerabilities, covering CVSS adaptation for AI, custom AI risk scoring matrices, severity classification, business impact assessment, and integration with existing vulnerability management processes.

rules-of-engagementtemplatelegalauthorizationmethodologywalkthrough

Rules of Engagement Template for AI 紅隊 Operations

Step-by-step guide to creating comprehensive rules of engagement documents for AI red team assessments, covering authorization, scope, constraints, communication, and legal protections.

scopingengagementplanningmethodologystatement-of-workrisk-assessmentwalkthrough

How to Scope an AI 紅隊 Engagement

Comprehensive walkthrough for scoping AI red team engagements from initial client contact through statement of work, covering target enumeration, risk-based prioritization, resource estimation, boundary definition, and legal considerations.

scopingchecklistpre-engagementmethodologyplanningrules-of-engagementwalkthrough

AI 紅隊 Scoping Checklist 導覽

Systematic walkthrough of the pre-engagement scoping process for AI red team assessments: stakeholder identification, target enumeration, scope boundary definition, resource estimation, and rules of engagement documentation.

technical-appendixreportingdocumentationevidencemethodologywalkthrough

Creating Detailed Technical Appendices

指南 to building comprehensive technical appendices for AI red team reports, including evidence formatting, reproduction procedures, tool output presentation, and raw data organization.

test-planplanningmethodologytest-casesautomationwalkthrough

Developing Comprehensive AI 安全 Test Plans

Step-by-step guide to developing structured test plans for AI red team engagements, covering test case design, automation strategy, coverage mapping, and execution scheduling.

threat-modelingllm-securityrisk-assessmentmethodologySTRIDEwalkthrough

Threat 模型ing for LLM-Powered Applications

Step-by-step walkthrough for conducting threat modeling sessions specifically tailored to LLM-powered applications, covering data flow analysis, trust boundary identification, AI-specific threat enumeration, risk assessment, and mitigation planning.

threat-modelingstrideattack-treesworkshopfacilitationmethodologywalkthrough

AI Threat 模型ing Workshop 導覽

Step-by-step guide to running an AI-focused threat modeling workshop: adapting STRIDE for AI systems, constructing attack trees for LLM applications, participant facilitation techniques, and producing actionable threat models.

anyscalerayray-servecluster-securitydashboardactor-isolationwalkthrough

Anyscale Ray Serve ML Testing

End-to-end walkthrough for security testing Ray Serve ML deployments on Anyscale: cluster enumeration, serve endpoint exploitation, Ray Dashboard exposure, actor isolation testing, and observability review.

autogenmulti-agentcode-executionagent-injectionconversation-controlsandboxwalkthrough

AutoGen Multi-代理 System Testing

End-to-end walkthrough for security testing AutoGen multi-agent systems: agent enumeration, inter-agent injection, code execution sandbox assessment, conversation manipulation, and escalation path analysis.

awssagemakercloudmodel-extractioniaminference-pipelinewalkthrough

AWS SageMaker 紅隊演練

End-to-end walkthrough for red teaming ML models deployed on AWS SageMaker: endpoint enumeration, IAM policy analysis, model extraction testing, inference pipeline exploitation, and CloudTrail log review.

azureazure-mlcloudmanaged-endpointscompute-instancedata-storewalkthrough

Azure ML 安全 Testing

End-to-end walkthrough for security testing Azure Machine Learning endpoints: workspace enumeration, managed online endpoint exploitation, compute instance assessment, data store access review, and Azure Monitor analysis.

azureopenaired-teamcontent-safetymanaged-identityprompt-flowwalkthrough

Azure OpenAI 紅隊導覽

Complete red team walkthrough for Azure OpenAI deployments: testing content filters, managed identity exploitation, prompt flow injection, data integration attacks, and Azure Monitor evasion.

azureopenaicloudcontent-filteringmanaged-identityprompt-flowwalkthrough

Azure OpenAI 紅隊導覽 (Platform 導覽)

End-to-end walkthrough for red teaming Azure OpenAI deployments: deployment configuration review, content filtering bypass testing, managed identity exploitation, prompt flow assessment, and diagnostic log analysis.

awsbedrockred-teamguardrailsknowledge-baseagentswalkthrough

AWS Bedrock 紅隊導覽

Complete guide to red teaming AWS Bedrock deployments: testing guardrails bypass techniques, knowledge base data exfiltration, agent prompt injection, model customization abuse, and CloudTrail evasion.

awsbedrockcloudguardrailsknowledge-basecloudtrailwalkthrough

AWS Bedrock 紅隊導覽 (Platform 導覽)

End-to-end walkthrough for red teaming AI systems on AWS Bedrock: setting up access, invoking models via the Converse API, testing Bedrock Guardrails, exploiting knowledge bases, and analyzing CloudTrail logs.

crewaiagentscrewtask-injectiondelegationtool-securitywalkthrough

CrewAI 代理 Application 安全 Testing

End-to-end walkthrough for security testing CrewAI agent applications: crew enumeration, agent role exploitation, task injection, tool security assessment, delegation chain manipulation, and output validation.

databricksmlflowmodel-registryunity-catalogserving-endpointswalkthrough

Databricks MLflow Deployment Audit

End-to-end walkthrough for auditing MLflow deployments on Databricks: workspace enumeration, model registry security, serving endpoint testing, Unity Catalog integration review, and audit log analysis.

dspypipelinessignaturesoptimizersretrievalcompiled-promptswalkthrough

DSPy Pipeline 安全 Testing

End-to-end walkthrough for security testing DSPy optimized LLM pipelines: module enumeration, signature exploitation, optimizer manipulation, retrieval module assessment, and compiled prompt analysis.

gcpvertex-aicloudmodel-servingiampipelineswalkthrough

GCP Vertex AI 安全 Testing

End-to-end walkthrough for security testing Vertex AI deployments on Google Cloud: endpoint enumeration, IAM policy analysis, model serving exploitation, pipeline assessment, and Cloud Audit Logs review.

huggingfacesecurity-auditsupply-chainmodel-scanningprovenancewalkthrough

Hugging Face 安全 Audit 導覽

Step-by-step walkthrough for auditing Hugging Face models: scanning for malicious model files, verifying model provenance, assessing model card completeness, and testing Spaces and Inference API security.

huggingfacespacesgradiostreamlitapi-testingmodel-hubwalkthrough

HuggingFace Spaces 安全 Testing

End-to-end walkthrough for security testing HuggingFace Spaces applications: Space enumeration, Gradio/Streamlit exploitation, API endpoint testing, secret management review, and model access control assessment.

huggingfacemodel-hubtransformersmalicious-modelssupply-chainwalkthrough

Hugging Face Hub 紅隊導覽

導覽 for assessing AI models on Hugging Face Hub: model security assessment, scanning for malicious models, Transformers library testing, and Spaces application evaluation.

platformscloudawsazuregcphuggingfacewalkthrough

雲端 AI 平台導覽

在主要雲端平台上紅隊演練 AI 系統的動手導覽：AWS Bedrock、Azure OpenAI、Google Vertex AI 與 Hugging Face Hub。

langchainchainsagentstoolsragmemoryprompt-injectionwalkthrough

LangChain Application 安全 Testing

End-to-end walkthrough for security testing LangChain applications: chain enumeration, prompt injection through chains, tool and agent exploitation, retrieval augmented generation attacks, and memory manipulation.

llamaindexragquery-enginedata-connectorsresponse-synthesisagentswalkthrough

LlamaIndex RAG Application 安全 Testing

End-to-end walkthrough for security testing LlamaIndex RAG applications: index enumeration, query engine exploitation, data connector assessment, response synthesis manipulation, and agent pipeline testing.

modalserverlesscontainer-securityweb-endpointssecretsvolumeswalkthrough

Modal Serverless AI Deployment Testing

End-to-end walkthrough for security testing Modal serverless AI deployments: function enumeration, web endpoint exploitation, secret management assessment, volume security testing, and container escape analysis.

ollamalocal-modelssecurity-testingsafety-comparisonmodelfilewalkthrough

Ollama 安全 Testing 導覽

Complete walkthrough for security testing locally-hosted models with Ollama: comparing safety across models, testing system prompt extraction, API security assessment, and 模型file configuration hardening.

replicateapi-testingcogwebhooksmodel-servingwalkthrough

Replicate API 安全 Testing

End-to-end walkthrough for security testing models on Replicate: model enumeration, prediction API exploitation, webhook security, Cog container assessment, and billing abuse prevention.

runpodserverlessgpudockerwebhookshandler-securitywalkthrough

RunPod Serverless GPU Endpoint Testing

End-to-end walkthrough for security testing RunPod serverless GPU endpoints: endpoint enumeration, handler exploitation, webhook security, Docker template assessment, and cost abuse prevention.

semantic-kernelmicrosoftpluginsplannersmemoryazurewalkthrough

Microsoft Semantic Kernel 安全 Testing

End-to-end walkthrough for security testing Semantic Kernel applications: kernel enumeration, plugin exploitation, planner manipulation, memory and RAG assessment, and Azure integration security review.

together-aiapi-testinginferencefine-tuningfunction-callingwalkthrough

Together AI 安全 Testing

End-to-end walkthrough for security testing Together AI deployments: API enumeration, inference endpoint exploitation, fine-tuning security review, function calling assessment, and rate limit analysis.

gcpvertex-aicloudprediction-endpointsmodel-gardenfeature-storewalkthrough

Vertex AI 紅隊導覽

End-to-end walkthrough for red teaming Google Cloud Vertex AI: prediction endpoint testing, 模型 Garden security assessment, Feature Store probing, and Cloud Logging analysis.

gcpvertex-aired-teammodel-gardenfeature-storeprediction-endpointswalkthrough

Vertex AI 紅隊導覽 (Platform 導覽)

Complete red team walkthrough for Google Vertex AI: testing prediction endpoints, 模型 Garden assessments, Feature Store probing, and exploiting Vertex AI 代理s and Extensions.

artkitadversarial-testingrobustnessautomated-testingllm-evaluationwalkthrough

Adversarial Robustness Testing with ARTKit

導覽 for using ARTKit (Adversarial Robustness Testing Kit) to evaluate LLM application resilience through automated adversarial testing, covering test flow configuration, challenger setup, evaluator design, and results analysis.

burp-suiteproxyinterceptingfuzzingapi-testingstreamingwalkthrough

Burp Suite for AI APIs

Using Burp Suite to intercept, analyze, and fuzz LLM API calls: proxy setup, intercepting streaming responses, parameter fuzzing with Intruder, and building custom extensions for AI-specific testing.

burp-suiteapi-testingllm-securityproxyweb-securitywalkthrough

Using Burp Suite for LLM API Endpoint Testing

導覽 for using Burp Suite to intercept, analyze, and attack LLM API endpoints, covering proxy configuration, request manipulation, automated scanning for injection flaws, and custom extensions for AI-specific testing.

counterfitadversarial-mlmicrosoftrobustness-testingautomationwalkthrough

Counterfit 導覽

Complete walkthrough of Microsoft's Counterfit adversarial ML testing framework: installation, target configuration, running attacks against ML models, interpreting results, and automating adversarial robustness assessments.

garakcustom-probesadvancedattack-vectorsprobe-developmentwalkthrough

Writing Custom Garak Probes for Novel 攻擊 Vectors

進階 walkthrough for building custom Garak probes that target novel and emerging attack vectors, covering probe architecture, payload generation, detector pairing, and integration into automated scanning pipelines.

garakci-cdautomationgithub-actionsgitlab-ciwalkthrough

Integrating Garak into CI/CD Pipelines

中階 walkthrough on automating garak vulnerability scans within CI/CD pipelines, including GitHub Actions, Git實驗室 CI, threshold-based gating, result caching, and cost management strategies.

garakcustom-probesvulnerability-scanningextensibilitywalkthrough

Writing Custom Garak Probes

中階 walkthrough on creating custom garak probes tailored to application-specific attack surfaces, including probe structure, prompt engineering, custom detectors, and testing workflows.

garakdetectorscustom-detectorsadvancedwalkthrough

Building Custom Garak Detectors

進階 walkthrough on creating custom garak detectors for specific success criteria, including regex-based detectors, ML-based classifiers, multi-signal scoring, and integration with external evaluation services.

garakvulnerability-scanningbeginnerfirst-scanwalkthrough

執行你之首次 Garak 掃描

自零執行你之第一個 garak 漏洞掃描之逐步初學者演練，涵蓋安裝、目標設置、探測選擇與基礎結果解讀。

garakgeneratorspluginscustom-apiadvancedwalkthrough

Writing Garak Generator Plugins for Custom API Targets

進階 walkthrough on writing garak generator plugins to connect to custom API endpoints, proprietary model servers, and non-standard inference interfaces for vulnerability scanning.

garakmcptool-useprobesadvancedwalkthrough

Setting Up Garak Probes for MCP 工具 Interactions

進階 walkthrough on configuring garak probes that target 模型 Context Protocol (MCP) tool interactions, testing for tool misuse, privilege escalation through tools, and data exfiltration via tool calls.

garakmodel-comparisonvulnerability-scanningbenchmarkingwalkthrough

Comparing 漏洞 Profiles Across 模型s with Garak

中階 walkthrough on using garak to run identical vulnerability scans across multiple models, comparing results to understand relative security postures and make informed model selection decisions.

garakreportinganalysisvulnerability-assessmentwalkthrough

Deep Dive into Garak Scan Report Analysis

中階 walkthrough on analyzing garak scan reports, including JSONL parsing, false positive identification, vulnerability categorization, executive summary generation, and trend tracking.

garakvulnerability-scanningprobesautomationci-cdwalkthrough

Garak End-to-End 導覽

Complete walkthrough of NVIDIA's garak LLM vulnerability scanner: installation, configuration, running probes against local and hosted models, interpreting results, writing custom probes, and CI/CD integration.

harmbenchevaluationbenchmarkssafetyred-team-automationwalkthrough

HarmBench Evaluation Framework 導覽

Complete walkthrough of the HarmBench evaluation framework: installation, running standardized benchmarks against models, interpreting results, creating custom behavior evaluations, and comparing model safety across versions.

inspect-aiaisievaluationbenchmarksscorerscompliancewalkthrough

Inspect AI 導覽

Complete walkthrough of UK AISI's Inspect AI framework: installation, writing evaluations, running against models, custom scorers, benchmark suites, and producing compliance-ready reports.

langchainsecurity-testingagent-securitychain-injectionrag-securitywalkthrough

安全 Testing LangChain Applications

Step-by-step walkthrough for identifying and exploiting security vulnerabilities in LangChain-based applications, covering chain injection, agent manipulation, tool abuse, retrieval poisoning, and memory extraction attacks.

langfuseobservabilitymonitoringtracingsecurity-dashboardwalkthrough

Langfuse Observability 導覽

Complete walkthrough for using Langfuse to monitor AI applications for security anomalies: setting up tracing, building security dashboards, detecting prompt injection patterns, and creating automated alerts.

nemoguardrailscolangnvidiadialog-flowsbypass-testingwalkthrough

NeMo Guardrails 導覽

End-to-end walkthrough of NVIDIA NeMo Guardrails: installation, Colang configuration, dialog flow design, integration with LLM applications, and red team bypass testing techniques.

ollamalocal-modelsmodel-analysissafety-testingred-team-labwalkthrough

Local 模型 Analysis and Testing with Ollama

導覽 for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.

ollamalocal-modelsuncensoredsafety-comparisonlab-setupwalkthrough

Ollama for Local 紅隊演練

Using Ollama as a local red teaming environment: model selection, running uncensored models, API-based testing, comparing safety across model families, and building a cost-free testing lab.

promptfooevaluationbeginnerfirst-evalwalkthrough

Running Your First Promptfoo Evaluation

初階 walkthrough for running your first promptfoo evaluation from scratch, covering installation, configuration, test case creation, assertion writing, and result interpretation.

promptfooautomationred-teamevaluationci-cdwalkthrough

Automating 紅隊 Evaluations with Promptfoo

Complete walkthrough for setting up automated red team evaluation pipelines using Promptfoo, covering configuration, custom evaluators, adversarial dataset generation, CI integration, and result analysis.

promptfooevaluationred-teamci-cdyamlassertionswalkthrough

Promptfoo End-to-End 導覽

Complete walkthrough of promptfoo for AI red teaming: configuration files, provider setup, running evaluations, red team plugins, assertion-based scoring, reporting, and CI/CD integration.

pyritazureazure-openaicontent-safetyenterprisewalkthrough

Integrating PyRIT with Azure OpenAI and Content Safety

中階 walkthrough on integrating PyRIT with Azure OpenAI Service and Azure AI Content Safety for enterprise red teaming, including managed identity authentication, content filtering analysis, and compliance reporting.

pyritconverterspayload-transformationevasionwalkthrough

Building Converter Pipelines for Payload Transformation in PyRIT

Intermediate walkthrough on using PyRIT's converter system to transform attack payloads through encoding, translation, paraphrasing, and other obfuscation techniques to evade input filters.

pyritscorerscustom-scoringevaluationwalkthrough

Creating Custom Scorers for PyRIT 攻擊 Evaluation

中階 walkthrough on building custom PyRIT scorers for evaluating attack success, including pattern-based, LLM-based, and multi-criteria scoring approaches.

pyritred-teamingbeginnerfirst-campaignwalkthrough

Running Your First PyRIT 紅隊 Campaign

初階 walkthrough for running your first PyRIT red team campaign from scratch, covering installation, target configuration, orchestrator setup, and basic result analysis.

pyrituifrontendbeginnervisual-interfacewalkthrough

使用 PyRIT UI 前端

初學者教學：使用 PyRIT 網頁式 UI 前端，以視覺化方式管理紅隊行動，包括啟動行動、監控進度，以及不需寫程式即可檢視結果。

pyritmulti-turnorchestrationred-teamingwalkthrough

Orchestrating Multi-Turn 攻擊 Sequences with PyRIT

Intermediate walkthrough on using PyRIT's orchestration capabilities for multi-turn red team campaigns, including attack strategy design, conversation management, and adaptive scoring.

pyritmulti-turnorchestrationred-teammicrosoftattack-campaignswalkthrough

Microsoft PyRIT for Orchestrated Multi-Turn 攻擊s

Comprehensive walkthrough for using Microsoft PyRIT to design and execute orchestrated multi-turn attack campaigns against LLM applications, covering orchestrator configuration, converter chains, scoring strategies, and campaign analysis.

pyritreportingred-team-reportsdocumentationwalkthrough

Generating Professional Reports from PyRIT Campaigns

中階 walkthrough on generating professional red team reports from PyRIT campaign data, including executive summaries, technical findings, remediation guidance, and visual dashboards.

pyrittargetsconfigurationapi-integrationwalkthrough

Configuring Diverse Targets in PyRIT

中階 walkthrough on configuring PyRIT targets for various model providers, custom APIs, local models, and application endpoints including authentication, system prompts, and rate limiting.

pyritmicrosoftorchestratorsconvertersmulti-turnred-teamingwalkthrough

PyRIT End-to-End 導覽

Complete walkthrough of Microsoft's Python Risk Identification Toolkit: setup, connecting to targets, running orchestrators, using converters, multi-turn attacks, and analyzing results with the web UI.

pythonautomationhttpxaiohttpreportingtest-harnesswalkthrough

Python 紅隊 Automation

Building custom AI red team automation with Python: test harnesses with httpx and aiohttp, result collection and analysis, automated reporting, and integration with existing tools like promptfoo and garak.

rebuffprompt-injectiondetectiondefense-testingevasionwalkthrough

Testing 提示詞注入防禦s with Rebuff

導覽 for using Rebuff to test and evaluate prompt injection detection capabilities, covering installation, detection pipeline analysis, adversarial evasion testing, custom rule development, and benchmarking detection accuracy.