# testing
標記為「testing」的 111 篇文章
Section Assessments Overview
How to use the AI red teaming section assessments, scoring methodology, and recommended completion order.
Capstone: Design and Implement an AI Safety Benchmark Suite
Build a comprehensive, reproducible benchmark suite for evaluating LLM safety across multiple risk dimensions including toxicity, bias, hallucination, and adversarial robustness.
Azure AI Content Safety Testing
Testing Azure AI Content Safety service for bypass vulnerabilities and configuration weaknesses.
Security Gaps in AI-Generated Tests
Analyzing how AI-generated test suites systematically miss security-relevant test cases, creating dangerous coverage illusions.
Advanced Test Generation Manipulation
Advanced techniques for manipulating AI-generated tests to create false assurance by generating tests that pass but don't verify security properties.
Content Safety APIs (Azure, OpenAI, Google)
Detailed comparison of Azure Content Safety, OpenAI Moderation API, and Google Cloud safety offerings, including API structures, category taxonomies, severity levels, testing methodology, and common gaps.
Defense Evaluation Methodology
Systematic methodology for evaluating the effectiveness of AI defenses against known attack categories.
Red Teaming Automation
Frameworks and tools for automating AI red teaming at scale, including CART pipelines, jailbreak fuzzing, regression testing, and continuous monitoring.
Coverage Tracking Systems
Implementing test coverage tracking for AI security assessments to ensure comprehensive evaluation across attack vectors and model behaviors.
Defense Evaluation Toolkit
Building a toolkit for systematically evaluating the effectiveness of LLM defenses.
Fuzzing LLM Applications
Applying fuzzing methodologies to LLM applications including grammar-based fuzzing, mutation-based fuzzing, and coverage-guided approaches.
Multi-Model Test Orchestrator
Orchestrating parallel security testing across multiple models and providers to identify cross-model vulnerabilities and transferable attacks.
Multi-Target Testing Framework
Build a framework for testing the same attack suite across multiple model providers simultaneously.
promptfoo for Red Teaming
Deep dive into promptfoo for AI red teaming: YAML configuration, assertion-based testing, red team plugins, custom evaluators, and regression testing workflows for LLM security.
Regression Testing for AI Security
Implementing automated regression testing for AI security properties that integrates into CI/CD pipelines and catches safety regressions.
Alignment Stability Under Fine-Tuning
Testing how safety alignment degrades under various fine-tuning configurations and datasets.
AI Audit Methodology
Comprehensive methodology for auditing AI systems including planning, evidence collection, testing procedures, report templates, and integration with red team assessments.
EU AI Act Compliance Testing
EU AI Act risk categories, testing requirements for high-risk AI systems, conformity assessment procedures, and how red teaming supports EU AI Act compliance.
LLM API Security Testing
Security testing methodology for LLM APIs, covering authentication, rate limiting, input validation, output filtering, and LLM-specific API vulnerabilities.
Lab: API Authentication Security Testing
Test LLM API authentication mechanisms for vulnerabilities including token leakage, replay attacks, and scope bypass.
Lab: API-Based Model Testing
Learn to test language models through their APIs including OpenAI, Anthropic, and local Ollama endpoints. Build reusable API testing functions with proper error handling.
Basic Automated Testing Setup
Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.
Basic Defense Mechanism Testing
Identify and categorize the defensive mechanisms present in a target LLM application through structured probing.
Instruction Hierarchy Testing
Test how models prioritize conflicting instructions between system, user, and assistant roles.
Lab: Multi-Language Prompt Testing
Test LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.
LLM Playground Security Testing
Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.
Lab: Prompt Template Vulnerability Testing
Test common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.
Lab: Response Consistency Testing
Test LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.
Lab: Running Safety Benchmarks
Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.
Lab: Build an AI Fuzzer
Build a comprehensive fuzzing framework for large language models that systematically generates malformed, unexpected, and adversarial inputs to discover crashes, unexpected behaviors, and safety bypasses.
Deceptive Alignment Testing Framework
Build a testing framework for detecting mesa-optimization and deceptive alignment in fine-tuned models.
Lab: Agent Workflow Security Testing
Test multi-step agent workflows for injection points, state manipulation, and unauthorized action execution.
Lab: Canary Token Effectiveness Testing
Deploy and test canary tokens in system prompts to evaluate their effectiveness at detecting prompt extraction.
Lab: Cloud AI Platform Security Testing
Test cloud AI platform configurations including AWS Bedrock guardrails, Azure content filters, and GCP safety settings.
Lab: AI Compliance Testing Fundamentals
Test LLM applications against OWASP LLM Top 10 and NIST AI RMF requirements using structured test procedures.
Lab: Session Isolation Testing
Test session isolation in multi-tenant LLM deployments to detect cross-session data leakage and context contamination.
Claude Testing Methodology
Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.
Gemini Testing Methodology
Systematic methodology for red teaming Gemini, including Vertex AI API probing, Google AI Studio testing, multimodal test case design, and grounding attack validation.
GPT-4 Testing Methodology
Systematic methodology for red teaming GPT-4, including API-based probing techniques, rate limit considerations, content policy mapping, and safety boundary discovery.
Promptfoo Configuration Guide
Detailed guide to configuring Promptfoo for LLM security testing including provider setup, test assertions, and CI/CD integration.
Automated Defense Regression Testing
Build automated regression tests for LLM security defenses using attack replay and continuous monitoring.
Automated Defense Testing Pipeline
Build an automated pipeline that continuously tests defensive measures against evolving attack techniques.
Compliance-Driven Testing Methodology
Map regulatory requirements to specific test cases for compliance-driven AI red team assessments.
Testing for EU AI Act Compliance
Walkthrough for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.
Multi-Model Testing Methodology
Structured methodology for testing applications that use multiple LLM models in their processing pipeline.
AI Compliance Testing Methodology
Methodology for testing AI systems against regulatory compliance requirements including EU AI Act and NIST.
Testing AI21 Labs Models
Red team testing guide for AI21 Labs Jamba models including long context and efficiency features.
Testing Cohere Models
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
Testing Fireworks AI Platform
Red team testing guide for Fireworks AI including function calling and compound AI systems.
Testing Groq Inference Platform
Red team testing guide for Groq's high-speed inference platform and its security characteristics.
Testing Mistral AI Models
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.
Testing Ollama Local Deployments
Security testing guide for locally deployed models via Ollama including network exposure and API security.
Testing Replicate-Hosted Models
Red team testing guide for models hosted on Replicate including open-source model deployments.
Testing Together AI Platform
Red team testing guide for Together AI including fine-tuned model endpoints and custom deployments.
Counterfit ML Security Testing
Use Microsoft's Counterfit for adversarial ML testing of deployed model endpoints.
JailbreakBench Usage and Submission
Use JailbreakBench to evaluate jailbreak techniques and submit results to the benchmark.
章節評估概覽
如何使用 AI 紅隊章節評估、評分方法論與建議完成順序。
Capstone: Design and Implement an AI Safety Benchmark Suite
Build a comprehensive, reproducible benchmark suite for evaluating LLM safety across multiple risk dimensions including toxicity, bias, hallucination, and adversarial robustness.
Azure AI Content Safety Testing
Testing Azure AI Content Safety service for bypass vulnerabilities and configuration weaknesses.
安全 Gaps in AI-Generated Tests
Analyzing how AI-generated test suites systematically miss security-relevant test cases, creating dangerous coverage illusions.
進階 Test Generation Manipulation
Advanced techniques for manipulating AI-generated tests to create false assurance by generating tests that pass but don't verify security properties.
內容安全 API(Azure、OpenAI、Google)
Azure Content Safety、OpenAI Moderation API 與 Google Cloud 安全服務之詳細比較,含 API 結構、類別分類、嚴重性等級、測試方法論與常見缺口。
防禦 Evaluation Methodology
Systematic methodology for evaluating the effectiveness of AI defenses against known attack categories.
紅隊自動化
大規模自動化 AI 紅隊的框架與工具,涵蓋 CART 管線、越獄模糊測試、回歸測試與持續監控。
Coverage Tracking Systems
Implementing test coverage tracking for AI security assessments to ensure comprehensive evaluation across attack vectors and model behaviors.
防禦 Evaluation 工具kit
Building a toolkit for systematically evaluating the effectiveness of LLM defenses.
Fuzzing LLM Applications
Applying fuzzing methodologies to LLM applications including grammar-based fuzzing, mutation-based fuzzing, and coverage-guided approaches.
Multi-模型 Test Orchestrator
Orchestrating parallel security testing across multiple models and providers to identify cross-model vulnerabilities and transferable attacks.
Multi-Target Testing Framework
Build a framework for testing the same attack suite across multiple model providers simultaneously.
promptfoo 用於紅隊
promptfoo 用於 AI 紅隊的深入解析:YAML 組態、以斷言為基礎的測試、紅隊外掛、自製評估器,以及 LLM 安全的回歸測試流程。
Regression Testing for AI 安全
Implementing automated regression testing for AI security properties that integrates into CI/CD pipelines and catches safety regressions.
Alignment Stability Under Fine-Tuning
Testing how safety alignment degrades under various fine-tuning configurations and datasets.
AI Audit Methodology
Comprehensive methodology for auditing AI systems including planning, evidence collection, testing procedures, report templates, and integration with red team assessments.
EU AI Act Compliance Testing
EU AI Act risk categories, testing requirements for high-risk AI systems, conformity assessment procedures, and how red teaming supports EU AI Act compliance.
實驗室: API Authentication 安全 Testing
Test LLM API authentication mechanisms for vulnerabilities including token leakage, replay attacks, and scope bypass.
實驗室: API-Based 模型 Testing
Learn to test language models through their APIs including OpenAI, Anthropic, and local Ollama endpoints. Build reusable API testing functions with proper error handling.
Basic Automated Testing Setup
Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.
Basic 防禦 Mechanism Testing
Identify and categorize the defensive mechanisms present in a target LLM application through structured probing.
Instruction Hierarchy Testing
Test how models prioritize conflicting instructions between system, user, and assistant roles.
實驗室: Multi-Language Prompt Testing
Test LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.
LLM Playground 安全 Testing
Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.
實驗室: Prompt Template 漏洞 Testing
Test common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.
實驗室: Response Consistency Testing
Test LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.
實驗室: Running Safety Benchmarks
Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.
實驗室: Build an AI Fuzzer
Build a comprehensive fuzzing framework for large language models that systematically generates malformed, unexpected, and adversarial inputs to discover crashes, unexpected behaviors, and safety bypasses.
Deceptive Alignment Testing Framework
Build a testing framework for detecting mesa-optimization and deceptive alignment in fine-tuned models.
實驗室: 代理 Workflow 安全 Testing
Test multi-step agent workflows for injection points, state manipulation, and unauthorized action execution.
實驗室: Canary Token Effectiveness Testing
Deploy and test canary tokens in system prompts to evaluate their effectiveness at detecting prompt extraction.
實驗室: Cloud AI Platform 安全 Testing
Test cloud AI platform configurations including AWS Bedrock guardrails, Azure content filters, and GCP safety settings.
實驗室: AI Compliance Testing 基礎
Test LLM applications against OWASP LLM Top 10 and NIST AI RMF requirements using structured test procedures.
實驗室: Session Isolation Testing
Test session isolation in multi-tenant LLM deployments to detect cross-session data leakage and context contamination.
Claude Testing Methodology
Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.
Gemini Testing Methodology
Systematic methodology for red teaming Gemini, including Vertex AI API probing, Google AI Studio testing, multimodal test case design, and grounding attack validation.
GPT-4 測試方法論
為紅隊 GPT-4 之系統化方法論,含 API 基探測技術、速率限制考量、內容政策對應與安全邊界發現。
Promptfoo Configuration 指南
Detailed guide to configuring Promptfoo for LLM security testing including provider setup, test assertions, and CI/CD integration.
Automated 防禦 Regression Testing
Build automated regression tests for LLM security defenses using attack replay and continuous monitoring.
Automated 防禦 Testing Pipeline
Build an automated pipeline that continuously tests defensive measures against evolving attack techniques.
Compliance-Driven Testing Methodology
Map regulatory requirements to specific test cases for compliance-driven AI red team assessments.
Testing for EU AI Act Compliance
導覽 for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.
Multi-模型 Testing Methodology
Structured methodology for testing applications that use multiple LLM models in their processing pipeline.
AI Compliance Testing Methodology
Methodology for testing AI systems against regulatory compliance requirements including EU AI Act and NIST.
Testing AI21 實驗室s 模型s
Red team testing guide for AI21 實驗室s Jamba models including long context and efficiency features.
Testing Cohere 模型s
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
Testing Fireworks AI Platform
Red team testing guide for Fireworks AI including function calling and compound AI systems.
Testing Groq Inference Platform
Red team testing guide for Groq's high-speed inference platform and its security characteristics.
Testing Mistral AI 模型s
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.
Testing Ollama Local Deployments
安全 testing guide for locally deployed models via Ollama including network exposure and API security.
Testing Replicate-Hosted 模型s
Red team testing guide for models hosted on Replicate including open-source model deployments.
Testing Together AI Platform
Red team testing guide for Together AI including fine-tuned model endpoints and custom deployments.
Counterfit ML 安全 Testing
Use Microsoft's Counterfit for adversarial ML testing of deployed model endpoints.
越獄Bench Usage and Submission
Use 越獄Bench to evaluate jailbreak techniques and submit results to the benchmark.