# testing

code-gen-securitytestingcoveragesecurity-testing

Beveiligingshiaten in AI-gegenereerde tests

Analyzing how AI-generated test suites systematically miss security-relevant test cases, creating dangerous coverage illusions.

code-gentestinggenerationmanipulation

Geavanceerde manipulatie van testgeneratie

Advanced techniques for manipulating AI-generated tests to create false assurance by generating tests that pass but don't verify security properties.

content-safetyazureopenaigooglemoderation-apitaxonomytesting

Content Safety-API's (Azure, OpenAI, Google)

Gedetailleerde vergelijking van Azure Content Safety, de OpenAI Moderation API en de safety-aanbiedingen van Google Cloud, inclusief API-structuren, categorietaxonomieën, severityniveaus, testmethodologie en veelvoorkomende hiaten.

defenseevaluationmethodologytesting

Defense Evaluation Methodology

Systematische methodologie voor het evalueren van de effectiviteit van AI-verdedigingen tegen bekende aanvalscategorieën.

automationcartfuzzingtestingexploit-dev

Automatisering van redteaming

Frameworks en tools voor het op schaal automatiseren van AI-redteaming, inclusief CART-pipelines, jailbreak-fuzzing, regressietests en continue monitoring.

exploit-devcoveragetrackingtesting

Systemen voor het bijhouden van dekking

Het implementeren van testdekkingstracking voor AI-beveiligingsbeoordelingen om volledige evaluatie over aanvalsvectoren en modelgedragingen heen te garanderen.

exploit-devdefense-evaluationtoolkittesting

Toolkit voor verdedigingsevaluatie

Een toolkit bouwen om systematisch te evalueren hoe effectief verdedigingen van LLM's zijn.

exploit-devfuzzingtestingdiscovery

Fuzzen van LLM-applicaties

Fuzzing-methodieken toepassen op LLM-applicaties, inclusief grammar-based fuzzing, mutation-based fuzzing en coverage-guided aanpakken.

exploit-devmulti-modelorchestrationtesting

Multi-model testorkestrator

Parallelle beveiligingstesten orkestreren over meerdere modellen en providers om cross-model kwetsbaarheden en overdraagbare aanvallen te identificeren.

frameworktargetdevexploittestingmulti

Testframework voor meerdere doelen

Bouw een framework waarmee je dezelfde aanvalssuite tegelijk uitvoert tegen meerdere modelproviders.

promptfootestingevaluationtools

promptfoo voor redteaming

Diepgaande verkenning van promptfoo voor AI-redteaming: YAML-configuratie, assertion-gebaseerde tests, redteam-plug-ins, eigen evaluators en regressietest-workflows voor LLM-beveiliging.

exploit-devregressiontestingCI/CD

Regressietesten voor AI-beveiliging

Geautomatiseerde regressietests voor AI-veiligheidseigenschappen implementeren die integreren in CI/CD-pipelines en veiligheidsregressies opvangen.

stabilityfinetuningalignmenttesting

Stabiliteit van alignment onder fine-tuning

Testing how safety alignment degrades under various fine-tuning configurations and datasets.

auditmethodologyevidencetestingreport-templates

AI-auditmethodologie

Uitgebreide methodologie voor het auditen van AI-systemen, waaronder planning, bewijsverzameling, testprocedures, rapportsjablonen en integratie met red team-beoordelingen.

eu-ai-actcomplianceregulationtesting

EU AI Act-compliancetests

EU AI Act-risicocategorieën, testvereisten voor AI-systemen met hoog risico, conformiteitsbeoordelingsprocedures en hoe red-teamen de EU AI Act-compliance ondersteunt.

api-securityauthenticationrate-limitingtestinginfrastructure

Beveiligingstesten van LLM-API's

Beveiligingstestmethodologie voor LLM-API's, met aandacht voor authenticatie, rate limiting, invoervalidatie, uitvoerfiltering en LLM-specifieke API-kwetsbaarheden.

labsapi-authenticationtestingbeginner

Lab: beveiligingstesten van API-authenticatie

Test LLM API authentication mechanisms for vulnerabilities including token leakage, replay attacks, and scope bypass.

Lab: modeltesten via de API

Learn to test language models through their APIs including OpenAI, Anthropic, and local Ollama endpoints. Build reusable API testing functions with proper error handling.

labapitestingopenai

basiclabbeginnerlabsautomatedtesting

Basisopzet voor geautomatiseerd testen

Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.

testingdefensebasiclabbeginnerlabs

Basistesten van verdedigingsmechanismen

Identify and categorize the defensive mechanisms present in a target LLM application through structured probing.

labsinstruction-hierarchytestingbeginner

Testen van de instructiehiërarchie

Test how models prioritize conflicting instructions between system, user, and assistant roles.

labsmulti-languagetestingbeginner

Lab: meertalige prompts testen

Test LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.

labbeginnerlabsplaygroundtesting

Beveiligingstesten van LLM-playgrounds

Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.

labsprompt-templatestestingbeginner

Lab: kwetsbaarheden van prompt-templates testen

Test common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.

labsconsistencytestingbeginner

Lab: consistentie van responses testen

Test LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.

labssafety-benchmarkstestingbeginner

Lab: veiligheidsbenchmarks uitvoeren

Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.

labexpertfuzzertestingadversarialhands-on

Lab: bouw een AI-fuzzer

Build a comprehensive fuzzing framework for large language models that systematically generates malformed, unexpected, and adversarial inputs to discover crashes, unexpected behaviors, and safety bypasses.

Expert

Framework voor het testen op deceptive alignment

Build a testing framework for detecting mesa-optimization and deceptive alignment in fine-tuned models.

labsdeceptive-alignmenttestingexpert

Expert

Lab: beveiligingstesten van agent-workflows

Test multi-step agent workflows for injection points, state manipulation, and unauthorized action execution.

labsagent-workflowtestingintermediate

labscanary-tokenstestingintermediate

Lab: effectiviteit van canary tokens testen

Deploy and test canary tokens in system prompts to evaluate their effectiveness at detecting prompt extraction.

labscloud-platformtestingintermediate

Lab: beveiligingstesten van cloud-AI-platforms

Test cloud AI platform configurations including AWS Bedrock guardrails, Azure content filters, and GCP safety settings.

labscompliancetestingintermediate

Lab: grondbeginselen van AI-compliancetesten

Test LLM applications against OWASP LLM Top 10 and NIST AI RMF requirements using structured test procedures.

labssession-isolationtestingintermediate

Lab: sessie-isolatie testen

Test session isolation in multi-tenant LLM deployments to detect cross-session data leakage and context contamination.

claudetestingmethodologyapi-probingsafety-boundariesmodel-tiers

Testmethodologie voor Claude

Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.

geminitestingmethodologyvertex-aiai-studiomultimodal-testing

Testmethodologie voor Gemini

Systematic methodology for red teaming Gemini, including Vertex AI API probing, Google AI Studio testing, multimodal test case design, and grounding attack validation.

gpt-4testingmethodologyapi-probingsafety-boundariesred-teaming

Testmethodologie voor GPT-4

Systematic methodology for red teaming GPT-4, including API-based probing techniques, rate limit considerations, content policy mapping, and safety boundary discovery.

referencepromptfooconfigurationtesting

Configuratiegids voor promptfoo

Gedetailleerde gids voor het configureren van Promptfoo voor LLM-beveiligingstesten, inclusief provider-setup, test-asserties en CI/CD-integratie.

automatedtestingwalkthroughsdefense

Geautomatiseerde regressietesten van verdediging

Build automated regression tests for LLM security defenses using attack replay and continuous monitoring.

walkthroughsdefensetestingautomation

Geautomatiseerde testpipeline voor verdediging

Build an automated pipeline that continuously tests defensive measures against evolving attack techniques.

walkthroughsmethodologycompliancetesting

Compliance-gedreven testmethodologie

Map regulatory requirements to specific test cases for compliance-driven AI red team assessments.

eu-ai-actcomplianceregulationtestingmethodologywalkthrough

Testen op naleving van de EU AI Act

Walkthrough for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.

walkthroughsmethodologymulti-modeltesting

Methodologie voor multi-modeltesten

Structured methodology for testing applications that use multiple LLM models in their processing pipeline.

walkthroughsmethodologycompliancetesting

Methodologie voor AI-compliancetesten

Methodology for testing AI systems against regulatory compliance requirements including EU AI Act and NIST.

ai21platformstestingmodelswalkthroughs

AI21 Labs-modellen testen

Red team testing guide for AI21 Labs Jamba models including long context and efficiency features.

cohereplatformstestingmodelswalkthroughs

Cohere-modellen testen

Red team testing guide for Cohere's Command-R models including RAG and tool use features.

fireworksplatformstestingwalkthroughs

Het Fireworks AI-platform testen

Red team testing guide for Fireworks AI including function calling and compound AI systems.

inferencegroqplatformstestingwalkthroughs

Het Groq inference-platform testen

Red team testing guide for Groq's high-speed inference platform and its security characteristics.

mistralplatformstestingmodelswalkthroughs

Mistral AI-modellen testen

Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.

localollamaplatformstestingwalkthroughs

Lokale Ollama-deployments testen

Security testing guide for locally deployed models via Ollama including network exposure and API security.

replicateplatformstestingwalkthroughshosted

Door Replicate gehoste modellen testen

Red team testing guide for models hosted on Replicate including open-source model deployments.

togetherplatformstestingwalkthroughs

Het Together AI-platform testen

Red team testing guide for Together AI including fine-tuned model endpoints and custom deployments.

walkthroughscounterfitml-securitytesting

ML-beveiligingstesten met Counterfit

Use Microsoft's Counterfit for adversarial ML testing of deployed model endpoints.

walkthroughsjailbreakbenchbenchmarktesting

Gebruik van en inzendingen bij JailbreakBench

Use JailbreakBench to evaluate jailbreak techniques and submit results to the benchmark.