# walkthrough

context-overflowcontext-windowagent-securityattention-manipulationwalkthrough

Agent context overflow

Walkthrough of overflowing agent context windows to push safety instructions out of the LLM's attention, enabling bypasses of system prompts and guardrails.

agent-loophijackingagent-securityreasoning-chainagentic-aiwalkthrough

Agent loop-hijacking

Advanced walkthrough of hijacking agentic loops to redirect autonomous agent behavior, alter reasoning chains, and achieve persistent control over multi-step agent workflows.

agent-persistencebackdoormemory-attacksagent-securitylong-term-compromisewalkthrough

Agentpersistentie via geheugen

Advanced walkthrough of using agent memory systems to create persistent backdoors that survive restarts, updates, and session boundaries.

mcpcallback-abusemodel-context-protocolagent-securityexfiltrationwalkthrough

Misbruik van callbacks in MCP

Advanced walkthrough of abusing MCP callback mechanisms for unauthorized actions, data exfiltration, and privilege escalation in agent-tool interactions.

attackscompetitivewalkthroughwalkthroughsjailbreak

Jailbreak-technieken in competitiestijl

Walkthrough of jailbreak techniques used in AI security competitions and CTF events.

attacksinjectionwalkthroughusecomputerwalkthroughs

Walkthrough: injection bij computer-use agents

Walkthrough of injecting prompts through UI elements and screenshots processed by computer-use agents.

attacksharvestingdatawalkthroughwalkthroughs

Dataharvesting via LLM-apps

Complete walkthrough of systematic data extraction from LLM applications using various exfiltration channels.

attacksencodingchainbypasswalkthroughwalkthroughs

Walkthrough: encodingketen omzeilen

Walkthrough of chaining Base64, URL encoding, and Unicode tricks to bypass multi-layer input filters.

function-callingparameter-injectionapi-securityagent-securitywalkthrough

Parameter injection bij function calling

Walkthrough of manipulating function call parameters through prompt-level techniques, injecting malicious values into LLM-generated API calls.

mcptool-shadowingmodel-context-protocolagent-securitytool-poisoningwalkthrough

Tool shadowing in MCP

Advanced walkthrough of creating shadow tools that override legitimate MCP (Model Context Protocol) tools, enabling interception and manipulation of agent-tool interactions.

attacksmemorywalkthroughpersistencewalkthroughs

Walkthrough: aanval op geheugenpersistentie

Walkthrough of achieving persistent memory manipulation in agent systems for cross-session influence.

memory-poisoningagent-memorypersistenceinjectionagent-securitywalkthrough

Geheugen-poisoning stap voor stap

Walkthrough of persisting injection payloads in agent memory systems to achieve long-term compromise of LLM-based agents.

multi-agentprompt-relayinjection-chainagent-pipelineagent-securitywalkthrough

Multi-agent prompt-relay

Advanced walkthrough of relaying prompt injection payloads across multiple agents in a pipeline, achieving cascading compromise of multi-agent systems.

orchestratormulti-agenttask-delegationagent-securitycoordination-attackwalkthrough

Manipulatie van de orkestrator

Advanced walkthrough of attacking the orchestrator layer in multi-agent systems to gain control over task delegation, agent coordination, and system-wide behavior.

plugin-confusiontool-selectionagent-securitymisdirectionwalkthrough

Plugin confusion-aanval

Walkthrough of confusing LLM agents about which plugin or tool to invoke, causing them to call the wrong tool or pass data to unintended destinations.

attacksagentsprivilegeescalationwalkthroughwalkthroughs

Walkthrough: privilege escalation van agents

Walkthrough of escalating privileges in multi-agent systems through trust chain exploitation.

attackscamouflagesemanticwalkthroughwalkthroughs

Walkthrough van semantische camouflage

Walkthrough of crafting semantically camouflaged injections that evade both classifiers and human review.

attackssupplychainpoisoningwalkthroughwalkthroughs

Vergiftiging van de model-supply chain

Walkthrough of poisoning ML supply chains through dependency confusion, model weight manipulation, and hub attacks.

tool-callingfunction-callinginjectionagent-securitywalkthrough

Injectie via tool calls

Step-by-step walkthrough of injecting malicious parameters into LLM tool and function calls to execute unauthorized actions in agent systems.

attacksvisionmodelwalkthroughwalkthroughs

Walkthrough van een aanval op een vision-model (aanval-walkthrough)

Step-by-step walkthrough of visual prompt injection, adversarial images, and OCR exploitation in vision-language models.

attacksxmlinjectionwalkthroughjsonwalkthroughs

XML- en JSON-injectie in LLM-apps

Walkthrough of exploiting XML and JSON parsing in LLM applications for injection and data manipulation.

input-sanitizationprompt-injectiondefensellm-securityinput-validationwalkthrough

Een productieklare input-sanitizer bouwen

Step-by-step walkthrough for building a production-grade input sanitizer that cleans, normalizes, and validates user prompts before they reach an LLM, covering encoding normalization, injection pattern stripping, length enforcement, and integration testing.

canary-tokensprompt-injectiondetectionmonitoringdefensewalkthrough

Inzet van canary tokens

Step-by-step walkthrough for deploying canary tokens in LLM system prompts and context to detect prompt injection and data exfiltration attempts, covering token generation, placement strategies, monitoring, and alerting.

access-controlcapabilitiespermissionsauthorizationdefensewalkthrough

Capability-gebaseerde toegangscontrole

Step-by-step walkthrough for implementing fine-grained capability controls for LLM features, covering capability token design, permission scoping, dynamic capability grants, and audit trails.

constitutional-aiclassifierprinciplessafetydefensewalkthrough

Opzetten van een Constitutional classifier

Step-by-step walkthrough for implementing constitutional AI-style classifiers that evaluate LLM outputs against a set of principles, covering principle definition, classifier training, chain-of-thought evaluation, and deployment.

content-filteringdefenseclassifiersmoderationllm-judgeimplementationwalkthrough

Contentfiltering opzetten

Step-by-step walkthrough for implementing multi-layer content filtering for AI applications: keyword filtering, classifier-based detection, LLM-as-judge evaluation, testing effectiveness, and tuning for production.

nemo-guardrailsnvidiaguardrailscolangdefensewalkthrough

NeMo Guardrails uitrollen

Step-by-step walkthrough for setting up NVIDIA NeMo Guardrails in production, covering installation, Colang configuration, custom actions, topical and safety rails, testing, and monitoring.

dual-llmarchitecturevalidationdefense-in-depthdefensewalkthrough

Opzetten van een dual LLM-architectuur

Step-by-step walkthrough for implementing a dual LLM pattern where one model generates responses and a second model validates them, covering architecture design, validator prompt engineering, latency optimization, and failure handling.

guardrailsnemoinput-validationoutput-filteringpii-detectioncontent-policywalkthrough

AI-guardrails opzetten

Step-by-step walkthrough for implementing AI guardrails: input validation with NVIDIA NeMo Guardrails, prompt injection detection with rebuff, output filtering for PII and sensitive data, and content policy enforcement.

hallucinationdetectionfactual-groundingoutput-filteringdefensewalkthrough

Detectie van hallucinaties

Step-by-step walkthrough for detecting and flagging hallucinated content in LLM outputs, covering factual grounding checks, self-consistency verification, source attribution validation, and confidence scoring.

guardrailsinput-validationprompt-injection-defensecontent-safetydefensewalkthrough

Input-guardrails bouwen voor LLM-applicaties

Step-by-step walkthrough for implementing production-grade input guardrails that protect LLM applications from prompt injection, content policy violations, and resource abuse through multi-layer validation, classification, and rate limiting.

incident-responseplaybookbreach-responsecontainmentinvestigationdefensewalkthrough

Incidentrespons-playbook voor AI-beveiligingsinbreuken

Walkthrough for building an incident response playbook tailored to AI security breaches, covering detection triggers, triage procedures, containment strategies, investigation workflows, remediation validation, and post-incident review processes.

incident-responseplaybookstabletopcontainmentevidencepreparationwalkthrough

Voorbereiding op AI incident response

Step-by-step walkthrough for building AI incident response capabilities: playbook development, tabletop exercises, containment procedures, communication templates, and evidence collection workflows.

defenseguardrailsmonitoringincident-responseimplementationwalkthrough

Walkthroughs voor het implementeren van verdediging

Step-by-step guides for implementing AI security defenses: guardrail configuration, monitoring and detection setup, and incident response preparation for AI systems.

instruction-hierarchyprompt-injectionprivilege-separationdefenseadvancedwalkthrough

Handhaving van de instructiehiërarchie (verdediging-walkthrough)

Step-by-step walkthrough for enforcing instruction priority in LLM applications, ensuring system-level instructions always take precedence over user inputs through privilege separation, instruction tagging, and validation layers.

llm-judgeoutput-validationsafetyevaluationdefensewalkthrough

Implementatie van een LLM-judge

Step-by-step walkthrough for using an LLM to judge another LLM's outputs for safety and quality, covering judge prompt design, scoring rubrics, calibration, cost optimization, and deployment patterns.

output-validationsanitizationresponse-safetyschema-validationdefensewalkthrough

Modeluitvoer valideren en saniteren

Walkthrough for building output validation systems that verify LLM responses meet structural, factual, and safety requirements before delivery, covering schema validation, factual grounding checks, response consistency verification, and safe rendering.

monitoringobservabilitysecurity-eventsalertingproductiondefensewalkthrough

Productiemonitoring van LLM-beveiligingsevents

Walkthrough for building production monitoring systems that detect LLM security events in real time, covering log collection, anomaly detection, alert configuration, dashboard design, and incident correlation.

monitoringlogginganomaly-detectionsiemalertingobservabilitywalkthrough

Opzetten van AI-monitoring

Step-by-step walkthrough for implementing AI system monitoring: inference logging, behavioral anomaly detection, alert configuration, dashboard creation, and integration with existing SIEM platforms.

defense-in-depthinput-validationpipelinemulti-layerdefensewalkthrough

Invoervalidatie in meerdere lagen

Step-by-step walkthrough for building a defense-in-depth input validation pipeline that combines regex matching, semantic similarity, ML classification, and rate limiting into a unified validation system for LLM applications.

output-filteringclassifiercontent-moderationsafetydefensewalkthrough

Classifier voor uitvoercontent

Step-by-step walkthrough for building a classifier to filter harmful LLM outputs, covering taxonomy definition, multi-label classification, threshold calibration, and deployment as a real-time output gate.

output-filteringcontent-safetypii-detectionresponse-validationdefensewalkthrough

Implementatie van uitvoerfiltering en contentveiligheid

Walkthrough for building output filtering systems that inspect and sanitize LLM responses before they reach users, covering content classifiers, PII detection, response validation, canary tokens, and filter bypass resistance.

piiredactionprivacyoutput-filteringcompliancedefensewalkthrough

Pijplijn voor PII-redactie

Step-by-step walkthrough for building an automated PII detection and redaction pipeline for LLM outputs, covering regex-based detection, NER-based detection, presidio integration, redaction strategies, and compliance testing.

classifiermachine-learningprompt-injectiondetectiontrainingdefensewalkthrough

Training van een prompt-classifier

Step-by-step walkthrough for training a machine learning classifier to detect malicious prompts, covering dataset curation, feature engineering, model selection, training pipeline, evaluation, and deployment as a real-time detection service.

prompt-injectionmachine-learningdetectionclassifierdefensewalkthrough

ML-gebaseerde systemen voor detectie van prompt injection

Walkthrough for building and deploying ML-based prompt injection detection systems, covering training data collection, feature engineering, model architecture selection, threshold tuning, production deployment, and continuous improvement.

ragaccess-controlretrievaldata-leakageauthorizationdefensewalkthrough

Toegangscontrole implementeren in RAG-pijplijnen

Walkthrough for building access control systems in RAG pipelines that enforce document-level permissions, prevent cross-user data leakage, filter retrieved context based on user authorization, and resist retrieval poisoning attacks.

rate-limitingabuse-preventionapi-securitytoken-bucketcost-controldefensewalkthrough

Rate limiting en misbruikpreventie voor LLM-API's

Walkthrough for implementing rate limiting and abuse prevention systems for LLM API endpoints, covering token bucket algorithms, per-user quotas, cost-based limiting, anomaly detection, and graduated enforcement.

rate-limitingdefenseabuse-preventiontokensthrottlingimplementationwalkthrough

Walkthrough: AI rate limiting

Step-by-step walkthrough for implementing token-aware rate limiting for AI applications: request-level limiting, token budget enforcement, sliding window algorithms, abuse detection, and production deployment.

regexprompt-injectionpattern-matchinginput-filteringdefensewalkthrough

Op regex gebaseerd promptfilter

Step-by-step walkthrough for building a regex-based prompt filter that detects common injection payloads using pattern matching, covering pattern library construction, performance optimization, false positive management, and continuous updates.

response-boundariesoutput-filteringcontent-policyguardrailsdefensewalkthrough

Afdwingen van responsgrenzen

Stapsgewijze walkthrough om LLM-responses binnen gedefinieerde onderwerp-, formaat- en contentgrenzen te houden, met grensdefinitie, detectie van overtredingen, het herschrijven van responses en het monitoren van grensafwijking.

sandboxtool-executionisolationcontainerssecuritydefensewalkthrough

Tooluitvoering in een sandbox

Step-by-step walkthrough for running LLM tool calls in isolated sandboxes, covering container-based isolation, resource limits, network restrictions, and output sanitization.

sandboxingtool-usepermissionsagent-securityleast-privilegedefensewalkthrough

Sandboxing en permissiemodellen voor tool-gebruikende agents

Walkthrough for implementing sandboxing and permission models that constrain tool-using LLM agents, covering least-privilege design, parameter validation, execution sandboxes, approval workflows, and audit logging.

embeddingssemantic-similarityprompt-injectionvector-databasedefensewalkthrough

Detectie van semantische gelijkenis

Step-by-step walkthrough for using text embeddings to detect semantically similar prompt injection attempts, covering embedding model selection, vector database setup, similarity threshold tuning, and production deployment.

session-isolationmulti-tenancysecuritydefensewalkthrough

Patronen voor sessie-isolatie

Step-by-step walkthrough for isolating user sessions in LLM applications to prevent cross-contamination of context, memory, and permissions between users.

structured-outputvalidationjson-schemaoutput-filteringdefensewalkthrough

Validatie van gestructureerde uitvoer

Step-by-step walkthrough for validating structured LLM outputs against schemas, covering JSON schema validation, type coercion, constraint enforcement, and handling malformed model outputs gracefully.

toxicityscoringoutput-filteringcontent-moderationsafetydefensewalkthrough

Pijplijn voor het scoren van toxiciteit

Step-by-step walkthrough for building a toxicity scoring pipeline for LLM output filtering, covering model selection, multi-dimensional scoring, threshold calibration, and production deployment with real-time scoring.

unicodenormalizationencodinghomoglyphdefensewalkthrough

Verdediging via Unicode-normalisatie

Step-by-step walkthrough for implementing Unicode normalization to prevent encoding-based prompt injection bypasses, covering homoglyph detection, invisible character stripping, bidirectional text handling, and normalization testing.

agentengagementfunction-callingtool-abusepermission-escalationwalkthrough

Red team-engagement van een agentsysteem

Complete walkthrough for testing tool-using AI agents: scoping agent capabilities, exploiting function calling, testing permission boundaries, multi-step attack chains, and session manipulation.

apiengagementauthenticationrate-limitinginput-validationmodel-fingerprintingwalkthrough

AI API red team-engagement

Complete walkthrough for testing AI APIs: endpoint enumeration, authentication bypass, rate limit evasion, input validation testing, output data leakage, and model fingerprinting through API behavior.

chatbotengagementprompt-injectionsystem-promptcontent-filterwalkthrough

Red team-engagement van een chatbot

Step-by-step walkthrough for a complete chatbot red team assessment: scoping, system prompt extraction, content filter bypass, PII leakage testing, multi-turn manipulation, and professional reporting.

engagementswalkthroughmethodologyscopingreporting

Overzicht van opdracht-walkthroughs

Step-by-step walkthroughs for complete AI red team engagements: from scoping and reconnaissance through attack execution to reporting, organized by target system type.

multi-modelengagementmodel-routingorchestrationfallbackwalkthrough

Red team-engagement voor multi-modelsystemen

Complete walkthrough for testing systems that use multiple AI models: model-to-model injection, routing logic exploitation, fallback chain abuse, inter-model data leakage, and orchestration layer attacks.

ragengagementdocument-injectionembeddingretrievaldata-exfiltrationwalkthrough

Red team-engagement voor een RAG-systeem

Complete walkthrough for testing RAG applications: document injection, cross-scope retrieval exploitation, embedding manipulation, data exfiltration through retrieval, and chunk boundary attacks.

metricseffectivenessreportingkpisprogram-managementmethodologywalkthrough

De effectiviteit van AI-red teams meten en rapporteren

Walkthrough for defining, collecting, and reporting metrics that measure the effectiveness of AI red teaming programs, covering coverage metrics, detection rates, time-to-find analysis, remediation tracking, and ROI calculation.

threat-modelingai-securityrisk-assessmentmethodologywalkthrough

AI-specifieke dreigingsmodellen bouwen

Step-by-step walkthrough for creating threat models tailored to AI and LLM systems, covering asset identification, threat enumeration, attack tree construction, and risk prioritization.

attack-executiontechniquesattack-chainsevidencefindingsworkflowwalkthrough

Workflow voor het uitvoeren van aanvallen

Step-by-step workflow for executing AI red team attacks: selecting techniques from recon findings, building attack chains, documenting findings in real-time, managing evidence, and knowing when to escalate or stop.

attack-surfacemappingreconnaissanceai-securitymethodologywalkthrough

Het aanvalsoppervlak van AI-systemen in kaart brengen

Systematic walkthrough for identifying and mapping every attack surface in an AI system, from user inputs through model inference to output delivery and tool integrations.

communicationreportingstakeholdersexecutive-summaryfindings-presentationwalkthrough

AI red team-bevindingen communiceren naar stakeholders

Walkthrough for effectively communicating AI red team findings to diverse stakeholders, covering executive summaries, technical deep dives, live demonstrations, risk narratives, and remediation roadmaps tailored to audience expertise levels.

continuous-testingci-cdautomationpipelineregression-testingmethodologywalkthrough

Continue AI-redteaming-pijplijnen opzetten

Walkthrough for building continuous AI red teaming pipelines that automatically test LLM applications on every deployment, covering automated scan configuration, CI/CD integration, alert thresholds, regression testing, and dashboard reporting.

engagementkickoffscoperules-of-engagementlegalmethodologywalkthrough

Walkthrough: kick-off van een opdracht

Step-by-step guide to launching an AI red team engagement: initial client meetings, scope definition, rules of engagement, legal agreements, environment setup, and tool selection.

eu-ai-actcomplianceregulationtestingmethodologywalkthrough

Testen op naleving van de EU AI Act

Walkthrough for conducting red team assessments that evaluate compliance with the EU AI Act requirements, covering risk classification, mandatory testing obligations, and documentation requirements.

evidence-collectiondocumentationmethodologyreportingchain-of-custodywalkthrough

Best practices voor bewijsverzameling en documentatie

Walkthrough for systematic evidence collection during AI red team engagements, covering request/response capture, screenshot methodology, chain-of-custody documentation, reproducibility requirements, and evidence organization for reports.

evidencedocumentationmethodologyforensicsreportingwalkthrough

Methoden voor bewijsverzameling voor AI-red teams

Comprehensive methods for collecting, preserving, and organizing red team evidence from AI system assessments, including API logs, screenshots, reproduction scripts, and chain-of-custody procedures.

executive-summaryreportingcommunicationmethodologywalkthrough

Managementsamenvattingen schrijven voor AI-red team-rapporten

Guide to writing clear, impactful executive summaries for AI red team assessment reports that communicate risk to non-technical stakeholders and drive remediation decisions.

severityclassificationvulnerabilityrisk-assessmentmethodologywalkthrough

De ernst van AI-kwetsbaarheden classificeren

Framework for consistently classifying the severity of AI and LLM vulnerabilities, with scoring criteria, impact assessment, and examples across common finding categories.

methodologyengagementworkflowprocesswalkthrough

Walkthroughs van de methodologie

Step-by-step walkthroughs for each phase of an AI red team engagement: kickoff, reconnaissance, attack execution, and report writing.

iso-42001auditcomplianceai-managementstandardswalkthrough

Voorbereiden op een ISO 42001-audit van het AI-managementsysteem

Advanced walkthrough for preparing organizations for ISO 42001 AI management system audits, covering control assessment, evidence preparation, gap remediation, and audit readiness.

mitre-atlasattack-mappingframeworktacticstechniqueswalkthrough

MITRE ATLAS gebruiken voor het in kaart brengen van AI-aanvallen

Walkthrough for mapping AI red team activities and findings to the MITRE ATLAS framework, covering tactic and technique identification, attack chain construction, and navigator visualization.

owaspllm-top-10classificationstandardsmethodologywalkthrough

Bevindingen koppelen aan de OWASP LLM Top 10

Walkthrough for mapping AI red team findings to the OWASP Top 10 for LLM Applications, with classification guidance, reporting templates, and remediation mapping.

comparative-testingmulti-modelbenchmarkingsecurity-evaluationmethodologywalkthrough

Vergelijkend beveiligingstesten over meerdere LLM's

Walkthrough for conducting systematic comparative security testing across multiple LLM providers and configurations, covering test standardization, parallel execution, cross-model analysis, and differential vulnerability reporting.

nistai-rmfrisk-managementcomplianceframeworkwalkthrough

Walkthrough van een NIST AI RMF-beoordeling

Step-by-step guide for conducting assessments aligned with the NIST AI Risk Management Framework, covering the Govern, Map, Measure, and Manage functions for AI system security.

pre-engagementchecklistpreparationmethodologyplanningwalkthrough

Checklist voor voorbereiding vóór het engagement

Complete pre-engagement preparation checklist for AI red team operations covering team readiness, infrastructure setup, legal requirements, and initial reconnaissance planning.

reconnaissancesystem-promptmodel-identificationcapability-mappingattack-surfacewalkthrough

Reconnaissance-workflow

Systematic reconnaissance workflow for AI red team engagements: system prompt extraction, model identification, capability mapping, API enumeration, and documenting the attack surface.

remediationverificationretestingregressionmethodologywalkthrough

Verifiëren dat remediaties effectief zijn

Walkthrough for planning and executing remediation verification testing (retesting) to confirm that AI vulnerability fixes are effective and do not introduce regressions.

report-writingfindingsremediationexecutive-summaryrisk-ratingswalkthrough

Walkthrough van het schrijven van rapporten

Step-by-step guide to writing AI red team reports: structure, executive summary, technical findings, risk ratings, remediation recommendations, peer review, and delivery.

risk-scoringvulnerability-assessmentseverityCVSSmethodologywalkthrough

Risicoscoringkaders voor AI-kwetsbaarheden

Walkthrough for applying risk scoring frameworks to AI and LLM vulnerabilities, covering CVSS adaptation for AI, custom AI risk scoring matrices, severity classification, business impact assessment, and integration with existing vulnerability management processes.

rules-of-engagementtemplatelegalauthorizationmethodologywalkthrough

Sjabloon voor rules of engagement bij AI-red team-operaties

Step-by-step guide to creating comprehensive rules of engagement documents for AI red team assessments, covering authorization, scope, constraints, communication, and legal protections.

scopingengagementplanningmethodologystatement-of-workrisk-assessmentwalkthrough

Hoe je een AI-red team-opdracht scopet

Comprehensive walkthrough for scoping AI red team engagements from initial client contact through statement of work, covering target enumeration, risk-based prioritization, resource estimation, boundary definition, and legal considerations.

scopingchecklistpre-engagementmethodologyplanningrules-of-engagementwalkthrough

Walkthrough: scoping-checklist voor het AI red team

Systematic walkthrough of the pre-engagement scoping process for AI red team assessments: stakeholder identification, target enumeration, scope boundary definition, resource estimation, and rules of engagement documentation.

technical-appendixreportingdocumentationevidencemethodologywalkthrough

Gedetailleerde technische bijlagen maken

Guide to building comprehensive technical appendices for AI red team reports, including evidence formatting, reproduction procedures, tool output presentation, and raw data organization.

test-planplanningmethodologytest-casesautomationwalkthrough

Uitgebreide testplannen voor AI-beveiliging ontwikkelen

Step-by-step guide to developing structured test plans for AI red team engagements, covering test case design, automation strategy, coverage mapping, and execution scheduling.

threat-modelingllm-securityrisk-assessmentmethodologySTRIDEwalkthrough

Dreigingsmodellering voor LLM-gestuurde applicaties

Step-by-step walkthrough for conducting threat modeling sessions specifically tailored to LLM-powered applications, covering data flow analysis, trust boundary identification, AI-specific threat enumeration, risk assessment, and mitigation planning.

threat-modelingstrideattack-treesworkshopfacilitationmethodologywalkthrough

Walkthrough: workshop AI-dreigingsmodellering

Step-by-step guide to running an AI-focused threat modeling workshop: adapting STRIDE for AI systems, constructing attack trees for LLM applications, participant facilitation techniques, and producing actionable threat models.

anyscalerayray-servecluster-securitydashboardactor-isolationwalkthrough

ML-testen van Anyscale Ray Serve

End-to-end walkthrough for security testing Ray Serve ML deployments on Anyscale: cluster enumeration, serve endpoint exploitation, Ray Dashboard exposure, actor isolation testing, and observability review.

autogenmulti-agentcode-executionagent-injectionconversation-controlsandboxwalkthrough

Testen van een AutoGen multi-agent systeem

End-to-end walkthrough for security testing AutoGen multi-agent systems: agent enumeration, inter-agent injection, code execution sandbox assessment, conversation manipulation, and escalation path analysis.

awssagemakercloudmodel-extractioniaminference-pipelinewalkthrough

Redteaming van AWS SageMaker

End-to-end walkthrough for red teaming ML models deployed on AWS SageMaker: endpoint enumeration, IAM policy analysis, model extraction testing, inference pipeline exploitation, and CloudTrail log review.

azureazure-mlcloudmanaged-endpointscompute-instancedata-storewalkthrough

Azure ML-beveiligingstesten

End-to-end walkthrough voor het beveiligingstesten van Azure Machine Learning-endpoints: workspace-enumeratie, exploitatie van beheerde online-endpoints, beoordeling van compute-instances, review van data store-toegang en analyse met Azure Monitor.

azureopenaired-teamcontent-safetymanaged-identityprompt-flowwalkthrough

Walkthrough: Azure OpenAI red team

Complete red team walkthrough for Azure OpenAI deployments: testing content filters, managed identity exploitation, prompt flow injection, data integration attacks, and Azure Monitor evasion.

azureopenaicloudcontent-filteringmanaged-identityprompt-flowwalkthrough

Walkthrough: Azure OpenAI red team (platform-walkthrough)

End-to-end walkthrough for red teaming Azure OpenAI deployments: deployment configuration review, content filtering bypass testing, managed identity exploitation, prompt flow assessment, and diagnostic log analysis.

awsbedrockred-teamguardrailsknowledge-baseagentswalkthrough

Walkthrough: AWS Bedrock red team

Complete guide to red teaming AWS Bedrock deployments: testing guardrails bypass techniques, knowledge base data exfiltration, agent prompt injection, model customization abuse, and CloudTrail evasion.

awsbedrockcloudguardrailsknowledge-basecloudtrailwalkthrough

Walkthrough: AWS Bedrock red team (platform-walkthrough)

End-to-end walkthrough for red teaming AI systems on AWS Bedrock: setting up access, invoking models via the Converse API, testing Bedrock Guardrails, exploiting knowledge bases, and analyzing CloudTrail logs.

crewaiagentscrewtask-injectiondelegationtool-securitywalkthrough

Beveiligingstesten van CrewAI agent-applicaties

End-to-end walkthrough for security testing CrewAI agent applications: crew enumeration, agent role exploitation, task injection, tool security assessment, delegation chain manipulation, and output validation.

databricksmlflowmodel-registryunity-catalogserving-endpointswalkthrough

Audit van Databricks MLflow-deployment

End-to-end walkthrough voor het auditen van MLflow-deployments op Databricks: workspace-enumeratie, beveiliging van het modelregister, testen van serving-endpoints, review van de Unity Catalog-integratie en analyse van auditlogs.

dspypipelinessignaturesoptimizersretrievalcompiled-promptswalkthrough

Beveiligingstesten van DSPy-pipelines

End-to-end walkthrough for security testing DSPy optimized LLM pipelines: module enumeration, signature exploitation, optimizer manipulation, retrieval module assessment, and compiled prompt analysis.

gcpvertex-aicloudmodel-servingiampipelineswalkthrough

Beveiligingstesten van GCP Vertex AI

End-to-end walkthrough for security testing Vertex AI deployments on Google Cloud: endpoint enumeration, IAM policy analysis, model serving exploitation, pipeline assessment, and Cloud Audit Logs review.

huggingfacesecurity-auditsupply-chainmodel-scanningprovenancewalkthrough

Walkthrough: beveiligingsaudit van Hugging Face

Step-by-step walkthrough for auditing Hugging Face models: scanning for malicious model files, verifying model provenance, assessing model card completeness, and testing Spaces and Inference API security.

huggingfacespacesgradiostreamlitapi-testingmodel-hubwalkthrough

Beveiligingstesten van HuggingFace Spaces

End-to-end walkthrough for security testing HuggingFace Spaces applications: Space enumeration, Gradio/Streamlit exploitation, API endpoint testing, secret management review, and model access control assessment.

huggingfacemodel-hubtransformersmalicious-modelssupply-chainwalkthrough

Hugging Face Hub Red Team-walkthrough

Walkthrough voor het beoordelen van AI-modellen op Hugging Face Hub: beveiligingsassessment van modellen, scannen op kwaadaardige modellen, testen met de Transformers-library, en evaluatie van Spaces-applicaties.

platformscloudawsazuregcphuggingfacewalkthrough

Walkthroughs van cloud-AI-platforms

Hands-on walkthroughs for red teaming AI systems deployed on major cloud platforms: AWS Bedrock, Azure OpenAI, Google Vertex AI, and Hugging Face Hub.

langchainchainsagentstoolsragmemoryprompt-injectionwalkthrough

Beveiligingstesten van LangChain-applicaties

End-to-end walkthrough for security testing LangChain applications: chain enumeration, prompt injection through chains, tool and agent exploitation, retrieval augmented generation attacks, and memory manipulation.

llamaindexragquery-enginedata-connectorsresponse-synthesisagentswalkthrough

Beveiligingstesten van LlamaIndex RAG-applicaties

End-to-end walkthrough voor het beveiligingstesten van LlamaIndex RAG-applicaties: index-enumeratie, exploitatie van de query engine, beoordeling van data connectors, manipulatie van response synthesis en testen van agent-pipelines.

modalserverlesscontainer-securityweb-endpointssecretsvolumeswalkthrough

Testen van serverless AI-uitrol met Modal

End-to-end walkthrough for security testing Modal serverless AI deployments: function enumeration, web endpoint exploitation, secret management assessment, volume security testing, and container escape analysis.

ollamalocal-modelssecurity-testingsafety-comparisonmodelfilewalkthrough

Walkthrough van Ollama-beveiligingstesten

Complete walkthrough for security testing locally-hosted models with Ollama: comparing safety across models, testing system prompt extraction, API security assessment, and Modelfile configuration hardening.

replicateapi-testingcogwebhooksmodel-servingwalkthrough

Beveiligingstesten van de Replicate API

End-to-end walkthrough for security testing models on Replicate: model enumeration, prediction API exploitation, webhook security, Cog container assessment, and billing abuse prevention.

runpodserverlessgpudockerwebhookshandler-securitywalkthrough

Testen van RunPod serverless GPU-endpoints

End-to-end walkthrough for security testing RunPod serverless GPU endpoints: endpoint enumeration, handler exploitation, webhook security, Docker template assessment, and cost abuse prevention.

semantic-kernelmicrosoftpluginsplannersmemoryazurewalkthrough

Beveiligingstesten van Microsoft Semantic Kernel

End-to-end walkthrough for security testing Semantic Kernel applications: kernel enumeration, plugin exploitation, planner manipulation, memory and RAG assessment, and Azure integration security review.

together-aiapi-testinginferencefine-tuningfunction-callingwalkthrough

Beveiligingstesten van Together AI

End-to-end walkthrough for security testing Together AI deployments: API enumeration, inference endpoint exploitation, fine-tuning security review, function calling assessment, and rate limit analysis.

gcpvertex-aicloudprediction-endpointsmodel-gardenfeature-storewalkthrough

Red team-walkthrough van Vertex AI

End-to-end walkthrough for red teaming Google Cloud Vertex AI: prediction endpoint testing, Model Garden security assessment, Feature Store probing, and Cloud Logging analysis.

gcpvertex-aired-teammodel-gardenfeature-storeprediction-endpointswalkthrough

Red team-walkthrough van Vertex AI (platform-walkthrough)

Complete red team walkthrough for Google Vertex AI: testing prediction endpoints, Model Garden assessments, Feature Store probing, and exploiting Vertex AI Agents and Extensions.

artkitadversarial-testingrobustnessautomated-testingllm-evaluationwalkthrough

Testen van adversarial robuustheid met ARTKit

Walkthrough for using ARTKit (Adversarial Robustness Testing Kit) to evaluate LLM application resilience through automated adversarial testing, covering test flow configuration, challenger setup, evaluator design, and results analysis.

burp-suiteproxyinterceptingfuzzingapi-testingstreamingwalkthrough

Burp Suite voor AI-API's

Using Burp Suite to intercept, analyze, and fuzz LLM API calls: proxy setup, intercepting streaming responses, parameter fuzzing with Intruder, and building custom extensions for AI-specific testing.

burp-suiteapi-testingllm-securityproxyweb-securitywalkthrough

Burp Suite gebruiken voor het testen van LLM-API-endpoints

Walkthrough for using Burp Suite to intercept, analyze, and attack LLM API endpoints, covering proxy configuration, request manipulation, automated scanning for injection flaws, and custom extensions for AI-specific testing.

counterfitadversarial-mlmicrosoftrobustness-testingautomationwalkthrough

Walkthrough: Counterfit

Complete walkthrough of Microsoft's Counterfit adversarial ML testing framework: installation, target configuration, running attacks against ML models, interpreting results, and automating adversarial robustness assessments.

garakcustom-probesadvancedattack-vectorsprobe-developmentwalkthrough

Custom Garak-probes schrijven voor nieuwe aanvalsvectoren

Geavanceerde walkthrough voor het bouwen van custom Garak-probes die nieuwe en opkomende aanvalsvectoren targeten, met behandeling van probe-architectuur, payloadgeneratie, koppeling van detectoren en integratie in geautomatiseerde scanpipelines.

garakci-cdautomationgithub-actionsgitlab-ciwalkthrough

Garak integreren in CI/CD-pijplijnen

Intermediate walkthrough on automating garak vulnerability scans within CI/CD pipelines, including GitHub Actions, GitLab CI, threshold-based gating, result caching, and cost management strategies.

garakcustom-probesvulnerability-scanningextensibilitywalkthrough

Aangepaste Garak-probes schrijven

Intermediate walkthrough on creating custom garak probes tailored to application-specific attack surfaces, including probe structure, prompt engineering, custom detectors, and testing workflows.

garakdetectorscustom-detectorsadvancedwalkthrough

Custom garak-detectors bouwen

Advanced walkthrough on creating custom garak detectors for specific success criteria, including regex-based detectors, ML-based classifiers, multi-signal scoring, and integration with external evaluation services.

garakvulnerability-scanningbeginnerfirst-scanwalkthrough

Je eerste Garak-scan uitvoeren

Step-by-step beginner walkthrough for running your very first garak vulnerability scan from zero, covering installation, target setup, probe selection, and basic result interpretation.

garakgeneratorspluginscustom-apiadvancedwalkthrough

Garak-generatorplugins schrijven voor aangepaste API-doelen

Advanced walkthrough on writing garak generator plugins to connect to custom API endpoints, proprietary model servers, and non-standard inference interfaces for vulnerability scanning.

garakmcptool-useprobesadvancedwalkthrough

Garak-probes opzetten voor MCP-toolinteracties

Advanced walkthrough on configuring garak probes that target Model Context Protocol (MCP) tool interactions, testing for tool misuse, privilege escalation through tools, and data exfiltration via tool calls.

garakmodel-comparisonvulnerability-scanningbenchmarkingwalkthrough

Kwetsbaarheidsprofielen van modellen vergelijken met garak

Intermediate walkthrough on using garak to run identical vulnerability scans across multiple models, comparing results to understand relative security postures and make informed model selection decisions.

garakreportinganalysisvulnerability-assessmentwalkthrough

Deep dive: analyse van garak-scanrapporten

Intermediate walkthrough on analyzing garak scan reports, including JSONL parsing, false positive identification, vulnerability categorization, executive summary generation, and trend tracking.

garakvulnerability-scanningprobesautomationci-cdwalkthrough

End-to-end walkthrough van garak

Complete walkthrough of NVIDIA's garak LLM vulnerability scanner: installation, configuration, running probes against local and hosted models, interpreting results, writing custom probes, and CI/CD integration.

harmbenchevaluationbenchmarkssafetyred-team-automationwalkthrough

Walkthrough: HarmBench-evaluatieframework

Complete walkthrough of the HarmBench evaluation framework: installation, running standardized benchmarks against models, interpreting results, creating custom behavior evaluations, and comparing model safety across versions.

inspect-aiaisievaluationbenchmarksscorerscompliancewalkthrough

Walkthrough van Inspect AI

Complete walkthrough of UK AISI's Inspect AI framework: installation, writing evaluations, running against models, custom scorers, benchmark suites, and producing compliance-ready reports.

langchainsecurity-testingagent-securitychain-injectionrag-securitywalkthrough

Beveiligingstesten van LangChain-applicaties

Step-by-step walkthrough for identifying and exploiting security vulnerabilities in LangChain-based applications, covering chain injection, agent manipulation, tool abuse, retrieval poisoning, and memory extraction attacks.

langfuseobservabilitymonitoringtracingsecurity-dashboardwalkthrough

Walkthrough: observability met Langfuse

Complete walkthrough for using Langfuse to monitor AI applications for security anomalies: setting up tracing, building security dashboards, detecting prompt injection patterns, and creating automated alerts.

nemoguardrailscolangnvidiadialog-flowsbypass-testingwalkthrough

NeMo Guardrails-walkthrough

End-to-end walkthrough of NVIDIA NeMo Guardrails: installation, Colang configuration, dialog flow design, integration with LLM applications, and red team bypass testing techniques.

ollamalocal-modelsmodel-analysissafety-testingred-team-labwalkthrough

Lokale modellen analyseren en testen met Ollama

Walkthrough for using Ollama to run, analyze, and security-test local LLMs, covering model configuration, safety boundary testing, system prompt extraction, fine-tuning vulnerability assessment, and building a local red team lab.

ollamalocal-modelsuncensoredsafety-comparisonlab-setupwalkthrough

Ollama voor lokale redteaming

Using Ollama as a local red teaming environment: model selection, running uncensored models, API-based testing, comparing safety across model families, and building a cost-free testing lab.

promptfooevaluationbeginnerfirst-evalwalkthrough

Je eerste promptfoo-evaluatie uitvoeren

Beginner walkthrough for running your first promptfoo evaluation from scratch, covering installation, configuration, test case creation, assertion writing, and result interpretation.

promptfooautomationred-teamevaluationci-cdwalkthrough

Red team-evaluaties automatiseren met Promptfoo

Complete walkthrough for setting up automated red team evaluation pipelines using Promptfoo, covering configuration, custom evaluators, adversarial dataset generation, CI integration, and result analysis.

promptfooevaluationred-teamci-cdyamlassertionswalkthrough

End-to-end walkthrough van promptfoo

Complete walkthrough of promptfoo for AI red teaming: configuration files, provider setup, running evaluations, red team plugins, assertion-based scoring, reporting, and CI/CD integration.

pyritazureazure-openaicontent-safetyenterprisewalkthrough

PyRIT integreren met Azure OpenAI en Content Safety

Intermediate walkthrough on integrating PyRIT with Azure OpenAI Service and Azure AI Content Safety for enterprise red teaming, including managed identity authentication, content filtering analysis, and compliance reporting.

pyritconverterspayload-transformationevasionwalkthrough

Converter-pipelines bouwen voor payloadtransformatie in PyRIT

Intermediate walkthrough on using PyRIT's converter system to transform attack payloads through encoding, translation, paraphrasing, and other obfuscation techniques to evade input filters.

pyritscorerscustom-scoringevaluationwalkthrough

Custom scorers maken voor aanvalsevaluatie in PyRIT

Intermediate walkthrough on building custom PyRIT scorers for evaluating attack success, including pattern-based, LLM-based, and multi-criteria scoring approaches.

pyritred-teamingbeginnerfirst-campaignwalkthrough

Je eerste PyRIT-red team-campagne uitvoeren

Beginner walkthrough for running your first PyRIT red team campaign from scratch, covering installation, target configuration, orchestrator setup, and basic result analysis.

pyrituifrontendbeginnervisual-interfacewalkthrough

De PyRIT-UI-frontend gebruiken

Beginner walkthrough on using PyRIT's web-based UI frontend for visual red team campaign management, including launching campaigns, monitoring progress, and reviewing results without writing code.

pyritmulti-turnorchestrationred-teamingwalkthrough

Multi-turn-aanvalssequenties orkestreren met PyRIT

Intermediate walkthrough on using PyRIT's orchestration capabilities for multi-turn red team campaigns, including attack strategy design, conversation management, and adaptive scoring.

pyritmulti-turnorchestrationred-teammicrosoftattack-campaignswalkthrough

Microsoft PyRIT voor georkestreerde multi-turn-aanvallen

Comprehensive walkthrough for using Microsoft PyRIT to design and execute orchestrated multi-turn attack campaigns against LLM applications, covering orchestrator configuration, converter chains, scoring strategies, and campaign analysis.

pyritreportingred-team-reportsdocumentationwalkthrough

Professionele rapporten genereren uit PyRIT-campagnes

Intermediate walkthrough over het genereren van professionele red team-rapporten uit PyRIT-campagnedata, inclusief executive summaries, technische bevindingen, remediatie-richtlijnen en visuele dashboards.

pyrittargetsconfigurationapi-integrationwalkthrough

Diverse targets configureren in PyRIT

Intermediate walkthrough on configuring PyRIT targets for various model providers, custom APIs, local models, and application endpoints including authentication, system prompts, and rate limiting.

pyritmicrosoftorchestratorsconvertersmulti-turnred-teamingwalkthrough

End-to-end walkthrough van PyRIT

Complete walkthrough of Microsoft's Python Risk Identification Toolkit: setup, connecting to targets, running orchestrators, using converters, multi-turn attacks, and analyzing results with the web UI.

pythonautomationhttpxaiohttpreportingtest-harnesswalkthrough

Red team-automatisering met Python

Building custom AI red team automation with Python: test harnesses with httpx and aiohttp, result collection and analysis, automated reporting, and integration with existing tools like promptfoo and garak.

rebuffprompt-injectiondetectiondefense-testingevasionwalkthrough

Prompt injection-verdedigingen testen met Rebuff

Walkthrough for using Rebuff to test and evaluate prompt injection detection capabilities, covering installation, detection pipeline analysis, adversarial evasion testing, custom rule development, and benchmarking detection accuracy.