# walkthroughs
272 artikelengetagd met “walkthroughs”
Walkthrough: A2A berichtonderschepping
Intercept and manipulate messages between agents in A2A protocol-based multi-agent systems.
Walkthrough: A2A protocol injection
Walkthrough of exploiting Google's Agent-to-Agent protocol for inter-agent prompt injection.
Walkthrough: adversarial suffixes maken
Craft adversarial suffixes using gradient-based and gradient-free optimization methods for black-box models.
Walkthrough: API abuse chain-aanval
Chain multiple API calls to achieve unauthorized actions that no single call would permit.
Walkthrough: misbruik van API-chaining
Walkthrough of chaining multiple API calls in agent systems to achieve multi-step unauthorized actions.
Walkthrough: attention hijacking-aanval
Hijack transformer attention mechanisms to redirect model focus toward adversarial instructions in the context.
Walkthrough: stealthy jailbreak met AutoDAN
Walkthrough of generating stealthy, human-readable jailbreak prompts using the AutoDAN methodology.
Walkthrough: injection bij batchverwerking
Inject payloads through batch processing pipelines where individual items are processed without isolation.
Walkthrough: LLM-cache-poisoning
Poison LLM response caches to serve adversarial content to other users without direct injection.
Jailbreak-technieken in competitiestijl
Walkthrough of jailbreak techniques used in AI security competitions and CTF events.
Walkthrough: injection bij computer-use agents
Walkthrough of injecting prompts through UI elements and screenshots processed by computer-use agents.
Walkthrough: geavanceerde contextmanipulatie
Manipulate conversation context through history injection, context overflow, and attention weight exploitation.
Contextmanipulatie deep dive
Advanced walkthrough of context window manipulation techniques including attention dilution and instruction priority shifting.
Walkthrough: Crescendo multi-turn jailbreak
Step-by-step walkthrough of the Crescendo technique for gradually escalating multi-turn jailbreaks.
Walkthrough: cross-plug-in data-exfiltratie
Walkthrough of chaining multiple plugins/tools to exfiltrate data from LLM agent systems.
Dataharvesting via LLM-apps
Complete walkthrough of systematic data extraction from LLM applications using various exfiltration channels.
Walkthrough: op documenten gebaseerde injection
Inject prompts through documents processed by LLM applications including PDFs, spreadsheets, and presentations.
Walkthrough: embedding-collisieaanval
Craft documents that collide in embedding space with target queries to hijack RAG retrieval results.
Walkthrough: embedding-inversieaanval
Walkthrough of inverting text embeddings to recover original documents from vector databases.
Walkthrough: encodingketen-aanval
Chain multiple encoding transformations to bypass input filters that only decode one layer of encoding.
Walkthrough: encodingketen omzeilen
Walkthrough of chaining Base64, URL encoding, and Unicode tricks to bypass multi-layer input filters.
Walkthrough: poisoning van few-shot-voorbeelden
Poison few-shot examples in prompts to establish behavioral patterns that override system instructions.
Walkthrough: veiligheid omzeilen via fine-tuning
Walkthrough of using fine-tuning API access to remove safety behaviors from aligned models.
Gids voor misbruik van function calling
Complete walkthrough of exploiting function calling in OpenAI, Anthropic, and Google AI APIs.
Walkthrough: poisoning van function schema's
Poison function schemas to inject hidden instructions that redirect model tool selection and parameter filling.
Walkthrough: GCG adversarial suffix-aanval
Complete walkthrough of implementing the Greedy Coordinate Gradient adversarial suffix attack.
Indirecte prompt injection in de praktijk
Walkthrough of planting and triggering indirect prompt injection in web-browsing AI assistants.
Walkthrough: JSON-injectie-aanval
Exploit JSON parsing and generation in LLM applications to inject payloads through structured data boundaries.
Walkthrough: misbruik van logit bias
Exploit logit bias API parameters to force models toward specific token sequences and bypass safety filters.
Walkthrough: many-shot jailbreaking
Walkthrough implementing Anthropic's many-shot jailbreaking technique with analysis of scaling behavior.
Walkthrough: geavanceerde Markdown injection
Inject Markdown that triggers data exfiltration through image rendering, link generation, and code block escape.
Walkthrough: tool-poisoning-aanval in MCP
Walkthrough of exploiting MCP tool descriptions to redirect agent behavior via hidden instructions.
Walkthrough: aanval op geheugenpersistentie
Walkthrough of achieving persistent memory manipulation in agent systems for cross-session influence.
Walkthrough: overflow van het model-contextvenster
Overflow the context window to push safety instructions outside the effective attention range.
Walkthrough: modelextractie-aanval
Walkthrough of extracting model weights/behavior through systematic API querying.
Walkthrough van multimodale documentaanvallen
Combine visual and textual injection in documents processed by multimodal LLM applications.
Walkthrough van multi-turn persona-escalatie
Gradually escalate through persona manipulation across multiple turns to achieve full safety bypass.
Walkthrough van multimodale image-injectie
Step-by-step walkthrough of embedding adversarial prompts in images for vision model exploitation.
Walkthrough van geautomatiseerde jailbreak met PAIR
End-to-end walkthrough of the PAIR attack methodology for automated black-box jailbreaking.
Walkthrough: privilege escalation van agents
Walkthrough of escalating privileges in multi-agent systems through trust chain exploitation.
Walkthrough: geavanceerde prompt leaking
Advanced techniques for extracting system prompts including iterative reconstruction and side-channel methods.
End-to-end walkthrough van RAG-poisoning
Complete walkthrough of poisoning a RAG system from document injection through information extraction.
Walkthrough van misbruik van redeneermodellen
Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.
Walkthrough van het omzeilen van een regex-filter
Systematically bypass regex-based input filters using Unicode tricks, encoding, and pattern-specific evasion.
Walkthrough van een role confusion-aanval
Exploit role confusion between system, user, and assistant messages to override safety instructions.
Walkthrough van misbruik van rolverwarring
Exploit role boundaries in chat APIs by injecting assistant-role messages and system-level instructions.
Walkthrough van semantische camouflage
Walkthrough of crafting semantically camouflaged injections that evade both classifiers and human review.
Walkthrough van een semantische misleidingsaanval
Craft semantically deceptive prompts that appear benign to classifiers while achieving adversarial objectives.
Walkthrough van detectie van sleeper agents
Walkthrough of detecting deceptive sleeper agent behaviors in fine-tuned language models.
Walkthrough van prompt injection via de supply chain
Plant injection payloads in upstream data sources consumed by LLM applications including packages and documentation.
Vergiftiging van de model-supply chain
Walkthrough of poisoning ML supply chains through dependency confusion, model weight manipulation, and hub attacks.
Walkthrough van injectie van synthetische identiteiten
Create synthetic identities that exploit LLM trust mechanisms to achieve elevated instruction priority.
Walkthrough van TAP Tree of Attacks
Detailed walkthrough of building and running Tree of Attacks for systematic jailbreak generation.
Walkthrough van thought injection in redeneermodellen
Inject adversarial thoughts into the reasoning chain of thinking models to manipulate final outputs.
Walkthrough van een timing-side-channel-aanval
Extract information from LLM applications through timing differences in response generation and safety filter processing.
Walkthrough van aanvalsoptimalisatie op tokenniveau
Walkthrough of optimizing adversarial token sequences for maximum jailbreak effectiveness.
Walkthrough van een tool shadowing-aanval
Register shadow tools that override legitimate tool definitions to intercept and manipulate agent actions.
Walkthrough van een aanval op een vision-model
Attack vision-language models through adversarial images with embedded text, perturbations, and visual trojans.
Walkthrough van een aanval op een vision-model (aanval-walkthrough)
Step-by-step walkthrough of visual prompt injection, adversarial images, and OCR exploitation in vision-language models.
Walkthrough: A2A agent-imitatie
Walkthrough of impersonating a trusted agent in an A2A network to intercept and redirect delegated tasks.
Walkthrough: misbruik van API-parameters
Walkthrough of exploiting undocumented and misconfigured API parameters in LLM service endpoints.
Walkthrough: misbruik van de Batch API
Walkthrough of exploiting batch processing APIs for large-scale prompt injection and output analysis.
Walkthrough: chain-of-thought hijacking
Walkthrough of hijacking visible reasoning traces in CoT models to redirect conclusions and bypass safety checks.
Walkthrough: jailbreak via cognitieve overbelasting
Step-by-step walkthrough of the cognitive overload technique that overwhelms safety training with complex nested instructions.
Walkthrough: context virtualization-aanval
Walkthrough of creating virtualized contexts that make the model believe it is operating in a different environment.
Walkthrough: conversation poisoning-aanval
Step-by-step walkthrough of poisoning conversation history to alter model behavior in future turns.
Walkthrough: exploitatieketen voor documentparsing
End-to-end walkthrough of exploiting document parsers to inject instructions through PDFs, DOCX, and HTML files.
Analyse van schaalbaarheid van few-shot-aanvallen
Detailed analysis of how few-shot examples scale to influence model behavior, from 2-shot to many-shot regime.
Walkthrough: injection via afbeeldingssteganografie
Walkthrough of hiding prompt injection payloads in images using steganographic techniques for vision model attacks.
Walkthrough: misbruik van de instructiehiërarchie
Walkthrough of exploiting how models prioritize system, user, and tool instructions to override safety directives.
Walkthrough: injection-aanval op een knowledge graph
Walkthrough of injecting adversarial facts into knowledge graphs consumed by LLM-based reasoning systems.
Walkthrough: LangChain-exploitketen
Walkthrough of chaining LangChain CVEs for remote code execution from prompt injection through to shell access.
Walkthrough: misbruik van de MCP Sampling-API
Walkthrough of exploiting the MCP sampling API to manipulate how servers request and process LLM completions.
Walkthrough: geheugen-injection en -persistentie
Walkthrough of injecting persistent instructions into agent memory systems that survive across sessions.
Walkthrough van een multi-provider-aanvalscampagne
End-to-end walkthrough of conducting an attack campaign across OpenAI, Anthropic, Google, and open-source models.
Walkthrough van exfiltratietechnieken via het uitvoerformaat
Walkthrough of using output format manipulation to exfiltrate data through code blocks, markdown, and structured outputs.
Walkthrough van vergiftiging van RAG hybrid search
Walkthrough of poisoning both vector and keyword search in hybrid RAG architectures for maximum retrieval influence.
Walkthrough van recursieve prompt injection
Walkthrough of creating self-replicating injection payloads that persist through model output-to-input loops.
Walkthrough van de techniek voor weigeringsonderdrukking
Walkthrough of refusal suppression techniques that prime models to avoid safety refusal patterns.
Walkthrough van semantische obfuscatie
Walkthrough of semantically obfuscating adversarial payloads so they appear benign to both classifiers and humans.
Walkthrough van datalekkage via gestructureerde uitvoer
Walkthrough of using structured output requirements to extract sensitive data embedded in model responses.
Walkthrough van een tool use confusion-aanval
Walkthrough of confusing model tool-use decisions to invoke unintended functions or skip safety-critical tools.
Walkthrough van het omzeilen van Unicode-normalisatie
Step-by-step guide to exploiting Unicode normalization differences between input filters and model tokenizers.
Walkthrough van prompt injection bij spraak-AI
Walkthrough of injecting prompts into voice-based AI assistants through adversarial audio and ultrasonic signals.
Walkthrough: analyse van LLM-watermerken
Walkthrough of detecting and analyzing watermarks in LLM-generated text using statistical methods.
Walkthrough van XML-injectie in LLM-systemen
Exploit XML parsing in LLM application pipelines to inject instructions through entity expansion and CDATA sections.
XML- en JSON-injectie in LLM-apps
Walkthrough of exploiting XML and JSON parsing in LLM applications for injection and data manipulation.
Framework voor het testen van adversarial robuustheid
Build a framework for continuously testing adversarial robustness of deployed LLM defense mechanisms.
Adversarial training voor LLM-verdediging (verdediging-walkthrough)
Implement adversarial training techniques to improve LLM robustness against prompt injection and jailbreaking.
Afdwingen van permissiegrenzen voor agents
Implement fine-grained permission boundaries for LLM agents that limit tool access based on context and user role.
Geautomatiseerde regressietesten van verdediging
Build automated regression tests for LLM security defenses using attack replay and continuous monitoring.
Geautomatiseerde red team-verdedigingsloop
Build an automated continuous red team testing loop that discovers and patches vulnerabilities iteratively.
Gedragsmatige anomaliedetectie voor LLM's
Implement behavioral anomaly detection that identifies when model outputs deviate from expected safety profiles.
Een input safety-classifier bouwen
Build a production-quality input classifier that detects prompt injection attempts.
Implementatiegids voor Constitutional AI
Implement constitutional AI principles in a custom fine-tuning and RLHF pipeline.
Implementatie van het context isolation-patroon
Implement context isolation patterns that prevent instruction leakage between system prompts and user data.
Data loss prevention voor LLM-apps
Build a data loss prevention layer for LLM applications that prevents sensitive data exposure in outputs.
Systeem voor het benchmarken van verdediging
Build a benchmarking system to continuously evaluate defense effectiveness against known attack classes.
Defense-in-depth architectuur voor LLM-apps
Design and implement a complete defense-in-depth architecture for production LLM applications.
Opzetten van een AI incident response-systeem
Set up comprehensive incident response capabilities for AI-specific security incidents.
Uitrol van een input-embedding-firewall
Deploy an embedding-based firewall that detects injection attempts through semantic similarity to known attack patterns.
Veilige logging van in- en uitvoer voor verdediging
Implement secure logging for LLM input/output pairs that supports forensic analysis without exposing sensitive data.
Ontwerp van een LLM-firewallarchitectuur (verdediging-walkthrough)
Design and implement a comprehensive LLM firewall with input analysis, output filtering, and behavioral monitoring.
Uitrol van een LLM-honeypot
Deploy LLM honeypots to detect and study attacker behavior patterns and techniques.
Opzet voor het monitoren van modelgedrag
Set up comprehensive model behavior monitoring to detect drift, anomalies, and potential compromise.
LLM-applicaties monitoren op misbruik
Build a monitoring and alerting system to detect ongoing attacks against LLM applications.
Multi-model veiligheidsconsensus
Implement safety consensus mechanisms where multiple models must agree before executing sensitive actions.
Ontwerp van een uitvoerfilterpijplijn
Design and implement a multi-stage output filtering pipeline for LLM applications.
Grounding en verificatie van uitvoer
Implement output grounding verification to ensure LLM responses are factually supported by provided context.
Implementatiegids voor Prompt Armor
Implement a comprehensive prompt armoring system with instruction isolation, delimiter hardening, and priority enforcement.
Opzetten van een prompt injection-honeypot
Deploy honeypot prompts and canary data that detect and characterize prompt injection attempts.
Walkthrough van invoersanitisatie voor RAG
Implement input sanitization for RAG systems to prevent document-based injection attacks.
Systeem voor realtime aanvalsdetectie
Build a real-time attack detection system that monitors LLM interactions for adversarial patterns.
Feedbackloop tussen red team en verdediging
Build a continuous red team-defense improvement loop with automated testing and metric tracking.
Implementatie van watermerken in responses
Implement response watermarking to enable traceability and detect unauthorized reproduction of LLM outputs.
Implementatie van een runtime-veiligheidsmonitor
Implement a runtime safety monitor that detects and blocks unsafe model outputs in real-time.
Aangepaste veiligheidsclassifiers trainen
Train custom safety classifiers tuned to your application's specific threat model and content policy.
Ontwerp van een veilige agentarchitectuur
Design a secure architecture for LLM agent systems with sandboxing, capability controls, and audit trails.
Patronen voor veilige function calls
Implement secure function calling with input validation, output sanitization, and capability restrictions.
Walkthrough van een veilige RAG-architectuur
Design and implement a secure RAG architecture with document sanitization, access controls, and output validation.
Architectuur van een veilige RAG-pijplijn
Build a secure RAG pipeline with document sanitization, retrieval validation, and injection-resistant augmentation.
Technieken voor bescherming van de systeemprompt
Implement multiple layers of system prompt protection against extraction, override, and leakage attacks.
Threat intelligence voor AI-beveiliging
Build a threat intelligence pipeline for staying current with AI security threats and attack techniques.
Invoerfiltering op tokenniveau
Implement token-level input filtering that detects injection patterns at the tokenization stage before model processing.
Autorisatieframework voor tool calls
Implement a tool call authorization framework that validates tool invocations against policy before execution.
Implementatie van toegangscontrole voor agenttools
Implement fine-grained tool access control for LLM agents with capability-based security and approval workflows.
Anomaliedetectie voor LLM-verkeer
Build anomaly detection for LLM API traffic to identify attack patterns, abuse, and prompt injection attempts.
Monitoring van conversatie-integriteit
Build a conversation integrity monitoring system that detects manipulation across multi-turn interactions.
Dashboard met metrieken voor verdedigingseffectiviteit
Build a dashboard to measure and visualize the effectiveness of defensive measures against attack categories.
Geautomatiseerde testpipeline voor verdediging
Build an automated pipeline that continuously tests defensive measures against evolving attack techniques.
Detectiesysteem voor embedding-poisoning
Build a detection system for identifying poisoned documents in vector databases using statistical analysis.
Implementatie van guardrails voor function calling
Implement guardrails for function calling that validate tool selection, parameters, and execution scope.
Gids voor de uitrol van een LLM-honeypot
Deploy LLM-based honeypots to detect, analyze, and learn from attacker techniques in production environments.
Incidentrespons-playbook voor LLM-applicaties
Design and implement an incident response playbook specific to LLM application security incidents.
Gids voor productie-uitrol van LLM Guard
Deploy LLM Guard in a production environment with custom scanners, performance optimization, and monitoring.
Gids voor beveiligingshardening van MCP-servers
Harden MCP server implementations against tool poisoning, transport attacks, and capability escalation.
Multi-model verdedigingsensemble
Build an ensemble defense system using multiple models to cross-validate inputs and outputs for safety.
Geavanceerde configuratie van NeMo Guardrails
Advanced NeMo Guardrails configuration including custom actions, knowledge base integration, and multi-model pipelines.
Een pipeline voor output-scanning bouwen
Build a production output scanning pipeline that detects data leakage, harmful content, and injection artifacts.
PII-detectie en -redactie voor LLM's
Build a PII detection and redaction system for LLM inputs and outputs to prevent data exposure.
Een prompt injection-classifier trainen
Train a custom prompt injection detection classifier using labeled datasets and modern NLP techniques.
Implementatie van sandboxing voor RAG-documenten
Implement document-level sandboxing for RAG systems to prevent cross-document injection and privilege escalation.
Adaptieve rate limiting voor LLM-API's
Implement adaptive rate limiting that adjusts thresholds based on detected attack patterns and user behavior.
Implementatie van een veilige RAG-architectuur
Implement a security-hardened RAG architecture with input sanitization, access control, and output validation.
Beschermingslagen voor de systeemprompt
Implement multi-layer system prompt protection including injection-resistant formatting and runtime validation.
Zero trust-architectuur voor LLM-apps
Implement zero trust principles in LLM application architecture with continuous verification and least privilege.
Volledige opdracht: AI-codeassistent
End-to-end engagement for assessing an AI-powered code assistant with repository access.
Volledige opdracht: platform voor contentgeneratie
Full engagement walkthrough for a content generation AI platform with brand safety and copyright concerns.
Volledige opdracht: AI voor klantanalyse
Red team engagement for an AI-powered customer analytics system processing sensitive behavioral data.
Volledige opdracht: AI-ontwikkelaarstools
End-to-end engagement for AI-powered developer tools with repository access and code execution capabilities.
Volledige opdracht: educatief AI-platform
End-to-end engagement walkthrough for an AI-powered educational platform with student safety requirements.
Volledige opdracht: AI-financieel adviseur
Full engagement for assessing an AI financial advisor chatbot handling investment recommendations.
Volledige opdracht: AI-systeem voor de overheid
Full red team engagement for a government-deployed AI system with classification and compliance requirements.
Volledige opdracht: AI-systeem voor de zorg
End-to-end engagement walkthrough for a healthcare AI system with HIPAA compliance requirements.
Volledige opdracht: interne kennisbot
Full engagement walkthrough for an internal enterprise knowledge bot with RAG and document access.
Volledige opdracht: multi-agentplatform
Full red team engagement of a multi-agent platform with MCP servers and A2A communication.
Volledige opdracht: SaaS-klantenchatbot
End-to-end engagement walkthrough for assessing a SaaS customer-facing chatbot with tool access.
Volledige opdracht: audit van de ML-supply chain
End-to-end engagement walkthrough for auditing an organization's ML model supply chain security.
Volledige opdracht: autonome codeer-agent
End-to-end engagement for an autonomous AI coding agent with repo access, CI/CD integration, and deployment.
Volledige opdracht: AI-platform voor contentgeneratie
Red team engagement of an AI content generation platform used for marketing, SEO, and social media.
Volledige opdracht: multi-tool klantenservice-agent
End-to-end engagement for a customer support agent with order management, refund processing, and CRM access.
Volledige opdracht: DevOps-AI-assistent
End-to-end engagement for a DevOps AI assistant with CI/CD, cloud infrastructure, and monitoring access.
Volledige opdracht: documentverwerkingspijplijn
Full engagement of an AI document processing pipeline handling invoices, contracts, and compliance documents.
Volledige opdracht: AI-onderwijsplatform
End-to-end engagement walkthrough for assessing an AI-powered education platform with tutoring and grading.
Volledige opdracht: AI voor overheidsdienstverlening aan burgers
Red team engagement of a government citizen services AI handling tax, benefits, and identity verification.
Volledige opdracht: HR-chatbot voor arbeidsvoorwaarden
Full engagement walkthrough for an HR benefits chatbot with access to employee records and benefits administration.
Volledige opdracht: AI-acceptatie van verzekeringen
Full engagement for an AI underwriting system with risk scoring, policy generation, and claims data access.
Volledige opdracht: AI-systeem voor juridisch onderzoek
Full red team engagement of a legal research AI with case law access, privilege management, and citation generation.
Volledige opdracht: multi-agentsysteem voor ondernemingen
Full engagement of an enterprise multi-agent system with specialized agents for HR, IT, Finance, and Legal.
Volledige opdracht: AI voor retailpersonalisatie
Red team engagement of an AI personalization engine with access to customer profiles and purchase history.
Volledige opdracht: AI-beveiligingscopilot
Red team engagement of an AI security copilot with access to SIEM, vulnerability scanners, and threat intelligence.
Volledige opdracht: AI-optimalisator voor de supply chain
End-to-end engagement for a supply chain AI with access to logistics, inventory, and supplier management systems.
Volledige opdracht: AI-assistent voor telehealth
End-to-end engagement for a telehealth AI assistant with appointment scheduling, symptom assessment, and EHR access.
Walkthroughs
Step-by-step guided walkthroughs covering red team tools, engagement methodology, defense implementation, platform-specific testing, and full engagement workflows.
Ontwerp van adversarial simulaties
Design realistic adversarial simulations that model real-world threat actors and attack scenarios for AI systems.
Methodologie voor assessment van agentic systemen
Comprehensive methodology for assessing agentic AI systems including tool use, memory, and multi-agent interactions.
Planning van een AI-penetratietest
Complete methodology for planning AI-specific penetration tests including scope definition, resource allocation, and timeline.
Volwassenheidsmodel voor het AI red team (methodologie-walkthrough)
Maturity model for assessing and improving an organization's AI red teaming capabilities.
Framework voor AI-beveiligingsmetrieken
Framework for measuring and reporting on AI security posture using quantitative metrics.
Classificatiesysteem voor AI-kwetsbaarheden
Structured system for classifying AI-specific vulnerabilities by type, impact, and exploitability.
Framework voor het prioriteren van aanvallen
Prioritize attack techniques based on target architecture, time constraints, and likelihood of success.
Attack trees opstellen voor LLM-systemen
Build systematic attack trees for LLM system assessments using MITRE ATLAS and OWASP mappings.
Geautomatiseerde workflow voor AI-verkenning
Build an automated reconnaissance workflow that maps AI application architecture, models, and defense configurations.
Collaboratief AI red team-assessment
Coordinate multi-person red team assessments with role assignments, communication protocols, and finding deconfliction.
Concurrentieanalyse van AI-beveiligingstools
Methodology for evaluating and comparing AI security tools for red team operations.
Compliance-gedreven testmethodologie
Map regulatory requirements to specific test cases for compliance-driven AI red team assessments.
Methodologie voor integratie van continue monitoring
Integrate red team findings into continuous monitoring systems for ongoing threat detection and defense validation.
Deduplicatie en triage van bevindingen
Deduplicate and triage findings from automated and manual testing into actionable, prioritized vulnerability reports.
Methodologie voor multi-modeltesten
Structured methodology for testing applications that use multiple LLM models in their processing pipeline.
Methodologie voor analyse na het engagement
Conduct thorough post-engagement analysis including lessons learned, technique effectiveness, and methodology refinement.
Methodologie voor purple team AI-beoordeling
Conduct collaborative purple team AI assessments with real-time feedback between red and blue team operations.
Methodologie voor regressietesten van AI-beveiliging
Design regression testing suites that verify security fixes remain effective across model updates and deployments.
Risicogebaseerde aanpak voor AI-testen
Apply risk-based testing approaches to focus assessment effort on the highest-impact vulnerability categories.
Stakeholdermanagement bij AI-redteaming
Managing stakeholder expectations and communication throughout AI red team engagements.
Methodologie voor stakeholderspecifieke rapportage
Tailor red team reports for different stakeholders including executives, developers, security teams, and compliance officers.
Tabletop-oefeningen voor AI-beveiliging
Designing and facilitating tabletop exercises focused on AI security incident scenarios.
Methodologie voor een tijdgebonden AI-beoordeling
Maximize testing coverage within strict time constraints using prioritized attack trees and parallel testing.
Methodologie voor enumeratie van het AI-aanvalsoppervlak
Systematic methodology for enumerating the complete attack surface of an AI-powered application.
Methodologie voor AI-compliancetesten
Methodology for testing AI systems against regulatory compliance requirements including EU AI Act and NIST.
Bevindingen koppelen aan MITRE ATLAS
Methodology for mapping AI red team findings to MITRE ATLAS tactics, techniques, and procedures.
Structuur van een AI-penetratietestrapport
Detailed report structure for AI penetration tests with finding templates and severity scoring.
Scoping-sjablonen voor het AI red team
Templates and procedures for scoping AI red team engagements across different application types.
Methodologie voor het kwantificeren van AI-risico's
Quantitative risk assessment methodology for AI vulnerabilities with probability and impact scoring.
Ontwerp van tabletop-oefeningen voor AI-beveiliging
Design and facilitate AI security tabletop exercises for organizational preparedness assessment.
Bewijsverzameling tijdens AI-testen
Best practices for collecting, organizing, and preserving evidence during AI red team assessments.
Methodologie voor multi-modelbeoordeling
Methodology for assessing applications that use multiple AI models in pipelines or ensemble configurations.
Testmethodologie voor de OWASP LLM Top 10
Comprehensive testing methodology for each vulnerability in the OWASP LLM Top 10 2025.
Purple team-operaties voor AI-beveiliging
Methodology for conducting purple team operations that combine red team attacks with blue team defense improvement.
Methodologie voor regressietesten van AI-beveiliging (methodologie-walkthrough)
Methodology for continuous regression testing of AI application security after updates and model changes.
AI-risico's communiceren naar stakeholders
Guide for communicating AI security risks to technical and non-technical stakeholders effectively.
Threat intelligence voor AI-systemen
Methodology for gathering and applying threat intelligence specific to AI system attacks and defenses.
Framework voor het prioriteren van AI-kwetsbaarheden
Framework for prioritizing AI vulnerabilities by exploitability, impact, and remediation cost.
AI-red team-rapporten schrijven
Guide to writing clear, actionable AI red team assessment reports with findings and recommendations.
AI21 Labs-modellen testen
Red team testing guide for AI21 Labs Jamba models including long context and efficiency features.
Anthropic Claude testen: complete gids
Complete red team testing guide for Anthropic's Claude including tool use, extended thinking, and computer use.
AWS Bedrock-deployments testen
Red team testing guide for models deployed via AWS Bedrock including guardrails and access controls.
De Azure OpenAI Service testen
Red team testing guide for Azure OpenAI including content filtering, managed identity, and network controls.
Cohere-modellen testen
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
DeepSeek-modellen testen
Red team testing guide for DeepSeek models including R1 reasoning exploitation and MoE architecture.
Het Fireworks AI-platform testen
Red team testing guide for Fireworks AI including function calling and compound AI systems.
Google Gemini testen: complete gids
Complete red team testing guide for Google's Gemini including multimodal inputs and long context.
Het Groq inference-platform testen
Red team testing guide for Groq's high-speed inference platform and its security characteristics.
Door Hugging Face gehoste modellen testen
Red team testing guide for models hosted on Hugging Face including Inference API and Spaces.
Meta Llama testen: complete gids
Complete red team testing guide for Meta's Llama models including fine-tuning and self-hosted deployment.
Mistral AI-modellen testen
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.
Lokale Ollama-deployments testen
Security testing guide for locally deployed models via Ollama including network exposure and API security.
OpenAI GPT-4o testen: complete gids
Complete red team testing guide for OpenAI's GPT-4o including API, function calling, and structured outputs.
Door Replicate gehoste modellen testen
Red team testing guide for models hosted on Replicate including open-source model deployments.
Het Together AI-platform testen
Red team testing guide for Together AI including fine-tuned model endpoints and custom deployments.
AI21 Jamba-modellen testen
Red team testing guide for AI21's Jamba models with SSM-Transformer hybrid architecture analysis.
Anyscale/Ray Serve-endpoints testen
Red team testing guide for models served via Anyscale and Ray Serve including distributed inference security.
Cohere-modellen testen: complete gids
Complete red team testing guide for Cohere's Command models including RAG integration and tool use.
Fireworks AI-deployments testen
Red team testing guide for models deployed via Fireworks AI including function calling and fine-tuned models.
GCP Vertex AI-deployments testen
Red team testing guide for models deployed via GCP Vertex AI including Model Garden and custom endpoints.
Mistral/Mixtral testen: complete gids
Complete red team testing guide for Mistral and Mixtral models including MoE routing and API vulnerabilities.
Lokale Ollama-deployments testen (platform-walkthrough)
Red team testing guide for models deployed locally via Ollama including API endpoints and model management.
Door Replicate gehoste modellen testen
Red team testing guide for models hosted on Replicate including API exploitation and model versioning.
Door Together AI gehoste modellen testen
Red team testing guide for models hosted on Together AI's inference platform.
vLLM inference-deployments testen
Red team testing guide for models served via vLLM including batching, KV cache, and speculative decoding.
Ontwikkeling van een beveiligingsbenchmark-runner
Build a benchmark runner for standardized evaluation of LLM security across models and configurations.
Een attack replay-tool bouwen
Build a tool that records and replays attack sequences for regression testing and defense validation.
ML-beveiligingstesten met Counterfit
Use Microsoft's Counterfit for adversarial ML testing of deployed model endpoints.
Een custom LLM-fuzzer bouwen
Build a custom fuzzing tool for LLM applications with mutation engines and coverage tracking.
Een custom payload-mutatie-engine bouwen
Develop a mutation engine that evolves prompt injection payloads through genetic algorithms.
Geavanceerde configuratiegids voor garak
Deep dive into Garak's configuration system including custom probes, detectors, and generators.
Custom garak-probes bouwen (tool-walkthrough)
Develop custom probe modules for NVIDIA's Garak to test domain-specific vulnerabilities.
Eigen gedragssets voor HarmBench
Extend HarmBench with custom behavior specifications for domain-specific vulnerability testing.
Walkthrough: HarmBench-evaluatie
Run standardized attack evaluations using the HarmBench framework and interpret results.
Tool-walkthroughs
End-to-end practical walkthroughs for essential AI red teaming tools, covering installation, configuration, execution, and result interpretation.
Veiligheidsevaluaties met Inspect AI
Build and run AI safety evaluations using the UK AISI Inspect framework.
Gebruik van en inzendingen bij JailbreakBench
Use JailbreakBench to evaluate jailbreak techniques and submit results to the benchmark.
Uitrol en testen van LLM Guard
Deploy LLM Guard for input/output scanning and test its effectiveness against common attacks.
Een analyzer voor LLM-verkeer bouwen
Build a proxy-based LLM traffic analyzer for intercepting and analyzing API communications.
Integratie van promptfoo in de CI/CD-pijplijn
Integrate Promptfoo into CI/CD pipelines for continuous LLM security regression testing.
Promptfoo voor red team-evaluatie
Configure Promptfoo for comprehensive red team evaluation with custom assertions and graders.
Orkestratie van multi-turn-campagnes met PyRIT
Build and run sophisticated multi-turn attack campaigns using PyRIT's orchestrator system.
Integratie van aangepaste scoring in PyRIT
Integrate custom scoring metrics into PyRIT for organization-specific red team evaluation criteria.
Ontwikkeling van een agentbeveiligingsscanner
Build an automated security scanner for AI agent systems that tests tool use, memory, and workflow vulnerabilities.
Geautomatiseerde generatie van red team-rapporten
Build an automated system for generating structured red team reports from testing data and findings.
Een LLM-aanvalsproxy bouwen
Build an intercepting proxy for LLM API traffic that enables payload injection, response analysis, and logging.
Ontwikkeling van een tool voor het benchmarken van verdediging
Build a tool for benchmarking the effectiveness of defensive measures against standardized attack suites.
Een toolkit voor embedding-aanvallen bouwen
Build a toolkit for conducting embedding-level attacks including inversion, poisoning, and manipulation.
Ontwikkeling van eigen garak-plug-ins
Develop custom Garak plugins including probes, detectors, and generators for organization-specific testing.
Ontwikkeling van eigen solvers voor Inspect AI
Build custom solvers in Inspect AI for specialized safety evaluation scenarios and attack simulations.
Tool voor het cureren van jailbreak-datasets
Build a tool for curating, organizing, and testing jailbreak prompt datasets with effectiveness tracking.
Tool voor analyse van LLM-verkeer
Build a tool for analyzing and visualizing LLM API traffic patterns to identify attack indicators.
Tool voor MCP-beveiligingsaudits
Build a tool for auditing MCP server implementations for common security vulnerabilities and misconfigurations.
Bouw van een multi-model testharnas
Build a unified test harness for running attacks across OpenAI, Anthropic, Google, and local model endpoints.
Ontwikkeling van een payloadmutatie-framework
Build a payload mutation framework with grammar-based, semantic, and adversarial mutation strategies.
Integratie van promptfoo in de CI/CD-pijplijn (tool-walkthrough)
Integrate Promptfoo red team evaluations into CI/CD pipelines for automated security regression testing.
Ontwikkeling van aangepaste converters in PyRIT
Build custom PyRIT converters for payload transformation, encoding, and multi-modal attack generation.
Framework voor beveiligingstesten van RAG
Build a framework for systematic security testing of RAG applications including poisoning and exfiltration.