# beginner
165 artikelengetagd met “beginner”
Oefenexamen certificering voor beginners
Full-length practice exam for beginner-level AI security certification covering foundations and basic techniques.
Het AI-API-ecosysteem
Een gids voor de redteamer door het AI-API-landschap — OpenAI, Anthropic, Google, AWS, Azure, open-source API's, authenticatiepatronen en veelvoorkomende beveiligingsmisconfiguraties.
AI-deploymentpatronen en beveiligingsimplicaties
Hoe API-gebaseerde, self-hosted, edge- en hybride deploymentpatronen elk hun eigen beveiligingsoverwegingen en aanvalsoppervlakken voor AI-systemen creëren.
Het AI-landschap: een kaart voor de redteamer
De belangrijkste modellen, aanbieders, deploymentpatronen en de AI-stack van hardware tot applicatie — oriëntatie voor redteamers in het huidige AI-ecosysteem.
Modeltypes en hun aanvalsoppervlakken
Hoe tekst-, vision-, multimodale, embedding- en codegeneratiemodellen elk unieke kwetsbaarheden en aanvalsoppervlakken voor redteamers met zich meebrengen.
Open versus gesloten modellen: beveiligingsafwegingen
Beveiligingsimplicaties van open-weight versus closed-source AI-modellen — toegang tot gewichten, verantwoorde deployment, fine-tuning-risico's en de impact op je red team-strategie.
Hoe LLM's werken: een gids voor redteamers
Begrijp de basis van grote taalmodellen — tokenvoorspelling, contextvensters, rollen en temperatuur — door een beveiligingsbril.
Taxonomie van AI-aanvallen
Een uitgebreide classificatie van AI-aanvallen geordend naar doelwit, techniek en impact — een gedeelde woordenschat voor de planning en rapportage van red teams.
Ethiek van AI-redteaming
Verantwoorde testpraktijken, het vermijden van echte schade, het navigeren van dual-use-vraagstukken en professionele standaarden voor AI-red team-beoefenaars.
Grondbeginselen van red team-methodologie
Wat AI-redteaming is, hoe het verschilt van traditioneel beveiligingstesten en de volledige levenscyclus van een opdracht, van scoping tot rapportage.
Juridisch landschap voor AI-testen
Autorisatievereisten, overwegingen rond servicevoorwaarden, computerfraudewetten en frameworks voor verantwoorde openbaarmaking bij AI-redteaming.
Dreigingsmodellering voor AI-systemen
Hoe je assets, dreigingen en aanvalsvectoren identificeert die specifiek zijn voor AI-systemen met behulp van vereenvoudigde dreigingsmodelleringsframeworks die zijn aangepast voor machine learning.
Anatomie van een LLM API-call
Begrijp de structuur van het HTTP-verzoek voor de API's van OpenAI, Anthropic en andere LLM's — systeemberichten, parameters, function calling en veelvoorkomende misconfiguraties.
AI-systeemarchitectuur voor redteamers
Hoe AI-systemen in productie worden uitgerold — model-API, prompt templates, orkestratie, tools, geheugen en guardrails — met een analyse van het aanvalsoppervlak op elke laag.
Aan de slag met AI-redteaming-labs
Introduction to the hands-on lab environment for AI red teaming. Learn what you'll need, how labs are structured, and get an overview of all beginner exercises.
Lab: grondbeginselen van de Anthropic Claude API
Set up the Anthropic Claude API for red teaming, learn authentication, the Messages API, system prompts, and how temperature and top-p affect attack success rates.
Lab: beveiligingstesten van API-authenticatie
Test LLM API authentication mechanisms for vulnerabilities including token leakage, replay attacks, and scope bypass.
Verkenning van multi-provider-API's
Explore and compare API behaviors across OpenAI, Anthropic, and Google AI to understand provider-specific security characteristics.
Lab: beveiliging van API-sleutels
Learn common API key exposure vectors, secure key management with .env files, detect keys in git history, implement key rotation, and apply least-privilege principles.
API rate limiting en foutafhandeling
Test LLM API rate limits and implement proper error handling for automated testing workflows.
Lab: API-responses parsen en analyseren
Parse and analyze LLM API responses to identify anomalies, safety filter triggers, and information leakage patterns.
Lab: geautomatiseerd LLM-testen opzetten
Configure automated testing pipelines for LLM red teaming using Python scripts, test runners, and result collection.
Lab: basale adversarial voorbeelden maken
Create basic adversarial examples that cause LLMs to misclassify, misinterpret, or bypass safety checks on text input.
Basisopzet voor geautomatiseerd testen
Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.
Basis classifier-ontwijking
Evade basic input/output classifiers using paraphrasing, synonym substitution, and formatting tricks.
Basistechnieken voor data-exfiltratie
Extract sensitive information from LLM applications using social engineering and misdirection.
Lab: bouw je eerste verdediging
Hands-on lab for building a basic AI input defense system using keyword matching, regex filters, and simple classifiers to block prompt injection attempts.
Basistesten van verdedigingsmechanismen
Identify and categorize the defensive mechanisms present in a target LLM application through structured probing.
Lab: bouw je eerste verdediging (beginnerlab)
Create a simple input filter that blocks common prompt injection patterns, then test it against the attack techniques you have learned in previous labs.
Lab: inleiding tot LLM-fuzzing
Learn the basics of fuzzing LLM inputs by mutating prompts to discover unexpected behaviors and edge cases.
Lab: taxonomie van jailbreak-technieken
Explore the major categories of jailbreak techniques and practice classifying attack payloads by technique type.
Lab: basaal beveiligingstesten van een RAG-systeem
Introduction to testing Retrieval-Augmented Generation systems for document injection and retrieval manipulation vulnerabilities.
Oefening in het documenteren van beveiligingsbevindingen
Practice documenting security findings in a professional format with reproducible steps and impact assessment.
Basis role-play prompt injection
Use role-play techniques to bypass basic LLM safety guardrails through character assumption.
Lab: social engineering van LLM-applicaties
Practice social engineering techniques adapted for LLM applications including authority escalation and urgency injection.
Lab: modelvergelijking
Test the same attack techniques against different language models and compare their safety behaviors, refusal patterns, and vulnerability profiles.
Lab: vergelijk de veiligheid van modellen
Hands-on lab for running identical safety tests against GPT-4, Claude, Gemini, and Llama to compare how different models handle prompt injection, jailbreaks, and safety boundary enforcement.
Lab: grenzen van het contentbeleid in kaart brengen
Map the content policy boundaries of LLM applications to identify permitted and restricted topic areas systematically.
Lab: contextmanipulatie
Manipulate the conversational context to change model behavior, using techniques like context flooding, false history injection, and persona shifting.
Lab: context overflow-aanvallen
Explore context window overflow attacks that push system instructions out of the model's attention by filling the context with padding content, and measure instruction-following degradation.
Basis context window overflow
Fill the context window with padding content to push safety instructions out of the attention window.
Analyse van conversatiegeschiedenis
Analyze how conversation history influences model behavior and identify opportunities for context manipulation.
Lab: manipulatie van de conversatiegeschiedenis
Test how LLM applications handle conversation history including truncation, injection, and context window management.
Lab: grondbeginselen van het omzeilen van verdedigingen
Learn basic techniques to bypass simple LLM defenses including keyword filters, instruction reinforcement, and output validators using encoding, reformulation, and indirect approaches.
Lab: inleiding tot het testen van verdedigingen
Learn to systematically test LLM application defenses by probing input filters, output validators, and guardrails.
Lab: delimiter injection-aanvallen
Craft payloads that exploit XML, markdown, and special token delimiters to break instruction boundaries and manipulate how LLMs parse system prompts versus user input.
Lab: delimiter escape-aanvallen
Craft payloads that escape delimiter boundaries separating system and user content, testing how models handle broken fences, nested delimiters, and format confusion.
Delimiter escape-technieken
Practice escaping common delimiters used to separate system prompts from user input.
Embedding-grondbeginselen voor beveiliging
Understand text embeddings and their security relevance by generating, comparing, and manipulating embedding vectors.
Lab: grondbeginselen van embeddings voor redteamers
Learn embedding fundamentals including vector similarity, semantic search, and how embeddings enable RAG systems.
Grondbeginselen van encoding en obfuscatie
Explore how Base64, ROT13, and Unicode encoding can bypass input filters in LLM applications.
Lab: hardening van de testomgeving
Secure your AI red team testing environment to prevent credential leaks, accidental production testing, and data exposure.
Lab: foutmeldingen analyseren voor verkenning
Analyze LLM application error messages to extract information about system architecture, models, and defensive layers.
Inlichtingen verzamelen uit foutmeldingen
Extract system architecture information from error messages and response patterns in LLM applications.
Lab: ethisch redteamen
Practice responsible AI red teaming with proper documentation, scope management, and ethical decision-making frameworks.
Lab: ethisch redteamen (beginnerlab)
Hands-on lab for practicing responsible AI red teaming with proper documentation, scope management, ethical boundaries, and disclosure procedures.
Lab: bewijsverzameling bij LLM-testen
Set up evidence collection workflows including request/response logging, screenshots, and reproducibility documentation.
Grondbeginselen van few-shot injection
Craft few-shot examples that prime the model to follow attacker instructions in subsequent turns.
Lab: few-shot-manipulatie-aanvallen
Craft fake few-shot examples that teach the model to bypass its safety training by demonstrating the desired adversarial behavior through fabricated conversation examples.
Je eerste Claude-API-call
Set up the Anthropic SDK and make your first Claude API call with system prompts and messages.
Je eerste LLM-API-call met OpenAI
Set up your Python environment and make your first LLM API call to understand request/response patterns.
Lab: je eerste prompt injection
Hands-on lab where you perform your first prompt injection attacks against a chatbot, progressing from basic overrides to more sophisticated techniques.
Lab: je eerste jailbreak
Try basic jailbreak techniques against a local model using Ollama, learning the difference between prompt injection and jailbreaking through hands-on experimentation.
Lab: format string-injectie in LLM's
Practice injecting format strings and template directives to manipulate LLM output structure and content.
Je eerste Garak-kwetsbaarheidsscan
Install and run NVIDIA's Garak tool to perform automated vulnerability scanning on an LLM endpoint.
Lab: garak opzetten en je eerste scan
Install and configure garak, the LLM vulnerability scanner, then run your first automated scan against a local model to discover common vulnerabilities.
Lab: grondbeginselen van hallucinatiedetectie
Learn to detect and trigger hallucinations in LLM outputs including factual errors, fabricated citations, and invented APIs.
Hello world-prompt injection
Write and test your first prompt injection payload against a simple chatbot to understand the fundamental attack mechanism.
Lab: tool voor injection-detectie
Build a basic prompt injection detection tool using pattern matching, heuristics, and LLM-based classification to identify malicious inputs before they reach the target model.
Lab: loganalyse van injection-pogingen
Build a logging pipeline to capture and analyze injection attempts for pattern discovery and technique refinement.
Lab: overzicht van injection-technieken
Survey and test ten fundamental prompt injection techniques against a local LLM, measuring effectiveness and cataloging behavioral patterns for each approach.
Lab: prioriteit bij het volgen van instructies
Test how language models prioritize conflicting instructions from system prompts, user messages, and embedded directives to understand the instruction hierarchy.
Testen van de instructiehiërarchie
Test how models prioritize conflicting instructions between system, user, and assistant roles.
Lab: basale jailbreak-technieken
Hands-on exploration of jailbreak techniques including role-play, DAN-style prompts, and academic framing against multiple models.
Grondbeginselen van het misbruiken van JSON-uitvoer
Exploit LLM JSON mode to bypass content filters and extract information through structured output channels.
Lab: beveiligingstesten van de JSON-uitvoermodus
Test JSON output mode for injection vulnerabilities including schema manipulation and structured output exploitation.
Lab: meertalige prompts testen
Test LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.
Lab: een lokaal model opzetten voor testen
Set up local LLM instances using Ollama for safe, cost-free red team testing without API costs or rate limits.
Lab: Markdown-injectie
Inject images, links, and formatting into LLM responses that exfiltrate data or alter display rendering in chat interfaces.
Lab voor modelbeveiligingsvergelijking
Compare the security posture of different LLM models by running identical test suites across providers.
Lab: basale modelfingerprinting
Learn to identify which LLM model powers an application through behavioral fingerprinting and response analysis.
Lab: meertalige injection
Test prompt injection techniques using multiple natural languages, exploring how safety training coverage varies across languages and how translation requests can mask payloads.
Lab: multi-turn escalatie-aanvallen
Build trust over several conversation messages, then exploit established rapport to bypass safety mechanisms through gradual, multi-turn escalation techniques.
Probing van multi-turn-conversaties
Use multi-turn conversations to gradually escalate requests and probe LLM safety boundaries.
Lab: grondbeginselen van het testen van multimodale invoer
Introduction to testing multimodal LLMs with image and text inputs to understand cross-modal behavior.
Lab: Ollama opzetten voor lokaal LLM-testen
Install and configure Ollama for local LLM red teaming, download models, perform basic interactions, and compare safety behavior between local and API-hosted models.
Lab: uitvoerfilters detecteren
Identify and characterize output filtering mechanisms in LLM applications through systematic response analysis.
Lab voor controle over het uitvoerformaat
Practice controlling LLM output formats to understand how format constraints affect safety mechanism effectiveness.
Lab: misbruik van het uitvoerformaat
Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.
Manipulatie van het uitvoerformaat
Force LLMs to output in specific formats (JSON, XML, code) to bypass output sanitization.
Lab: uitvoersturing
Learn to steer LLM outputs into specific formats, tones, and content using injection techniques that manipulate how the model structures and presents its responses.
Lab: payloads maken
Learn to craft effective prompt injection payloads from scratch by understanding payload structure, testing iteratively, and optimizing for reliability against a local model.
Lab: encoding-bypasses
Test Base64, ROT13, hex, and Unicode encoding techniques to evade LLM content filters while preserving the semantic meaning of adversarial payloads.
Payload-logging opzetten
Build a payload logging system to track prompt injection attempts and model responses.
Lab: verkenning van de LLM-playground
Explore LLM provider playgrounds to understand model behavior, parameter effects, and interactive testing workflows.
Beveiligingstesten van LLM-playgrounds
Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.
Lab: verdedigingen tegen prompt leaking testen
Test various prompt leaking defense configurations to evaluate their effectiveness against extraction techniques.
Lab: grondbeginselen van prompt leaking
Learn to extract system prompts from chatbots using common leaking techniques including direct requests, indirect elicitation, and format manipulation.
Lab voor analyse van promptstructuur
Reverse-engineer the prompt structure of target applications by analyzing response patterns and behavioral cues.
Lab: kwetsbaarheden van prompt-templates testen
Test common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.
Promptfoo opzetten voor LLM-evaluatie
Configure Promptfoo to create automated test suites for evaluating LLM safety and robustness.
Lab: promptfoo opzetten en je eerste evaluatie
Install and configure promptfoo for systematic LLM evaluation, then run your first red team evaluation to test model safety boundaries.
Lab: PyRIT opzetten en je eerste aanval
Install and configure Microsoft's PyRIT (Python Risk Identification Toolkit) for automated red teaming, then run your first orchestrated attack against a local model.
Lab: enumeratie en omzeilen van rate limits
Enumerate API rate limits and test common bypass techniques including header manipulation and request distribution.
In kaart brengen van rate limits en quota
Map the rate limits, quotas, and throttling behavior of LLM APIs to plan efficient testing strategies.
Lab: weigeringspatronen van modellen analyseren
Systematically map model refusal patterns to understand safety training boundaries and identify potential bypasses.
Lab: grondbeginselen van het schrijven van red team-rapporten
Practice writing clear, actionable red team findings reports with evidence, risk ratings, and remediation guidance.
Grondbeginselen van responsanalyse
Learn to analyze LLM responses for information leakage, inconsistencies, and indicators of safety mechanism activation.
Lab: LLM-responses analyseren
Learn to identify safety refusals, partial compliance, inconsistent behavior, and response fingerprinting techniques when analyzing LLM outputs during red team testing.
Lab: consistentie van responses testen
Test LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.
Lab: rollenspel-aanvallen
Use persona-based approaches to bypass AI safety measures by assigning alternate identities, characters, and scenarios that override the model's trained refusal behaviors.
Lab: veiligheidsbenchmarks uitvoeren
Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.
Lab: Veiligheidsgrenzen in kaart brengen
Ontdek systematisch wat een taalmodel wel en niet wil doen door zijn veiligheidsgrenzen over meerdere categorieën te onderzoeken en de resultaten te documenteren.
In kaart brengen van veiligheidsgrenzen
Systematically map the safety boundaries of an LLM by testing increasingly sensitive topics and documenting refusal patterns.
Lab: inleiding tot veiligheidstesten
Learn the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.
Probing van de grenzen van veiligheidstraining
Systematically probe the boundaries of RLHF safety training to understand where and how safety behaviors are enforced.
Lab: grondbeginselen van kwetsbaarheidsscoring
Learn vulnerability scoring frameworks adapted for LLM systems including severity, exploitability, and impact assessment.
Lab: eenvoudige technieken voor payload-encoding
Practice encoding injection payloads using Base64, hex, URL encoding, and Unicode to bypass basic input filters.
Lab: manipulatie van gestructureerde uitvoer
Explore how JSON mode, schema injection, and forced output formats can bypass LLM safety filters and manipulate structured responses.
Lab: technieken voor enumeratie van system prompts
Practice systematic approaches to enumerating hidden system prompt content through varied probing strategies.
Grondbeginselen van extractie van de systeemprompt
Learn to extract hidden system prompts from LLM applications using direct and indirect techniques.
Lab: system prompt overschrijven
Explore techniques to override system-level instructions with user-level inputs, testing how models prioritize conflicting directives across different prompt architectures.
Lab: reconstructie van de system prompt
Use various techniques to systematically extract and reconstruct the full system prompt of an LLM application, combining direct, indirect, and incremental extraction methods.
Lab: beveiligingseffecten van temperatuur en sampling
Explore how temperature and sampling parameters affect model susceptibility to jailbreaks and safety boundary enforcement.
Lab: LLM-red team-testcases ontwerpen
Design effective red team test cases with clear objectives, success criteria, and reproducible execution procedures.
Tokens tellen en schatten
Understand tokenization by counting and estimating tokens across different models and encoders.
Lab: grondbeginselen van tokenmanipulatie
Understand how tokenization affects LLM behavior and practice basic token-level manipulation techniques.
Lab: red team-testtools vergelijken
Compare Garak, PyRIT, and Promptfoo capabilities through hands-on exercises using each tool against the same target.
Installatie en configuratie van red team-tools
Install and configure essential AI red teaming tools: Garak, PyRIT, Promptfoo, and custom Python harnesses.
Testen van API-key scope en permissies
Test API key scoping and permission boundaries to identify over-privileged access configurations.
Analyse van API-responseheaders
Analyze HTTP response headers from LLM APIs to fingerprint providers, versions, and middleware.
Basis indirecte prompt injection
Plant and trigger a basic indirect prompt injection payload in content consumed by an LLM.
Basistechnieken voor payload-mutatie
Learn fundamental payload mutation strategies including synonym replacement and structural transforms.
Basis RAG query injection
Craft user queries that manipulate RAG retrieval to surface unintended documents.
Basispatronen voor misbruik van rate limits
Test common rate-limit bypass patterns including header manipulation and endpoint discovery.
Technieken om karaktercodering te omzeilen
Bypass input filters using Unicode normalization, homoglyph substitution, and mixed-script encoding.
Mapping van chatbot-persona en -capaciteiten
Map the configured persona, capabilities, and restrictions of an LLM chatbot through systematic probing.
Grondbeginselen van completion hijacking
Craft partial sentences that steer model completions toward attacker-desired outputs.
Conversation reset-aanvallen
Exploit conversation resets and context clearing to weaken model adherence to safety instructions.
Injectietechnieken met emoji en Unicode
Use emoji sequences and Unicode special characters to bypass text-based input filters.
Misbruik van foutmeldingen
Trigger and analyze error messages from LLM applications to reveal architecture and configuration details.
Je eerste HarmBench-evaluatie
Run a standardized safety evaluation using the HarmBench framework against a target model.
Je eerste Inspect AI-evaluatie
Set up and run a basic AI safety evaluation using the UK AISI Inspect framework.
Grondbeginselen van JSON-injectie
Inject adversarial content through JSON-formatted inputs to exploit structured data processing.
Je eerste LLM Guard-scan
Install and configure LLM Guard to scan inputs and outputs, understanding detection capabilities and gaps.
Exfiltratie via Markdown-rendering
Exploit Markdown rendering in chat UIs to exfiltrate data through image tags and links.
Grondbeginselen van modelfingerprinting
Identify which LLM model powers an application through behavioral fingerprinting techniques.
Inleiding tot NeMo Guardrails
Set up NVIDIA NeMo Guardrails and create your first Colang-based guardrail for an LLM application.
Prompt injection via bestandsnamen
Embed prompt injection payloads in filenames and metadata of uploaded documents.
Prompt leaking via samenvattingsverzoeken
Use summarization requests to trick an LLM into revealing its hidden system prompt through indirect disclosure.
Weigeringsberichten analyseren voor intel
Extract useful information about model configuration and guardrails by systematically analyzing refusal messages.
Side-channel-analyse van responstiming
Use response timing differences to infer information about model processing and guardrail activation.
Oefening in het in kaart brengen van veiligheidsgrenzen
Systematically map the safety boundaries of an LLM application across multiple topic categories.
Eenvoudige aanvallen op uitvoerbeperkingen
Force models to output in constrained formats that bypass output safety filters.
Reconstructie van de systeemprompt op basis van aanwijzingen
Piece together a hidden system prompt by collecting partial information from multiple probe techniques.
Extractie van de systeemprompt via foutinjectie
Trigger specific error conditions that cause the model to inadvertently reveal system prompt contents.
Effecten van temperature en top-k op veiligheid
Systematically test how temperature, top-k, and top-p parameters affect safety guardrail effectiveness.
Prompt injection via vertaling
Exploit LLM translation capabilities to smuggle instructions through language boundaries.
XML-injectie in LLM-contexten
Exploit XML tag handling in LLM applications to manipulate instruction parsing.
Grondbeginselen van directe injection
Core concepts of directly injecting instructions into LLM prompts, including override techniques, simple payload crafting, and understanding how models parse conflicting instructions.
Prompt leaking stap voor stap
Systematic approaches to extract system prompts from LLM applications, covering direct elicitation, indirect inference, differential analysis, and output-based reconstruction.
Je eerste Garak-scan uitvoeren
Step-by-step beginner walkthrough for running your very first garak vulnerability scan from zero, covering installation, target setup, probe selection, and basic result interpretation.
Je eerste promptfoo-evaluatie uitvoeren
Beginner walkthrough for running your first promptfoo evaluation from scratch, covering installation, configuration, test case creation, assertion writing, and result interpretation.
Je eerste PyRIT-red team-campagne uitvoeren
Beginner walkthrough for running your first PyRIT red team campaign from scratch, covering installation, target configuration, orchestrator setup, and basic result analysis.
De PyRIT-UI-frontend gebruiken
Beginner walkthrough on using PyRIT's web-based UI frontend for visual red team campaign management, including launching campaigns, monitoring progress, and reviewing results without writing code.