# beginner

165 articlestagged with “beginner”

Beginner Certification Practice Exam

Full-length practice exam for beginner-level AI security certification covering foundations and basic techniques.

assessmentscertificationbeginnerexampractice

The AI API Ecosystem

A red teamer's guide to the AI API landscape — OpenAI, Anthropic, Google, AWS, Azure, open-source APIs, authentication patterns, and common security misconfigurations.

apiauthenticationopenaianthropicsecuritybeginner

Beginner

AI Deployment Patterns and Security Implications

How API-based, self-hosted, edge, and hybrid deployment patterns each create distinct security considerations and attack surfaces for AI systems.

deploymentapiself-hostededgesecuritybeginner

Beginner

The AI Landscape: A Red Teamer's Map

Major models, providers, deployment patterns, and the AI stack from hardware to application — orienting red teamers in the current AI ecosystem.

ai-landscapeprovidersmodelsecosystembeginner

Beginner

Model Types and Their Attack Surfaces

How text, vision, multimodal, embedding, and code generation models each present unique vulnerabilities and attack surfaces for red teamers.

model-typesattack-surfacemultimodalcode-generationbeginner

Beginner

Open vs Closed Models: Security Tradeoffs

Security implications of open-weight vs closed-source AI models — weight access, responsible deployment, fine-tuning risks, and the impact on red teaming strategy.

open-sourceclosed-sourcemodel-weightssecurity-tradeoffsbeginner

Beginner

How LLMs Work: A Red Teamer's Guide

Understand the fundamentals of large language models — token prediction, context windows, roles, and temperature — through a security-focused lens.

llmfundamentalsred-teamingbeginner

Beginner

AI Attack Taxonomy

A comprehensive classification of AI attacks organized by target, technique, and impact — providing a shared vocabulary for red team planning and reporting.

taxonomyattacksclassificationbeginner

Beginner

Ethics of AI Red Teaming

Responsible testing practices, avoiding real harm, navigating dual-use concerns, and professional standards for AI red team practitioners.

ethicsresponsible-testingdual-useprofessional-standardsbeginner

Beginner

Red Team Methodology Fundamentals

What AI red teaming is, how it differs from traditional security testing, and the complete engagement lifecycle from scoping to reporting.

red-teamingmethodologyfundamentalsbeginner

Beginner

Legal Landscape for AI Testing

Authorization requirements, terms of service considerations, computer fraud laws, and responsible disclosure frameworks for AI red teaming.

legalauthorizationcomplianceresponsible-disclosurebeginner

Beginner

Threat Modeling for AI Systems

How to identify assets, threats, and attack vectors specific to AI systems using simplified threat modeling frameworks adapted for machine learning.

threat-modelingmethodologybeginnerplanning

Beginner

Anatomy of an LLM API Call

Understand the HTTP request structure for OpenAI, Anthropic, and other LLM APIs — system messages, parameters, function calling, and common misconfigurations.

apihttpconfigurationbeginner

Beginner

AI System Architecture for Red Teamers

How AI systems are deployed in production — model API, prompt templates, orchestration, tools, memory, and guardrails — with attack surface analysis at each layer.

architecturesystemsdeploymentbeginner

Beginner

Getting Started with AI Red Teaming Labs

Introduction to the hands-on lab environment for AI red teaming. Learn what you'll need, how labs are structured, and get an overview of all beginner exercises.

labsgetting-startedbeginnersetup

Beginner

Lab: Anthropic Claude API Basics

Set up the Anthropic Claude API for red teaming, learn authentication, the Messages API, system prompts, and how temperature and top-p affect attack success rates.

labanthropicclaudeapibeginner

Beginner

Lab: API Authentication Security Testing

Test LLM API authentication mechanisms for vulnerabilities including token leakage, replay attacks, and scope bypass.

labsapi-authenticationtestingbeginner

Beginner

Multi-Provider API Exploration

Explore and compare API behaviors across OpenAI, Anthropic, and Google AI to understand provider-specific security characteristics.

explorationlabbeginnerlabsmultiapi

Beginner

Lab: API Key Security

Learn common API key exposure vectors, secure key management with .env files, detect keys in git history, implement key rotation, and apply least-privilege principles.

labapi-keyssecuritysecrets-managementbeginner

Beginner

API Rate Limit and Error Handling

Test LLM API rate limits and implement proper error handling for automated testing workflows.

labsapirate-limitsbeginner

Beginner

Lab: API Response Parsing and Analysis

Parse and analyze LLM API responses to identify anomalies, safety filter triggers, and information leakage patterns.

labsapiresponse-parsingbeginner

Beginner

Lab: Setting Up Automated LLM Testing

Configure automated testing pipelines for LLM red teaming using Python scripts, test runners, and result collection.

labsautomated-testingsetupbeginner

Beginner

Lab: Crafting Basic Adversarial Examples

Create basic adversarial examples that cause LLMs to misclassify, misinterpret, or bypass safety checks on text input.

labsadversarial-examplescraftingbeginner

Beginner

Basic Automated Testing Setup

Set up automated prompt testing with Python scripts to scale your security evaluation beyond manual testing.

basiclabbeginnerlabsautomatedtesting

Beginner

Basic Classifier Evasion

Evade basic input/output classifiers using paraphrasing, synonym substitution, and formatting tricks.

labsclassifierevasionbeginner

Beginner

Basic Data Exfiltration Techniques

Extract sensitive information from LLM applications using social engineering and misdirection.

labsdata-exfiltrationbeginner

Beginner

Lab: Build Your First Defense

Hands-on lab for building a basic AI input defense system using keyword matching, regex filters, and simple classifiers to block prompt injection attempts.

labdefenseinput-filterregexbeginner

Beginner

Basic Defense Mechanism Testing

Identify and categorize the defensive mechanisms present in a target LLM application through structured probing.

testingdefensebasiclabbeginnerlabs

Beginner

Lab: Build Your First Defense (Beginner Lab)

Create a simple input filter that blocks common prompt injection patterns, then test it against the attack techniques you have learned in previous labs.

labdefenseinput-filterbeginnerhands-on

Beginner

Lab: Introduction to LLM Fuzzing

Learn the basics of fuzzing LLM inputs by mutating prompts to discover unexpected behaviors and edge cases.

labsfuzzingintroductionbeginner

Beginner

Lab: Jailbreak Technique Taxonomy

Explore the major categories of jailbreak techniques and practice classifying attack payloads by technique type.

labsjailbreaktaxonomybeginner

Beginner

Lab: Basic RAG System Security Testing

Introduction to testing Retrieval-Augmented Generation systems for document injection and retrieval manipulation vulnerabilities.

labsrag-testingbasicsbeginner

Beginner

Security Finding Documentation Exercise

Practice documenting security findings in a professional format with reproducible steps and impact assessment.

reportingbasiclabbeginnerexerciselabs

Beginner

Basic Role-Play Prompt Injection

Use role-play techniques to bypass basic LLM safety guardrails through character assumption.

labsprompt-injectionrole-playbeginner

Beginner

Lab: Social Engineering LLM Applications

Practice social engineering techniques adapted for LLM applications including authority escalation and urgency injection.

labssocial-engineeringllmbeginner

Beginner

Lab: Model Comparison

Test the same attack techniques against different language models and compare their safety behaviors, refusal patterns, and vulnerability profiles.

labmodel-comparisonsafetybenchmarkingbeginnerhands-on

Beginner

Lab: Compare Model Safety

Hands-on lab for running identical safety tests against GPT-4, Claude, Gemini, and Llama to compare how different models handle prompt injection, jailbreaks, and safety boundary enforcement.

labcomparisonsafetymulti-modelbeginner

Beginner

Lab: Content Policy Boundary Mapping

Map the content policy boundaries of LLM applications to identify permitted and restricted topic areas systematically.

labscontent-policymappingbeginner

Beginner

Lab: Context Manipulation

Manipulate the conversational context to change model behavior, using techniques like context flooding, false history injection, and persona shifting.

labcontext-manipulationprompt-injectionpersonabeginnerhands-on

Beginner

Lab: Context Overflow Attacks

Explore context window overflow attacks that push system instructions out of the model's attention by filling the context with padding content, and measure instruction-following degradation.

labcontext-overflowprompt-injectionattentionbeginner

Beginner

Basic Context Window Overflow

Fill the context window with padding content to push safety instructions out of the attention window.

labscontext-windowoverflowbeginner

Beginner

Conversation History Analysis

Analyze how conversation history influences model behavior and identify opportunities for context manipulation.

historyconversationlabbeginnerbasicslabs

Beginner

Lab: Conversation History Manipulation

Test how LLM applications handle conversation history including truncation, injection, and context window management.

labsconversation-historymanipulationbeginner

Beginner

Lab: Defense Bypass Basics

Learn basic techniques to bypass simple LLM defenses including keyword filters, instruction reinforcement, and output validators using encoding, reformulation, and indirect approaches.

labdefense-bypassevasionfiltersbeginnerhands-on

Beginner

Lab: Introduction to Defense Testing

Learn to systematically test LLM application defenses by probing input filters, output validators, and guardrails.

labsdefense-testingintroductionbeginner

Beginner

Lab: Delimiter Injection Attacks

Craft payloads that exploit XML, markdown, and special token delimiters to break instruction boundaries and manipulate how LLMs parse system prompts versus user input.

labdelimiter-injectionprompt-injectionxmlmarkdownbeginner

Beginner

Lab: Delimiter Escape Attacks

Craft payloads that escape delimiter boundaries separating system and user content, testing how models handle broken fences, nested delimiters, and format confusion.

labdelimiter-escapeprompt-injectionboundariesbeginnerhands-on

Beginner

Delimiter Escape Techniques

Practice escaping common delimiters used to separate system prompts from user input.

labsdelimitersescapebeginner

Beginner

Embedding Basics for Security

Understand text embeddings and their security relevance by generating, comparing, and manipulating embedding vectors.

securitylabbeginnerembeddingbasicslabs

Beginner

Lab: Embedding Fundamentals for Red Teamers

Learn embedding fundamentals including vector similarity, semantic search, and how embeddings enable RAG systems.

labsembeddingsfundamentalsbeginner

Beginner

Encoding and Obfuscation Basics

Explore how Base64, ROT13, and Unicode encoding can bypass input filters in LLM applications.

labsencodingobfuscationbeginner

Beginner

Lab: Testing Environment Hardening

Secure your AI red team testing environment to prevent credential leaks, accidental production testing, and data exposure.

labsenvironmenthardeningbeginner

Beginner

Lab: Error Message Analysis for Recon

Analyze LLM application error messages to extract information about system architecture, models, and defensive layers.

labserror-messagesreconnaissancebeginner

Beginner

Error Message Intelligence Gathering

Extract system architecture information from error messages and response patterns in LLM applications.

errorlabbeginnerintelligencelabsmessage

Beginner

Lab: Ethical Red Teaming

Practice responsible AI red teaming with proper documentation, scope management, and ethical decision-making frameworks.

labethicsresponsibledocumentationbeginnerhands-on

Beginner

Lab: Ethical Red Teaming (Beginner Lab)

Hands-on lab for practicing responsible AI red teaming with proper documentation, scope management, ethical boundaries, and disclosure procedures.

labethicsresponsible-testingdocumentationbeginner

Beginner

Lab: Evidence Collection for LLM Testing

Set up evidence collection workflows including request/response logging, screenshots, and reproducibility documentation.

labsevidence-collectionloggingbeginner

Beginner

Few-Shot Injection Fundamentals

Craft few-shot examples that prime the model to follow attacker instructions in subsequent turns.

labsfew-shotinjectionbeginner

Beginner

Lab: Few-Shot Manipulation Attacks

Craft fake few-shot examples that teach the model to bypass its safety training by demonstrating the desired adversarial behavior through fabricated conversation examples.

labfew-shotin-context-learningprompt-injectionbeginner

Beginner

Your First Claude API Call

Set up the Anthropic SDK and make your first Claude API call with system prompts and messages.

labsanthropicapibeginner

Beginner

Your First LLM API Call with OpenAI

Set up your Python environment and make your first LLM API call to understand request/response patterns.

labsopenaiapibeginner

Beginner

Lab: Your First Prompt Injection

Hands-on lab where you perform your first prompt injection attacks against a chatbot, progressing from basic overrides to more sophisticated techniques.

labprompt-injectionbeginnerhands-on

Beginner

Lab: Your First Jailbreak

Try basic jailbreak techniques against a local model using Ollama, learning the difference between prompt injection and jailbreaking through hands-on experimentation.

labjailbreakollamabeginnerhands-on

Beginner

Lab: Format String Injection in LLMs

Practice injecting format strings and template directives to manipulate LLM output structure and content.

labsformat-injectionstringsbeginner

Beginner

Your First Garak Vulnerability Scan

Install and run NVIDIA's Garak tool to perform automated vulnerability scanning on an LLM endpoint.

labsgarakscanningbeginner

Beginner

Lab: Garak Setup and First Scan

Install and configure garak, the LLM vulnerability scanner, then run your first automated scan against a local model to discover common vulnerabilities.

labgaraktool-setupvulnerability-scannerbeginnerhands-on

Beginner

Lab: Hallucination Detection Basics

Learn to detect and trigger hallucinations in LLM outputs including factual errors, fabricated citations, and invented APIs.

labshallucinationdetectionbeginner

Beginner

Hello World Prompt Injection

Write and test your first prompt injection payload against a simple chatbot to understand the fundamental attack mechanism.

injectionlabbeginnerlabshelloworld

Beginner

Lab: Injection Detection Tool

Build a basic prompt injection detection tool using pattern matching, heuristics, and LLM-based classification to identify malicious inputs before they reach the target model.

labinjection-detectiondefenseclassificationbeginnerhands-on

Beginner

Lab: Injection Attempt Log Analysis

Build a logging pipeline to capture and analyze injection attempts for pattern discovery and technique refinement.

labsinjection-logginganalysisbeginner

Beginner

Lab: Injection Techniques Survey

Survey and test ten fundamental prompt injection techniques against a local LLM, measuring effectiveness and cataloging behavioral patterns for each approach.

labprompt-injectionsurveytechniquesbeginnerhands-on

Beginner

Lab: Instruction Following Priority

Test how language models prioritize conflicting instructions from system prompts, user messages, and embedded directives to understand the instruction hierarchy.

labinstruction-followingpriorityconflicting-instructionsbeginnerhands-on

Beginner

Instruction Hierarchy Testing

Test how models prioritize conflicting instructions between system, user, and assistant roles.

labsinstruction-hierarchytestingbeginner

Beginner

Lab: Basic Jailbreak Techniques

Hands-on exploration of jailbreak techniques including role-play, DAN-style prompts, and academic framing against multiple models.

labjailbreakbeginnertechniques

Beginner

JSON Output Exploitation Basics

Exploit LLM JSON mode to bypass content filters and extract information through structured output channels.

exploitationoutputlabbeginnerlabsjson

Beginner

Lab: JSON Output Mode Security Testing

Test JSON output mode for injection vulnerabilities including schema manipulation and structured output exploitation.

labsjson-modeoutput-testingbeginner

Beginner

Lab: Multi-Language Prompt Testing

Test LLM safety boundaries across multiple languages to identify inconsistencies in safety training coverage.

labsmulti-languagetestingbeginner

Beginner

Lab: Local Model Setup for Testing

Set up local LLM instances using Ollama for safe, cost-free red team testing without API costs or rate limits.

labslocal-modelollama-setupbeginner

Beginner

Lab: Markdown Injection

Inject images, links, and formatting into LLM responses that exfiltrate data or alter display rendering in chat interfaces.

labmarkdown-injectiondata-exfiltrationrenderingbeginner

Beginner

Model Security Comparison Lab

Compare the security posture of different LLM models by running identical test suites across providers.

comparisonsecuritylabbeginnerlabsmodel

Beginner

Lab: Basic Model Fingerprinting

Learn to identify which LLM model powers an application through behavioral fingerprinting and response analysis.

labsfingerprintingmodel-identificationbeginner

Beginner

Lab: Multi-Language Injection

Test prompt injection techniques using multiple natural languages, exploring how safety training coverage varies across languages and how translation requests can mask payloads.

labmulti-languageprompt-injectiontranslationbeginnerhands-on

Beginner

Lab: Multi-Turn Escalation Attacks

Build trust over several conversation messages, then exploit established rapport to bypass safety mechanisms through gradual, multi-turn escalation techniques.

labmulti-turnescalationsocial-engineeringbeginner

Beginner

Multi-Turn Conversation Probing

Use multi-turn conversations to gradually escalate requests and probe LLM safety boundaries.

labsmulti-turnprobingbeginner

Beginner

Lab: Multimodal Input Testing Basics

Introduction to testing multimodal LLMs with image and text inputs to understand cross-modal behavior.

labsmultimodalbasicsbeginner

Beginner

Lab: Setting Up Ollama for Local LLM Testing

Install and configure Ollama for local LLM red teaming, download models, perform basic interactions, and compare safety behavior between local and API-hosted models.

labollamasetuplocal-modelsbeginner

Beginner

Lab: Detecting Output Filters

Identify and characterize output filtering mechanisms in LLM applications through systematic response analysis.

labsoutput-filteringdetectionbeginner

Beginner

Output Format Control Lab

Practice controlling LLM output formats to understand how format constraints affect safety mechanism effectiveness.

outputformatlabbeginnercontrollabs

Beginner

Lab: Output Format Exploitation

Manipulate output formats like JSON, CSV, code blocks, and structured data to extract information that models would normally refuse to provide in natural language.

laboutput-formatexploitationdata-leakagebeginnerhands-on

Beginner

Output Format Manipulation

Force LLMs to output in specific formats (JSON, XML, code) to bypass output sanitization.

labsoutput-manipulationformatbeginner

Beginner

Lab: Output Steering

Learn to steer LLM outputs into specific formats, tones, and content using injection techniques that manipulate how the model structures and presents its responses.

laboutput-steeringprompt-injectionoutput-manipulationbeginnerhands-on

Beginner

Lab: Payload Crafting

Learn to craft effective prompt injection payloads from scratch by understanding payload structure, testing iteratively, and optimizing for reliability against a local model.

labpayload-craftingprompt-injectionred-teamingbeginnerhands-on

Beginner

Lab: Encoding Bypasses

Test Base64, ROT13, hex, and Unicode encoding techniques to evade LLM content filters while preserving the semantic meaning of adversarial payloads.

labencodingbypasscontent-filtersbeginner

Beginner

Setting Up Payload Logging

Build a payload logging system to track prompt injection attempts and model responses.

labsloggingpayloadsbeginner

Beginner

Lab: LLM Playground Exploration

Explore LLM provider playgrounds to understand model behavior, parameter effects, and interactive testing workflows.

labsplaygroundexplorationbeginner

Beginner

LLM Playground Security Testing

Use web-based LLM playgrounds to practice basic injection techniques without API setup requirements.

labbeginnerlabsplaygroundtesting

Beginner

Lab: Testing Prompt Leaking Defenses

Test various prompt leaking defense configurations to evaluate their effectiveness against extraction techniques.

labsprompt-leakingdefense-testingbeginner

Beginner

Lab: Prompt Leaking Basics

Learn to extract system prompts from chatbots using common leaking techniques including direct requests, indirect elicitation, and format manipulation.

labprompt-leakingsystem-promptextractionbeginnerhands-on

Beginner

Prompt Structure Analysis Lab

Reverse-engineer the prompt structure of target applications by analyzing response patterns and behavioral cues.

analysisstructurelabbeginnerpromptlabs

Beginner

Lab: Prompt Template Vulnerability Testing

Test common prompt template patterns for injection vulnerabilities including string interpolation and placeholder abuse.

labsprompt-templatestestingbeginner

Beginner

Setting Up Promptfoo for LLM Evaluation

Configure Promptfoo to create automated test suites for evaluating LLM safety and robustness.

labspromptfooevaluationbeginner

Beginner

Lab: Promptfoo Setup and First Eval

Install and configure promptfoo for systematic LLM evaluation, then run your first red team evaluation to test model safety boundaries.

labpromptfootool-setupevaluationbeginnerhands-on

Beginner

Lab: PyRIT Setup and First Attack

Install and configure Microsoft's PyRIT (Python Risk Identification Toolkit) for automated red teaming, then run your first orchestrated attack against a local model.

labpyrittool-setupred-teamingmicrosoftbeginnerhands-on

Beginner

Lab: Rate Limit Enumeration and Bypass

Enumerate API rate limits and test common bypass techniques including header manipulation and request distribution.

labsrate-limitingenumerationbeginner

Beginner

Rate Limit and Quota Mapping

Map the rate limits, quotas, and throttling behavior of LLM APIs to plan efficient testing strategies.

mappingratelabbeginnerlimitlabs

Beginner

Lab: Analyzing Model Refusal Patterns

Systematically map model refusal patterns to understand safety training boundaries and identify potential bypasses.

labsrefusal-patternsanalysisbeginner

Beginner

Lab: Red Team Report Writing Basics

Practice writing clear, actionable red team findings reports with evidence, risk ratings, and remediation guidance.

labsreport-writingdocumentationbeginner

Beginner

Response Analysis Fundamentals

Learn to analyze LLM responses for information leakage, inconsistencies, and indicators of safety mechanism activation.

analysisresponselabbeginnerbasicslabs

Beginner

Lab: Analyzing LLM Responses

Learn to identify safety refusals, partial compliance, inconsistent behavior, and response fingerprinting techniques when analyzing LLM outputs during red team testing.

labresponse-analysisfingerprintingsafety-refusalsbeginner

Beginner

Lab: Response Consistency Testing

Test LLM response consistency across repeated queries to identify non-deterministic safety boundary behavior.

labsconsistencytestingbeginner

Beginner

Lab: Role-Play Attacks

Use persona-based approaches to bypass AI safety measures by assigning alternate identities, characters, and scenarios that override the model's trained refusal behaviors.

labrole-playpersonajailbreakbeginnerhands-on

Beginner

Lab: Running Safety Benchmarks

Run standardized safety benchmarks against LLM models to establish baseline safety profiles for comparison.

labssafety-benchmarkstestingbeginner

Beginner

Lab: Mapping Safety Boundaries

Systematically discover what a language model will and won't do by probing its safety boundaries across multiple categories and documenting the results.

labsafetyboundariesmappingbeginnerhands-on

Beginner

Safety Boundary Mapping

Systematically map the safety boundaries of an LLM by testing increasingly sensitive topics and documenting refusal patterns.

safetymappinglabbeginnerboundarylabs

Beginner

Lab: Introduction to Safety Testing

Learn the fundamentals of LLM safety testing including test case design, baseline measurement, and result documentation.

labssafety-testingfundamentalsbeginner

Beginner

Safety Training Boundary Probing

Systematically probe the boundaries of RLHF safety training to understand where and how safety behaviors are enforced.

probingsafetylabbeginnertraininglabs

Beginner

Lab: Vulnerability Scoring Fundamentals

Learn vulnerability scoring frameworks adapted for LLM systems including severity, exploitability, and impact assessment.

labsscoringvulnerability-ratingbeginner

Beginner

Lab: Simple Payload Encoding Techniques

Practice encoding injection payloads using Base64, hex, URL encoding, and Unicode to bypass basic input filters.

labspayload-encodingtechniquesbeginner

Beginner

Lab: Structured Output Manipulation

Explore how JSON mode, schema injection, and forced output formats can bypass LLM safety filters and manipulate structured responses.

labstructured-outputjsonschema-injectionbeginner

Beginner

Lab: System Prompt Enumeration Techniques

Practice systematic approaches to enumerating hidden system prompt content through varied probing strategies.

labssystem-promptenumerationbeginner

Beginner

System Prompt Extraction Fundamentals

Learn to extract hidden system prompts from LLM applications using direct and indirect techniques.

labssystem-promptextractionbeginner

Beginner

Lab: System Prompt Override

Explore techniques to override system-level instructions with user-level inputs, testing how models prioritize conflicting directives across different prompt architectures.

labprompt-injectionsystem-promptoverridebeginnerhands-on

Beginner

Lab: System Prompt Reconstruction

Use various techniques to systematically extract and reconstruct the full system prompt of an LLM application, combining direct, indirect, and incremental extraction methods.

labsystem-promptreconnaissanceextractionbeginner

Beginner

Lab: Temperature and Sampling Security Effects

Explore how temperature and sampling parameters affect model susceptibility to jailbreaks and safety boundary enforcement.

labstemperaturesampling-securitybeginner

Beginner

Lab: Designing LLM Red Team Test Cases

Design effective red team test cases with clear objectives, success criteria, and reproducible execution procedures.

labstest-case-designmethodologybeginner

Beginner

Token Counting and Estimation

Understand tokenization by counting and estimating tokens across different models and encoders.

labstokenscountingbeginner

Beginner

Lab: Token Manipulation Basics

Understand how tokenization affects LLM behavior and practice basic token-level manipulation techniques.

labstokensmanipulationbeginner

Beginner

Lab: Comparing Red Team Testing Tools

Compare Garak, PyRIT, and Promptfoo capabilities through hands-on exercises using each tool against the same target.

labstool-comparisonevaluationbeginner

Beginner

Red Team Tool Installation and Configuration

Install and configure essential AI red teaming tools: Garak, PyRIT, Promptfoo, and custom Python harnesses.

labbeginnerconfigtoolinstallationlabs

Beginner

API Key Scope and Permission Testing

Test API key scoping and permission boundaries to identify over-privileged access configurations.

labsapi-keyscopepermissionsbeginner

Beginner

API Response Header Analysis

Analyze HTTP response headers from LLM APIs to fingerprint providers, versions, and middleware.

labsapiheadersanalysisbeginner

Beginner

Basic Indirect Prompt Injection

Plant and trigger a basic indirect prompt injection payload in content consumed by an LLM.

labsindirect-injectionbasicsbeginner

Beginner

Basic Payload Mutation Techniques

Learn fundamental payload mutation strategies including synonym replacement and structural transforms.

labsmutationpayloadsbeginner

Beginner

Basic RAG Query Injection

Craft user queries that manipulate RAG retrieval to surface unintended documents.

labsragquery-injectionbeginner

Beginner

Basic Rate Limit Abuse Patterns

Test common rate-limit bypass patterns including header manipulation and endpoint discovery.

labsrate-limitingabusebeginner

Beginner

Character Encoding Bypass Techniques

Bypass input filters using Unicode normalization, homoglyph substitution, and mixed-script encoding.

labsencodingbypassunicodebeginner

Beginner

Chatbot Persona and Capability Mapping

Map the configured persona, capabilities, and restrictions of an LLM chatbot through systematic probing.

labspersonacapability-mappingbeginner

Beginner

Completion Hijacking Fundamentals

Craft partial sentences that steer model completions toward attacker-desired outputs.

labscompletion-hijackingbasicsbeginner

Beginner

Conversation Reset Attacks

Exploit conversation resets and context clearing to weaken model adherence to safety instructions.

labsconversation-resetcontextbeginner

Beginner

Emoji and Unicode Injection Techniques

Use emoji sequences and Unicode special characters to bypass text-based input filters.

labsemojiunicodeinjectionbeginner

Beginner

Error Message Exploitation

Trigger and analyze error messages from LLM applications to reveal architecture and configuration details.

labserror-messagesexploitationbeginner

Beginner

Your First HarmBench Evaluation

Run a standardized safety evaluation using the HarmBench framework against a target model.

labsharmbenchevaluationbeginner

Beginner

Your First Inspect AI Evaluation

Set up and run a basic AI safety evaluation using the UK AISI Inspect framework.

labsinspect-aievaluationbeginner

Beginner

JSON Injection Basics

Inject adversarial content through JSON-formatted inputs to exploit structured data processing.

labsjsoninjectionstructured-databeginner

Beginner

Your First LLM Guard Scan

Install and configure LLM Guard to scan inputs and outputs, understanding detection capabilities and gaps.

labsllm-guardscanningbeginner

Beginner

Markdown Rendering Exfiltration

Exploit Markdown rendering in chat UIs to exfiltrate data through image tags and links.

labsmarkdownexfiltrationrenderingbeginner

Beginner

Model Fingerprinting Basics

Identify which LLM model powers an application through behavioral fingerprinting techniques.

labsfingerprintingmodel-identificationbeginner

Beginner

Introduction to NeMo Guardrails

Set up NVIDIA NeMo Guardrails and create your first Colang-based guardrail for an LLM application.

labsnemo-guardrailsintrobeginner

Beginner

Prompt Injection via File Names

Embed prompt injection payloads in filenames and metadata of uploaded documents.

labsfilenamesinjectionuploadbeginner

Beginner

Prompt Leaking via Summarization Requests

Use summarization requests to trick an LLM into revealing its hidden system prompt through indirect disclosure.

labsprompt-leakingsummarizationbeginner

Beginner

Analyzing Refusal Messages for Intel

Extract useful information about model configuration and guardrails by systematically analyzing refusal messages.

labsrefusal-analysisreconnaissancebeginner

Beginner

Response Timing Side-Channel Analysis

Use response timing differences to infer information about model processing and guardrail activation.

labstimingside-channelanalysisbeginner

Beginner

Safety Boundary Mapping Exercise

Systematically map the safety boundaries of an LLM application across multiple topic categories.

labssafety-boundariesmappingbeginner

Beginner

Simple Output Constraint Attacks

Force models to output in constrained formats that bypass output safety filters.

labsoutput-constraintsattacksbeginner

Beginner

System Prompt Reconstruction from Clues

Piece together a hidden system prompt by collecting partial information from multiple probe techniques.

labssystem-promptreconstructionbeginner

Beginner

System Prompt Extraction via Error Injection

Trigger specific error conditions that cause the model to inadvertently reveal system prompt contents.

labssystem-prompterror-injectionbeginner

Beginner

Temperature and Top-K Effects on Safety

Systematically test how temperature, top-k, and top-p parameters affect safety guardrail effectiveness.

labstemperaturetop-ksafetybeginner

Beginner

Prompt Injection via Translation

Exploit LLM translation capabilities to smuggle instructions through language boundaries.

labstranslationinjectionmultilingualbeginner

Beginner

XML Injection in LLM Contexts

Exploit XML tag handling in LLM applications to manipulate instruction parsing.

labsxmlinjectionstructured-databeginner

Beginner

Direct Injection Basics

Core concepts of directly injecting instructions into LLM prompts, including override techniques, simple payload crafting, and understanding how models parse conflicting instructions.

prompt-injectiondirect-injectionred-teamingbeginnerpayload-crafting

Beginner

Prompt Leaking Step by Step

Systematic approaches to extract system prompts from LLM applications, covering direct elicitation, indirect inference, differential analysis, and output-based reconstruction.

prompt-injectionprompt-leakingsystem-promptextractionred-teamingbeginner

Beginner

Running Your First Garak Scan

Step-by-step beginner walkthrough for running your very first garak vulnerability scan from zero, covering installation, target setup, probe selection, and basic result interpretation.

garakvulnerability-scanningbeginnerfirst-scanwalkthrough

Beginner

Running Your First Promptfoo Evaluation

Beginner walkthrough for running your first promptfoo evaluation from scratch, covering installation, configuration, test case creation, assertion writing, and result interpretation.

promptfooevaluationbeginnerfirst-evalwalkthrough

Beginner

Running Your First PyRIT Red Team Campaign

Beginner walkthrough for running your first PyRIT red team campaign from scratch, covering installation, target configuration, orchestrator setup, and basic result analysis.

pyritred-teamingbeginnerfirst-campaignwalkthrough

Beginner

Using the PyRIT UI Frontend

Beginner walkthrough on using PyRIT's web-based UI frontend for visual red team campaign management, including launching campaigns, monitoring progress, and reviewing results without writing code.

pyrituifrontendbeginnervisual-interfacewalkthrough

Beginner