Defense Fundamentals Assessment

intermediate12 min readUpdated 2026-03-15

Test your understanding of AI defense mechanisms including input/output filtering, guardrails, sandboxing, and defense-in-depth strategies with 9 intermediate-level questions.

assessment defense filtering guardrails defense-in-depth

Defense Fundamentals Assessment

This assessment evaluates your knowledge of defensive mechanisms for AI systems: input and output filtering, guardrail architectures, sandboxing, rate limiting, and the defense-in-depth approach to AI security.

Knowledge Check

Why is defense-in-depth the recommended approach for AI security rather than relying on a single strong defense?

Knowledge Check

What is the fundamental asymmetry between attackers and defenders in input filtering for LLM applications?

Knowledge Check

How does an LLM-based guardrail system work, and what is its key vulnerability?

Knowledge Check

What advantage does output filtering have over input filtering for detecting harmful AI content?

Knowledge Check

What is the security rationale for sandboxing LLM code execution capabilities?

Knowledge Check

Why should rate limiting for AI applications consider token consumption rather than just request count?

Knowledge Check

What is 'prompt hardening' and what are its limitations?

Knowledge Check

What is the role of structured output enforcement in defending against LLM-based attacks?

Knowledge Check

When designing a defense architecture for an LLM application, which principle should guide the selection and ordering of defense layers?

Concept Summary

Defense Layer	Mechanism	What It Catches	What It Misses
Input keyword filtering	Blocks known dangerous phrases	Naive injection attempts	Encoded/obfuscated payloads
Input semantic analysis	LLM-based intent classification	Sophisticated phrasings	Dual-purpose payloads targeting the classifier
Output filtering	Evaluates generated content	Harmful content from any vector	Format-evasion and tool-call harms
Prompt hardening	Defensive system prompt instructions	Naive override attempts	Jailbreaking, encoding, multi-turn
Structured output	Schema-constrained generation	Freeform injection, exfiltration	Nothing if schema is strict; string fields if permissive
Sandboxing	Isolated execution environment	Arbitrary code execution impact	Sandbox escape vulnerabilities
Rate limiting (token-aware)	Resource consumption caps	DoS, brute-force extraction	Slow-and-low attacks under threshold
Least privilege	Minimal permissions for the model	Impact amplification	Harm within granted permissions

Scoring Guide

Score	Rating	Next Steps
8-9	Excellent	Strong defense knowledge. Proceed to the Guardrails Implementation Assessment.
6-7	Proficient	Review missed questions and revisit defense architecture materials.
4-5	Developing	Spend additional time with defense-in-depth and filtering concepts.
0-3	Needs Review	Study security defense fundamentals before retesting.

Study Checklist

I understand defense-in-depth and why no single layer is sufficient
I can explain the attacker-defender asymmetry in input filtering
I understand LLM-based guardrails and their injection vulnerability
I can describe the advantage of output filtering over input filtering
I understand sandboxing rationale for code execution capabilities
I can explain token-aware rate limiting versus request-count limits
I understand prompt hardening and its fundamental limitations
I can describe structured output enforcement as a defense mechanism
I know the cost-efficiency principle for ordering defense layers

Defense Fundamentals Assessment

Related articles

Defense Fundamentals Assessment

Related articles