# architecture
46 articlestagged with “architecture”
Agent Architecture Security Assessment
Assessment covering agent design patterns, tool sandboxing, multi-agent trust, and MCP security.
LLM Architecture Security Assessment
Assessment on transformer internals, tokenization security, attention vulnerabilities, and model-level attacks.
Defense Architecture Assessment
Assessment on designing defense-in-depth architectures for different AI application types.
Capstone: Defense Architecture Design
Capstone exercise: design and validate a defense-in-depth architecture for an LLM-powered application.
Cloud AI Network Security Architecture
Network security architecture for cloud AI deployments including VPC design, endpoints, and traffic inspection.
CaMeL & Dual LLM Pattern
Architectural defense patterns that separate trusted and untrusted processing: Simon Willison's Dual LLM concept and Google DeepMind's CaMeL framework for defending tool-using AI agents against prompt injection.
Content Filtering Architecture
Designing content filtering systems for LLM applications covering input, output, and context filtering.
Defense-in-Depth for LLM Applications
Implementing layered defense architectures for production LLM applications.
Guardrails & Safety Layer Architecture
How guardrail systems are architecturally designed, including pre-processing, in-processing, and post-processing layers, common design patterns, and where each layer can be bypassed.
Input Validation Architecture for LLMs
Designing input validation pipelines that detect and neutralize prompt injection before reaching the model.
LLM Firewall Architecture Patterns
Design patterns for implementing LLM firewalls that inspect and filter both input and output traffic to LLM applications.
AI Logging Architecture
What to capture in AI system logs — prompts, completions, latency, tokens, tool calls — along with storage strategies, retention policies, and privacy considerations.
Defense-in-Depth for LLM Apps
Layered defense strategy for AI applications covering network, application, model, and output layers, how each layer contributes, and why single-layer defense always fails.
Context Isolation Patterns for LLM Apps
Architectural patterns for isolating untrusted content from trusted instructions in LLM applications.
LLM Firewall Architecture Design
Designing and implementing an LLM firewall that inspects, filters, and logs all model interactions.
Multi-Model Safety Validation Architecture
Using multiple models to cross-validate inputs and outputs for safety in a mutually-checking architecture.
Zero Trust Architecture for LLM Applications
Applying zero trust principles to LLM application architectures including identity, device, and data plane security.
Payload Generator Architecture
Designing and implementing automated payload generation systems that produce diverse and effective adversarial inputs for LLM testing.
Custom Harness Building Patterns
Design patterns for building custom AI red team harnesses: plugin architecture, result storage, async execution, multi-model support, converter pipelines, and production-grade orchestration.
RAG Architecture: How Retrieval Systems Work
End-to-end anatomy of a Retrieval-Augmented Generation pipeline — document ingestion, chunking, embedding, indexing, retrieval, context assembly, and generation — with attack surface analysis at each stage.
Foundations
Essential building blocks for AI red teaming, covering red team methodology, the AI landscape, how LLMs work, embeddings and vector systems, AI system architecture, and adversarial machine learning concepts.
LLM Trust Boundaries
Understanding trust boundaries in LLM applications: where data crosses privilege levels and how the lack of native trust enforcement creates attack surfaces.
Transformer Architecture for Attackers
Deep dive into the transformer architecture — attention, feed-forward layers, and residual connections — through the lens of which components are exploitable.
Common AI Deployment Patterns & Attack Surfaces
Analysis of deployment patterns — chatbots, copilots, autonomous agents, batch processing, and fine-tuned models — with their unique attack surfaces and security considerations.
AI System Architecture for Red Teamers
How AI systems are deployed in production — model API, prompt templates, orchestration, tools, memory, and guardrails — with attack surface analysis at each layer.
Transformer Architecture Basics for Security
Understanding transformer architecture fundamentals through a security lens: how attention, embeddings, and generation create exploitable properties.
Sparse Attention Mechanism Security
Security implications of sparse and efficient attention mechanisms used in modern frontier models.
Multi-Cloud ML Security
Security architecture for ML workloads spanning multiple cloud providers including identity federation, data sovereignty, and policy consistency.
Cross-Architecture Injection Transfer
Research into how injection techniques transfer across model architectures and what architectural properties determine transferability.
Cross-Architecture Transfer Lab
Develop attacks on one model architecture that reliably transfer to fundamentally different architectures.
LLM Internals & Exploit Primitives
An overview of large language model architecture from a security researcher's perspective, covering the key components that create exploitable attack surfaces.
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Gemini (Google) Overview
Architecture overview of Google's Gemini model family, including natively multimodal design, long context capabilities, Google ecosystem integration, and security-relevant features for red teaming.
GPT-4 / GPT-4o Overview
Architecture overview of OpenAI's GPT-4 and GPT-4o models, including rumored Mixture of Experts design, capabilities, API surface, and security-relevant features for red teaming.
Model Deep Dives
Why model-specific knowledge matters for AI red teaming, how different architectures create different attack surfaces, and a systematic methodology for profiling any new model.
Multimodal Fusion Layer Attacks
Attacking the fusion mechanisms that combine information from multiple modalities in multimodal models.
VLM Architecture & Vision-Language Alignment
Deep dive into VLM architectures including CLIP, SigLIP, and vision transformers. How image patches become tokens, alignment training, and where misalignment creates exploitable gaps.
Model Architecture Attack Vectors
How model architecture decisions create exploitable attack surfaces, including attention mechanisms, MoE routing, KV cache, and context window vulnerabilities.
MoE Routing Manipulation
Attacking Mixture-of-Experts routing: expert selection manipulation, load balancing exploitation, safety expert bypass, and routing-aware adversarial inputs.
Training Pipeline Security
Security of the full AI model training pipeline, covering pre-training attacks, fine-tuning and alignment manipulation, architecture-level vulnerabilities, and advanced training-time threats.
Defense-in-Depth Architecture for LLM Apps
Design and implement a complete defense-in-depth architecture for production LLM applications.
Dual LLM Architecture Setup
Step-by-step walkthrough for implementing a dual LLM pattern where one model generates responses and a second model validates them, covering architecture design, validator prompt engineering, latency optimization, and failure handling.
LLM Firewall Architecture Design (Defense Walkthrough)
Design and implement a comprehensive LLM firewall with input analysis, output filtering, and behavioral monitoring.
Secure Agent Architecture Design
Design a secure architecture for LLM agent systems with sandboxing, capability controls, and audit trails.
Secure RAG Architecture Walkthrough
Design and implement a secure RAG architecture with document sanitization, access controls, and output validation.
Zero Trust Architecture for LLM Apps
Implement zero trust principles in LLM application architecture with continuous verification and least privilege.