# architecture
標記為「architecture」的 93 篇文章
Agent Architecture Security Assessment
Assessment covering agent design patterns, tool sandboxing, multi-agent trust, and MCP security.
LLM Architecture Security Assessment
Assessment on transformer internals, tokenization security, attention vulnerabilities, and model-level attacks.
Defense Architecture Assessment
Assessment on designing defense-in-depth architectures for different AI application types.
Capstone: Defense Architecture Design
Capstone exercise: design and validate a defense-in-depth architecture for an LLM-powered application.
Cloud AI Network Security Architecture
Network security architecture for cloud AI deployments including VPC design, endpoints, and traffic inspection.
CaMeL & Dual LLM Pattern
Architectural defense patterns that separate trusted and untrusted processing: Simon Willison's Dual LLM concept and Google DeepMind's CaMeL framework for defending tool-using AI agents against prompt injection.
Content Filtering Architecture
Designing content filtering systems for LLM applications covering input, output, and context filtering.
Defense-in-Depth for LLM Applications
Implementing layered defense architectures for production LLM applications.
Guardrails & Safety Layer Architecture
How guardrail systems are architecturally designed, including pre-processing, in-processing, and post-processing layers, common design patterns, and where each layer can be bypassed.
Input Validation Architecture for LLMs
Designing input validation pipelines that detect and neutralize prompt injection before reaching the model.
LLM Firewall Architecture Patterns
Design patterns for implementing LLM firewalls that inspect and filter both input and output traffic to LLM applications.
AI Logging Architecture
What to capture in AI system logs — prompts, completions, latency, tokens, tool calls — along with storage strategies, retention policies, and privacy considerations.
Defense-in-Depth for LLM Apps
Layered defense strategy for AI applications covering network, application, model, and output layers, how each layer contributes, and why single-layer defense always fails.
Context Isolation Patterns for LLM Apps
Architectural patterns for isolating untrusted content from trusted instructions in LLM applications.
LLM Firewall Architecture Design
Designing and implementing an LLM firewall that inspects, filters, and logs all model interactions.
Multi-Model Safety Validation Architecture
Using multiple models to cross-validate inputs and outputs for safety in a mutually-checking architecture.
Zero Trust Architecture for LLM Applications
Applying zero trust principles to LLM application architectures including identity, device, and data plane security.
Payload Generator Architecture
Designing and implementing automated payload generation systems that produce diverse and effective adversarial inputs for LLM testing.
Custom Harness Building Patterns
Design patterns for building custom AI red team harnesses: plugin architecture, result storage, async execution, multi-model support, converter pipelines, and production-grade orchestration.
RAG Architecture: How Retrieval Systems Work
End-to-end anatomy of a Retrieval-Augmented Generation pipeline — document ingestion, chunking, embedding, indexing, retrieval, context assembly, and generation — with attack surface analysis at each stage.
Foundations
Essential building blocks for AI red teaming, covering red team methodology, the AI landscape, how LLMs work, embeddings and vector systems, AI system architecture, and adversarial machine learning concepts.
LLM Trust Boundaries
Understanding trust boundaries in LLM applications: where data crosses privilege levels and how the lack of native trust enforcement creates attack surfaces.
Transformer Architecture for Attackers
Deep dive into the transformer architecture — attention, feed-forward layers, and residual connections — through the lens of which components are exploitable.
Common AI Deployment Patterns & Attack Surfaces
Analysis of deployment patterns — chatbots, copilots, autonomous agents, batch processing, and fine-tuned models — with their unique attack surfaces and security considerations.
AI System Architecture for Red Teamers
How AI systems are deployed in production — model API, prompt templates, orchestration, tools, memory, and guardrails — with attack surface analysis at each layer.
Transformer Architecture Basics for Security
Understanding transformer architecture fundamentals through a security lens: how attention, embeddings, and generation create exploitable properties.
Sparse Attention Mechanism Security
Security implications of sparse and efficient attention mechanisms used in modern frontier models.
Multi-Cloud ML Security
Security architecture for ML workloads spanning multiple cloud providers including identity federation, data sovereignty, and policy consistency.
Cross-Architecture Injection Transfer
Research into how injection techniques transfer across model architectures and what architectural properties determine transferability.
Cross-Architecture Transfer Lab
Develop attacks on one model architecture that reliably transfer to fundamentally different architectures.
LLM Internals & Exploit Primitives
An overview of large language model architecture from a security researcher's perspective, covering the key components that create exploitable attack surfaces.
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Gemini (Google) Overview
Architecture overview of Google's Gemini model family, including natively multimodal design, long context capabilities, Google ecosystem integration, and security-relevant features for red teaming.
GPT-4 / GPT-4o Overview
Architecture overview of OpenAI's GPT-4 and GPT-4o models, including rumored Mixture of Experts design, capabilities, API surface, and security-relevant features for red teaming.
Model Deep Dives
Why model-specific knowledge matters for AI red teaming, how different architectures create different attack surfaces, and a systematic methodology for profiling any new model.
Multimodal Fusion Layer Attacks
Attacking the fusion mechanisms that combine information from multiple modalities in multimodal models.
VLM Architecture & Vision-Language Alignment
Deep dive into VLM architectures including CLIP, SigLIP, and vision transformers. How image patches become tokens, alignment training, and where misalignment creates exploitable gaps.
Model Architecture Attack Vectors
How model architecture decisions create exploitable attack surfaces, including attention mechanisms, MoE routing, KV cache, and context window vulnerabilities.
MoE Routing Manipulation
Attacking Mixture-of-Experts routing: expert selection manipulation, load balancing exploitation, safety expert bypass, and routing-aware adversarial inputs.
Training Pipeline Security
Security of the full AI model training pipeline, covering pre-training attacks, fine-tuning and alignment manipulation, architecture-level vulnerabilities, and advanced training-time threats.
Defense-in-Depth Architecture for LLM Apps
Design and implement a complete defense-in-depth architecture for production LLM applications.
Dual LLM Architecture Setup
Step-by-step walkthrough for implementing a dual LLM pattern where one model generates responses and a second model validates them, covering architecture design, validator prompt engineering, latency optimization, and failure handling.
LLM Firewall Architecture Design (Defense Walkthrough)
Design and implement a comprehensive LLM firewall with input analysis, output filtering, and behavioral monitoring.
Secure Agent Architecture Design
Design a secure architecture for LLM agent systems with sandboxing, capability controls, and audit trails.
Secure RAG Architecture Walkthrough
Design and implement a secure RAG architecture with document sanitization, access controls, and output validation.
Zero Trust Architecture for LLM Apps
Implement zero trust principles in LLM application architecture with continuous verification and least privilege.
代理 Architecture 安全 評量
評量 covering agent design patterns, tool sandboxing, multi-agent trust, and MCP security.
LLM Architecture 安全 評量
評量 on transformer internals, tokenization security, attention vulnerabilities, and model-level attacks.
防禦 Architecture 評量
評量 on designing defense-in-depth architectures for different AI application types.
Capstone: 防禦 Architecture Design
Capstone exercise: design and validate a defense-in-depth architecture for an LLM-powered application.
Cloud AI Network 安全 Architecture
Network security architecture for cloud AI deployments including VPC design, endpoints, and traffic inspection.
CaMeL & Dual LLM Pattern
Architectural defense patterns that separate trusted and untrusted processing: Simon Willison's Dual LLM concept and Google DeepMind's CaMeL framework for defending tool-using AI agents against prompt injection.
Content Filtering Architecture
Designing content filtering systems for LLM applications covering input, output, and context filtering.
防禦-in-Depth for LLM Applications
Implementing layered defense architectures for production LLM applications.
防護機制與安全層架構
防護系統在架構上如何設計,包括前置處理、推論中處理與後置處理層、常見設計模式,以及各層可被繞過之處。
Input Validation Architecture for LLMs
Designing input validation pipelines that detect and neutralize prompt injection before reaching the model.
LLM Firewall Architecture Patterns
Design patterns for implementing LLM firewalls that inspect and filter both input and output traffic to LLM applications.
AI Logging Architecture
What to capture in AI system logs — prompts, completions, latency, tokens, tool calls — along with storage strategies, retention policies, and privacy considerations.
LLM 應用之縱深防禦
AI 應用之分層防禦策略,涵蓋網路、應用、模型與輸出層,各層的貢獻,以及為何單層防禦必然失敗。
Context Isolation Patterns for LLM Apps
Architectural patterns for isolating untrusted content from trusted instructions in LLM applications.
LLM Firewall Architecture Design
Designing and implementing an LLM firewall that inspects, filters, and logs all model interactions.
Multi-模型 Safety Validation Architecture
Using multiple models to cross-validate inputs and outputs for safety in a mutually-checking architecture.
Zero Trust Architecture for LLM Applications
Applying zero trust principles to LLM application architectures including identity, device, and data plane security.
Payload Generator Architecture
Designing and implementing automated payload generation systems that produce diverse and effective adversarial inputs for LLM testing.
Custom Harness Building Patterns
Design patterns for building custom AI red team harnesses: plugin architecture, result storage, async execution, multi-model support, converter pipelines, and production-grade orchestration.
RAG 架構:檢索系統如何運作
檢索增強生成管線之端到端解剖——文件攝入、分塊、embedding、索引、檢索、脈絡組裝與生成——含各階段之攻擊面分析。
基礎
AI 紅隊演練的核心建構區塊,涵蓋紅隊方法論、AI 景觀、大型語言模型如何運作、嵌入向量與向量系統、AI 系統架構,以及對抗性機器學習概念。
LLM Trust Boundaries
Understanding trust boundaries in LLM applications: where data crosses privilege levels and how the lack of native trust enforcement creates attack surfaces.
大型語言模型如何運作
從安全視角理解大型語言模型——涵蓋 transformer 架構、分詞、注意力、訓練流程與安全對齊機制。
Transformer Architecture for 攻擊ers
Deep dive into the transformer architecture — attention, feed-forward layers, and residual connections — through the lens of which components are exploitable.
常見 AI 部署模式與攻擊面
部署模式之分析——聊天機器人、copilot、自主代理、批次處理與微調模型——及其獨特之攻擊面與安全考量。
紅隊視角的 AI 系統架構
AI 系統如何於生產環境部署——模型 API、提示範本、編排、工具、記憶體與防護機制——並於每一層進行攻擊面分析。
Transformer Architecture Basics for 安全
Understanding transformer architecture fundamentals through a security lens: how attention, embeddings, and generation create exploitable properties.
Sparse Attention Mechanism 安全
安全 implications of sparse and efficient attention mechanisms used in modern frontier models.
Multi-Cloud ML 安全
安全 architecture for ML workloads spanning multiple cloud providers including identity federation, data sovereignty, and policy consistency.
Cross-Architecture Injection Transfer
Research into how injection techniques transfer across model architectures and what architectural properties determine transferability.
Cross-Architecture Transfer 實驗室
Develop attacks on one model architecture that reliably transfer to fundamentally different architectures.
大型語言模型內部與利用原語
從安全研究員視角出發的大型語言模型架構概覽,涵蓋建立可利用攻擊面的關鍵元件。
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Gemini(Google)概觀
Google Gemini 模型家族的架構概觀,包括原生多模態設計、長上下文能力、Google 生態整合,以及對紅隊具意義的安全相關特性。
GPT-4 / GPT-4o 概觀
OpenAI GPT-4 與 GPT-4o 模型的架構概觀,涵蓋傳聞中的 Mixture of Experts 設計、能力、API 介面,以及對紅隊具意義的安全相關特性。
模型深入探討
為何模型特定知識對 AI 紅隊演練重要、不同架構如何建立不同的攻擊面,以及為任何新模型剖析的系統化方法論。
Multimodal Fusion Layer 攻擊s
攻擊ing the fusion mechanisms that combine information from multiple modalities in multimodal models.
VLM 架構與視覺—語言對齊
深入探討 VLM 架構,包括 CLIP、SigLIP 與 vision transformers。圖像 patch 如何變成 token、對齊訓練,以及錯位(misalignment)如何製造可利用之缺口。
架構層級攻擊
鎖定模型架構最佳化的攻擊——涵蓋量化利用、蒸餾攻擊、KV 快取攻擊、MoE 路由操控與上下文視窗利用。
MoE 路由操弄
攻擊 Mixture-of-Experts 路由:專家選擇操弄、負載平衡利用、安全專家繞過,以及路由感知對抗輸入。
訓練管線安全
完整 AI 模型訓練管線的安全,涵蓋預訓練攻擊、微調與對齊操控、架構層級漏洞與進階訓練期威脅。
防禦-in-Depth Architecture for LLM Apps
Design and implement a complete defense-in-depth architecture for production LLM applications.
雙 LLM 架構設置
實作雙 LLM 模式之逐步流程——一個模型產生回應、第二個模型驗證之,涵蓋架構設計、驗證者提示工程、延遲最佳化與失敗處理。
LLM Firewall Architecture Design (防禦 導覽)
Design and implement a comprehensive LLM firewall with input analysis, output filtering, and behavioral monitoring.
Secure 代理 Architecture Design
Design a secure architecture for LLM agent systems with sandboxing, capability controls, and audit trails.
Secure RAG Architecture 導覽
Design and implement a secure RAG architecture with document sanitization, access controls, and output validation.
Zero Trust Architecture for LLM Apps
Implement zero trust principles in LLM application architecture with continuous verification and least privilege.