The AI Landscape: A Red Teamer's Map
Major models, providers, deployment patterns, and the AI stack from hardware to application — orienting red teamers in the current AI ecosystem.
Why the Landscape Matters
Before you can attack an AI system, you need to understand what you are looking at. The AI ecosystem is layered, fragmented, and evolving rapidly. A "chatbot" might be running GPT-4 via Azure OpenAI Service, fine-tuned with proprietary data, wrapped in a LangChain orchestration layer, behind a Next.js frontend. Each layer introduces its own attack surface, and understanding the landscape helps you identify which layers are present and which are worth targeting.
The AI Stack
The modern AI stack has five distinct layers, each with its own security considerations:
┌─────────────────────────────────────────┐
│ Application Layer │
│ (UI, business logic, user management) │
├─────────────────────────────────────────┤
│ Orchestration Layer │
│ (LangChain, LlamaIndex, custom code) │
├─────────────────────────────────────────┤
│ Model Layer │
│ (GPT-4, Claude, Llama, Gemini) │
├─────────────────────────────────────────┤
│ Infrastructure Layer │
│ (APIs, hosting, scaling, monitoring) │
├─────────────────────────────────────────┤
│ Hardware Layer │
│ (GPUs, TPUs, cloud compute) │
└─────────────────────────────────────────┘
Hardware Layer
The foundation of AI computation. Most AI workloads run on specialized hardware:
| Hardware | Provider | Significance |
|---|---|---|
| NVIDIA H100/H200 GPUs | NVIDIA | Dominant for training and inference; supply constraints shape the industry |
| Google TPUs | Purpose-built for TensorFlow and JAX workloads; used for Gemini | |
| AWS Trainium/Inferentia | Amazon | Custom chips for AWS-hosted AI workloads |
| Apple Neural Engine | Apple | On-device inference for Apple products |
| Intel Gaudi | Intel | Alternative accelerator gaining traction in enterprise |
Red team relevance: Hardware layer attacks are rare but high-impact. Side-channel attacks on shared GPU infrastructure, supply chain attacks on accelerator firmware, and resource exhaustion attacks that exploit hardware constraints are all possible but require specialized expertise.
Infrastructure Layer
Where models are hosted and served. This includes cloud providers, inference engines, and supporting services.
| Platform | Provider | Key Services |
|---|---|---|
| Azure AI | Microsoft | OpenAI model hosting, Azure ML, Cognitive Services |
| AWS Bedrock | Amazon | Multi-model API (Claude, Llama, Mistral, Titan) |
| Google Cloud AI | Vertex AI, Gemini API, Model Garden | |
| Together AI | Together | Open-source model hosting and inference |
| Replicate | Replicate | Easy API deployment for open-source models |
| Hugging Face | Hugging Face | Model hub, Inference Endpoints, Spaces |
Red team relevance: API authentication flaws, rate limit bypass, model endpoint enumeration, inference engine vulnerabilities, and container escape in shared hosting environments.
Model Layer
The AI models themselves — the core intelligence of any AI system.
| Category | Leading Models | Provider |
|---|---|---|
| Frontier closed | GPT-4o, o1, o3 | OpenAI |
| Frontier closed | Claude 3.5 Sonnet, Claude 3 Opus | Anthropic |
| Frontier closed | Gemini 1.5 Pro, Gemini Ultra | |
| Frontier open | Llama 3.1 405B, Llama 3.2 | Meta |
| Open efficient | Mistral Large, Mixtral | Mistral AI |
| Open efficient | Qwen 2.5, DeepSeek-V3 | Alibaba, DeepSeek |
| Specialized | Codex, CodeLlama, StarCoder | Various |
| Embedding | text-embedding-3, voyage-3, BGE | Various |
Red team relevance: Jailbreaking, prompt injection, model extraction, training data extraction, adversarial examples, alignment bypass. See Model Types & Attack Surfaces.
Orchestration Layer
Software that connects models to data, tools, and each other. This is where most AI application complexity lives.
| Framework | Purpose | Adoption |
|---|---|---|
| LangChain | General-purpose LLM application framework | Very high |
| LlamaIndex | Data connection and retrieval for LLMs | High |
| Semantic Kernel | Microsoft's SDK for AI orchestration | Growing |
| CrewAI | Multi-agent orchestration | Growing |
| AutoGen | Microsoft's multi-agent framework | Growing |
| Haystack | End-to-end NLP/LLM framework | Moderate |
Red team relevance: Prompt injection through RAG pipelines, tool abuse, agent manipulation, chain-of-thought hijacking, and memory poisoning. The orchestration layer is often the weakest link because it was built for functionality, not security.
Application Layer
The user-facing layer — the product that humans actually interact with.
- Chat interfaces: Customer support, coding assistants, general-purpose chatbots
- APIs: Programmatic access for developers building on top of AI services
- Embedded AI: AI features within existing products (email composition, document summarization)
- Autonomous agents: AI systems that take actions on behalf of users (booking, research, coding)
Red team relevance: Traditional web application vulnerabilities (XSS, CSRF, authentication flaws), plus AI-specific risks like user session manipulation and cross-user data leakage through shared model state.
The Provider Ecosystem
Understanding who builds, hosts, and deploys AI systems reveals the trust boundaries that red teamers can exploit.
Model Providers
These organizations train and provide the base models:
- OpenAI: GPT family, DALL-E, Whisper. Distributed through their own API and Azure.
- Anthropic: Claude family. Distributed through their own API, AWS Bedrock, and Google Cloud.
- Google: Gemini family, PaLM. Distributed through Google Cloud and direct API.
- Meta: Llama family. Open-weight models distributed through Hugging Face and various hosts.
- Mistral AI: Mistral and Mixtral models. Distributed through their own API, Azure, and AWS.
- Cohere: Command family. Focused on enterprise search and RAG applications.
Platform Providers
These organizations host and serve models, often with additional services:
- Cloud hyperscalers (AWS, Azure, GCP): Offer managed AI services with enterprise security features
- Specialized inference providers (Together, Replicate, Anyscale): Optimize for inference speed and cost
- Model registries (Hugging Face): Host and distribute models and datasets
Application Builders
These organizations build products using AI models and platforms:
- Vertical SaaS: AI features embedded in domain-specific software (legal, medical, finance)
- Horizontal tools: General-purpose AI assistants, coding tools, content creation tools
- Enterprise integrations: AI added to existing enterprise workflows (CRM, ERP, collaboration)
Current Trends Affecting Security
Several ongoing trends in the AI landscape directly affect red teaming priorities:
Trend 1: Multi-Modal Expansion
Models increasingly handle text, images, audio, video, and code simultaneously. Each modality introduces new attack surface — adversarial images, audio prompt injection, and cross-modal attacks that combine modalities to bypass single-modal defenses.
Trend 2: Agentic AI
Models are gaining the ability to use tools, browse the web, execute code, and take actions in the real world. Agent systems dramatically expand the blast radius of successful attacks from "the model said something bad" to "the model did something bad."
Trend 3: Model Commoditization
Open-weight models are approaching closed-model performance, making it easier for anyone to deploy powerful AI. This democratizes both beneficial use and adversarial capability.
Trend 4: Specialization Through Fine-Tuning
Organizations are increasingly fine-tuning base models for specific domains. Fine-tuning can weaken safety training, and fine-tuned models often have less robust alignment than their base versions.
Trend 5: Edge Deployment
Models running locally on phones, laptops, and IoT devices are outside the provider's control. Edge-deployed models cannot rely on server-side guardrails, rate limiting, or monitoring.
Related Topics
- Model Types & Attack Surfaces — security implications of different model architectures
- Deployment Patterns — how deployment choices affect security
- The AI API Ecosystem — authentication and access patterns
- Open vs Closed Models — security tradeoffs of model availability
- AI System Architecture — deeper dive into system-level architecture
References
- "State of AI Report 2025" - Benaich & Hogarth (2025) - Annual comprehensive analysis of the AI industry landscape, key players, and trends
- "The AI Index Report" - Stanford HAI (2025) - Data-driven analysis of AI progress, adoption, and policy across the global ecosystem
- "MLOps: Continuous Delivery for Machine Learning" - Google (2024) - Overview of the infrastructure and operational practices supporting production AI systems
- "OWASP AI Security and Privacy Guide" - OWASP (2025) - Security considerations organized by AI system component and lifecycle stage
Why is understanding the full AI stack important for red teaming rather than focusing only on the model?