# models
23 articlestagged with “models”
The AI Landscape: A Red Teamer's Map
Major models, providers, deployment patterns, and the AI stack from hardware to application — orienting red teamers in the current AI ecosystem.
CTF: Supply Chain Saboteur
Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Command R Security Analysis
Security assessment of Cohere's Command R models with focus on RAG-specific attack surfaces and retrieval-augmented generation vulnerabilities.
Distillation Security Analysis
Security implications of knowledge distillation including backdoor transfer, capability extraction, and safety property degradation in student models.
Gemma Security Deep Dive
Comprehensive security analysis of Google's Gemma open-weight models including safety training effectiveness and fine-tuning attack surfaces.
Mistral NeMo Security Analysis
Security assessment of the Mistral-NVIDIA NeMo collaboration models examining enterprise deployment risks and instruction-following vulnerabilities.
MoE Routing Exploitation (Model Deep Dives)
Detailed analysis of how Mixture-of-Experts routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Multimodal Model Security Comparison
Comparing security properties across multimodal models (GPT-4V, Claude, Gemini) with focus on cross-modal injection and vision-language attacks.
Open Weight vs API Security Comparison
Comparative analysis of security properties between open-weight deployments and API-based access, including unique attack surfaces for each.
Phi Model Security Analysis
Security analysis of Microsoft's Phi family of small language models, examining how reduced scale affects safety properties and attack surfaces.
Phi Models Security Analysis
Security analysis of Microsoft's Phi small language model family including safety vs capability tradeoffs.
Pruning Impact on Safety
How structured and unstructured pruning affects model safety properties, and techniques for exploiting pruning artifacts to bypass safety training.
Quantization Effects on Security Properties
Systematic study of how different quantization methods (GPTQ, AWQ, GGUF, SqueezeLLM) affect model safety properties and vulnerability to attacks.
Qwen Architecture Security
In-depth security assessment of Alibaba's Qwen model family including architecture-specific vulnerabilities and cross-language attack surfaces.
Qwen Models Security Analysis
Security analysis of Alibaba's Qwen model family including multilingual safety considerations.
Reasoning Model Security Analysis
Security analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.
Tokenizer Attack Surface Analysis
Deep analysis of tokenizer vulnerabilities including token boundary exploitation, special token manipulation, and cross-tokenizer attacks.
Yi Model Security Assessment
Security analysis of 01.AI's Yi models focusing on bilingual capabilities, training data implications, and comparative safety properties.
Model API Comparison Table
Side-by-side comparison of major LLM API features, security controls, and rate limits for OpenAI, Anthropic, Google, and other providers.
Testing AI21 Labs Models
Red team testing guide for AI21 Labs Jamba models including long context and efficiency features.
Testing Cohere Models
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
Testing Mistral AI Models
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.