# models
標記為「models」的 46 篇文章
The AI Landscape: A Red Teamer's Map
Major models, providers, deployment patterns, and the AI stack from hardware to application — orienting red teamers in the current AI ecosystem.
CTF: Supply Chain Saboteur
Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Command R Security Analysis
Security assessment of Cohere's Command R models with focus on RAG-specific attack surfaces and retrieval-augmented generation vulnerabilities.
Distillation Security Analysis
Security implications of knowledge distillation including backdoor transfer, capability extraction, and safety property degradation in student models.
Gemma Security Deep Dive
Comprehensive security analysis of Google's Gemma open-weight models including safety training effectiveness and fine-tuning attack surfaces.
Mistral NeMo Security Analysis
Security assessment of the Mistral-NVIDIA NeMo collaboration models examining enterprise deployment risks and instruction-following vulnerabilities.
MoE Routing Exploitation (Model Deep Dives)
Detailed analysis of how Mixture-of-Experts routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Multimodal Model Security Comparison
Comparing security properties across multimodal models (GPT-4V, Claude, Gemini) with focus on cross-modal injection and vision-language attacks.
Open Weight vs API Security Comparison
Comparative analysis of security properties between open-weight deployments and API-based access, including unique attack surfaces for each.
Phi Model Security Analysis
Security analysis of Microsoft's Phi family of small language models, examining how reduced scale affects safety properties and attack surfaces.
Phi Models Security Analysis
Security analysis of Microsoft's Phi small language model family including safety vs capability tradeoffs.
Pruning Impact on Safety
How structured and unstructured pruning affects model safety properties, and techniques for exploiting pruning artifacts to bypass safety training.
Quantization Effects on Security Properties
Systematic study of how different quantization methods (GPTQ, AWQ, GGUF, SqueezeLLM) affect model safety properties and vulnerability to attacks.
Qwen Architecture Security
In-depth security assessment of Alibaba's Qwen model family including architecture-specific vulnerabilities and cross-language attack surfaces.
Qwen Models Security Analysis
Security analysis of Alibaba's Qwen model family including multilingual safety considerations.
Reasoning Model Security Analysis
Security analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.
Tokenizer Attack Surface Analysis
Deep analysis of tokenizer vulnerabilities including token boundary exploitation, special token manipulation, and cross-tokenizer attacks.
Yi Model Security Assessment
Security analysis of 01.AI's Yi models focusing on bilingual capabilities, training data implications, and comparative safety properties.
Model API Comparison Table
Side-by-side comparison of major LLM API features, security controls, and rate limits for OpenAI, Anthropic, Google, and other providers.
Testing AI21 Labs Models
Red team testing guide for AI21 Labs Jamba models including long context and efficiency features.
Testing Cohere Models
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
Testing Mistral AI Models
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.
AI 地景:紅隊員地圖
主要模型、提供者、部署模式,與自硬體至應用之 AI 堆疊——為當前 AI 生態系中之紅隊員定向。
CTF: Supply Chain Saboteur
Identify and exploit supply chain vulnerabilities in a model deployment pipeline. Find poisoned models, exploit malicious packages, and compromise the ML infrastructure.
Architecture Comparison for Safety Properties
Comparative analysis of how architectural choices (dense vs MoE, decoder-only vs encoder-decoder) affect safety properties and attack surfaces.
Command R 安全 Analysis
Security assessment of Cohere's Command R models with focus on RAG-specific attack surfaces and retrieval-augmented generation vulnerabilities.
Distillation 安全 Analysis
安全 implications of knowledge distillation including backdoor transfer, capability extraction, and safety property degradation in student models.
Gemma 安全 Deep Dive
Comprehensive security analysis of Google's Gemma open-weight models including safety training effectiveness and fine-tuning attack surfaces.
Mistral NeMo 安全 Analysis
安全 assessment of the Mistral-NVIDIA NeMo collaboration models examining enterprise deployment risks and instruction-following vulnerabilities.
MoE Routing 利用ation (模型 Deep Dives)
Detailed analysis of how Mixture-of-專家s routing can be manipulated to bypass safety-critical expert paths and trigger unsafe generation.
Multimodal 模型 安全 Comparison
Comparing security properties across multimodal models (GPT-4V, Claude, Gemini) with focus on cross-modal injection and vision-language attacks.
Open Weight vs API 安全 Comparison
Comparative analysis of security properties between open-weight deployments and API-based access, including unique attack surfaces for each.
Phi 模型 安全 Analysis
Security analysis of Microsoft's Phi family of small language models, examining how reduced scale affects safety properties and attack surfaces.
Phi 模型s 安全 Analysis
Security analysis of Microsoft's Phi small language model family including safety vs capability tradeoffs.
Pruning Impact on Safety
How structured and unstructured pruning affects model safety properties, and techniques for exploiting pruning artifacts to bypass safety training.
Quantization Effects on 安全 Properties
Systematic study of how different quantization methods (GPTQ, AWQ, GGUF, SqueezeLLM) affect model safety properties and vulnerability to attacks.
Qwen Architecture 安全
In-depth security assessment of Alibaba's Qwen model family including architecture-specific vulnerabilities and cross-language attack surfaces.
Qwen 模型s 安全 Analysis
Security analysis of Alibaba's Qwen model family including multilingual safety considerations.
Reasoning 模型 安全 Analysis
安全 analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.
Tokenizer 攻擊 Surface Analysis
Deep analysis of tokenizer vulnerabilities including token boundary exploitation, special token manipulation, and cross-tokenizer attacks.
Yi 模型 安全 評量
Security analysis of 01.AI's Yi models focusing on bilingual capabilities, training data implications, and comparative safety properties.
模型 API Comparison Table
Side-by-side comparison of major LLM API features, security controls, and rate limits for OpenAI, Anthropic, Google, and other providers.
Testing AI21 實驗室s 模型s
Red team testing guide for AI21 實驗室s Jamba models including long context and efficiency features.
Testing Cohere 模型s
Red team testing guide for Cohere's Command-R models including RAG and tool use features.
Testing Mistral AI 模型s
Complete red team testing guide for Mistral AI models including Mixtral MoE architecture and chat endpoints.