Blog
Research, announcements, and insights from the AI red teaming community.
Finding CVEs with AI Red Teaming: A Research-Backed Guide
How AI red teaming techniques are discovering real-world CVEs in SQLite, OpenSSL, Linux kernel, and UEFI bootloaders — with references to the research behind it.
250 Poisoned Documents Is All It Takes: Anthropic's Data Poisoning Breakthrough
Anthropic, the UK AI Safety Institute, and the Turing Institute proved that injecting just 250 malicious documents into pretraining data can backdoor LLMs from 600M to 13B parameters. Here's what this means for model security.
Hacking the AI Hackers: When Security Tools Become the Vulnerability
New research achieves 100% prompt injection success against AI-powered security tools. If your SOC uses AI for threat detection, your AI can be turned against you. Here's what the research found and how to defend.
LLM Jailbreaking in 2026: 97% Success Rates, Autonomous Attacks, and the Arms Race That Isn't Working
Nature Communications confirms AI reasoning models can autonomously jailbreak other LLMs with 97% success. JBFuzz achieves 99% in 60 seconds. Here's what the latest 2026 research reveals about the state of AI safety — and why current defenses are failing.
Red Teaming the AI SOC: Why Your Autonomous Security Operations Need an Adversary
As organizations rush to deploy agentic AI in their SOCs, red teamers are finding that the defenders' own AI agents are now the attack surface. 520 tool misuse incidents, memory poisoning persistence, and a 97% jailbreak success rate — here's how to red team the AI-powered SOC before attackers do.
OpenClaw: Anatomy of 2026's First Major AI Agent Security Crisis
How OpenClaw's meteoric rise to the most-starred GitHub project exposed critical agentic AI vulnerabilities — from ClawJacked WebSocket hijacking (CVE-2026-25253) to malicious skills distributing macOS stealers. What red teamers and defenders need to know.
What's New in AI Security — March 2026
Monthly roundup of the most important AI security developments, tool updates, research highlights, and emerging attack vectors for March 2026.
The Prompt Injection Landscape in 2026
How prompt injection attacks have evolved from simple instruction overrides to sophisticated multi-stage exploitation chains.
The Complete Guide to Agentic AI Security
A comprehensive guide to securing agentic AI systems — covering tool use risks, multi-agent architectures, MCP security, memory poisoning, and practical defense strategies.
The AI Defense Landscape in 2026
A survey of the current state of AI defense mechanisms, from prompt shields to LLM judges, and where the arms race is heading.
Welcome to redteams.ai
Introducing the AI red teaming knowledge base — why we built it and what's ahead.
Red Teaming Cloud AI Services: A Practical Guide
Practical guide to red teaming AI services on AWS, Azure, and GCP — covering shared responsibility boundaries, service-specific attack surfaces, and cloud-native security controls.
Building a Production AI Defense Stack
How to build a layered AI defense stack for production deployments — covering input filtering, output monitoring, guardrails, anomaly detection, and incident response integration.
Building Your AI Red Team Lab
A practical guide to setting up a local AI red teaming lab with open-source models, testing frameworks, and realistic target applications.
Career Guide: Becoming an AI Red Teamer
A comprehensive career guide for aspiring AI red teamers — covering required skills, learning paths, certifications, job roles, and how to break into the field from different backgrounds.
AI Security Incidents: 2025-2026 Review
A roundup of notable AI security incidents from 2025 into early 2026, covering prompt injection in production, agent exploitation, and emerging attack classes.
Top AI Vulnerabilities of 2026
Analysis of the most impactful AI vulnerabilities discovered and exploited in 2026 — from MCP tool shadowing to multi-agent injection chains and reasoning model exploitation.
MCP Security: The New Attack Surface
Deep dive into Model Context Protocol security — analyzing tool registration attacks, transport layer risks, cross-server exploitation, and practical hardening strategies.
LLM Forensics: A Primer for Incident Responders
A primer on forensic investigation of LLM security incidents — covering evidence collection, log analysis, attack reconstruction, model behavior analysis, and forensic tooling.
Reasoning Model Security in 2026
How chain-of-thought reasoning models like o1, o3, and DeepSeek-R1 change the AI security landscape -- new attack surfaces and new defensive opportunities.
Lessons Learned from Fine-Tuning Security Research
Key lessons from researching fine-tuning security — covering alignment erosion, backdoor injection, data poisoning, safety evaluation gaps, and defensive strategies for fine-tuning pipelines.
The Multimodal Attack Landscape
As AI systems process images, audio, and video alongside text, the attack surface has expanded dramatically. Here's what red teamers need to know.
The State of AI Red Teaming in 2025
A survey of the AI red teaming landscape in early 2025 — emerging attack vectors, industry adoption, tooling maturity, and what to expect as the field evolves.