# chain-of-thought
18 articlestagged with “chain-of-thought”
Manipulating Reasoning Chains
Techniques for influencing an AI agent's chain-of-thought reasoning to steer its planning, decision-making, and tool selection toward attacker-desired outcomes.
Chain-of-Thought Exploitation Techniques
Deep analysis of how reasoning traces in CoT models can be manipulated to produce adversarial outputs while maintaining coherent reasoning.
Reasoning Model Attacks
Overview of security risks in reasoning-enabled LLMs: how chain-of-thought models introduce new attack surfaces, exploit primitives, and defensive challenges.
Reasoning Model Jailbreaks
How reasoning capabilities create novel jailbreak surfaces: chain-of-thought exploitation, scratchpad attacks, and why higher reasoning effort increases attack success.
Steganographic Reasoning
Hidden communication channels within AI reasoning traces, where models encode information or coordinate behavior through patterns invisible to human overseers, including detection methods and implications for AI safety.
Unfaithful Chain-of-Thought Reasoning
Analysis of unfaithful chain-of-thought reasoning in language models, where the visible reasoning trace does not accurately reflect the model's actual computational process, including detection methods, implications for oversight, and exploitation techniques.
Reasoning Model Exploitation
Exploiting extended thinking and chain-of-thought reasoning in o1, Claude, and DeepSeek-R1 models.
Red Teaming Reasoning Traces
Techniques for analyzing and exploiting visible reasoning traces in chain-of-thought models.
Injection in Reasoning Models
Research into injection attacks specific to reasoning-augmented models that exploit chain-of-thought processes and self-reflection mechanisms.
Lab: Reasoning Model Exploitation
Attack reasoning models like o1, o3, and DeepSeek-R1 by exploiting chain-of-thought manipulation, reasoning budget exhaustion, and thought-injection techniques.
Reasoning Trace Exploitation in CoT Models
Exploit visible chain-of-thought reasoning traces in models like o1 and DeepSeek-R1 to manipulate outputs.
Reasoning Model Chain-of-Thought Exploitation
Exploit extended thinking and chain-of-thought reasoning in models like o1 and DeepSeek-R1.
Lab: Chain-of-Thought Exploitation
Exploit chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.
Reasoning Model Security Analysis
Security analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.
Reasoning Model Exploitation Walkthrough
Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.
Thought Injection for Reasoning Models
Techniques for injecting malicious content into chain-of-thought reasoning traces of thinking models, exploiting the gap between reasoning and safety enforcement.
Thought Injection in Reasoning Models Walkthrough
Inject adversarial thoughts into the reasoning chain of thinking models to manipulate final outputs.
Chain-of-Thought Hijacking Walkthrough
Walkthrough of hijacking visible reasoning traces in CoT models to redirect conclusions and bypass safety checks.