# chain-of-thought

18 artikelengetagd met “chain-of-thought”

Redeneerketens manipuleren

Technieken om de chain-of-thought-redenering van een AI-agent te beïnvloeden, zodat je zijn planning, besluitvorming en toolkeuze stuurt richting uitkomsten die de aanvaller wil.

agentschain-of-thoughtreasoningmanipulationadvanced

Gevorderd

Chain-of-Thought-Exploitatietechnieken

Diepgaande analyse van hoe redeneersporen in CoT-modellen gemanipuleerd kunnen worden om adversariële outputs te produceren terwijl een coherente redenering behouden blijft.

frontierchain-of-thoughtreasoning

Gevorderd

Aanvallen op reasoning-modellen

Overzicht van beveiligingsrisico's in reasoning-geschikte LLM's: hoe chain-of-thought-modellen nieuwe aanvalsoppervlakken, exploitprimitieven en verdedigingsuitdagingen introduceren.

reasoningo1chain-of-thoughtattacks

Expert

Jailbreaks van reasoning-modellen

Hoe reasoning-capaciteiten nieuwe jailbreak-oppervlakken creëren: chain-of-thought-exploitatie, scratchpad-aanvallen, en waarom een hogere reasoning-inspanning het aanvalssucces vergroot.

reasoningjailbreakchain-of-thoughto1o3adaptive-attacksresearch

Gevorderd

Steganografisch redeneren

Verborgen communicatiekanalen binnen AI-redeneersporen, waar modellen informatie coderen of gedrag coördineren via patronen die onzichtbaar zijn voor menselijke toezichthouders, inclusief detectiemethoden en implicaties voor AI-veiligheid.

steganographyreasoninghidden-communicationchain-of-thoughtai-safetyoversight

Expert

Ontrouwe chain-of-thought-redenering

Analyse van ontrouwe chain-of-thought-redenering in taalmodellen, waar het zichtbare redeneerspoor niet nauwkeurig het daadwerkelijke berekeningsproces van het model weerspiegelt, inclusief detectiemethoden, implicaties voor toezicht en uitbuitingstechnieken.

unfaithful-reasoningchain-of-thoughtreasoninginterpretabilityoversightai-safety

Gevorderd

Exploitatie van reasoning-modellen

Het uitbuiten van extended thinking en chain-of-thought-redeneren in o1-, Claude- en DeepSeek-R1-modellen.

frontier-researchreasoningexploitationchain-of-thought

Expert

Red teaming van reasoning traces

Technieken voor het analyseren en uitbuiten van zichtbare reasoning traces in chain-of-thought-modellen.

frontier-researchreasoning-tracesred-teamingchain-of-thought

Gevorderd

Injection in reasoning-modellen

Onderzoek naar injection-aanvallen specifiek voor reasoning-augmented modellen die chain-of-thought-processen en zelfreflectiemechanismen misbruiken.

researchreasoningchain-of-thoughtinjection

Gevorderd

Lab: misbruik van reasoning-modellen

Attack reasoning models like o1, o3, and DeepSeek-R1 by exploiting chain-of-thought manipulation, reasoning budget exhaustion, and thought-injection techniques.

labreasoningo1chain-of-thought

Gevorderd

Misbruik van reasoning traces in CoT-modellen

Exploit visible chain-of-thought reasoning traces in models like o1 and DeepSeek-R1 to manipulate outputs.

labsreasoning-tracechain-of-thoughtexploitationadvanced

Gevorderd

Misbruik van chain-of-thought in redeneermodellen

Exploit extended thinking and chain-of-thought reasoning in models like o1 and DeepSeek-R1.

labsreasoningchain-of-thoughtexpert

Expert

Lab: misbruik van chain-of-thought

Exploit chain-of-thought reasoning to leak internal model reasoning, bypass safety filters, and manipulate decision processes.

labschain-of-thoughtexploitationintermediate

Gemiddeld

Beveiligingsanalyse van redeneermodellen

Security analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.

modelsreasoningchain-of-thoughto1

Gevorderd

Walkthrough van misbruik van redeneermodellen

Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.

walkthroughsreasoningexploitationchain-of-thought

Gevorderd

Thought injection voor redeneermodellen

Techniques for injecting malicious content into chain-of-thought reasoning traces of thinking models, exploiting the gap between reasoning and safety enforcement.

jailbreakingthought-injectionchain-of-thoughtreasoning-modelsCoTred-teaming

Gevorderd

Walkthrough van thought injection in redeneermodellen

Inject adversarial thoughts into the reasoning chain of thinking models to manipulate final outputs.

walkthroughsthought-injectionreasoningchain-of-thought

Gevorderd

Walkthrough: chain-of-thought hijacking

Walkthrough of hijacking visible reasoning traces in CoT models to redirect conclusions and bypass safety checks.

walkthroughschain-of-thoughthijackingreasoning

Gevorderd