# reasoning

27 artikelengetagd met “reasoning”

Redeneerketens manipuleren

Technieken om de chain-of-thought-redenering van een AI-agent te beïnvloeden, zodat je zijn planning, besluitvorming en toolkeuze stuurt richting uitkomsten die de aanvaller wil.

agentschain-of-thoughtreasoningmanipulationadvanced

Gevorderd

Beveiligingsbeoordeling van redeneermodellen

Assessment of chain-of-thought exploitation, reasoning trace manipulation, and thinking-token attacks.

assessmentreasoningfrontier

Gevorderd

Vaardigheidsverificatie: aanvallen op redeneermodellen

Verification of skills in reasoning trace manipulation, chain-of-thought exploitation, and thinking-token attacks.

skill-verificationreasoningfrontier

Gevorderd

Chain-of-Thought-Exploitatietechnieken

Diepgaande analyse van hoe redeneersporen in CoT-modellen gemanipuleerd kunnen worden om adversariële outputs te produceren terwijl een coherente redenering behouden blijft.

frontierchain-of-thoughtreasoning

Gevorderd

Frontier Research

Geavanceerd AI-beveiligingsonderzoek dat aanvallen op reasoning-modellen, beveiliging van codegeneratie, computer use-agents, AI-aangedreven red teaming, robotica en embodied AI, en alignment faking behandelt.

frontierresearchreasoningcode-modelscomputer-usealignment-fakingembodied-ai

Beginner

Multi-modal reasoning-aanvallen

Het aanvallen van redeneerprocessen die meerdere modaliteiten omvatten in vision-language- en audio-language-modellen.

frontier-researchmultimodalreasoningattacks

Expert

Beveiliging van multimodaal redeneren

Beveiligingsuitdagingen die specifiek zijn voor modellen die tegelijkertijd redeneren over tekst-, beeld-, audio- en videomodaliteiten.

frontiermultimodalreasoning

Gevorderd

Chain-of-Thought-exploitatie

Technieken voor het manipuleren van reasoning chains in CoT-geschikte modellen: injectie van valse premissen, logic bombs, reasoning hijacking en chain-corruptieaanvallen.

cotexploitationreasoningmanipulation

Expert

Aanvallen op reasoning-modellen

Overzicht van beveiligingsrisico's in reasoning-geschikte LLM's: hoe chain-of-thought-modellen nieuwe aanvalsoppervlakken, exploitprimitieven en verdedigingsuitdagingen introduceren.

reasoningo1chain-of-thoughtattacks

Expert

Jailbreaks van reasoning-modellen

Hoe reasoning-capaciteiten nieuwe jailbreak-oppervlakken creëren: chain-of-thought-exploitatie, scratchpad-aanvallen, en waarom een hogere reasoning-inspanning het aanvalssucces vergroot.

reasoningjailbreakchain-of-thoughto1o3adaptive-attacksresearch

Gevorderd

Steganografisch redeneren

Verborgen communicatiekanalen binnen AI-redeneersporen, waar modellen informatie coderen of gedrag coördineren via patronen die onzichtbaar zijn voor menselijke toezichthouders, inclusief detectiemethoden en implicaties voor AI-veiligheid.

steganographyreasoninghidden-communicationchain-of-thoughtai-safetyoversight

Expert

Thought-injectie & verborgen CoT-manipulatie

Technieken voor het injecteren van gedachten in verborgen redeneersporen, het uitbuiten van samengevatte versus volledige CoT, en het sturen van modelredenering zonder zichtbare manipulatie.

thought-injectionhidden-cotreasoning

Expert

Ontrouwe chain-of-thought-redenering

Analyse van ontrouwe chain-of-thought-redenering in taalmodellen, waar het zichtbare redeneerspoor niet nauwkeurig het daadwerkelijke berekeningsproces van het model weerspiegelt, inclusief detectiemethoden, implicaties voor toezicht en uitbuitingstechnieken.

unfaithful-reasoningchain-of-thoughtreasoninginterpretabilityoversightai-safety

Gevorderd

Exploitatie van reasoning-modellen

Het uitbuiten van extended thinking en chain-of-thought-redeneren in o1-, Claude- en DeepSeek-R1-modellen.

frontier-researchreasoningexploitationchain-of-thought

Expert

Thinking-token-aanvallen op reasoning-modellen

Het uitbuiten van extended thinking tokens en interne redeneringsprocessen in modellen zoals Claude en o1 voor jailbreaking.

frontierthinking-tokensreasoning

Gevorderd

Aanvallen op test-time compute scaling (Frontier Research)

Aanvallen op test-time compute scaling in redeneermodellen om de toewijzing van middelen en redeneersporen te misbruiken.

frontier-researchtest-time-computeattacksreasoning

Expert

Injection in reasoning-modellen

Onderzoek naar injection-aanvallen specifiek voor reasoning-augmented modellen die chain-of-thought-processen en zelfreflectiemechanismen misbruiken.

researchreasoningchain-of-thoughtinjection

Gevorderd

Lab: misbruik van reasoning-modellen

Attack reasoning models like o1, o3, and DeepSeek-R1 by exploiting chain-of-thought manipulation, reasoning budget exhaustion, and thought-injection techniques.

labreasoningo1chain-of-thought

Gevorderd

Reasoning chain hijack: CTF over CoT-misbruik

Hijack a reasoning model's chain-of-thought to redirect its conclusions while maintaining coherent reasoning.

labsctfreasoningchain-hijack

Expert

Misbruik van chain-of-thought in redeneermodellen

Exploit extended thinking and chain-of-thought reasoning in models like o1 and DeepSeek-R1.

labsreasoningchain-of-thoughtexpert

Expert

Kaping van de redeneerketen

Hijack the model's chain-of-thought reasoning to redirect its conclusions toward attacker-controlled objectives.

intermediatechainlabhijackingreasoninglabs

Gemiddeld

Beveiligingsanalyse van DeepSeek-R1

Security analysis of DeepSeek-R1's reasoning capabilities and MoE architecture vulnerabilities.

model-deep-divesdeepseekr1reasoning

Gevorderd

Beveiligingsanalyse van redeneermodellen

Security analysis of reasoning-augmented models (o1, DeepSeek-R1) focusing on chain-of-thought manipulation and reasoning-specific attack vectors.

modelsreasoningchain-of-thoughto1

Gevorderd

Walkthrough van misbruik van redeneermodellen

Exploit extended thinking in reasoning models to inject false premises and manipulate conclusion generation.

walkthroughsreasoningexploitationchain-of-thought

Gevorderd

Walkthrough van thought injection in redeneermodellen

Inject adversarial thoughts into the reasoning chain of thinking models to manipulate final outputs.

walkthroughsthought-injectionreasoningchain-of-thought

Gevorderd

Walkthrough: chain-of-thought hijacking

Walkthrough of hijacking visible reasoning traces in CoT models to redirect conclusions and bypass safety checks.

walkthroughschain-of-thoughthijackingreasoning

Gevorderd

DeepSeek-modellen testen

Red team testing guide for DeepSeek models including R1 reasoning exploitation and MoE architecture.

walkthroughsplatformsdeepseekreasoning

Gevorderd