1 articletagged with “reasoning-models”
Techniques for injecting malicious content into chain-of-thought reasoning traces of thinking models, exploiting the gap between reasoning and safety enforcement.