Skip to main content
redteams.ai
All tags

# mechanistic

2 articlestagged with “mechanistic

Attention Manipulation Research

Research into directly manipulating attention patterns to achieve injection objectives, informed by mechanistic interpretability insights.

researchattentionmanipulationmechanistic
Advanced

Activation Steering

Manipulating model behavior by adding learned steering vectors to intermediate activations, bypassing safety training through direct representation engineering.

activation-steeringrepresentation-engineeringsteering-vectorsmechanisticsafety-bypass
Expert