Exploiting Attention Mechanisms
How the self-attention mechanism in transformers can be leveraged to steer model behavior, hijack information routing, and bypass safety instructions.
attentiontransformersinternalsexploit-primitivesinformation-routing