# activations

2 articlestagged with “activations”

LLM Internals

Deep technical exploration of LLM internal mechanisms for exploit development, covering activation analysis, alignment bypass primitives, and embedding space exploitation.

internalsactivationsalignmentembeddingsmechanistic-interpretabilityexploit-development

Beginner

Activation Analysis & Hidden State Exploitation

Reading model internals via hidden state extraction, logprob probing, refusal direction analysis, and activation steering techniques.

activationshidden-statesprobinginformation-leakagemechanistic-interpretability

Expert