# transformers

7 articlestagged with “transformers”

Attention Pattern Analysis for Security

Using attention maps to understand and exploit model behavior, identifying security-relevant attention patterns, and leveraging attention mechanics for red team operations.

attentiontransformersinterpretabilityattention-patternssecurity

Advanced

LLM Internals for Exploit Developers

Transformer architecture, tokenizer internals, logit pipelines, and trust boundaries from an offensive security perspective.

transformerstokenizersinternalsexploit-primitivescontext-window

Expert

Exploiting Attention Mechanisms

How the self-attention mechanism in transformers can be leveraged to steer model behavior, hijack information routing, and bypass safety instructions.

attentiontransformersinternalsexploit-primitivesinformation-routing

Advanced

LLM Internals & Exploit Primitives

An overview of large language model architecture from a security researcher's perspective, covering the key components that create exploitable attack surfaces.

llmtransformersinternalsexploit-primitivesarchitecture

Beginner

Lab: Exploiting Quantized Models

Hands-on lab comparing attack success rates across quantization levels: testing jailbreaks on FP16 vs INT8 vs INT4, measuring safety degradation, and crafting quantization-aware exploits.

labquantizationhands-onjailbreaksafety-testingtransformers

Advanced

Lab: Poisoning a Training Dataset

Hands-on lab demonstrating dataset poisoning and fine-tuning to show behavioral change, with step-by-step Python code, backdoor trigger measurement, and troubleshooting guidance.

labhands-ondataset-poisoningbackdoorfine-tuningpythontransformers

Advanced

Hugging Face Hub Red Team Walkthrough

Walkthrough for assessing AI models on Hugging Face Hub: model security assessment, scanning for malicious models, Transformers library testing, and Spaces application evaluation.

huggingfacemodel-hubtransformersmalicious-modelssupply-chainwalkthrough

Beginner