# tokenizer
8 articlestagged with “tokenizer”
Tokenizer-Level Defense Mechanisms
Implementing security checks at the tokenizer level to detect and neutralize adversarial token patterns.
Tokenizer Security
How tokenization creates attack surfaces in LLM systems: BPE exploitation, token boundary attacks, encoding edge cases, and tokenizer-aware adversarial techniques.
Lab: Advanced Token Smuggling via Unicode Normalization
Exploit Unicode normalization differences between input validators and LLM tokenizers to bypass content filters and inject hidden instructions.
Token Boundary Manipulation
Exploit tokenizer-specific behavior by crafting inputs that split across token boundaries in unexpected ways.
Tokenizer Attack Surface Analysis
Deep analysis of tokenizer vulnerabilities including token boundary exploitation, special token manipulation, and cross-tokenizer attacks.
Tokenizer Vulnerabilities Across Models
Comprehensive analysis of tokenizer vulnerabilities across major model families.
Tokenizer Manipulation & Custom Vocabularies
Attacking BPE training data to influence vocabulary construction, inserting special tokens, manipulating merge rules, and creating custom tokenizer backdoors.
Tokenizer Poisoning Attacks
Attacking tokenizer training and vocabulary to create adversarial token patterns that bypass safety measures.