Tokenizer Manipulation & Custom Vocabularies
Attacking BPE training data to influence vocabulary construction, inserting special tokens, manipulating merge rules, and creating custom tokenizer backdoors.
tokenizerBPEvocabularymerge-rulestoken-manipulationspecial-tokens