# trojan
標記為「trojan」的 8 篇文章
Backdoor Trigger Design
Methodology for designing effective backdoor triggers for LLMs, covering trigger taxonomy, poison rate optimization, trigger-target mapping, multi-trigger systems, evaluation evasion, and persistence through fine-tuning.
Training & Fine-Tuning Attacks
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Trigger-Based Backdoor Attacks
Implementing backdoor attacks using specific trigger patterns that activate pre-programmed model behavior while remaining dormant under normal conditions.
Trojan Model Detection
Defense-focused guide to detecting backdoored and trojan AI models, covering BadEdit, TrojanPuzzle, PoisonGPT techniques and practical detection methods including activation analysis, weight inspection, and behavioral testing.
Backdoor Trigger Design
Methodology for designing effective backdoor triggers for LLMs, covering trigger taxonomy, poison rate optimization, trigger-target mapping, multi-trigger systems, evaluation evasion, and persistence through fine-tuning.
訓練 & Fine-Tuning 攻擊s
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Trigger-Based Backdoor 攻擊s
Implementing backdoor attacks using specific trigger patterns that activate pre-programmed model behavior while remaining dormant under normal conditions.
Trojan 模型 Detection
防禦-focused guide to detecting backdoored and trojan AI models, covering BadEdit, TrojanPuzzle, PoisonGPT techniques and practical detection methods including activation analysis, weight inspection, and behavioral testing.