# lora
標記為「lora」的 19 篇文章
Fine-Tuning Security Assessment
Test your knowledge of fine-tuning security risks including LoRA attacks, RLHF manipulation, safety degradation, and catastrophic forgetting with 15 questions.
Training & Fine-Tuning Attacks
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
Fine-Tuning Security
Comprehensive overview of how fine-tuning can compromise model safety -- attack taxonomy covering dataset poisoning, safety degradation, backdoor insertion, and reward hacking in the era of widely available fine-tuning APIs.
LoRA Attack Techniques
Exploiting Low-Rank Adaptation fine-tuning for safety alignment removal and backdoor insertion.
Malicious Adapter Injection
How attackers craft LoRA adapters containing backdoors, distribute poisoned adapters through model hubs, and exploit adapter stacking to compromise model safety -- techniques, detection challenges, and real-world supply chain risks.
LoRA & Adapter Attack Surface
Overview of security vulnerabilities in parameter-efficient fine-tuning methods including LoRA, QLoRA, and adapter-based approaches -- how the efficiency and shareability of adapters create novel attack vectors.
Direct Weight Manipulation
Techniques for directly modifying LoRA adapter weights to bypass safety training, inject targeted capabilities, and hide malicious behaviors -- going beyond dataset-driven fine-tuning to surgical weight-level attacks.
LoRA Backdoor Insertion Attack
Insert triggered backdoors through LoRA fine-tuning that activate on specific input patterns while passing safety evals.
Model Merging & LoRA Composition Exploits
Exploiting model merging techniques (TIES, DARE, linear interpolation) and LoRA composition to introduce backdoors through individually benign model components.
LoRA & Adapter Layer Attacks
Security implications of LoRA and adapter-based fine-tuning, including safety alignment removal, adapter poisoning, rank manipulation attacks, and multi-adapter conflict exploitation.
訓練 & Fine-Tuning 攻擊s
Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.
微調安全
微調如何妥協模型安全的全面概覽——涵蓋資料集投毒、安全劣化、後門植入與獎勵駭客的攻擊分類,於微調 API 廣泛可得的時代。
LoRA 攻擊 Techniques
利用ing Low-Rank Adaptation fine-tuning for safety alignment removal and backdoor insertion.
Malicious Adapter Injection
How attackers craft LoRA adapters containing backdoors, distribute poisoned adapters through model hubs, and exploit adapter stacking to compromise model safety -- techniques, detection challenges, and real-world supply chain risks.
LoRA & Adapter 攻擊 Surface
概覽 of security vulnerabilities in parameter-efficient fine-tuning methods including LoRA, QLoRA, and adapter-based approaches -- how the efficiency and shareability of adapters create novel attack vectors.
Direct Weight Manipulation
Techniques for directly modifying LoRA adapter weights to bypass safety training, inject targeted capabilities, and hide malicious behaviors -- going beyond dataset-driven fine-tuning to surgical weight-level attacks.
LoRA Backdoor Insertion 攻擊
Insert triggered backdoors through LoRA fine-tuning that activate on specific input patterns while passing safety evals.
模型合併與 LoRA 組合攻擊
利用模型合併技術(TIES、DARE、線性內插)與 LoRA 組合,透過個別無害的模型元件引入後門。
LoRA 與介面卡層攻擊
LoRA 與介面卡基微調之安全意涵,含安全對齊移除、介面卡投毒、秩操弄攻擊與多介面卡衝突利用。