# model-merging

標記為「model-merging」的 20 篇文章

Training & Fine-Tuning Attacks

Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.

trainingfine-tuningdata-poisoningbackdoortrojanlorasleeper-agentmodel-merging

專家

Model Merging Risks

Security risks in model and adapter merging workflows -- how merging adapters from untrusted sources can introduce vulnerabilities, exploit merge algorithm properties, and cause safety property loss through TIES, DARE, SLERP, and linear interpolation.

model-mergingtiesdareslerpadapter-mergesafety-lossfine-tuning-security

進階

Model Merging Security Analysis

Security implications of model merging techniques (TIES, DARE, SLERP) including backdoor propagation and safety property degradation.

fine-tuningmodel-mergingTIESsecurity

進階

Model Merging Attack Surface Analysis

Security analysis of model merging techniques including TIES, DARE, and SLERP for injecting malicious capabilities.

frontiermodel-mergingattack-surface

專家

Model Merging Security Implications

Security analysis of model merging techniques and potential for backdoor propagation through merged models.

frontier-researchmodel-mergingsecurityopen-source

進階

Lab: Model Merging Security Analysis

Analyze security implications of model merging techniques and test for backdoor propagation through merged model weights.

labsmodel-mergingsecurity-analysisadvanced

進階

Model Merging Backdoor Propagation

Demonstrate how backdoors propagate through model merging techniques like TIES, DARE, and spherical interpolation.

labsmodel-mergingbackdoorpropagationexpert

專家

Advanced Training Attack Vectors

Cutting-edge training attacks: federated learning poisoning, model merging exploits, distributed training vulnerabilities, emergent capability risks, and synthetic data pipeline attacks.

advancedfederated-learningmodel-mergingdistributed-trainingemergencesynthetic-datatraining-security

專家

Model Merging & LoRA Composition Exploits

Exploiting model merging techniques (TIES, DARE, linear interpolation) and LoRA composition to introduce backdoors through individually benign model components.

model-mergingloratiesdaremergekitcompositionbackdoorsupply-chain

專家

Model Merging Safety Implications

Analysis of how model merging techniques (TIES, DARE, SLERP) affect safety properties and alignment.

trainingmodel-mergingsafety

進階

訓練 & Fine-Tuning 攻擊s

Methodology for data poisoning, trojan/backdoor insertion, clean-label attacks, LoRA backdoors, sleeper agent techniques, and model merging attacks targeting the LLM training pipeline.

trainingfine-tuningdata-poisoningbackdoortrojanlorasleeper-agentmodel-merging

專家

模型 Merging Risks

安全 risks in model and adapter merging workflows -- how merging adapters from untrusted sources can introduce vulnerabilities, exploit merge algorithm properties, and cause safety property loss through TIES, DARE, SLERP, and linear interpolation.

model-mergingtiesdareslerpadapter-mergesafety-lossfine-tuning-security

進階

模型 Merging 安全 Analysis

安全 implications of model merging techniques (TIES, DARE, SLERP) including backdoor propagation and safety property degradation.

fine-tuningmodel-mergingTIESsecurity

進階

模型 Merging 攻擊 Surface Analysis

安全 analysis of model merging techniques including TIES, DARE, and SLERP for injecting malicious capabilities.

frontiermodel-mergingattack-surface

專家

模型 Merging 安全 Implications

安全 analysis of model merging techniques and potential for backdoor propagation through merged models.

frontier-researchmodel-mergingsecurityopen-source

進階

實驗室: 模型 Merging 安全 Analysis

Analyze security implications of model merging techniques and test for backdoor propagation through merged model weights.

labsmodel-mergingsecurity-analysisadvanced

進階

模型 Merging Backdoor Propagation

Demonstrate how backdoors propagate through model merging techniques like TIES, DARE, and spherical interpolation.

labsmodel-mergingbackdoorpropagationexpert

專家

進階訓練漏洞

AI 訓練中的進階安全威脅——涵蓋聯邦學習攻擊、模型合併風險、水印移除、合成資料投毒、遺忘攻擊與持續學習漏洞。

advancedfederated-learningmodel-mergingwatermarksynthetic-dataunlearning

進階

模型合併與 LoRA 組合攻擊

利用模型合併技術（TIES、DARE、線性內插）與 LoRA 組合，透過個別無害的模型元件引入後門。

model-mergingloratiesdaremergekitcompositionbackdoorsupply-chain

專家

模型 Merging Safety Implications

Analysis of how model merging techniques (TIES, DARE, SLERP) affect safety properties and alignment.

trainingmodel-mergingsafety

進階