聯邦學習投毒（訓練管線）

Expert4 min readUpdated 2026-03-13

聯邦學習架構漏洞：Byzantine 攻擊、模型替換、梯度操弄，以及經由惡意參與者投毒全域模型之技術。

federated-learning byzantine model-replacement gradient-poisoning aggregation privacy

聯邦學習（FL）啟動無資料分享之協作訓練。每個參與者於本地資料訓練，並僅將模型更新送至中央聚合伺服器。此保護隱私之設計具根本安全意涵：伺服器無法檢查參與者之資料——使其無法驗證更新是否誠實。

聯邦學習架構

┌─────────────┐     模型更新           ┌──────────────────┐
│ 參與者 1    │ ───────────────────── │                  │
│ （誠實）    │                       │  聚合            │
├─────────────┤     模型更新           │  伺服器          │
│ 參與者 2    │ ───────────────────── │                  │
│ （誠實）    │                       │  FedAvg /        │
├─────────────┤     被投毒之更新       │  Robust Agg      │
│ 參與者 3    │ ───────────────────── │                  │
│ （惡意）    │                       │  ──── 全域 ──▶   │
├─────────────┤     模型更新           │       模型       │
│ 參與者 N    │ ───────────────────── │                  │
│ （誠實）    │                       └──────────────────┘
└─────────────┘

標準聚合：FedAvg

def federated_average(participant_updates: list[dict], weights: list[float]) -> dict:
    """FedAvg：參與者模型更新之加權平均。
    易受投毒，因它等級地信任所有更新。"""
    global_update = {}
    for key in participant_updates[0]:
        global_update[key] = sum(
            w * update[key] for w, update in zip(weights, participant_updates)
        ) / sum(weights)
    return global_update

攻擊 1：Byzantine 梯度投毒

Byzantine 攻擊者送出打造之梯度更新——設計為將全域模型朝特定對抗目標轉移。

有針對性之投毒

攻擊者計算朝所欲行為移動全域模型之梯度，並將其縮放以主宰聚合：

def byzantine_poisoning_attack(
    malicious_model,
    target_behavior_data,
    global_model_state,
    num_participants: int,
    scaling_factor: float = 10.0,
):
    """打造將全域模型朝目標行為轉移之被投毒更新。"""
    # 步驟 1：計算朝目標行為之梯度
    malicious_model.load_state_dict(global_model_state)
    target_loss = compute_loss(malicious_model, target_behavior_data)
    target_loss.backward()
 
    # 步驟 2：計算更新 delta
    poisoned_update = {}
    for name, param in malicious_model.named_parameters():
        clean_delta = param.data - global_model_state[name]
        target_delta = -param.grad  # 朝目標之梯度下降
 
        # 將乾淨行為與被投毒目標混合
        poisoned_update[name] = clean_delta + scaling_factor * target_delta
 
    # 步驟 3：縮放以主宰聚合
    # 具 N 個參與者時，以 N 縮放確保被投毒更新
    # 超過所有誠實參與者之總和
    for name in poisoned_update:
        poisoned_update[name] *= num_participants
 
    return poisoned_update

依投毒率之攻擊有效性

惡意參與者	FedAvg 影響	以中位數為本之聚合影響	Krum 影響
10 之 1（10%）	高——單一攻擊者可主宰	低——中位數過濾異常	低——Krum 拒絕異常
10 之 3（30%）	非常高	中——中位數轉移	中——多個異常混淆選擇
10 之 5（50%）	完全控制	高——多數控制中位數	高——Krum 假設被違反

攻擊 2：模型替換

更積極之攻擊——單一惡意參與者於一輪中替換整個全域模型：

def model_replacement_attack(
    target_model_state: dict,
    global_model_state: dict,
    num_participants: int,
    learning_rate: float,
):
    """計算經 FedAvg 後將全域模型
    替換為攻擊者目標模型之更新。"""
    replacement_update = {}
    for name in target_model_state:
        # 經 FedAvg 後：new_global = global + lr * avg(updates)
        # 我們要：new_global = target_model
        # 故：update = (target - global) * num_participants / lr
        replacement_update[name] = (
            (target_model_state[name] - global_model_state[name])
            * num_participants / learning_rate
        )
    return replacement_update

攻擊 3：經 FL 之後門注入

經由聯邦學習嵌入後門特別危險，因攻擊者之訓練資料從不被檢查：

訓練本地被後門化模型
於含觸發後門樣本之本地資料微調全域模型（見訓練與微調攻擊）。
計算更新 delta
被後門化本地模型與所收全域模型之差為被投毒之更新。
縮放以存活於聚合
將更新乘以確保後門於與誠實參與者更新平均後存活。
約束更新 norm
若伺服器使用 norm clipping，確保被投毒更新之 norm 於 clipping 門檻內以避免被拒絕。

def constrained_backdoor_update(
    backdoored_model, global_model, norm_bound: float
):
    """受約束以通過以 norm 為本偵測之後門更新。"""
    update = {}
    for name in backdoored_model.state_dict():
        update[name] = backdoored_model.state_dict()[name] - global_model.state_dict()[name]
 
    # 計算更新 norm
    total_norm = torch.sqrt(sum(
        (update[name] ** 2).sum() for name in update
    ))
 
    # 必要時縮小以保持於偵測門檻內
    if total_norm > norm_bound:
        scale = norm_bound / total_norm
        for name in update:
            update[name] *= scale
 
    return update

防禦：Byzantine 穩健聚合

聚合方法之比較

方法	機制	容忍	額外負擔
FedAvg	加權平均	0% Byzantine	基準
Krum	選擇最接近多數之更新	< 50%	O(n^2) 距離計算
Trimmed Mean	每座標移除 top/bottom k%、餘者平均	< k%	每座標排序
Median	逐座標中位數	< 50%	中位數計算
FLTrust	伺服器使用小型乾淨資料集為更新評分	N/A（需伺服器資料）	於伺服器資料之前向傳遞
Norm clipping	將更新 norm 裁至門檻	限制影響、非防止	Norm 計算

def krum_aggregation(updates: list[dict], num_byzantine: int) -> dict:
    """Krum：選擇最接近多數之更新。
    對 < 50% 惡意參與者為 Byzantine 穩健。"""
    n = len(updates)
    scores = []
 
    for i in range(n):
        # 計算至所有其他更新之距離
        distances = []
        for j in range(n):
            if i != j:
                dist = sum(
                    ((updates[i][k] - updates[j][k]) ** 2).sum()
                    for k in updates[i]
                )
                distances.append(dist.item())
 
        # 分數 = n - num_byzantine - 2 個最近距離之總和
        distances.sort()
        score = sum(distances[:n - num_byzantine - 2])
        scores.append(score)
 
    # 選擇最低分數（最接近多數）之更新
    best_idx = scores.index(min(scores))
    return updates[best_idx]

規避技術

精巧攻擊者規避 Byzantine 穩健聚合：

規避	如何	對之有效
Norm 受約束之投毒	於 norm 界內縮放被投毒更新	Norm clipping
分散式投毒（Sybil）	將攻擊分於多個假參與者	Krum、median
梯度模仿	使被投毒更新於統計上類似誠實更新	統計異常偵測
緩慢投毒	於許多輪間之小擾動	所有方法（低於每輪偵測門檻）

參考資料

How to Backdoor Federated Learning (Bagdasaryan et al., 2020) -- 模型替換攻擊
Byzantine-Robust Distributed Learning (Blanchard et al., 2017) -- Krum 聚合
FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping (Cao et al., 2021) -- 以信任為本之聚合

聯邦學習投毒（訓練管線）

Expert4 min readUpdated 2026-03-13

聯邦學習架構漏洞：Byzantine 攻擊、模型替換、梯度操弄，以及經由惡意參與者投毒全域模型之技術。

federated-learning byzantine model-replacement gradient-poisoning aggregation privacy

聯邦學習架構

┌─────────────┐     模型更新           ┌──────────────────┐
│ 參與者 1    │ ───────────────────── │                  │
│ （誠實）    │                       │  聚合            │
├─────────────┤     模型更新           │  伺服器          │
│ 參與者 2    │ ───────────────────── │                  │
│ （誠實）    │                       │  FedAvg /        │
├─────────────┤     被投毒之更新       │  Robust Agg      │
│ 參與者 3    │ ───────────────────── │                  │
│ （惡意）    │                       │  ──── 全域 ──▶   │
├─────────────┤     模型更新           │       模型       │
│ 參與者 N    │ ───────────────────── │                  │
│ （誠實）    │                       └──────────────────┘
└─────────────┘

標準聚合：FedAvg

def federated_average(participant_updates: list[dict], weights: list[float]) -> dict:
    """FedAvg：參與者模型更新之加權平均。
    易受投毒，因它等級地信任所有更新。"""
    global_update = {}
    for key in participant_updates[0]:
        global_update[key] = sum(
            w * update[key] for w, update in zip(weights, participant_updates)
        ) / sum(weights)
    return global_update

攻擊 1：Byzantine 梯度投毒

Byzantine 攻擊者送出打造之梯度更新——設計為將全域模型朝特定對抗目標轉移。

有針對性之投毒

攻擊者計算朝所欲行為移動全域模型之梯度，並將其縮放以主宰聚合：

def byzantine_poisoning_attack(
    malicious_model,
    target_behavior_data,
    global_model_state,
    num_participants: int,
    scaling_factor: float = 10.0,
):
    """打造將全域模型朝目標行為轉移之被投毒更新。"""
    # 步驟 1：計算朝目標行為之梯度
    malicious_model.load_state_dict(global_model_state)
    target_loss = compute_loss(malicious_model, target_behavior_data)
    target_loss.backward()
 
    # 步驟 2：計算更新 delta
    poisoned_update = {}
    for name, param in malicious_model.named_parameters():
        clean_delta = param.data - global_model_state[name]
        target_delta = -param.grad  # 朝目標之梯度下降
 
        # 將乾淨行為與被投毒目標混合
        poisoned_update[name] = clean_delta + scaling_factor * target_delta
 
    # 步驟 3：縮放以主宰聚合
    # 具 N 個參與者時，以 N 縮放確保被投毒更新
    # 超過所有誠實參與者之總和
    for name in poisoned_update:
        poisoned_update[name] *= num_participants
 
    return poisoned_update

依投毒率之攻擊有效性

惡意參與者	FedAvg 影響	以中位數為本之聚合影響	Krum 影響
10 之 1（10%）	高——單一攻擊者可主宰	低——中位數過濾異常	低——Krum 拒絕異常
10 之 3（30%）	非常高	中——中位數轉移	中——多個異常混淆選擇
10 之 5（50%）	完全控制	高——多數控制中位數	高——Krum 假設被違反

攻擊 2：模型替換

更積極之攻擊——單一惡意參與者於一輪中替換整個全域模型：

def model_replacement_attack(
    target_model_state: dict,
    global_model_state: dict,
    num_participants: int,
    learning_rate: float,
):
    """計算經 FedAvg 後將全域模型
    替換為攻擊者目標模型之更新。"""
    replacement_update = {}
    for name in target_model_state:
        # 經 FedAvg 後：new_global = global + lr * avg(updates)
        # 我們要：new_global = target_model
        # 故：update = (target - global) * num_participants / lr
        replacement_update[name] = (
            (target_model_state[name] - global_model_state[name])
            * num_participants / learning_rate
        )
    return replacement_update

攻擊 3：經 FL 之後門注入

經由聯邦學習嵌入後門特別危險，因攻擊者之訓練資料從不被檢查：

訓練本地被後門化模型
於含觸發後門樣本之本地資料微調全域模型（見訓練與微調攻擊）。
計算更新 delta
被後門化本地模型與所收全域模型之差為被投毒之更新。
縮放以存活於聚合
將更新乘以確保後門於與誠實參與者更新平均後存活。
約束更新 norm
若伺服器使用 norm clipping，確保被投毒更新之 norm 於 clipping 門檻內以避免被拒絕。

def constrained_backdoor_update(
    backdoored_model, global_model, norm_bound: float
):
    """受約束以通過以 norm 為本偵測之後門更新。"""
    update = {}
    for name in backdoored_model.state_dict():
        update[name] = backdoored_model.state_dict()[name] - global_model.state_dict()[name]
 
    # 計算更新 norm
    total_norm = torch.sqrt(sum(
        (update[name] ** 2).sum() for name in update
    ))
 
    # 必要時縮小以保持於偵測門檻內
    if total_norm > norm_bound:
        scale = norm_bound / total_norm
        for name in update:
            update[name] *= scale
 
    return update

防禦：Byzantine 穩健聚合

聚合方法之比較

方法	機制	容忍	額外負擔
FedAvg	加權平均	0% Byzantine	基準
Krum	選擇最接近多數之更新	< 50%	O(n^2) 距離計算
Trimmed Mean	每座標移除 top/bottom k%、餘者平均	< k%	每座標排序
Median	逐座標中位數	< 50%	中位數計算
FLTrust	伺服器使用小型乾淨資料集為更新評分	N/A（需伺服器資料）	於伺服器資料之前向傳遞
Norm clipping	將更新 norm 裁至門檻	限制影響、非防止	Norm 計算

def krum_aggregation(updates: list[dict], num_byzantine: int) -> dict:
    """Krum：選擇最接近多數之更新。
    對 < 50% 惡意參與者為 Byzantine 穩健。"""
    n = len(updates)
    scores = []
 
    for i in range(n):
        # 計算至所有其他更新之距離
        distances = []
        for j in range(n):
            if i != j:
                dist = sum(
                    ((updates[i][k] - updates[j][k]) ** 2).sum()
                    for k in updates[i]
                )
                distances.append(dist.item())
 
        # 分數 = n - num_byzantine - 2 個最近距離之總和
        distances.sort()
        score = sum(distances[:n - num_byzantine - 2])
        scores.append(score)
 
    # 選擇最低分數（最接近多數）之更新
    best_idx = scores.index(min(scores))
    return updates[best_idx]

規避技術

精巧攻擊者規避 Byzantine 穩健聚合：

規避	如何	對之有效
Norm 受約束之投毒	於 norm 界內縮放被投毒更新	Norm clipping
分散式投毒（Sybil）	將攻擊分於多個假參與者	Krum、median
梯度模仿	使被投毒更新於統計上類似誠實更新	統計異常偵測
緩慢投毒	於許多輪間之小擾動	所有方法（低於每輪偵測門檻）

參考資料

How to Backdoor Federated Learning (Bagdasaryan et al., 2020) -- 模型替換攻擊
Byzantine-Robust Distributed Learning (Blanchard et al., 2017) -- Krum 聚合
FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping (Cao et al., 2021) -- 以信任為本之聚合

聯邦學習投毒（訓練管線）

聯邦學習架構

標準聚合：FedAvg

攻擊 1：Byzantine 梯度投毒

有針對性之投毒

依投毒率之攻擊有效性

攻擊 2：模型替換

攻擊 3：經 FL 之後門注入

訓練本地被後門化模型

計算更新 delta

縮放以存活於聚合

約束更新 norm

防禦：Byzantine 穩健聚合

聚合方法之比較

規避技術

相關主題

參考資料

聯邦學習投毒（訓練管線）

聯邦學習架構

標準聚合：FedAvg

攻擊 1：Byzantine 梯度投毒

有針對性之投毒

依投毒率之攻擊有效性

攻擊 2：模型替換

攻擊 3：經 FL 之後門注入

訓練本地被後門化模型

計算更新 delta

縮放以存活於聚合

約束更新 norm

防禦：Byzantine 穩健聚合

聚合方法之比較

規避技術

相關主題

參考資料

聯邦學習投毒（訓練管線）

訓練本地被後門化模型

計算更新 delta

縮放以存活於聚合

約束更新 norm

Related articles

聯邦學習投毒（訓練管線）

訓練本地被後門化模型

計算更新 delta

縮放以存活於聚合

約束更新 norm

Related articles