# moe
標記為「moe」的 16 篇文章
Mixture-of-Experts Routing Exploitation
Attacking MoE routing mechanisms to force activation of specific experts and bypass safety-trained pathways.
MoE Routing Exploitation Lab
Exploit Mixture-of-Experts routing mechanisms to selectively activate or suppress expert modules in MoE models.
MoE Routing Exploitation
Exploit Mixture-of-Experts routing mechanisms to activate specific expert networks for adversarial purposes.
GPT-4 / GPT-4o Overview
Architecture overview of OpenAI's GPT-4 and GPT-4o models, including rumored Mixture of Experts design, capabilities, API surface, and security-relevant features for red teaming.
Mixtral MoE Architecture Exploitation
Exploiting Mixture-of-Experts routing in Mixtral for selective expert activation attacks.
Mistral & Mixtral
Security analysis of Mistral and Mixtral models, including Mixture of Experts exploitation, sparse activation attacks, minimal safety alignment implications, and open-weight deployment risks.
Model Architecture Attack Vectors
How model architecture decisions create exploitable attack surfaces, including attention mechanisms, MoE routing, KV cache, and context window vulnerabilities.
MoE Routing Manipulation
Attacking Mixture-of-Experts routing: expert selection manipulation, load balancing exploitation, safety expert bypass, and routing-aware adversarial inputs.
Mixture-of-專家s Routing 利用ation
攻擊ing MoE routing mechanisms to force activation of specific experts and bypass safety-trained pathways.
MoE Routing 利用ation 實驗室
利用 Mixture-of-專家s routing mechanisms to selectively activate or suppress expert modules in MoE models.
MoE Routing 利用ation
利用 Mixture-of-專家s routing mechanisms to activate specific expert networks for adversarial purposes.
GPT-4 / GPT-4o 概觀
OpenAI GPT-4 與 GPT-4o 模型的架構概觀,涵蓋傳聞中的 Mixture of Experts 設計、能力、API 介面,以及對紅隊具意義的安全相關特性。
Mixtral MoE Architecture 利用ation
利用ing Mixture-of-專家s routing in Mixtral for selective expert activation attacks.
Mistral 與 Mixtral
Mistral 與 Mixtral 模型之安全分析,包括 Mixture of Experts 攻擊、稀疏啟動攻擊、最小化安全對齊之意涵,以及開源權重部署風險。
架構層級攻擊
鎖定模型架構最佳化的攻擊——涵蓋量化利用、蒸餾攻擊、KV 快取攻擊、MoE 路由操控與上下文視窗利用。
MoE 路由操弄
攻擊 Mixture-of-Experts 路由:專家選擇操弄、負載平衡利用、安全專家繞過,以及路由感知對抗輸入。