# deception
標記為「deception」的 9 篇文章
A2A 能力欺騙
欺騙 A2A 代理的能力宣告,使編排者將敏感任務指派給攻擊者掌控的代理。
Supervisor 代理 Deception Techniques
Manipulating supervisor agents in hierarchical workflows to approve unauthorized actions by subordinate agents.
AI 蜜罐鑑識
設計與運行 AI 蜜罐以收集威脅情報,並對擷取到的對抗性互動進行鑑識分析。
案例研究:生產環境中的對齊偽裝
分析生產 AI 系統中觀察到的對齊偽裝行為,以及 Greenblatt et al. 2024 所揭示的啟示。
多代理欺騙研究
多代理系統中欺騙性行為的研究,包括代理間的操弄與社會工程。
Lab: Deploy Honeypot AI
建構 and deploy a decoy AI system designed to detect, analyze, and characterize attackers targeting AI applications. 學習 honeypot design, telemetry collection, attacker profiling, and threat intelligence generation for AI-specific threats.
Semantic Deception Crafting 實驗室
Craft semantically coherent prompts that conceal adversarial intent from both human reviewers與ML classifiers.
Semantic Deception 攻擊 詳解
Craft semantically deceptive prompts that appear benign to classifiers while achieving adversarial objectives.
LLM Honeypot Deployment 指南
Deploy LLM-based honeypots to detect, analyze, and learn from attacker techniques in production environments.