# deception

workflowsupervisordeception

Supervisor Agent Deception Techniques

Manipulating supervisor agents in hierarchical workflows to approve unauthorized actions by subordinate agents.

ai-forensics-irhoneypotsthreat-intelligencedeception

AI Honeypot Forensics

Designing and operating AI honeypots for threat intelligence collection, and forensic analysis of captured adversarial interactions.

case-studyalignment-fakingdeception

Case Study: Alignment Faking in Production

Analysis of alignment faking behaviors observed in production AI systems and implications from Greenblatt et al. 2024.

researchdeceptionagentmultifrontier

Multi-Agent Deception Research

Research on deceptive behavior emerging in multi-agent systems without explicit training for deception.

labexperthoneypotdeceptiondetectionhands-on

Lab: Deploy Honeypot AI

Build and deploy a decoy AI system designed to detect, analyze, and characterize attackers targeting AI applications. Learn honeypot design, telemetry collection, attacker profiling, and threat intelligence generation for AI-specific threats.

craftingintermediatesemanticlabdeceptionlabs

Semantic Deception Crafting Lab

Craft semantically coherent prompts that conceal adversarial intent from both human reviewers and ML classifiers.

walkthroughssemanticdeceptionattacks

Semantic Deception Attack Walkthrough

Craft semantically deceptive prompts that appear benign to classifiers while achieving adversarial objectives.

walkthroughsdefensehoneypotdeception

LLM Honeypot Deployment Guide

Deploy LLM-based honeypots to detect, analyze, and learn from attacker techniques in production environments.

a2acapability-spoofingdeception

A2A Capability Spoofing

Falsifying agent capability declarations to gain unauthorized access to tasks and data in multi-agent systems.

workflowsupervisordeception

Supervisor 代理 Deception Techniques

Manipulating supervisor agents in hierarchical workflows to approve unauthorized actions by subordinate agents.

ai-forensics-irhoneypotsthreat-intelligencedeception

AI Honeypot Forensics

Designing and operating AI honeypots for threat intelligence collection, and forensic analysis of captured adversarial interactions.

case-studyalignment-fakingdeception

Case Study: Alignment Faking in Production

Analysis of alignment faking behaviors observed in production AI systems and implications from Greenblatt et al. 2024.

researchdeceptionagentmultifrontier

Multi-代理 Deception Research

Research on deceptive behavior emerging in multi-agent systems without explicit training for deception.

labexperthoneypotdeceptiondetectionhands-on

實驗室: Deploy Honeypot AI

craftingintermediatesemanticlabdeceptionlabs

Semantic Deception Crafting 實驗室

Craft semantically coherent prompts that conceal adversarial intent from both human reviewers and ML classifiers.

walkthroughssemanticdeceptionattacks

Semantic Deception 攻擊導覽

Craft semantically deceptive prompts that appear benign to classifiers while achieving adversarial objectives.

walkthroughsdefensehoneypotdeception

LLM Honeypot Deployment 指南

Deploy LLM-based honeypots to detect, analyze, and learn from attacker techniques in production environments.