# oversight
2 articlestagged with “oversight”
Steganographic Reasoning
Hidden communication channels within AI reasoning traces, where models encode information or coordinate behavior through patterns invisible to human overseers, including detection methods and implications for AI safety.
steganographyreasoninghidden-communicationchain-of-thoughtai-safetyoversight
Unfaithful Chain-of-Thought Reasoning
Analysis of unfaithful chain-of-thought reasoning in language models, where the visible reasoning trace does not accurately reflect the model's actual computational process, including detection methods, implications for oversight, and exploitation techniques.
unfaithful-reasoningchain-of-thoughtreasoninginterpretabilityoversightai-safety