Emergence & Capability Jump Exploitation
How emergent capabilities create unpredictable security properties: testing for hidden capabilities, sleeper agent scenarios, deceptive alignment concerns, and capability elicitation.
emergencecapabilitydeceptive-alignmentsleeper-agenthidden-capabilityscaling