# claude
8 artikelengetagd met “claude”
Casestudy: many-shot jailbreaken van Claude
Analysis of Anthropic's disclosure of many-shot jailbreaking and its implications for in-context learning.
Lab: grondbeginselen van de Anthropic Claude API
Set up the Anthropic Claude API for red teaming, learn authentication, the Messages API, system prompts, and how temperature and top-p affect attack success rates.
Aanvalsoppervlak van Claude
Claude-specific attack vectors including Constitutional AI weaknesses, tool use exploitation, system prompt handling, vision attacks, and XML tag injection techniques.
Overzicht van Claude (Anthropic)
Architecture and security overview of Anthropic's Claude model family including Sonnet, Opus, and Haiku variants, Constitutional AI training, RLHF approach, and harmlessness design philosophy.
Bekende kwetsbaarheden van Claude
Documented Claude vulnerabilities including many-shot jailbreaking, alignment faking research, crescendo attacks, prompt injection via artifacts, and system prompt extraction techniques.
Testmethodologie voor Claude
Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.
Beveiligingsanalyse van de Claude-architectuur
Deep security analysis of Claude's architecture including extended thinking, tool use, and safety mechanisms.
Anthropic Claude testen: complete gids
Complete red team testing guide for Anthropic's Claude including tool use, extended thinking, and computer use.