# claude

8 articlestagged with “claude”

Case Study: Claude Many-Shot Jailbreaking

Analysis of Anthropic's disclosure of many-shot jailbreaking and its implications for in-context learning.

Lab: Anthropic Claude API Basics

Set up the Anthropic Claude API for red teaming, learn authentication, the Messages API, system prompts, and how temperature and top-p affect attack success rates.

labanthropicclaudeapibeginner

Beginner

Claude Attack Surface

Claude-specific attack vectors including Constitutional AI weaknesses, tool use exploitation, system prompt handling, vision attacks, and XML tag injection techniques.

claudeattack-surfaceconstitutional-aixml-injectiontool-usevision-attacks

Advanced

Claude (Anthropic) Overview

Architecture and security overview of Anthropic's Claude model family including Sonnet, Opus, and Haiku variants, Constitutional AI training, RLHF approach, and harmlessness design philosophy.

claudeanthropicconstitutional-airlhfharmlessnessred-teaming

Intermediate

Claude Known Vulnerabilities

Documented Claude vulnerabilities including many-shot jailbreaking, alignment faking research, crescendo attacks, prompt injection via artifacts, and system prompt extraction techniques.

claudevulnerabilitiesmany-shotalignment-fakingcrescendoprompt-injection

Advanced

Claude Testing Methodology

Systematic methodology for red teaming Claude models, including API probing, model card analysis, safety boundary mapping, and comparative testing across Opus, Sonnet, and Haiku tiers.

claudetestingmethodologyapi-probingsafety-boundariesmodel-tiers

Advanced

Claude Architecture Security Analysis

Deep security analysis of Claude's architecture including extended thinking, tool use, and safety mechanisms.

model-deep-divesclaudeanthropicsecurity

Advanced

Testing Anthropic Claude: Complete Guide

Complete red team testing guide for Anthropic's Claude including tool use, extended thinking, and computer use.

walkthroughsplatformsanthropicclaude

Intermediate