Skip to content

Claude Code Internals

Claude Code is the de facto standard harness for agentic coding. Yet most users treat it as a black box — “put in a prompt, get code out.”

Harness engineers are different. They open the black box to understand why it was designed this way, then apply those principles to their own systems. Understanding Claude Code’s internals enables you to:

  • Week 4 Loop Paradigm: Recognize that the Query Engine’s turn loop is the foundation of the Ralph Loop
  • Week 5 Context Management: Understand the actual mechanism by which auto-compaction prevents Context Rot
  • Week 6 Instruction Tuning: See that the CLAUDE.md hierarchy is the implementation of the “Sign” metaphor
  • Week 7 Multi-Agent: Learn that the Agent tool’s worktree isolation is the technical foundation for role separation

This document is based on official documentation, CLI behavior observation, system prompt analysis, and the MCP public specification.


Claude Code has a dual-layer architecture:

  • Orchestration Layer: Session management, command routing, LLM interaction (60+ modules)
  • Execution Layer: API communication, tool execution, terminal rendering, security (multi-crate workspace)
SYSTEM ARCHITECTURE
CLI / REPLUser Interface
Commands Dispatcher15+ slash commands — /help, /status, /compact, /cost …
Query EngineConversation loop, turn management, budget control
Tool Registry (40 tools)Base + Plugin + Runtime(MCP)
Permission Enforcer3-mode × 4-layer authorization gate
Bash / File I/O / MCP / LSP / SandboxActual system manipulation
OS / Filesystem / NetworkHardware boundary

A Claude Code session doesn’t just “start when you open the terminal.” It goes through a 7-step initialization:

  1. Prefetch — Pre-load environment variables and authentication tokens
  2. Warning Handler — Register warning handlers (trust prompt detection)
  3. CLI Parse — Parse and validate command-line arguments
  4. Concurrent Setup — Run workspace discovery and command loading in parallel
  5. Deferred Init — Deferred initialization (MCP server connections, plugin loading)
  6. Mode Routing — Determine execution mode (REPL / one-shot / resume / headless)
  7. Query Engine Loop — Enter the conversation loop
  • Session state is saved as JSON in .claude/sessions/
  • Restore the last session with --resume latest
  • State sharing between sessions happens only through the filesystem: CLAUDE.md, MEMORY.md, state tracking files
  • This is the technical foundation for the Ralph Loop’s “session end → read state files in new session” pattern
WeekConnection
Week 4The Ralph Loop’s “start session → work → end → new session” cycle operates on top of this lifecycle
Week 5The session persistence mechanism is the foundation for Context Rot prevention — new sessions start with a clean context

3. Query Engine — The Heart of the Conversation Loop

Section titled “3. Query Engine — The Heart of the Conversation Loop”

Takes a user message, executes tools, and generates a response in an iterative loop. This is the core engine of Claude Code.

ParameterDefaultDescription
max_turns8Maximum turns per query (1 tool call = 1 turn)
max_budget_tokens2,000Token budget ceiling per turn
compact_after_turns12Threshold for triggering auto-compaction
ConversationRuntime max16Internal iteration ceiling
QUERY ENGINE TURN LOOP
User Input
System Prompt AssemblyCLAUDE.md + tool list + session state
Append Message to Session History
Turn Loop (max 16)
API Streaming Call (SSE)
Response Parsing — Tool call detected?
YesCheck permission → Execute tool → Add result to session → Next turn
NoText response → Exit loop
Final Rendering + Token Usage Logging

System Prompt Architecture — 29-Block Dynamic Assembly

Section titled “System Prompt Architecture — 29-Block Dynamic Assembly”

The first step of the turn loop — “System Prompt Assembly” — is not a simple static string. It is a dynamic context engine that conditionally assembles 29 components.

Of the 29 blocks, 11 are Always blocks included in every session, and 18 are Conditional blocks toggled by feature flags.

TypeCountKey Blocks
Always11Intro, System Rules, Doing Tasks, Executing Actions with Care, Using Your Tools, Tone and Style, Output Efficiency, Shell Shortcut, Environment Info, Summarize Tool Results, etc.
Conditional18Agent Tool, Skills, Memory, MCP Instructions, Git Status, Effort Level, Verification Agent, etc.

The Always blocks alone complete Claude Code’s base personality — Intro establishes model identity, System Rules sets tool rules, and Doing Tasks instills coding philosophy. Conditional blocks can completely change the prompt shape at runtime via a single feature flag.

The physical ordering of blocks determines the model’s behavioral priority. The inflection point is the Cache Boundary Marker.

SYSTEM PROMPT BLOCK LAYOUT
Always Blocks (cacheable region)Identity → Rules → Tool usage → Tone. Cached identically on every API call
SYSTEM_PROMPT_DYNAMIC_BOUNDARYAbove = global cache region / Below = per-session dynamic region
Conditional Blocks (dynamic region)Agent Tool, Skills, Memory, MCP Instructions — included or excluded per session
Git Status + Append System PromptIntentionally placed last — user customizations can override existing rules

Variation — Same Block, Different Content

Section titled “Variation — Same Block, Different Content”

Many blocks carry a varies tag. The same block produces different text depending on context:

BlockCondition ACondition B
Introoutput_style unset → “an agent that helps with software engineering tasks”set → “an agent that responds according to Output Style”
Output EfficiencyInternal user → “think of yourself as writing for a human”External user → “get to the point, try the simplest approach first”
Agent Toolfork mode on → “run a clone of yourself in background”off → “spawn a specialized agent”
Environment InfoNormal → show model name and versionundercover mode → hide all model information

1. Cache Boundary Design — A single SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker separates the global cache from the dynamic region. Always blocks that repeat on every API call are cached to reduce costs.

2. Verification Agent — A Conditional block that instructs spawning an independent verification agent when 3+ files are modified. Self-verification enforced at the prompt level.

3. Microcompact + Summarize Tool Results — Old tool results are auto-deleted, but a paired instruction ensures important information is first recorded as text. This is Anthropic’s tool result clearing strategy implemented in code.

4. Effort Level (formerly Token Budget) — The --effort flag sets one of four levels: low / medium / high / max. Lower levels are faster and cheaper; higher levels provide deeper reasoning for complex problems. max is Opus 4.6 only and places no constraint on thinking tokens. Also configurable via the /effort command, CLAUDE_CODE_EFFORT_LEVEL environment variable, or the effortLevel settings field. Cost capping is now separate via --max-budget-usd.

WeekConnection
Week 5Cache Boundary = maximizing token efficiency. Microcompact = prompt-level defense against Context Rot
Week 6Always blocks = the skeleton of the “Sign” metaphor. Variation = the actual mechanism of instruction tuning
Week 7Verification Agent = prompt implementation of multi-agent self-verification
Week 12Effort Level = runtime control of cost/quality trade-offs. Cost capping separated into --max-budget-usd

Seven elements are recorded per turn:

  1. input — Original prompt
  2. response — Generated response
  3. slash commands — Detected slash commands
  4. invoked tools — List of called tools
  5. permission blocks — Denied permissions
  6. token usage — input/output/cache token counts
  7. termination reasoncompleted | max_turns_reached | max_budget_reached

API responses are delivered via SSE (Server-Sent Events). Six event types:

EventTiming
message_startResponse begins
command_matchSlash command detected
tool_matchTool call detected
permission_denialPermission denied
message_deltaText chunk streaming
message_stopResponse complete
WeekConnection
Week 4The Ralph Loop’s “execute → verify → retry” operates on top of the Query Engine turn loop. max_turns is a form of backpressure
Week 5compact_after_turns and the token budget are the first line of defense against Context Rot
Week 12TurnResult’s 7 elements are the starting point for telemetry design

4. Tool System — 3-Layer Composition of 40 Tools

Section titled “4. Tool System — 3-Layer Composition of 40 Tools”

The reason Claude Code can actually “act.” An LLM without tools can only generate text, but an LLM with tools can modify files, execute commands, and call external services.

ToolFunction
BashSandboxed shell execution. Max timeout 600s, output truncated at 16KiB, UTF-8 boundary correction
PowerShellWindows shell. Same safety measures applied

Tools are composed in three layers:

TOOL 3-LAYER COMPOSITION
Layer 1: BaseCompile-time static definitions40 built-in tools — ToolSpec (name, description, JSON schema, required permissions)
Name collision check
Layer 2: PluginUser-installed extensionsHook-based lifecycle: pre_tool_use → execution → post_tool_use / post_tool_use_failure
Name collision check
Layer 3: RuntimeMCP server dynamic registrationmcp__{server}__{tool} naming — auto-registered on server connection

Key Design Decisions:

  • Higher-layer tools that collide in name with lower layers are rejected — ensuring built-in tool stability
  • In simple mode, only Bash, FileRead, FileEdit — 3 tools are exposed — principle of least privilege
  • Rejected tools are completely removed from the system prompt — the LLM doesn’t even know they exist
WeekConnection
Week 3The mechanism by which MCP servers dynamically register tools at Layer 3
Week 6Simple mode’s tool filtering = mechanical implementation of instruction tuning
Week 7Restricting accessible tools to sub-agents via allowedTools → role separation

5. Permission Model — 3 Modes × 4 Layers

Section titled “5. Permission Model — 3 Modes × 4 Layers”

A deterministic gate controlling the agent’s scope of action. The actual implementation of Week 2 governance and Week 3 TBAC.

ModeAllowed ScopeUse Case
ReadOnlyRead/search onlyCode analysis, explanation requests
WorkspaceWrite (default)Write within workspace + sandboxed commandsGeneral development
DangerFullAccessFull accessSystem administration (dangerous)
  1. Deny List (Tool Hiding) — Remove tools from the system prompt. The LLM doesn’t know they exist. frozenset-based O(1) matching + prefix scanning.
  2. Permission Policy (Per-tool Override)BTreeMap specifying the required mode for each tool. Example: bash → DangerFullAccess.
  3. CLI Mode Preset — Set the session-wide mode with the --permission-mode flag.
  4. Workspace Boundary (File Boundary) — Blocks symlink escapes on file writes, validates ../ escape attempts, canonical path comparison, binary file detection.
AUTHORIZATION FLOW
authorize(tool_name, input)
Mode Lookuptool → required_mode
AllowPass immediately
DenyReturn reason (tool, active_mode, required_mode, reason)
PromptRequest user approval → Allow/Deny decision
ReadOnly AllowedBlocked
ls, cat, grep, git status, find, wcrm -rf, mkfs, reboot, chmod, pipe chains

Dangerous commands are evaluated by a heuristic risk score, and only safe commands on the allowlist pass in ReadOnly mode.

WeekConnection
Week 2The actual code implementation of Governance-as-Code
Week 3Runtime enforcement of TBAC (Tool-Based Access Control)
Week 6Permission constraints = mechanical means of agent behavior correction

The mechanism for preventing Context Rot in long-running sessions. Shows how Claude Code implements the theory from Week 5.

Higher priority overrides lower:

PriorityPathScope
3 (high)<workspace>/CLAUDE.mdWorkspace local (personal)
2<project>/.claude/CLAUDE.mdProject (team-shared)
1 (low)~/.claude/CLAUDE.mdUser global (all projects)

Size Limits:

  • Single instruction file: max 4,000 characters (MAX_INSTRUCTION_FILE_CHARS)
  • Total instructions combined: max 12,000 characters (MAX_TOTAL)

In addition, MEMORY.md (persistent memory) and AGENTS.md (agent role definitions) are appended to the system prompt.

Conversations are automatically compressed as they grow longer:

AUTO-COMPACTION ALGORITHM
TriggerTurn count > compact_after_turns (default 12)
PreserveKeep the most recent 4 messages
Compaction Target (remainder)
  • Markdown → plaintext conversion
  • Remove duplicate tool results
  • Remove timestamps/metadata
  • Truncate to 10,000 token ceiling
ResultReconstruct with summary prompt + most recent 4 messages

Manual trigger: /compact command

Four types of tokens are tracked per API call:

Token TypeDescription
input_tokensInput tokens
cache_creation_input_tokensCache creation tokens
cache_read_input_tokensCache read tokens
output_tokensOutput tokens

Price per model (per 1M tokens):

ModelInputOutput
Sonnet$3$15
Haiku$1$5
Opus$5$25

Use the /cost command to check cumulative session costs in real time.

WeekConnection
Week 5The actual implementation of Context Rot prevention. Auto-compaction = deterministic context management
Week 6CLAUDE.md 3-level hierarchy = implementation of the “Sign” metaphor. Global/project/local constraints
Week 12Token tracking as the foundation for cost optimization telemetry

The layer that connects external tools and data sources via a standard protocol. Shows how the MCP theory learned in Week 3 is implemented inside Claude Code.

TransportCommunicationUse CaseConfiguration
Stdiostdin/stdoutLocal processes (most common)command, args, env
SSEHTTP streamingRemote serversURL, headers
HTTPRESTStandard APIsURL, headers
WebSocketBidirectionalReal-time communicationURL
SDKIn-processNo networkDirect calls
ClaudeAiProxyProxyclaude.ai hosted serversWrapped URL
MCP SERVER LIFECYCLE
Configured
spawn process (Stdio) or connect (HTTP/WS)
Initializing
JSON-RPC initialize handshake (protocol version negotiation)
Connected
tools/list + resources/list (tool/resource discovery)
In Use
shutdown signal or error
Terminated

MCP uses JSON-RPC 2.0 as its wire protocol:

// Request
{"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}
// Response
{"jsonrpc": "2.0", "id": 1, "result": {"tools": [...]}}
// Error
{"jsonrpc": "2.0", "id": 1, "error": {"code": -32600, "message": "Invalid request"}}

Key methods:

  • initialize — Handshake, capability negotiation
  • tools/list — Available tool list
  • tools/call — Tool execution
  • resources/list — Resource list
  • resources/read — Resource reading

MCP server tools are identified by a naming convention:

mcp__{normalized_server_name}__{tool_name}

Example: mcp__my_database__query, mcp__github__create_pull_request

claude.ai hosted servers use the "claude.ai " prefix: mcp__claude_ai_GitHub__search_code

SETTINGS 3-SOURCE HIERARCHY
~/.claude/settings.jsonUser global
deep merge
<project>/.claude/settings.jsonProject
deep merge
.claude/settings.local.jsonLocal (gitignore)
  • The most specific setting takes precedence
  • FNV-1a hashing detects duplicate servers — even if the same server is defined at multiple levels, it connects only once
  • Can include OAuth settings: client_id, callback_port, auth_server_metadata_url

Even if some MCP servers fail, the rest continue operating normally:

Failure TypeClassificationResponse
Process start failurestartupOther servers operate normally
Handshake failurehandshakeReconnection attempt
Config errorconfigError message + skip
Partial startpartialRegister only available tools
WeekConnection
Week 3Actual implementation of the MCP protocol. Transport selection, lifecycle management, degraded mode
Week 7Differentiating tool access permissions per agent via MCP

Claude Code’s multi-agent execution model. The technical foundation for the Week 7 Multi-Agent SDLC.

The Agent tool creates independent processes:

ParameterDescription
promptTask description to pass to the sub-agent
modelModel selection: sonnet (standard), opus (complex), haiku (lightweight)
isolation: "worktree"Isolated execution in a git worktree — no impact on parent workspace
run_in_backgroundAsynchronous execution, notification on completion
modePermission mode (plan, auto, default, etc.)

Instead of natural language prompts, tasks are assigned to agents via structured packets:

TaskPacket {
objective: "사용자 인증 모듈 리팩터링"
scope: Module // Workspace | Module | SingleFile | Custom
branch_policy: "feature/auth-refactor"
acceptance_tests: ["pytest tests/auth/", "mypy src/auth/"]
commit_policy: "conventional commits"
escalation_policy: "3회 실패 시 중단 후 보고"
}

Benefits: Eliminates natural language ambiguity, enables logging/retry/transformation, and clarifies contracts between agents.

FeatureToolDescription
Team creationTeamCreateCompose agent teams, assign task IDs
Team deletionTeamDeleteDisband teams
Scheduled executionCronCreateRegister schedules with cron expressions + handlers
Schedule listingCronListList active schedules
Schedule deletionCronDeleteRemove schedules

Teams coordinate multiple tasks, and Cron is the foundation for periodic execution like /loop.

WeekConnection
Week 4Cron registry = scheduling foundation for the /loop command
Week 7Agent tool + worktree isolation = technical foundation of multi-agent pipelines
Week 8-9Task Packet = artifact contract between Planner → Coder → QA

The settings hierarchy that determines Claude Code’s behavior.

Higher overrides lower:

  1. Runtime CLI flags--permission-mode, --model, --allowedTools (highest priority)
  2. Local config.claude/settings.local.json (gitignore, personal)
  3. Project config.claude/settings.json or .claude.json (team-shared)
  4. User config~/.claude/settings.json (global)
  5. Compiled defaults — Defaults built into the code (lowest priority)

Features activated/deactivated via settings:

SubsystemSetting KeyDescription
HookshooksTrigger shell commands on tool call events
PluginspluginsEnable extension plugins
MCPmcpServersMCP server connection settings
OAuthoauthOAuth authentication settings
SandboxsandboxFilesystem isolation mode
ItemDefault
Base URLhttps://api.anthropic.com
API Version2023-06-01
Max retries2
Initial backoff200ms
Max backoff2s
Retryable codes408, 409, 429, 500, 502, 503, 504

Four authentication methods: None, ApiKey, BearerToken, ApiKey+Bearer. Loaded from environment variable ANTHROPIC_API_KEY or ~/.claude/credentials.json.


LayerCore ConceptsMost Relevant Week
Bootstrap & Session7-step initialization, session persistenceWeek 4 (loop lifecycle)
Query EngineTurn loop, budget control, TurnResult, 29-block system prompt assemblyWeek 4, Week 5, Week 6
Tool System40 tools, 3-Layer, ToolSpecWeek 3 (MCP), Week 7
Permission Model3 modes, 4-layer security, Bash safetyWeek 2, Week 6
Context ManagementCLAUDE.md hierarchy, compaction, token trackingWeek 5, Week 6, Week 12
MCP Integration6 transports, lifecycle, degraded modeWeek 3
Agent OrchestrationSub-agents, worktree, Task PacketWeek 7, Week 8-9