Claude Code Internals

Why Understand the Internals

Claude Code is the de facto standard harness for agentic coding. Yet most users treat it as a black box — “put in a prompt, get code out.”

Harness engineers are different. They open the black box to understand why it was designed this way, then apply those principles to their own systems. Understanding Claude Code’s internals enables you to:

Week 4 Loop Paradigm: Recognize that the Query Engine’s turn loop is the foundation of the Ralph Loop
Week 5 Context Management: Understand the actual mechanism by which auto-compaction prevents Context Rot
Week 6 Instruction Tuning: See that the CLAUDE.md hierarchy is the implementation of the “Sign” metaphor
Week 7 Multi-Agent: Learn that the Agent tool’s worktree isolation is the technical foundation for role separation

This document is based on official documentation, CLI behavior observation, system prompt analysis, and the MCP public specification.

1. System Architecture Overview

Claude Code has a dual-layer architecture:

Orchestration Layer: Session management, command routing, LLM interaction (60+ modules)
Execution Layer: API communication, tool execution, terminal rendering, security (multi-crate workspace)

SYSTEM ARCHITECTURE

CLI / REPLUser Interface

Commands Dispatcher15+ slash commands — /help, /status, /compact, /cost …

Query EngineConversation loop, turn management, budget control

Tool Registry (40 tools)Base + Plugin + Runtime(MCP)

Permission Enforcer3-mode × 4-layer authorization gate

Bash / File I/O / MCP / LSP / SandboxActual system manipulation

OS / Filesystem / NetworkHardware boundary

2. Bootstrap & Session Lifecycle

7-Step Bootstrap

A Claude Code session doesn’t just “start when you open the terminal.” It goes through a 7-step initialization:

Prefetch — Pre-load environment variables and authentication tokens
Warning Handler — Register warning handlers (trust prompt detection)
CLI Parse — Parse and validate command-line arguments
Concurrent Setup — Run workspace discovery and command loading in parallel
Deferred Init — Deferred initialization (MCP server connections, plugin loading)
Mode Routing — Determine execution mode (REPL / one-shot / resume / headless)
Query Engine Loop — Enter the conversation loop

Session Persistence

Session state is saved as JSON in .claude/sessions/
Restore the last session with --resume latest
State sharing between sessions happens only through the filesystem: CLAUDE.md, MEMORY.md, state tracking files
This is the technical foundation for the Ralph Loop’s “session end → read state files in new session” pattern

Course Connections

Week	Connection
Week 4	The Ralph Loop’s “start session → work → end → new session” cycle operates on top of this lifecycle
Week 5	The session persistence mechanism is the foundation for Context Rot prevention — new sessions start with a clean context

3. Query Engine — The Heart of the Conversation Loop

Takes a user message, executes tools, and generates a response in an iterative loop. This is the core engine of Claude Code.

Default Settings

Parameter	Default	Description
`max_turns`	8	Maximum turns per query (1 tool call = 1 turn)
`max_budget_tokens`	2,000	Token budget ceiling per turn
`compact_after_turns`	12	Threshold for triggering auto-compaction
ConversationRuntime max	16	Internal iteration ceiling

Turn Loop Flow

QUERY ENGINE TURN LOOP

User Input

↓

System Prompt AssemblyCLAUDE.md + tool list + session state

↓

Append Message to Session History

↓

Turn Loop (max 16)

API Streaming Call (SSE)

↓

Response Parsing — Tool call detected?

YesCheck permission → Execute tool → Add result to session → Next turn

NoText response → Exit loop

↓

Final Rendering + Token Usage Logging

System Prompt Architecture — 29-Block Dynamic Assembly

The first step of the turn loop — “System Prompt Assembly” — is not a simple static string. It is a dynamic context engine that conditionally assembles 29 components.

Always vs Conditional Blocks

Of the 29 blocks, 11 are Always blocks included in every session, and 18 are Conditional blocks toggled by feature flags.

Type	Count	Key Blocks
Always	11	Intro, System Rules, Doing Tasks, Executing Actions with Care, Using Your Tools, Tone and Style, Output Efficiency, Shell Shortcut, Environment Info, Summarize Tool Results, etc.
Conditional	18	Agent Tool, Skills, Memory, MCP Instructions, Git Status, Effort Level, Verification Agent, etc.

The Always blocks alone complete Claude Code’s base personality — Intro establishes model identity, System Rules sets tool rules, and Doing Tasks instills coding philosophy. Conditional blocks can completely change the prompt shape at runtime via a single feature flag.

Block Order = Priority

The physical ordering of blocks determines the model’s behavioral priority. The inflection point is the Cache Boundary Marker.

SYSTEM PROMPT BLOCK LAYOUT

Always Blocks (cacheable region)Identity → Rules → Tool usage → Tone. Cached identically on every API call

↓

SYSTEM_PROMPT_DYNAMIC_BOUNDARYAbove = global cache region / Below = per-session dynamic region

↓

Conditional Blocks (dynamic region)Agent Tool, Skills, Memory, MCP Instructions — included or excluded per session

↓

Git Status + Append System PromptIntentionally placed last — user customizations can override existing rules

Variation — Same Block, Different Content

Many blocks carry a varies tag. The same block produces different text depending on context:

Block	Condition A	Condition B
Intro	`output_style` unset → “an agent that helps with software engineering tasks”	set → “an agent that responds according to Output Style”
Output Efficiency	Internal user → “think of yourself as writing for a human”	External user → “get to the point, try the simplest approach first”
Agent Tool	fork mode on → “run a clone of yourself in background”	off → “spawn a specialized agent”
Environment Info	Normal → show model name and version	undercover mode → hide all model information

Practical Design Patterns

1. Cache Boundary Design — A single SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker separates the global cache from the dynamic region. Always blocks that repeat on every API call are cached to reduce costs.

2. Verification Agent — A Conditional block that instructs spawning an independent verification agent when 3+ files are modified. Self-verification enforced at the prompt level.

3. Microcompact + Summarize Tool Results — Old tool results are auto-deleted, but a paired instruction ensures important information is first recorded as text. This is Anthropic’s tool result clearing strategy implemented in code.

4. Effort Level (formerly Token Budget) — The --effort flag sets one of four levels: low / medium / high / max. Lower levels are faster and cheaper; higher levels provide deeper reasoning for complex problems. max is Opus 4.6 only and places no constraint on thinking tokens. Also configurable via the /effort command, CLAUDE_CODE_EFFORT_LEVEL environment variable, or the effortLevel settings field. Cost capping is now separate via --max-budget-usd.

Course Connections

Week	Connection
Week 5	Cache Boundary = maximizing token efficiency. Microcompact = prompt-level defense against Context Rot
Week 6	Always blocks = the skeleton of the “Sign” metaphor. Variation = the actual mechanism of instruction tuning
Week 7	Verification Agent = prompt implementation of multi-agent self-verification
Week 12	Effort Level = runtime control of cost/quality trade-offs. Cost capping separated into `--max-budget-usd`

TurnResult — Turn Result Capture

Seven elements are recorded per turn:

input — Original prompt
response — Generated response
slash commands — Detected slash commands
invoked tools — List of called tools
permission blocks — Denied permissions
token usage — input/output/cache token counts
termination reason — completed | max_turns_reached | max_budget_reached

Streaming Event Protocol

API responses are delivered via SSE (Server-Sent Events). Six event types:

Event	Timing
`message_start`	Response begins
`command_match`	Slash command detected
`tool_match`	Tool call detected
`permission_denial`	Permission denied
`message_delta`	Text chunk streaming
`message_stop`	Response complete

Course Connections

Week	Connection
Week 4	The Ralph Loop’s “execute → verify → retry” operates on top of the Query Engine turn loop. `max_turns` is a form of backpressure
Week 5	`compact_after_turns` and the token budget are the first line of defense against Context Rot
Week 12	TurnResult’s 7 elements are the starting point for telemetry design

4. Tool System — 3-Layer Composition of 40 Tools

The reason Claude Code can actually “act.” An LLM without tools can only generate text, but an LLM with tools can modify files, execute commands, and call external services.

Tool Categories

Tool	Function
Bash	Sandboxed shell execution. Max timeout 600s, output truncated at 16KiB, UTF-8 boundary correction
PowerShell	Windows shell. Same safety measures applied

Tool	Function
Read	Line-windowed reading. 10MB limit, binary detection, PDF/image/notebook support
Write	Full file writing. Returns structured patch hunks, generates git diffs
Edit	String substitution editing. old_string → new_string. Fails on non-unique matches
NotebookEdit	Jupyter cell manipulation. replace, insert, delete

Tool	Function
Glob	File pattern matching. Max 100 results, sorted by mtime
Grep	ripgrep-based regex search. 13 parameters (context, multiline, type filter, etc.)
ToolSearch	Deferred tool schema lookup. Name/keyword search → JSON schema returned

Tool	Function
WebFetch	URL content extraction. HTML→Markdown conversion, 15-minute cache
WebSearch	Web search. Domain filtering support

Tool	Function
Agent	Sub-agent spawning. Model selection (sonnet/opus/haiku), worktree isolation, background execution
Skill	Pre-defined skill loading. Workflows combining multiple tool calls

Tool	Function
TodoWrite	Task list management. id/content/status/priority
Sleep	Pause execution (duration_ms)
SendUserMessage	Send message to user (attachments, status display)
Config	Read/write agent settings
StructuredOutput	Machine-parseable JSON output
REPL	Language-specific execution environment (Python, Node.js, etc.)

3-Layer Composition

Tools are composed in three layers:

TOOL 3-LAYER COMPOSITION

Layer 1: BaseCompile-time static definitions40 built-in tools — ToolSpec (name, description, JSON schema, required permissions)

↓Name collision check

Layer 2: PluginUser-installed extensionsHook-based lifecycle: pre_tool_use → execution → post_tool_use / post_tool_use_failure

↓Name collision check

Layer 3: RuntimeMCP server dynamic registrationmcp__{server}__{tool} naming — auto-registered on server connection

Key Design Decisions:

Higher-layer tools that collide in name with lower layers are rejected — ensuring built-in tool stability
In simple mode, only Bash, FileRead, FileEdit — 3 tools are exposed — principle of least privilege
Rejected tools are completely removed from the system prompt — the LLM doesn’t even know they exist

Course Connections

Week	Connection
Week 3	The mechanism by which MCP servers dynamically register tools at Layer 3
Week 6	Simple mode’s tool filtering = mechanical implementation of instruction tuning
Week 7	Restricting accessible tools to sub-agents via `allowedTools` → role separation

5. Permission Model — 3 Modes × 4 Layers

A deterministic gate controlling the agent’s scope of action. The actual implementation of Week 2 governance and Week 3 TBAC.

3 Permission Modes

Mode	Allowed Scope	Use Case
ReadOnly	Read/search only	Code analysis, explanation requests
WorkspaceWrite (default)	Write within workspace + sandboxed commands	General development
DangerFullAccess	Full access	System administration (dangerous)

4-Layer Security

Deny List (Tool Hiding) — Remove tools from the system prompt. The LLM doesn’t know they exist. frozenset-based O(1) matching + prefix scanning.
Permission Policy (Per-tool Override) — BTreeMap specifying the required mode for each tool. Example: bash → DangerFullAccess.
CLI Mode Preset — Set the session-wide mode with the --permission-mode flag.
Workspace Boundary (File Boundary) — Blocks symlink escapes on file writes, validates ../ escape attempts, canonical path comparison, binary file detection.

Authorization Flow

AUTHORIZATION FLOW

authorize(tool_name, input)

↓

Mode Lookuptool → required_mode

↓

AllowPass immediately

DenyReturn reason (tool, active_mode, required_mode, reason)

PromptRequest user approval → Allow/Deny decision

Bash Special Handling

ReadOnly Allowed	Blocked
`ls`, `cat`, `grep`, `git status`, `find`, `wc`	`rm -rf`, `mkfs`, `reboot`, `chmod`, pipe chains

Dangerous commands are evaluated by a heuristic risk score, and only safe commands on the allowlist pass in ReadOnly mode.

Course Connections

Week	Connection
Week 2	The actual code implementation of Governance-as-Code
Week 3	Runtime enforcement of TBAC (Tool-Based Access Control)
Week 6	Permission constraints = mechanical means of agent behavior correction

6. Context Management & Compaction

The mechanism for preventing Context Rot in long-running sessions. Shows how Claude Code implements the theory from Week 5.

CLAUDE.md 3-Level Hierarchy

Higher priority overrides lower:

Priority	Path	Scope
3 (high)	`<workspace>/CLAUDE.md`	Workspace local (personal)
2	`<project>/.claude/CLAUDE.md`	Project (team-shared)
1 (low)	`~/.claude/CLAUDE.md`	User global (all projects)

Size Limits:

Single instruction file: max 4,000 characters (MAX_INSTRUCTION_FILE_CHARS)
Total instructions combined: max 12,000 characters (MAX_TOTAL)

In addition, MEMORY.md (persistent memory) and AGENTS.md (agent role definitions) are appended to the system prompt.

Auto-Compaction Algorithm

Conversations are automatically compressed as they grow longer:

AUTO-COMPACTION ALGORITHM

TriggerTurn count > compact_after_turns (default 12)

↓

PreserveKeep the most recent 4 messages

↓

Compaction Target (remainder)

Markdown → plaintext conversion
Remove duplicate tool results
Remove timestamps/metadata
Truncate to 10,000 token ceiling

↓

ResultReconstruct with summary prompt + most recent 4 messages

Manual trigger: /compact command

Token Tracking and Cost Calculation

Four types of tokens are tracked per API call:

Token Type	Description
`input_tokens`	Input tokens
`cache_creation_input_tokens`	Cache creation tokens
`cache_read_input_tokens`	Cache read tokens
`output_tokens`	Output tokens

Price per model (per 1M tokens):

Model	Input	Output
Sonnet	$3	$15
Haiku	$1	$5
Opus	$5	$25

Use the /cost command to check cumulative session costs in real time.

Course Connections

Week	Connection
Week 5	The actual implementation of Context Rot prevention. Auto-compaction = deterministic context management
Week 6	CLAUDE.md 3-level hierarchy = implementation of the “Sign” metaphor. Global/project/local constraints
Week 12	Token tracking as the foundation for cost optimization telemetry

7. MCP Integration

The layer that connects external tools and data sources via a standard protocol. Shows how the MCP theory learned in Week 3 is implemented inside Claude Code.

6 Transport Types

Transport	Communication	Use Case	Configuration
Stdio	stdin/stdout	Local processes (most common)	command, args, env
SSE	HTTP streaming	Remote servers	URL, headers
HTTP	REST	Standard APIs	URL, headers
WebSocket	Bidirectional	Real-time communication	URL
SDK	In-process	No network	Direct calls
ClaudeAiProxy	Proxy	claude.ai hosted servers	Wrapped URL

Server Lifecycle State Machine

MCP SERVER LIFECYCLE

Configured

↓spawn process (Stdio) or connect (HTTP/WS)

Initializing

↓JSON-RPC initialize handshake (protocol version negotiation)

Connected

↓tools/list + resources/list (tool/resource discovery)

In Use

↓shutdown signal or error

Terminated

JSON-RPC 2.0 Protocol

MCP uses JSON-RPC 2.0 as its wire protocol:

// Request
{"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}

// Response
{"jsonrpc": "2.0", "id": 1, "result": {"tools": [...]}}

// Error
{"jsonrpc": "2.0", "id": 1, "error": {"code": -32600, "message": "Invalid request"}}

Key methods:

initialize — Handshake, capability negotiation
tools/list — Available tool list
tools/call — Tool execution
resources/list — Resource list
resources/read — Resource reading

Tool Naming Convention

MCP server tools are identified by a naming convention:

mcp__{normalized_server_name}__{tool_name}

Example: mcp__my_database__query, mcp__github__create_pull_request

claude.ai hosted servers use the "claude.ai " prefix: mcp__claude_ai_GitHub__search_code

Settings 3-Source Hierarchy

SETTINGS 3-SOURCE HIERARCHY

~/.claude/settings.jsonUser global

↓deep merge

<project>/.claude/settings.jsonProject

↓deep merge

.claude/settings.local.jsonLocal (gitignore)

The most specific setting takes precedence
FNV-1a hashing detects duplicate servers — even if the same server is defined at multiple levels, it connects only once
Can include OAuth settings: client_id, callback_port, auth_server_metadata_url

Degraded Mode

Even if some MCP servers fail, the rest continue operating normally:

Failure Type	Classification	Response
Process start failure	`startup`	Other servers operate normally
Handshake failure	`handshake`	Reconnection attempt
Config error	`config`	Error message + skip
Partial start	`partial`	Register only available tools

Course Connections

Week	Connection
Week 3	Actual implementation of the MCP protocol. Transport selection, lifecycle management, degraded mode
Week 7	Differentiating tool access permissions per agent via MCP

8. Agent Orchestration

Claude Code’s multi-agent execution model. The technical foundation for the Week 7 Multi-Agent SDLC.

Agent Tool — Sub-Agent Spawning

The Agent tool creates independent processes:

Parameter	Description
`prompt`	Task description to pass to the sub-agent
`model`	Model selection: `sonnet` (standard), `opus` (complex), `haiku` (lightweight)
`isolation: "worktree"`	Isolated execution in a git worktree — no impact on parent workspace
`run_in_background`	Asynchronous execution, notification on completion
`mode`	Permission mode (plan, auto, default, etc.)

Task Packet — Structured Task Delivery

Instead of natural language prompts, tasks are assigned to agents via structured packets:

TaskPacket {
  objective:         "사용자 인증 모듈 리팩터링"
  scope:             Module           // Workspace | Module | SingleFile | Custom
  branch_policy:     "feature/auth-refactor"
  acceptance_tests:  ["pytest tests/auth/", "mypy src/auth/"]
  commit_policy:     "conventional commits"
  escalation_policy: "3회 실패 시 중단 후 보고"
}

Benefits: Eliminates natural language ambiguity, enables logging/retry/transformation, and clarifies contracts between agents.

Team & Cron Registry

Feature	Tool	Description
Team creation	`TeamCreate`	Compose agent teams, assign task IDs
Team deletion	`TeamDelete`	Disband teams
Scheduled execution	`CronCreate`	Register schedules with cron expressions + handlers
Schedule listing	`CronList`	List active schedules
Schedule deletion	`CronDelete`	Remove schedules

Teams coordinate multiple tasks, and Cron is the foundation for periodic execution like /loop.

Course Connections

Week	Connection
Week 4	Cron registry = scheduling foundation for the `/loop` command
Week 7	Agent tool + worktree isolation = technical foundation of multi-agent pipelines
Week 8-9	Task Packet = artifact contract between Planner → Coder → QA

9. Configuration System

The settings hierarchy that determines Claude Code’s behavior.

Config Priority

Higher overrides lower:

Runtime CLI flags — --permission-mode, --model, --allowedTools (highest priority)
Local config — .claude/settings.local.json (gitignore, personal)
Project config — .claude/settings.json or .claude.json (team-shared)
User config — ~/.claude/settings.json (global)
Compiled defaults — Defaults built into the code (lowest priority)

Feature-Gated Subsystems

Features activated/deactivated via settings:

Subsystem	Setting Key	Description
Hooks	`hooks`	Trigger shell commands on tool call events
Plugins	`plugins`	Enable extension plugins
MCP	`mcpServers`	MCP server connection settings
OAuth	`oauth`	OAuth authentication settings
Sandbox	`sandbox`	Filesystem isolation mode

API Client Configuration

Item	Default
Base URL	`https://api.anthropic.com`
API Version	`2023-06-01`
Max retries	2
Initial backoff	200ms
Max backoff	2s
Retryable codes	408, 409, 429, 500, 502, 503, 504

Four authentication methods: None, ApiKey, BearerToken, ApiKey+Bearer. Loaded from environment variable ANTHROPIC_API_KEY or ~/.claude/credentials.json.

Summary — 7 Layers and Course Map

Layer	Core Concepts	Most Relevant Week
Bootstrap & Session	7-step initialization, session persistence	Week 4 (loop lifecycle)
Query Engine	Turn loop, budget control, TurnResult, 29-block system prompt assembly	Week 4, Week 5, Week 6
Tool System	40 tools, 3-Layer, ToolSpec	Week 3 (MCP), Week 7
Permission Model	3 modes, 4-layer security, Bash safety	Week 2, Week 6
Context Management	CLAUDE.md hierarchy, compaction, token tracking	Week 5, Week 6, Week 12
MCP Integration	6 transports, lifecycle, degraded mode	Week 3
Agent Orchestration	Sub-agents, worktree, Task Packet	Week 7, Week 8-9