Week 6: Instruction Tuning
Theory
Section titled “Theory”Why Instruction Tuning is Central in Week 6
Section titled “Why Instruction Tuning is Central in Week 6”In Week 5 we learned how to keep context clean — auto-compaction, fresh context, state files. But even with clean context, if the agent keeps making the same mistakes, it’s all for nothing.
The core question this week: Can we correct agent behavior without changing model weights?
The answer is “instruction tuning” — correcting behavior by changing the environment (the instruction file) rather than the model. If Week 4’s Ralph loop AGENTS.md (cumulative learning) was passive record-keeping, this week’s PROMPT.md/CLAUDE.md tuning is active constraint design.
When we move to Week 7 (multi-agent), each agent will need its own PROMPT.md and permissions. Before that, we master the techniques for precisely controlling a single agent’s behavior.
What is Instruction Tuning?
Section titled “What is Instruction Tuning?”When an agent makes recurring mistakes in the Ralph Loop, we correct its behavior by adding specific, deterministic instructions to PROMPT.md — without retraining model weights.
The Instruction Tuning Process
Section titled “The Instruction Tuning Process”Example of an Advanced PROMPT.md
Section titled “Example of an Advanced PROMPT.md”# [Permanent Constraints — Never Ignore]
## ⚠️ Known Pitfalls (Instructional Tuning)
### 1. Do not call non-existent functions- `utils.parse_json()` does not exist in this project- Always use `json.loads()` directly- Added: 2026-04-02 (recurring error at loop #47)
### 2. Do not commit without tests- `pytest tests/ -q` must pass before any `git commit`- CI will auto-rollback on failure
### 3. Type hints are mandatory- All functions must include Python type hints- `def add(a, b):` → `def add(a: int, b: int) -> int:`
---
# [Current Task]...Instruction File Design Principles — Lessons from 2,500 Repositories
Section titled “Instruction File Design Principles — Lessons from 2,500 Repositories”GitHub Blog’s analysis of 2,500 open-source repositories (2026) and ETH Zurich’s AGENTbench research empirically identified the success factors and anti-patterns of instruction files.
5 Elements of Successful Instruction Files
Section titled “5 Elements of Successful Instruction Files”- Executable commands first — Place specific commands with flags (e.g.,
pnpm test --coverage) at the top of the file - Code examples over prose —
def add(a: int, b: int) -> int:is more effective than “write in a type-safe manner” - Clear boundaries — Explicitly state which files/directories must never be touched
- Specific stack versions — Not “use Python” but “Python 3.12, FastAPI 0.115, Pydantic v2”
- Cover 6 key areas — Build commands, tests, project structure, coding style, git workflow, prohibited actions
3-Tier Boundary System
Section titled “3-Tier Boundary System”| Tier | Meaning | Examples |
|---|---|---|
| Always Do | Execute every time without fail | ”Run pytest before committing”, “Type hints mandatory” |
| Ask First | Request confirmation before executing | ”Confirm with a human before DB schema changes”, “Confirm before modifying external API keys” |
| Never Do | Absolutely prohibited | ”Never commit .env files”, “Never push directly to main” |
The AGENTIF benchmark (Tsinghua, 2025) raises an additional concern: existing instruction-following research evaluates models with short synthetic instructions averaging 45 words, but real-world agentic task instructions run hundreds to thousands of words. A model that follows short benchmark instructions well is not guaranteed to follow long real-world instructions — keep this in mind when evaluating the effectiveness of a 200-line CLAUDE.md.
The CLAUDE.md 200-Line Rule
Section titled “The CLAUDE.md 200-Line Rule”CLAUDE.md is injected into the system prompt every session and every turn. Therefore there is a token budget:
- The system prompt already occupies ~50 instructions
- User instruction budget: ~150–200 (roughly 200 lines)
- Exceeding this budget causes the agent to start ignoring critical instructions
Pruning test: For each line, ask “Would Claude make a mistake without this?” If not, delete it.
Include: Build/test commands, non-standard coding conventions, architecture decisions, prohibited lists Exclude: Standard language rules (handled by linters), frequently changing information, long tutorials
Skills — Progressive Disclosure Pattern
Section titled “Skills — Progressive Disclosure Pattern”Skills are the solution for staying under 200 lines while still leveraging domain knowledge:
| Storage | Load Time | Purpose |
|---|---|---|
CLAUDE.md | Automatic every session | Project-wide rules (under 200 lines) |
.claude/skills/*.md | On-demand for relevant tasks | Domain-specific knowledge |
Example: API conventions, deployment procedures, DB migration rules don’t need to be loaded every time. Placing them in .claude/skills/ means they load only when needed — saving tokens while making specialized knowledge available.
SkillReducer research (arXiv:2603.29919) found a “less-is-more” effect: compressing tool descriptions by 48% actually improved quality by 2.8%. Reducing information helps the agent focus on what matters.
Measuring Instruction Effectiveness
Section titled “Measuring Instruction Effectiveness”| Metric | Measurement Method |
|---|---|
| Recurring error rate | Number of identical error occurrences / total loop count |
| Average loop count | Number of loops required to complete a task |
| Context efficiency | Ratio of tokens wasted on unnecessary exploration |
Output Styles and Cognitive Modes — Zero-Config Instruction Tuning
Section titled “Output Styles and Cognitive Modes — Zero-Config Instruction Tuning”Adding constraints to PROMPT.md is like installing custom-made signs. In contrast, Claude Code’s Output Styles are pre-installed mode switches built into the agent. They change behavior patterns at zero configuration cost.
The key insight: output styles change the cognitive mode, not just the tone. Given the same task, the agent’s approach to the problem itself changes depending on the style.
claude --output-style explanatory "Improve error handling in this function"The agent modifies code while explaining why it makes each change. Reasoning is already embedded in the PR, so reviewers only need to verify the decisions.
Best for: Team onboarding, junior engineers working in unfamiliar codebases, reducing code review turnaround
claude --output-style learning "Improve error handling in this function"Instead of making changes directly, the agent guides step by step, encouraging the learner to make the changes themselves. Coaching mode.
Best for: Educational purposes, lab exercises where students need to understand the principles
claude --output-style concise "Improve error handling in this function"Outputs code changes with minimal explanation. For experienced developers who just need results fast.
Best for: Repetitive tasks, lint fixes, applying well-known patterns
Boris Cherny’s team sets Explanatory as the default when junior engineers work in unfamiliar services. The result: PR review time dropped — reasoning arrives attached to the code, so reviewers no longer need to reconstruct the decision chain.
Effort Levels — Adjusting Reasoning Depth
Section titled “Effort Levels — Adjusting Reasoning Depth”If output styles change the direction, effort levels adjust the depth.
| Level | Use Case | Cost | Example |
|---|---|---|---|
low | Simple lookups, type checks | Minimal | claude --effort low "Return type of this function?" |
medium | General code changes | Moderate | claude --effort medium "Add tests" |
high | Architecture design, complex debugging | High | claude --effort high "Design async refactoring" |
max | Maximum reasoning depth (Opus 4.6 exclusive) | Maximum (10x+) | claude --effort max "Security vulnerability analysis" |
Practical effort level use in Ralph loops: start with low in early iterations (exploration phase), then switch to high for core implementation — this optimizes token cost across the loop.
Custom Output Style — Per-Project Cognitive Mode
Section titled “Custom Output Style — Per-Project Cognitive Mode”When the built-in styles (Explanatory/Learning/Concise) are insufficient, you can create a custom style:
# Create a new custom styleclaude /output-style:newIn the generated Markdown file’s frontmatter, keep-coding-instructions: true/false controls whether existing coding instructions are preserved or fully replaced. This is a deeper level of instruction tuning than PROMPT.md — it replaces the coding-related portion of the system prompt itself.
PROMPT.md Tuning vs Output Styles — When to Use Which
Section titled “PROMPT.md Tuning vs Output Styles — When to Use Which”| Aspect | PROMPT.md Instruction Tuning | Output Style + Effort Level |
|---|---|---|
| Setup cost | High — error analysis → writing → verification | Zero — a single CLI flag |
| Customization | Unlimited — free-text project constraints | Limited — choose from preset list |
| Persistence | Permanent — written to file, git-tracked | Per-session — must specify each time |
| Sign metaphor | Custom-made signs installed on site | Factory-installed mode switches |
| Best for | Project-specific recurring error correction | General behavior pattern changes |
Hooks — Deterministic Enforcement Beyond CLAUDE.md
Section titled “Hooks — Deterministic Enforcement Beyond CLAUDE.md”CLAUDE.md is advisory — it is followed roughly 80% of the time. For 100% enforcement, use Hooks.
Hooks are an automation mechanism that triggers shell commands / HTTP / LLM judgment on tool calls and session events. They are defined in ~/.claude/settings.json or the project’s .claude/settings.json.
4 Handler Types
Section titled “4 Handler Types”| Type | Behavior | Use Case |
|---|---|---|
| command | Execute a shell command | Auto-format, lint check, log recording |
| http | HTTP endpoint POST | External service notification, CI trigger |
| prompt | Delegate judgment to LLM (Haiku) | Automatically judge “task completion” |
| agent | Subagent reads files/runs commands | Complex verification like “did tests pass” |
Real Example: Auto-Formatting with PostToolUse
Section titled “Real Example: Auto-Formatting with PostToolUse”{ "hooks": { "PostToolUse": [{ "matcher": "Write", "command": "npx prettier --write $CLAUDE_FILE_PATH" }] }}This Hook runs Prettier automatically every time Claude writes a file. It is 100% reliable compared to writing “follow Prettier format” in CLAUDE.md.
Stop Hook for Automatic Ralph Loop Completion Verification
Section titled “Stop Hook for Automatic Ralph Loop Completion Verification”{ "hooks": { "Stop": [{ "type": "prompt", "prompt": "Are all tasks complete? Check fix_plan.md for any incomplete items.", "model": "haiku" }] }}The prompt type Hook delegates judgment to the Haiku model. If it returns "ok": false, the agent continues working. This is the key pattern for automating the manual verification step in the Ralph loop.
Instruction File Comparison Across Tools
Section titled “Instruction File Comparison Across Tools”Instruction files are a common pattern across all AI coding tools, but the implementation differs per tool:
| Tool | File | Hierarchy | Notable Features |
|---|---|---|---|
| Claude Code | CLAUDE.md | 5-level + Skills | @import, advisory (~80%), supplemented by Hooks |
| Cursor | .cursor/rules/ | Directory-based | Migrated from .cursorrules, glob pattern matching |
| Windsurf | .windsurf/rules/ | Directory-based | Cascade engine auto-detects context |
| Codex CLI | AGENTS.md | Tool-neutral | 60,000+ repo adoption, 25+ agent compatible |
| GitHub Copilot | .github/copilot-instructions.md | Single + path-specific | Org-level GA (2026-04), excludeAgent support |
| Gemini CLI | GEMINI.md | Hierarchical discovery | Also reads AGENTS.md, 1M context |
| JetBrains Junie | .junie/guidelines.md | Single file | IntelliJ platform integration |
GitHub Copilot shipped organization-level custom instructions as GA in April 2026. Admins can set default instructions in a .github repo that apply across all org repositories. Additionally, .github/instructions/*.instructions.md supports path-specific instructions (YAML frontmatter + glob patterns), enabling different rules per file type.
Discussion Questions
Section titled “Discussion Questions”- What is the difference between writing “Do not commit without tests” in CLAUDE.md versus using a Hook to force
pytestexecution viaPreToolUse: Bash(git commit*)? Which is more effective, and why? - Explain why LLM-generated instruction files are actually harmful according to the ETH Zurich research. “Adding an architecture overview seems helpful — why doesn’t it work?”
- At what point does Sign Fatigue occur? How many lines do you predict PROMPT.md must exceed before its effect reverses? Discuss in connection with the SkillReducer research.
- Output styles (Explanatory/Learning/Concise) are said to change the agent’s cognitive mode. What does “changing cognitive mode” mean technically? Which part of the system prompt is being modified?
- In Week 7 multi-agent systems, if each agent needs a different PROMPT.md, how would you design the separation of shared parts vs. role-specific parts?
- Cursor introduced
/Generate Cursor Rulesfor AI-assisted rule auto-generation. How can this coexist with the ETH Zurich finding that “LLM-generated instructions are harmful”? Discuss the pros and cons of the “AI drafts, human curates” approach.
Practicum
Section titled “Practicum”-
Error Pattern Analysis
Analyze execution logs from the previous lab using
log_analyzer.py(see Lab 06) to extract the top 5 recurring errors. -
3-Tier Boundary Design
Classify the extracted errors into Always Do / Ask First / Never Do, and add a structured instruction section to PROMPT.md.
-
Hook Configuration
Implement one of the Never Do items as a
PreToolUseHook to guarantee 100% enforcement. Add it tosettings.json. -
A/B Testing
Run the same task before and after adding instructions, comparing loop counts, token usage, and recurring error rate. Use
ab_test.pyfrom Lab 06. -
Connect to Lab 06
Use the experimental results above to complete the 4 requirements in Lab 06 (analysis report, PROMPT.md, comparison experiment, graph).
Assignment
Section titled “Assignment”Lab 06: Instruction Tuning Practice
Section titled “Lab 06: Instruction Tuning Practice”Submission deadline: 2026-04-14 23:59
Requirements:
- Recurring error analysis report (minimum 3 patterns, including 3-tier classification)
- Enhanced
PROMPT.md(instruction section + Always Do / Never Do structure) - Experimental results comparing before and after tuning (A/B test)
- Quantitative graph measuring instruction effectiveness
Key Summary
Section titled “Key Summary”- Instruction tuning = installing signs + pruning: Includes not just adding but also pruning and reprioritizing to prevent Sign Fatigue
- Do not generate instruction files with LLMs: ETH Zurich research — 3% success rate decrease, 20% cost increase. Humans must write only “information the agent cannot infer by reasoning”
- CLAUDE.md under 200 lines: instruction budget ~150–200. Use the pruning test to remove unnecessary lines
- Separate with Skills: Domain knowledge goes in
.claude/skills/for on-demand loading. The less-is-more effect from SkillReducer - Advisory vs Deterministic: CLAUDE.md (~80% compliance) + Hooks (100% enforcement). Choose the level based on importance
- 5-level hierarchy: Global → project → local → parent directory → child directory. The closest one takes priority
- Karpathy Guidelines: Surface assumptions → Simplicity first → Surgical changes → Goal-driven execution. A practical catalog of PROMPT.md instructions → Reference
- AGENTS.md is an AAIF standard — A Linux Foundation standard with 146+ member organizations. 60,000+ repo adoption. Claude Code does not yet support it natively, so dual-file strategy (AGENTS.md + CLAUDE.md) is needed.
- Triple-layer defense (defense-in-depth) — Advisory (CLAUDE.md, ~80%) + Deterministic (Hooks, 100%) + Sandboxing. A denylist bypass case (2026-03) empirically demonstrates the need for multi-layered defense.
Further Reading
Section titled “Further Reading”- Karpathy Guidelines — Four behavioral correction principles and anti-pattern catalog for LLM coding agents
- How to Write a Great agents.md — GitHub Blog — 2,500 repo analysis, 5 success factors
- ETH Zurich AGENTbench — InfoQ — Research on the negative effects of LLM-generated instruction files
- Claude Code Hooks Guide — Official docs for 24 event types, 4 handler types
- SkillReducer (arXiv:2603.29919) — 48% tool description compression yields 2.8% quality improvement
- Agentic Coding — MIT Missing Semester — MIT regular course covering instruction files, skills, and subagents
- Agentic AI Foundation (AAIF) — Linux Foundation body governing MCP, AGENTS.md, and goose. 146+ member organizations
- Anthropic: Context Engineering for AI Agents — The shift from “prompt engineering” to “context engineering”
- GitHub Copilot Organization Instructions (GA 2026-04) — Organization-level custom instructions
- JetBrains: Coding Guidelines for AI Agents — Guide for writing coding guidelines for AI agents