Lab 04: Ralph Loop Implementation
Intermediate
Due: 2026-03-31
1.
Section titled “1. PROMPT.md — Agent Instructions”
2.
Section titled “2. AGENTS.md — Accumulated Learning Log”
AGENTS.md
3.
Section titled “3. harness.sh — Loop Harness”
Terminal window
Objectives
Section titled “Objectives”- Understand and implement the Ralph loop (iterative autonomous coding loop)
- Control AI coding CLI in non-interactive mode via
harness.sh - Apply
PROMPT.md+AGENTS.mdauthoring principles - Ensure loop stability with Garbage Collection and Backpressure
Ralph Loop Overview
Section titled “Ralph Loop Overview”The Ralph loop is an autonomous coding workflow that operates on a “Run → Assess → Loop → Progress → Halt” cycle. It runs the AI coding CLI in headless mode and uses test results as feedback for iteration.
PROMPT.md + AGENTS.md
↓
Claude CodeCode changes
↓
Run Tests
✓PassDone
✗FailGC → Update AGENTS.md → Retry
Implementation Requirements
Section titled “Implementation Requirements”1. PROMPT.md — Agent Instructions
Section titled “1. PROMPT.md — Agent Instructions”An effective PROMPT.md clearly defines the role, constraints, and completion criteria.
# RoleYou are an autonomous coding agent. Your task is to make all tests passin the `src/` directory without modifying test files.
**IMPORTANT: Read `AGENTS.md` first** — it contains learned patternsfrom previous iterations and known failure modes.
# Constraints- Do NOT modify any file under `tests/`- Do NOT install new packages without checking `requirements.txt` first- Write minimal, focused changes — do not refactor unrelated code- If stuck after 3 attempts on the same error, write your analysis to `fix_plan.md` and stop
# Completion CriteriaAll pytest tests pass. Run `pytest tests/ -q` to verify.Write a brief summary to `DONE.md` when complete.
# Working Notes- Check `AGENTS.md` for learned patterns and anti-patterns- Check `fix_plan.md` for prior analysis before starting- Use `git diff` to review your changes before each commit- Update `AGENTS.md` with new learnings after each attempt2. AGENTS.md — Accumulated Learning Log
Section titled “2. AGENTS.md — Accumulated Learning Log”This file records patterns learned and failure causes as the loop iterates. The agent reads this file at the start of each iteration to avoid repeating the same mistakes.
## Learned Patterns- (Agent adds entries after loop runs)
## Anti-Patterns- Do not modify files in the tests/ directory- Check requirements.txt before installing new packages
## Progress- [ ] test_divide_normal — ?- [ ] test_divide_zero — ?- [ ] test_fibonacci — ?3. harness.sh — Loop Harness
Section titled “3. harness.sh — Loop Harness”#!/usr/bin/env bashset -euo pipefail
# ConfigurationMAX_ITER=${MAX_ITER:-10}SLEEP_SEC=${SLEEP_SEC:-5}PROMPT_FILE="PROMPT.md"LOG_FILE="harness.log"PASS_MARKER="DONE.md"AI_CLI="${AI_CLI:-claude}" # Tool selection: claude, gemini, codex
# --- Git initialization (prerequisite for Garbage Collection) ---if [ ! -d .git ]; then git init git add -A git commit -m "initial: harness starting point" echo "[harness] Git repository initialized"fi
iter=0echo "[harness] Ralph loop starting (max ${MAX_ITER} iterations, tool: ${AI_CLI})" | tee "$LOG_FILE"
while [ $iter -lt $MAX_ITER ]; do iter=$((iter + 1)) echo "[harness] === Iteration ${iter}/${MAX_ITER} ===" | tee -a "$LOG_FILE"
# --- Backpressure: throttle token consumption rate --- if [ $iter -gt 1 ]; then echo "[harness] Backpressure wait ${SLEEP_SEC}s..." | tee -a "$LOG_FILE" sleep "$SLEEP_SEC" fi
# --- Run AI coding CLI in headless mode --- case "$AI_CLI" in claude) claude --print --no-color --dangerously-skip-permissions \ "$(cat "$PROMPT_FILE")" 2>&1 | tee -a "$LOG_FILE" ;; gemini) cat "$PROMPT_FILE" | gemini 2>&1 | tee -a "$LOG_FILE" ;; codex) codex --approval-mode full-auto \ "$(cat "$PROMPT_FILE")" 2>&1 | tee -a "$LOG_FILE" ;; *) echo "[harness] Unsupported tool: $AI_CLI" | tee -a "$LOG_FILE" exit 1 ;; esac
# --- Check completion condition --- if [ -f "$PASS_MARKER" ]; then echo "[harness] Completion marker detected — exiting loop" | tee -a "$LOG_FILE" exit 0 fi
# --- Run tests and check results --- if python -m pytest tests/ -q --tb=no 2>/dev/null; then echo "[harness] All tests passed — exiting loop" | tee -a "$LOG_FILE" exit 0 fi
# --- Garbage Collection: roll back failed code changes --- # Preserve learned content in AGENTS.md if [ -f AGENTS.md ]; then cp AGENTS.md /tmp/AGENTS.md.bak fi echo "[harness] GC: rolling back failed changes (git checkout .)" | tee -a "$LOG_FILE" git checkout . if [ -f /tmp/AGENTS.md.bak ]; then cp /tmp/AGENTS.md.bak AGENTS.md fi
echo "[harness] Tests not passing — proceeding to next iteration" | tee -a "$LOG_FILE"done
echo "[harness] Maximum iterations reached — failed" | tee -a "$LOG_FILE"exit 14. Backpressure Mechanism
Section titled “4. Backpressure Mechanism”Backpressure prevents the loop from calling the API excessively or getting stuck in an infinite loop. It is implemented across three layers.
# Backpressure enhancement logic to add to harness.sh
# 1. Exponential backoff: increase wait time proportional to failure countconsecutive_failures=0backoff_base=5
# On failureconsecutive_failures=$((consecutive_failures + 1))sleep_time=$((backoff_base * consecutive_failures))echo "[harness] Backoff wait ${sleep_time}s..." | tee -a "$LOG_FILE"sleep "$sleep_time"
# Reset on successconsecutive_failures=0# backpressure.py — Python helper scriptimport timeimport jsonfrom pathlib import Path
def check_progress(log_file: str = "harness.log") -> dict: """Analyzes the progress status of the last N iterations.""" lines = Path(log_file).read_text().splitlines() iterations = [l for l in lines if "Iteration" in l]
return { "total_iterations": len(iterations), "last_10_lines": lines[-10:], "is_stalled": _detect_stall(lines) }
def _detect_stall(lines: list[str], window: int = 20) -> bool: """Determines a stall if the same error pattern repeats.""" recent = lines[-window:] error_lines = [l for l in recent if "ERROR" in l or "FAILED" in l] if len(error_lines) < 3: return False # Stall if the same error repeats 3 or more times return len(set(error_lines)) == 15. Target Code for the Lab
Section titled “5. Target Code for the Lab”Use the Ralph loop to fix the following buggy code.
# src/calculator.py (contains bugs)def divide(a: float, b: float) -> float: return a / b # ZeroDivisionError not handled
def fibonacci(n: int) -> int: if n <= 0: return 0 elif n == 1: return 1 return fibonacci(n - 1) + fibonacci(n - 2) # n=2 case handled, performance issue# tests/test_calculator.py (do not modify)import pytestfrom src.calculator import divide, fibonacci
def test_divide_normal(): assert divide(10, 2) == 5.0
def test_divide_zero(): with pytest.raises(ValueError, match="Cannot divide by zero"): divide(10, 0)
def test_fibonacci(): assert fibonacci(10) == 55 assert fibonacci(0) == 0- Create the lab directory structure (
src/,tests/,PROMPT.md,AGENTS.md) - Grant execution permission to
harness.sh:chmod +x harness.sh - Run the loop:
MAX_ITER=5 bash harness.sh - Check the iteration flow in
harness.log - Verify GC behavior: check rollback with
git log --oneline - Confirm that learned content has accumulated in
AGENTS.md - Review
DONE.mdand the modifiedsrc/calculator.pyafter completion
Deliverables
Section titled “Deliverables”Submit a PR to assignments/lab-04/[student-id]/:
-
PROMPT.md— Includes role/constraints/completion criteria, referencesAGENTS.md -
AGENTS.md— Accumulated learning log after loop runs (should show at least 2 failures followed by success) -
harness.sh— Complete implementation with Backpressure + Garbage Collection -
backpressure.py— Stall detection logic -
harness.log— Actual execution log (minimum 3 iterations) -
src/calculator.py— Final version modified by the agent -
README.md— Record of how many iterations it took to pass tests, GC behavior confirmation, whether a stall occurred