Skip to content

Assignment Submissions

This page summarizes student assignments submitted via GitHub PRs. See the Contributing Guide for submission instructions.


Student IDW01W02W03W04W05W06W07W08
202021016Submitted
202021035SubmittedSubmitted
202121014SubmittedSubmitted
202321005SubmittedSubmittedSubmitted
202321006SubmittedSubmittedSubmittedSubmittedSubmittedSubmittedSubmittedSubmitted
202321010SubmittedSubmittedSubmittedSubmittedSubmittedSubmitted
Student IDLab 02Lab 03Lab 05Lab 06Lab 07Capstone
202121014SubmittedSubmittedSubmittedSubmittedSubmitted
202321005Submitted
202321006Submitted
202321010Submitted
  • Submitted Meets assignment requirements
  • Submitted Submitted but needs supplementation

Week 01: AI Coding CLI Installation and Execution

Section titled “Week 01: AI Coding CLI Installation and Execution”

Assignment: Install an AI coding CLI (Claude Code, Gemini CLI, or Codex CLI), create hello_agent.py, document installation issues and solutions

202021035

Tool: Codex CLI (Windows)

Resolved PowerShell execution policy error (Set-ExecutionPolicy), Windows environment variable setup (exportset). Systematic troubleshooting report.

202121014

Tool: Codex CLI

Verified Node.js version compatibility (v24 vs recommended v20). Selected ChatGPT account login for authentication.

202321005

Tool: Claude Code

Resolved SSH Permission denied (ssh-keygen + ssh-copy-id). Fixed CLI PATH not registered issue.

202321006

Tool: Codex CLI

Implemented a Pygame pinball game. Resolved Python 3.14 distutils removal issue using uv run --python 3.12.

202321010

Tool: Gemini CLI

Batch submission of Weeks 01–04. Includes installation screenshots and hello_agent.py.


Assignment: HOTL governance, MIG virtual simulation

202321006

Designed virtual MIG data structures with FastMCP. Overcame MIG non-support on RTX 3060 through simulation. Implemented TBAC-based role access control (Professor/Student).


Week 03: MCP Server Implementation and Security Verification

Section titled “Week 03: MCP Server Implementation and Security Verification”

Assignment: FastMCP server implementation (Tool + Resource + Prompt), MIG profile analysis, governance gateway

202321005

MCP server (mcp_server.py) + Governed Gateway (mcp_gateway.py). Includes MIG analysis report, K8s nodeAffinity example, Llama-3-8B benchmark, TBAC/SANDWORM security reports, MCP Inspector capture JSONs. Most comprehensive submission.

202321006

MIG profile partition analysis (Strategy A vs B comparison), TBAC 3-layer architecture (Mermaid diagrams), McpInject attack simulation and defense strategies, Llama-3-8B 4-bit VRAM calculation (~6GB), mig_monitor_server.py JSON-RPC verification.

202321010

MCP GPU server, MIG profile analysis, architecture documentation, K8s Pod configuration.


Week 04: Ralph Loop Implementation and Test-time Compute Scaling

Section titled “Week 04: Ralph Loop Implementation and Test-time Compute Scaling”

Assignment: Ralph loop harness (harness.sh) implementation — backpressure, garbage collection, AGENTS.md cumulative learning

202021016

Insufficient content. README is a copy of the project root README, AGENTS.md contains only “CLAUDE.md”. No harness.sh or loop implementation submitted. Needs supplementation.

202021035

harness.sh (159 lines) + run_task.py. tasks.json-based task queue, 3-iteration progress archive, metrics.csv collection. Clean separation of AGENTS.md and PROMPT.md for role clarity.

202121014

harness.sh (103 lines), backpressure.py, AGENTS.md (Learned Patterns + Anti-Patterns + Progress checklist), PROMPT.md, 805-line execution log. Records divide-by-zero handling learning process for calculator task.

202321005

Most sophisticated implementation. harness.sh (335 lines), backpressure.py (149 lines), mock_agent.py, 4 task checkpoints, 7 pytest error logs, loop metrics (CSV+JSON), worktree analysis report, RLM chunk demo experiment.

202321006

harness.sh (76 lines) — backpressure (pytest exit code), garbage collection (file deletion), stuck detection (task splitting after 2 consecutive failures), loop metrics JSON recording. All 5 loops failed due to pytest: command not found — environment setup issue.

202321010

autoresearch.py (167 lines), harness.sh (437 lines) — longest harness script. Applied JSON feature list + Initializer pattern. Includes loop results JSON, metrics CSV, worktree report.


Week 05: Context Management and Token Optimization

Section titled “Week 05: Context Management and Token Optimization”

Assignment: Context Manager, token counter, Context Rot measurement and auto-reset

202321006

context_manager.py dynamic context pruning + priority-based filtering. token_counter.py accurate token counting for Claude models. Context Rot mitigation via sliding window.

202321010

Token counter-integrated Ralph loop, Context Rot simulation and measurement, Hybrid auto-context reset. fix_plan.md + claude-progress.txt state tracking. Analysis scripts with before/after visualization graphs (5 types).


Week 06: Instruction Tuning and Log Analysis

Section titled “Week 06: Instruction Tuning and Log Analysis”

Assignment: Prompt improvement experiments, log analysis, Instruction Tuning impact measurement

202321006

Added architecture constraints to PROMPT.md, achieving 30% reduction in logic errors. log_analyzer.py tracks agent reasoning paths, identifies common failure points. Confirmed stabilization effect of explicit CLAUDE.md rules.

202321010

Instruction Tuning experiment — testset + before/after output comparison. Generated error rate graph, metrics graph, error type classification graph. Systematic organization under lab06 subdirectory.


Assignment: 5-stage pipeline multi-agent architecture, JSON artifact-based communication, verification gates

202321006

5-phase gated pipeline architecture diagram, DAG-based task decomposition + tiered parallelization strategy, phase transition quality checklists, recovery strategies for 3 failure scenarios. 4 schemas (requirement.json, task.json, pipeline_state.json, lesson.json).

202321010

Planner → Context → Builder → Reviewer → Finalizer 5-stage pipeline. 4+ standard JSON schemas, DAG dependency graph + parallelization tier analysis, 5 gate checklists per phase, automatic recovery mechanisms for 3 scenarios.


Assignment: Capstone project proposal — problem definition, system design, milestones

202321005 — Socratic Tutor

Educational agent that corrects student misconceptions through questions alone, never revealing answer code. Dual-loop · 4-module · 3-tier defense architecture with Analysis/Dialogue/Review/Logging modules, Q-Critic + Validator parallel AND gate.

202321006 — Collaboration Nudge Agent

Real-time bus factor calculation on GitHub PR events to detect knowledge monopoly. Extracts design context via AI interview, redistributes knowledge to suitable team members. Prevents problems typically discovered only after employee departure.

202321010 — Docs-Code Drift Detector

AST-based comparison of type/parameter definitions in READMEs, docstrings, and API docs against actual code to detect drift. Auto-generates documentation fix PRs (code fixes are suggestions only).


Assignment: Combine HOTL governance layer + audit logging on an Anthropic SDK-based agent

202121014

governance.py 4-tier risk classification + approval policy, agent.py Claude suggestion/fallback passthrough, pytest tests, JSONL audit log sample. Documented design decisions.


Assignment: FastMCP-based Tool + Resource + Prompt primitives, Path Traversal defense, safe subprocess policy

202121014

filesystem/git/custom server registration (settings.json), custom MCP server (custom_server.py), 3 pytest samples. Covers baseline security requirements.


Assignment: Token tracking, Rolling Window context compression, state save/restore

202121014

python main.py runs full demo automatically. Token usage tracking, Rolling Window compression, execution state save/restore, auto-generation of claude-progress.txt.


Assignment: Analyze harness.log, extract recurring failure patterns, A/B compare two prompt versions

202121014

log_analyzer.py classifies recurring log error patterns + generates error_report.md, compares prompt v1 (baseline) and v2 (improved), ab_test.py records results to ab_results.json.


Assignment: Planner → Coder 2-stage pipeline with JSON schema-based structured outputs

202121014

Planner/Coder output JSON schemas, base_agent.py common class, planner_agent.py (goal → subtask decomposition), coder_agent.py (plan → change set), pipeline.py sequential connection. Design documentation included.