npm - @ai-dev-methodologies/rlp-desk - Versions diffs - 0.0.1 → 0.1.0 - Mend

@ai-dev-methodologies/rlp-desk 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/README.md CHANGED Viewed

@@ -30,15 +30,20 @@ Or without npm:
 curl -sSL https://raw.githubusercontent.com/ai-dev-methodologies/rlp-desk/main/install.sh | bash
 ```
-### 2. Brainstorm
+### 2. Brainstorm (recommended)
-In your project directory, start a Claude Code session:
+**Always start with brainstorm.** It interactively walks you through the project contract:
 ```
 /rlp-desk brainstorm "implement a Python calculator with tests"
 ```
-This interactively defines the contract: slug, objective, user stories, verification commands, and iteration settings.
+You'll be asked to confirm each item:
+- **Slug** — project identifier
+- **User Stories** — discrete, testable units of work
+- **Iteration Unit** — one story per iteration (incremental) or all at once (fast)
+- **Verification Commands** — how to check the work
+- **Models** — which Claude model for Worker/Verifier
 ### 3. Run
@@ -107,8 +112,8 @@ for iteration in 1..max_iter:
 | Simple, single-file changes | `haiku` |
 | Standard work (default) | `sonnet` |
 | Architecture changes, multi-file, prior failure | `opus` |
-| Standard verification | `sonnet` |
-| Security/critical logic verification | `opus` |
+| Verification (default) | `opus` |
+| Lightweight verification | `sonnet` |
 ## Commands
@@ -118,7 +123,7 @@ for iteration in 1..max_iter:
 /rlp-desk run   <slug> [--opts]        Run the loop (this session = leader)
 /rlp-desk status <slug>                Show loop status
 /rlp-desk logs  <slug> [N]             Show iteration logs
-/rlp-desk clean <slug>                 Reset for re-run
+/rlp-desk clean <slug> [--kill-session]  Reset for re-run
 ```
 ### Run Options
@@ -127,7 +132,66 @@ for iteration in 1..max_iter:
 |------|---------|-------------|
 | `--max-iter N` | 100 | Maximum iterations before timeout |
 | `--worker-model MODEL` | sonnet | Worker model (haiku/sonnet/opus) |
-| `--verifier-model MODEL` | sonnet | Verifier model (haiku/sonnet/opus) |
+| `--verifier-model MODEL` | opus | Verifier model (haiku/sonnet/opus) |
+| `--mode agent\|tmux` | agent | Execution mode (see below) |
+## Execution Modes
+RLP Desk supports two execution modes. Both honor the same governance protocol.
+### Environment Compatibility
+| Environment | Agent Mode | Tmux Mode |
+|-------------|-----------|-----------|
+| Claude Code (any terminal) | **Works** | Requires tmux |
+| Inside tmux session | **Works** | **Works** — panes split in current window |
+| Outside tmux session | **Works** | **Rejected** — "start tmux first" |
+### Agent Mode (default) — "Smart Mode"
+```
+/rlp-desk run calculator
+```
+The current Claude Code session acts as the Leader, dispatching Workers and Verifiers via `Agent()`. The Leader is an LLM that dynamically routes models and reasons about context.
+- Works anywhere — no tmux required
+- Dynamic model routing — Leader upgrades models on failure
+- Fix Loop — extracts verifier issues and feeds them back to the next worker
+- Best for interactive development
+### Tmux Mode — "Lean Mode"
+```
+/rlp-desk run calculator --mode tmux
+```
+**Requires running inside a tmux session.** A shell script takes over as Leader, splitting your current window into three panes. Workers run interactive `claude` sessions — you can watch them work in real-time.
+```
++---------------------+---------------------+
+| Your pane (Leader)  | Worker pane         |
+| shell loop running  | claude TUI running  |
+| polls signal files  | you see it working  |
+|                     +---------------------+
+|                     | Verifier pane       |
+|                     | claude TUI running  |
+|                     | (only when needed)  |
++---------------------+---------------------+
+```
+- Real-time visibility — watch Worker/Verifier execute live
+- Zero-token orchestration — shell loop, not LLM
+- Automatic cleanup — panes removed on completion
+- Best for long campaigns and observability
+Prerequisites: `tmux` and `jq` must be installed.
+To clean up tmux artifacts:
+```
+/rlp-desk clean calculator --kill-session
+```
 ## Project Structure
@@ -169,7 +233,7 @@ mkdir my-calc && cd my-calc
 ## Documentation
-- [Architecture](docs/architecture.md) — Design philosophy and the Agent() approach
+- [Architecture](docs/architecture.md) — Design philosophy, Agent() and tmux execution modes
 - [Getting Started](docs/getting-started.md) — Step-by-step tutorial with the calculator example
 - [Protocol Reference](docs/protocol-reference.md) — Full protocol specification

package/docs/architecture.md CHANGED Viewed

@@ -39,14 +39,40 @@ Agent(
 The Agent returns synchronously. No polling, no signal files, no tmux. The Leader simply reads the filesystem after each Agent completes.
-### Why Agent() Over Other Approaches
-| Approach | Problem |
-|----------|---------|
-| Single long session | Context drift, token limits |
-| tmux + polling | Complex, brittle, race conditions |
-| Signal files + sleep loops | Fragile timing, wasted compute |
-| **Agent() subprocess** | **Clean, synchronous, guaranteed fresh context** |
+### Two Execution Modes
+RLP Desk supports two modes for running the Leader loop. Both honor the same governance protocol (section 7). Choose based on your use case.
+| Mode | Leader | Model Routing | Session Required | Best For |
+|------|--------|---------------|------------------|----------|
+| **Agent() — "Smart mode"** (default) | LLM (current session) | Dynamic — Leader reasons about which model to use each iteration | Active Claude Code session | Interactive development, complex routing decisions |
+| **Tmux — "Lean mode"** | Shell script (`run_ralph_desk.zsh`) | Static — set via `WORKER_MODEL`/`VERIFIER_MODEL` env vars | None (runs detached) | Long campaigns, CI, observability, zero-token orchestration |
+**Agent() mode** is synchronous and simple: each `Agent()` call blocks until the subprocess finishes, then the Leader reads the filesystem. No polling, no signal files, no tmux.
+**Tmux mode** trades dynamic routing for visibility and independence. The shell Leader writes prompts to files, sends short trigger commands via `tmux send-keys`, and polls structured JSON signal files (`iter-signal.json`, `verify-verdict.json`) for control flow. It uses proven [omc-teams](https://github.com/anthropics/omc-teams) tmux patterns — write-then-notify, pane ID stability, copy-mode guards, heartbeat monitoring — for reliable, race-free orchestration.
+The tmux script is a second implementation of the governance protocol. Traceability is maintained via governance.md section 7 step-number comments throughout the script.
+#### Tmux Architecture
+```
+[tmux session: rlp-desk-<slug>-<timestamp>]
++-------------------------------------+
+| Leader pane (shell loop)            |
+| - writes prompts to files           |
+| - sends short triggers via send-keys|
+| - polls iter-signal.json via jq     |
+| - monitors heartbeat files          |
+| - writes sentinels                  |
++------------------+------------------+
+| Worker pane      | Verifier pane    |
+| bash trigger.sh  | bash trigger.sh  |
+| -> claude -p ... | -> claude -p ... |
+| heartbeat writer | heartbeat writer |
+| (fresh context)  | (fresh context)  |
++------------------+------------------+
+```
 ## Three-Role Architecture

package/docs/getting-started.md CHANGED Viewed

@@ -48,7 +48,7 @@ The brainstorm phase interactively determines:
 | **User Stories** | US-001: calculator functions, US-002: pytest tests |
 | **Iteration Unit** | One user story per iteration |
 | **Verification** | `python3 -m pytest test_calc.py -v` |
-| **Models** | Worker: sonnet, Verifier: sonnet |
+| **Models** | Worker: sonnet, Verifier: opus |
 | **Max Iterations** | 10 |
 On approval, brainstorm offers to run `init` automatically.
@@ -119,7 +119,7 @@ You'll see status updates after each iteration:
 ```
 Iteration 1 | Worker (sonnet) | US-001 complete, continuing
 Iteration 2 | Worker (sonnet) | All stories done, requesting verification
-Iteration 3 | Verifier (sonnet) | PASS — all criteria met
+Iteration 3 | Verifier (opus) | PASS — all criteria met
 ✓ COMPLETE
 ```

package/docs/protocol-reference.md CHANGED Viewed

@@ -11,9 +11,16 @@ for iteration in 1..max_iter:
      - <slug>-complete.md exists → stop (success)
      - <slug>-blocked.md exists → stop (failure)
+  ①½ Prep-stage cleanup (before each iteration)
+     - Delete <slug>-done-claim.json if exists  [leader-measured]
+     - Delete <slug>-verify-verdict.json if exists  [leader-measured]
+     (Ensures stale runtime files from a previous run cannot mislead the loop)
   ② Read memory.md
      - Parse "Stop Status" section → continue/verify/blocked
      - Parse "Next Iteration Contract" → task for this iteration
+       • Also read "Completed Stories" → track what has been verified
+       • Also read "Key Decisions" → architectural choices already settled
   ③ Select model
      - Apply model routing rules (see below)
@@ -42,7 +49,8 @@ for iteration in 1..max_iter:
        • verdict=fail + recommended=continue → go to ⑧
        • verdict=blocked → write BLOCKED sentinel, stop
-  ⑧ Update status.json, report to user, clean runtime files, next iteration
+  ⑧ Write iter-NNN.result.md (see Result Log below)
+     Update status.json, report to user, next iteration
 ```
 ## Signal Contracts
@@ -63,15 +71,66 @@ continue | verify | blocked
 ## Current State
 Iteration N - <description>
+## Completed Stories
+- US-001: Calculator add/subtract implemented [interface: `add(a, b) -> float`]
+- US-002: pytest suite — 8 tests passing
 ## Next Iteration Contract
-<specific task for the next worker>
+**Story**: US-003 — Edge case handling
+**Task**: Handle divide-by-zero in calc.py.
+1. Raise ValueError with message "division by zero"
+2. Add test_divide_by_zero to test_calc.py
+**Criteria**:
+- `pytest` exits 0
+- `grep "ValueError" calc.py` matches
+## Key Decisions
+- Iteration 2: Chose ValueError over ZeroDivisionError — matches project error style.
+- Iteration 3: Skipped type hints — out of scope per PRD Non-Goals.
 ## Patterns Discovered
 ## Learnings
 ## Evidence Chain
 ```
-The Leader reads **Stop Status** and **Next Iteration Contract** to decide what happens next.
+The Leader reads:
+- **Stop Status** and **Next Iteration Contract** to decide what happens next.
+- **Completed Stories** to track verified work without re-reading full history.
+- **Key Decisions** to carry forward settled architectural choices.
+All sections use plain Markdown. No YAML.
+### Iteration Signal (`<slug>-iter-signal.json`)
+Written by the Worker at the end of every iteration. Provides a structured JSON signal for the Leader to detect iteration completion without parsing markdown.
+```json
+{
+  "iteration": 3,
+  "status": "continue|verify|blocked",
+  "summary": "Completed US-001, other stories remain",
+  "timestamp": "2025-01-15T10:30:00Z"
+}
+```
+| Field | Type | Description |
+|-------|------|-------------|
+| `iteration` | number | Current iteration number |
+| `status` | string | One of: `continue`, `verify`, `blocked` |
+| `summary` | string | Brief description of what was accomplished |
+| `timestamp` | string | ISO 8601 UTC timestamp |
+**Status values:**
+- `continue` -- Current action done but more work remains. Leader proceeds to next iteration.
+- `verify` -- All work complete and done-claim written. Leader dispatches Verifier.
+- `blocked` -- Autonomous blocker encountered. Leader writes BLOCKED sentinel.
+**Usage by mode:**
+- **Tmux mode:** The shell Leader polls for this file's existence after dispatching the Worker. Once it appears, the Leader reads the `status` field via `jq` to decide the next step. This is the primary control-flow mechanism in tmux mode.
+- **Agent() mode:** The Leader MAY read this file as a structured alternative to parsing `memory.md`'s Stop Status section. Agent() mode primarily uses memory.md, so iter-signal.json is supplementary.
+**Worker obligation:** The Worker MUST write this file at the end of every iteration, regardless of execution mode. This ensures both Agent() and tmux modes can use the same Worker prompt templates.
 ### Done Claim (`<slug>-done-claim.json`)
@@ -92,11 +151,15 @@ Written by the Worker when claiming all work is complete:
 ### Verify Verdict (`<slug>-verify-verdict.json`)
-Written by the Verifier after independent verification:
+Written by the Verifier after independent verification.
+**Tmux mode polling:** In tmux mode, after dispatching the Verifier, the shell Leader polls for the existence of `verify-verdict.json` (same pattern as `iter-signal.json`). Once it appears, the Leader reads the `verdict` and `recommended_state_transition` fields via `jq` to decide whether to write a COMPLETE sentinel, continue iterating, or write a BLOCKED sentinel.
+**Schema:**
 ```json
 {
-  "verdict": "pass|fail|blocked",
+  "verdict": "pass|fail|request_info",
   "verified_at_utc": "2025-01-15T10:35:00Z",
   "summary": "All criteria verified with fresh evidence",
   "criteria_results": [
@@ -107,13 +170,76 @@ Written by the Verifier after independent verification:
     }
   ],
   "missing_evidence": [],
-  "issues": [],
+  "issues": [
+    {
+      "criterion": "US-002 AC1",
+      "description": "Test file missing",
+      "severity": "critical|major|minor",
+      "fix_hint": "(suggestion, non-authoritative) Add test_calc.py"
+    }
+  ],
   "recommended_state_transition": "complete|continue|blocked",
   "next_iteration_contract": "Fix failing test for divide by zero",
   "evidence_paths": ["test_calc.py::test_divide_by_zero"]
 }
 ```
+**Verdict values:**
+- `pass`: all criteria met — Leader may write COMPLETE sentinel
+- `fail`: one or more criteria not met — Leader reads issues, builds next contract
+- `request_info`: Verifier cannot determine pass/fail without more information — summary contains specific questions; Leader decides outcome and may relay questions to Worker
+**Issues severity:**
+- `critical`: blocking — must be fixed before COMPLETE
+- `major`: significant gap in acceptance criteria
+- `minor`: cosmetic or non-blocking concern
+**Verifier scope:**
+- Identify changed files via `git diff --name-only` — read those files and their direct imports only
+- Campaign Memory (`<slug>-memory.md`) is for orientation only — not the source of truth for verification
+- Delegate deterministic checks (type hints, linting, security) to tools defined in test-spec
+- Focus on: AC verification, semantic review, smoke tests
+- Do NOT use `fail` when uncertain — use `request_info` with specific questions instead
+### Fix Loop Protocol
+When the Verifier returns `fail`, the Leader executes the Fix Loop before dispatching the next Worker:
+#### Flow
+```
+Verifier fail
+  → Leader reads verify-verdict.json issues
+  → Sort issues by severity: critical → major → minor
+  → Build structured fix contract (see format below)
+  → Increment consecutive_failures in status.json
+  → Dispatch Worker with fix contract as Next Iteration Contract
+```
+#### Fix Contract Format
+```markdown
+## Next Iteration Contract
+**Mode**: fix
+**Verifier verdict reference**: iter-NNN
+**Issues to fix** (severity-sorted):
+1. [critical] US-002 AC3: <description>
+   - fix_hint: (suggestion, non-authoritative) <hint text>
+2. [major] US-001 AC1: <description>
+3. [minor] US-003 AC2: <description>
+**Traceability rule**: Only changes that resolve a listed issue are allowed (traceability enforcement).
+Every change must be justified by the issue it addresses.
+```
+#### Rules
+- `fix_hint` is optional. When present it is labeled `(suggestion, non-authoritative)` — the Worker may choose a different approach.
+- **traceability**: the Worker must not introduce changes beyond what is needed to resolve the listed issues.
+- The Leader increments `consecutive_failures` in `status.json` after each `fail` verdict, and resets it to 0 after any `pass`.
+- The Leader (not the Worker) owns the `consecutive_failures` counter.
 ### Sentinels
 Leader-only files that terminate the loop:
@@ -155,7 +281,8 @@ Updated by the Worker each iteration to reflect the current frontier:
 | Condition | Detection | Action |
 |-----------|-----------|--------|
 | Stale context | `context-latest.md` hash unchanged for 3 consecutive iterations | Write BLOCKED sentinel |
-| Repeated error | Worker produces the same error message 2 iterations in a row | Upgrade model, retry once; still failing → BLOCKED |
+| Repeated criterion failure | Same acceptance criterion fails in 2 consecutive Verifier verdicts | Upgrade model, retry once; still failing → BLOCKED |
+| Persistent diverse failures | 3 consecutive **fail** verdicts on 3 unique acceptance criterion IDs | Upgrade to opus, retry once; still failing → BLOCKED |
 | Timeout | Iteration count reaches `max_iter` | Write TIMEOUT status, report to user |
 ### Stale Context Detection
@@ -165,10 +292,20 @@ The Leader computes a hash (or diff) of `context-latest.md` before and after eac
 ### Error Escalation
 ```
-Error in iteration N (sonnet) → retry with opus in iteration N+1
-Same error in iteration N+1 (opus) → BLOCKED
+Same acceptance criterion fails iteration N (sonnet) → retry with opus in iteration N+1
+Same acceptance criterion still fails iteration N+1 (opus) → BLOCKED
 ```
+"Same error" is defined as: **the same acceptance criterion ID appears in the `issues` list of two consecutive Verifier `fail` verdicts.** A `request_info` verdict does not break or contribute to this chain — only `fail` verdicts are counted.
+### Consecutive Failures Counter
+The Leader maintains `consecutive_failures` in `status.json`. This counter:
+- Increments by 1 after each Verifier `fail` verdict
+- Resets to 0 after any Verifier `pass` verdict
+- **Unchanged** by `request_info` verdicts (neither increments nor resets)
+- Triggers the 3-consecutive-diverse-failures CB when it reaches 3 and the 3 most recent `fail` verdicts each have a unique criterion ID
 ## Model Routing
 ### Selection Matrix
@@ -179,8 +316,8 @@ Same error in iteration N+1 (opus) → BLOCKED
 | Standard implementation | `sonnet` | Balanced (default) |
 | Multi-file, architecture | `opus` | Needs broad understanding |
 | Previous iteration failed | upgrade | Harder model may succeed |
-| Verification (standard) | `sonnet` | Sufficient for running checks |
-| Verification (security) | `opus` | Critical logic needs thoroughness |
+| Verification (default) | `opus` | Independent verification requires thoroughness |
+| Verification (lightweight) | `sonnet` | Simple, well-defined checks only |
 ### Dynamic Adaptation
@@ -191,6 +328,29 @@ The Leader reassesses the model every iteration:
 3. If simple/repetitive → consider downgrade
 4. User override via `--worker-model` / `--verifier-model` takes precedence
+## Result Log (`iter-NNN.result.md`)
+Written by the Leader after each iteration completes (step ⑧). Stored in `logs/<slug>/`.
+```markdown
+# Iteration NNN Result
+## Result Status
+pass | fail | continue  [leader-measured]
+## Files Changed
+(output of `git diff --stat HEAD~1 HEAD`)  [git-measured]
+## Summary
+<1–2 sentence summary of what the Worker did this iteration>
+## Verifier Verdict
+pass | fail | blocked | (not run)  [leader-measured]
+```
+- `[leader-measured]`: value determined by the Leader reading memory/verdict files.
+- `[git-measured]`: value determined by running `git diff --stat` — not from Worker's claim.
 ## Status File (`status.json`)
 Updated by the Leader after each iteration:
@@ -202,19 +362,112 @@ Updated by the Leader after each iteration:
   "max_iter": 100,
   "phase": "worker|verifier|complete|blocked|timeout",
   "worker_model": "sonnet",
-  "verifier_model": "sonnet",
+  "verifier_model": "opus",
   "last_result": "continue|verify|pass|fail|blocked",
+  "consecutive_failures": 0,
   "updated_at_utc": "2025-01-15T10:30:00Z"
 }
 ```
+- `consecutive_failures`: number of consecutive Verifier `fail` verdicts since the last `pass`. Reset to 0 on any `pass`. Unchanged by `request_info`. Used by the Circuit Breaker (see above).
+- `last_failing_criteria`: (optional) array of criterion IDs from recent `fail` verdicts, used by Leader to detect same-criterion and diverse-failure CB patterns. Leaders may add additional tracking fields as needed.
+## Project Plans Files
+The `plans/` directory holds documents that define the project's acceptance criteria and verification approach:
+| File | Required | Description |
+|------|----------|-------------|
+| `plans/prd-<slug>.md` | Yes | Product Requirements Document — user stories, acceptance criteria, non-goals |
+| `plans/test-spec-<slug>.md` | Yes | Test specification — verification commands, criteria-to-test mapping |
+| `plans/quality-spec-<slug>.md` | Optional | Additional quality constraints (coding standards, performance budgets, security requirements). Not generated by `init` — create manually when needed. |
+The `quality-spec` file is not generated by `init`. Create it manually when a project requires additional quality constraints beyond the acceptance criteria in the PRD.
 ## Slash Command Reference
 | Command | Arguments | Description |
 |---------|-----------|-------------|
 | `brainstorm` | `<description>` | Interactive planning before init |
 | `init` | `<slug> [objective]` | Create project scaffold |
-| `run` | `<slug> [--max-iter N] [--worker-model M] [--verifier-model M]` | Run the leader loop |
+| `run` | `<slug> [--max-iter N] [--worker-model M] [--verifier-model M] [--mode agent\|tmux]` | Run the leader loop |
 | `status` | `<slug>` | Display current loop status |
 | `logs` | `<slug> [N]` | Show iteration logs |
-| `clean` | `<slug>` | Remove runtime artifacts for re-run |
+| `clean` | `<slug> [--kill-session]` | Remove runtime artifacts for re-run |
+### `--mode` Flag
+The `run` command accepts `--mode agent|tmux` (default: `agent`).
+- **`--mode agent`** (default): The current Claude Code session acts as the Leader, dispatching Workers and Verifiers via `Agent()`. Synchronous, no tmux required.
+- **`--mode tmux`**: Validates the scaffold, checks prerequisites (`tmux`, `jq`), then launches `run_ralph_desk.zsh` as the Leader. The LLM session exits after launching the script. The shell script runs independently in a tmux session.
+### `--kill-session` Flag
+The `clean` command accepts `--kill-session` to kill any tmux sessions matching the slug pattern (`rlp-desk-<slug>-*`) in addition to removing runtime files.
+## Tmux Mode Specifics
+This section documents the tmux-specific patterns used by `run_ralph_desk.zsh`. These apply only when running with `--mode tmux`.
+### Write-Then-Notify
+The single most important pattern. **Never** send data (prompts, large strings) through `tmux send-keys` directly.
+1. Write the prompt to a file: `logs/<slug>/iter-NNN.worker-prompt.md`
+2. Write a trigger script to a file: `logs/<slug>/iter-NNN.worker-trigger.sh`
+3. Send only a short command via `send-keys`: `bash /path/to/trigger.sh`
+The trigger script reads the prompt file and invokes `claude -p "$(cat /path/to/prompt.md)" --model <model> --dangerously-skip-permissions`.
+### Signal File Polling
+In tmux mode, the shell Leader cannot call `Agent()` synchronously. Instead, it polls for signal files:
+| Signal | Written By | Polled By Leader | Purpose |
+|--------|-----------|------------------|---------|
+| `<slug>-iter-signal.json` | Worker | After dispatching Worker | Detect Worker iteration completion |
+| `<slug>-verify-verdict.json` | Verifier | After dispatching Verifier | Detect Verifier completion |
+The Leader reads these files with `jq` to extract status/verdict fields for control-flow decisions.
+### Heartbeat Monitoring
+Each trigger script writes a heartbeat file (`worker-heartbeat.json` or `verifier-heartbeat.json`) in a background loop. The Leader periodically checks the heartbeat's timestamp to detect stale processes (no update within `HEARTBEAT_STALE_THRESHOLD` seconds).
+### Idle Pane Nudging
+If a pane produces no output for `IDLE_NUDGE_THRESHOLD` seconds, the Leader sends a nudge (an Enter keystroke) to prompt activity. After `MAX_NUDGES` attempts without progress, the Leader treats the pane as stuck.
+### Exponential Backoff Restarts
+If a Worker or Verifier process crashes, the Leader restarts it with exponential backoff: 5s, 10s, 20s, 60s. After `MAX_RESTARTS` consecutive failures, the Leader writes a BLOCKED sentinel.
+### Per-Iteration Timeout
+Each iteration has a configurable timeout (`ITER_TIMEOUT`, default 600s). If a Worker does not produce an `iter-signal.json` within this period, the Leader kills the process and records the timeout.
+### Static Model Routing
+Unlike Agent() mode where the LLM Leader dynamically selects models, tmux mode uses static model routing via environment variables:
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `WORKER_MODEL` | `sonnet` | Model for Worker invocations |
+| `VERIFIER_MODEL` | `opus` | Model for Verifier invocations |
+### Session Config
+Session metadata is stored in `logs/<slug>/session-config.json`:
+```json
+{
+  "session_name": "rlp-desk-<slug>-20260318-143000",
+  "leader_pane": "%0",
+  "worker_pane": "%1",
+  "verifier_pane": "%2",
+  "created_at": "2026-03-18T14:30:00Z"
+}
+```
+This file is used by the `status` and `clean` commands to find and interact with the running tmux session.

package/examples/calculator/.claude/ralph-desk/context/loop-test-latest.md ADDED Viewed

@@ -0,0 +1,12 @@
+# loop-test - Latest Context
+## Current Frontier
+### Completed
+### In Progress
+### Next
+- US-001: calc.py — Basic Operations
+## Key Decisions
+## Known Issues
+## Files Changed This Iteration
+## Verification Status

package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-output.log ADDED Viewed

File without changes

package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-prompt.md ADDED Viewed

@@ -0,0 +1,38 @@
+Execute the plan for loop-test.
+Required reads every iteration:
+- PRD: .claude/ralph-desk/plans/prd-loop-test.md
+- Test Spec: .claude/ralph-desk/plans/test-spec-loop-test.md
+- Campaign Memory: .claude/ralph-desk/memos/loop-test-memory.md
+- Latest Context: .claude/ralph-desk/context/loop-test-latest.md
+CRITICAL RULE: Work on only ONE User Story per iteration.
+- Check campaign memory's "Next Iteration Contract" first and do that.
+- Do not touch already-completed stories.
+Iteration rules:
+- Use fresh context only; do NOT depend on prior chat history.
+- Execute exactly ONE bounded next action (ONE user story).
+- Refresh context file with the current frontier.
+- Rewrite campaign memory in full.
+MANDATORY: When done, write the following signal file:
+- Path: .claude/ralph-desk/memos/loop-test-iter-signal.json
+- Format: {"iteration": N, "status": "continue|verify|blocked", "summary": "what was done", "timestamp": "ISO"}
+- Status values:
+  - "continue" = current story done but other stories remain
+  - "verify" = all stories complete + done-claim written
+  - "blocked" = autonomous blocker
+Stop behavior:
+- Current story done but other stories remain → memory stop=continue, signal status=continue
+- All stories complete + all tests pass → write done-claim JSON (.claude/ralph-desk/memos/loop-test-done-claim.json) + signal status=verify
+- Autonomous blocker → write blocked.md + signal status=blocked
+Objective: Implement a Python calculator module: calc.py (4 functions + type hints + ValueError) + test_calc.py (pytest, 8+ tests, all passed)
+---
+## Iteration Context
+- **Iteration**: 1
+- **Memory Stop Status**: continue
+- **Next Iteration Contract**: Start from the beginning: read PRD and implement US-001 (calc.py with 4 functions).

package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-trigger.sh ADDED Viewed

@@ -0,0 +1,28 @@
+#!/bin/zsh
+# Trigger for iteration 1 worker - generated by run_ralph_desk.zsh
+# DO NOT use exec here -- it breaks heartbeat cleanup
+HEARTBEAT_FILE="/Users/kyjin/dev/own/ai-dev-methodologies/rlp-desk/examples/calculator/.claude/ralph-desk/logs/loop-test/worker-heartbeat.json"
+# Background heartbeat writer (omc-teams pattern)
+(
+  while true; do
+    echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","pid":'"$$"'}' > "${HEARTBEAT_FILE}.tmp.$$"
+    mv "${HEARTBEAT_FILE}.tmp.$$" "$HEARTBEAT_FILE"
+    sleep 15
+  done
+) &
+HEARTBEAT_PID=$!
+# Run claude with fresh context (governance.md s7 step 5)
+claude -p "$(cat /Users/kyjin/dev/own/ai-dev-methodologies/rlp-desk/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-prompt.md)" \
+  --model sonnet \
+  --dangerously-skip-permissions \
+  --output-format text \
+  2>&1 | tee /Users/kyjin/dev/own/ai-dev-methodologies/rlp-desk/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-output.log
+# Cleanup heartbeat writer
+kill $HEARTBEAT_PID 2>/dev/null
+wait $HEARTBEAT_PID 2>/dev/null
+echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","status":"exited"}' > "${HEARTBEAT_FILE}.tmp.$$"
+mv "${HEARTBEAT_FILE}.tmp.$$" "$HEARTBEAT_FILE"