npm - @ai-dev-methodologies/rlp-desk - Versions diffs - 0.2.2 → 0.2.4 - Mend

@ai-dev-methodologies/rlp-desk 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/package.json +1 -1
package/src/commands/rlp-desk.md +18 -2
package/src/scripts/run_ralph_desk.zsh +68 -3

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@ai-dev-methodologies/rlp-desk",
-  "version": "0.2.2",
+  "version": "0.2.4",
   "description": "Fresh-context iterative loops for Claude Code — autonomous task completion with independent verification",
   "scripts": {
     "postinstall": "node scripts/postinstall.js",

package/src/commands/rlp-desk.md CHANGED Viewed

@@ -97,7 +97,7 @@ Options (parse from `$ARGUMENTS`):
 - `--consensus-scope all|final-only` — when consensus runs (default: `all`)
   - `all`: consensus runs on every verify (current behavior)
   - `final-only`: consensus only on final ALL verify
-- `--debug` — enable debug logging (tmux mode only, writes to logs/<slug>/debug.log)
+- `--debug` — enable debug logging (writes to logs/<slug>/debug.log)
 ### Mode Selection
@@ -144,11 +144,14 @@ DEBUG=<1 if --debug, else 0> \
 1. Validate scaffold: `.claude/ralph-desk/prompts/<slug>.worker.prompt.md` etc.
 2. Check sentinels (complete/blocked). Found → tell user `/rlp-desk clean <slug>`.
 3. Clean previous `done-claim.json`, `verify-verdict.json`.
+4. If `--debug`: create/clear `logs/<slug>/debug.log`. Define a helper: to "debug_log" means append a timestamped line to this file via `Bash("echo \"[$(date '+%Y-%m-%d %H:%M:%S')] $msg\" >> .claude/ralph-desk/logs/<slug>/debug.log")`.
 ### Leader Loop
 **CRITICAL: DO NOT STOP between iterations.** You MUST continue the loop automatically until a sentinel is written (COMPLETE or BLOCKED) or max_iter is reached. Do NOT pause to ask the user. Do NOT wait for confirmation. The loop is fully autonomous — just report each iteration result briefly and immediately proceed to the next iteration.
+If `--debug`, at loop start debug_log: `[PLAN] slug=<slug> max_iter=<N> worker_engine=<engine> worker_model=<model> verifier_engine=<engine> verifier_model=<model> verify_mode=<mode> consensus=<0|1> consensus_scope=<scope>`
 For each iteration (1 to max_iter):
 **① Check sentinels**
@@ -166,11 +169,13 @@ rm -f .claude/ralph-desk/memos/<slug>-verify-verdict.json
 **② Read memory.md** → Stop Status, Next Iteration Contract
 - Also read **Completed Stories** → verified work so far
 - Also read **Key Decisions** → settled architectural choices
+- If `--debug`: debug_log `[EXEC] iter=N phase=read_memory stop_status=<status> contract="<summary>"`
 **③ Decide model** (§4 of governance.md)
 - Previous iteration failed → upgrade model
 - Simple task → downgrade
 - User specified → use that
+- If `--debug`: debug_log `[EXEC] iter=N phase=model_select worker_model=<model> reason=<reason>`
 **④ Build worker prompt**
 - Read `.claude/ralph-desk/prompts/<slug>.worker.prompt.md`
@@ -178,6 +183,7 @@ rm -f .claude/ralph-desk/memos/<slug>-verify-verdict.json
 - Write to `.claude/ralph-desk/logs/<slug>/iter-NNN.worker-prompt.md` (audit trail)
 **⑤ Execute Worker**
+- If `--debug`: debug_log `[EXEC] iter=N phase=worker engine=<engine> model=<model> dispatched=true`
 If `--worker-engine claude` (default):
 ```
@@ -199,11 +205,14 @@ Bash("codex exec --model <worker_codex_model> --reasoning-effort <worker_codex_r
 - Codex runs as a subprocess via Bash(), not Agent().
 - Each Bash() call = fresh context for codex.
+- If `--debug`: debug_log `[EXEC] iter=N phase=worker_done engine=<engine>`
 **⑥ Read memory.md again** (Worker updated it)
 - `stop=continue` → go to ⑧
 - `stop=verify` → go to ⑦
 - `stop=blocked` → write BLOCKED sentinel, stop
 - Also read `iter-signal.json` for `us_id` field (which US was just completed)
+- If `--debug`: debug_log `[EXEC] iter=N phase=worker_signal status=<stop_status> us_id=<us_id>`
 **⑦ Execute Verifier**
@@ -225,6 +234,7 @@ Bash("codex exec --model <worker_codex_model> --reasoning-effort <worker_codex_r
 - Verifier checks all AC at once
 **⑦a Dispatch Verifier**
+- If `--debug`: debug_log `[EXEC] iter=N phase=verifier engine=<engine> model=<model> scope=<us_id> dispatched=true`
 If `--verifier-engine claude` (default):
 ```
@@ -263,6 +273,8 @@ After the primary verifier runs, run a second verifier with the OTHER engine:
     5. Go to ⑧ with fix contract as next Worker contract
   - `request_info` → Leader reads Verifier's questions, decides outcome (or relays to Worker in next contract) → go to ⑧
   - `blocked` → write BLOCKED sentinel, stop
+- If `--debug`: debug_log `[EXEC] iter=N phase=verdict engine=<engine> verdict=<pass|fail|request_info> us_id=<us_id>`
+- If `--debug` and consensus: debug_log `[EXEC] iter=N phase=consensus claude=<verdict> codex=<verdict> round=<N>`
 **⑧ Write result log and report to user, continue loop**
 - Write `logs/<slug>/iter-NNN.result.md`:
@@ -271,6 +283,10 @@ After the primary verifier runs, run a second verifier with the OTHER engine:
   - Verifier verdict `[leader-measured]`
 - Write `status.json`
 - Report: iteration N, phase, model used, result
+- If `--debug`: debug_log `[EXEC] iter=N phase=result status=<result> consecutive_failures=<N> verified_us=<list>`
+At loop end (COMPLETE, BLOCKED, or TIMEOUT):
+- If `--debug`: debug_log `[VALIDATE] result=<COMPLETE|BLOCKED|TIMEOUT> iterations=<N> verified_us=<list>`
 ### Circuit Breaker
 - context-latest.md unchanged 3 iterations → BLOCKED
@@ -342,7 +358,7 @@ Run options:
   --verify-mode per-us|batch Verification strategy (default: per-us)
   --verify-consensus         Cross-engine consensus verification
   --consensus-scope SCOPE    When consensus runs: all|final-only (default: all)
-  --debug                    Debug logging (tmux mode only)
+  --debug                    Debug logging (logs/<slug>/debug.log)
 ```
 ## Architecture

package/src/scripts/run_ralph_desk.zsh CHANGED Viewed

@@ -1411,7 +1411,56 @@ run_consensus_verification() {
     # Consensus disagreement
     log_debug "[EXEC] iter=$iter phase=consensus_disagreement round=$CONSENSUS_ROUND claude=$CLAUDE_VERDICT codex=$CODEX_VERDICT action=fix_contract"
-    # Either fails → build combined fix contract
+    # --- Pre-existing failure detection ---
+    # Get files changed by Worker in this iteration
+    local worker_changed_files=""
+    worker_changed_files=$(cd "$ROOT" && git diff --name-only HEAD~1 HEAD 2>/dev/null || echo "")
+    log_debug "[EXEC] iter=$iter worker_changed_files=\"$worker_changed_files\""
+    # Check if ALL failing issues reference files NOT touched by the worker
+    local has_worker_caused_issues=0
+    local failing_verdict_file=""
+    if [[ "$CLAUDE_VERDICT" = "fail" ]]; then failing_verdict_file="$claude_verdict_file"
+    elif [[ "$CODEX_VERDICT" = "fail" ]]; then failing_verdict_file="$codex_verdict_file"
+    fi
+    if [[ -n "$failing_verdict_file" && -n "$worker_changed_files" ]]; then
+      # Extract file paths mentioned in issues and check against worker changes
+      local issue_files
+      issue_files=$(jq -r '.issues[]? | .description // ""' "$failing_verdict_file" 2>/dev/null)
+      for changed_file in $(echo "$worker_changed_files"); do
+        if echo "$issue_files" | grep -q "$changed_file" 2>/dev/null; then
+          has_worker_caused_issues=1
+          break
+        fi
+      done
+      if (( ! has_worker_caused_issues )); then
+        # None of the failing issues reference files the worker changed
+        log "  Pre-existing failure detected: failing tests are NOT in files changed by Worker."
+        log_debug "[EXEC] iter=$iter pre_existing_failure=true failing_engine=$([ \"$CLAUDE_VERDICT\" = 'fail' ] && echo claude || echo codex)"
+        # Treat as pass — the other engine passed, and failures are pre-existing
+        {
+          echo '{'
+          echo '  "verdict": "pass",'
+          echo '  "verified_at_utc": "'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'",'
+          echo '  "summary": "Consensus PASS (pre-existing failure filtered): claude='"$CLAUDE_VERDICT"' codex='"$CODEX_VERDICT"'. Failing tests not in worker-changed files.",'
+          echo '  "recommended_state_transition": "complete",'
+          echo '  "pre_existing_failure": true,'
+          echo '  "worker_changed_files": "'"$(echo $worker_changed_files | tr '\n' ',')"'",'
+          echo '  "consensus": {'
+          echo '    "claude": { "verdict": "'"$CLAUDE_VERDICT"'" },'
+          echo '    "codex": { "verdict": "'"$CODEX_VERDICT"'" },'
+          echo '    "round": '"$CONSENSUS_ROUND"
+          echo '  }'
+          echo '}'
+        } | atomic_write "$VERDICT_FILE"
+        return 0
+      fi
+    fi
+    # --- Worker-caused failure: build fix contract as before ---
     local fix_contract="$LOGS_DIR/iter-$(printf '%03d' $iter).fix-contract.md"
     {
       echo "# Fix Contract (Consensus Round $CONSENSUS_ROUND, iteration $iter)"
@@ -1577,9 +1626,13 @@ main() {
   PREV_CONTEXT_HASH=$(compute_context_hash)
   # --- governance.md s7: Leader Loop ---
+  local HARD_CEILING=$(( ITER_TIMEOUT * 3 ))  # absolute max per iteration (no extensions beyond this)
   for (( ITERATION = 1; ITERATION <= MAX_ITER; ITERATION++ )); do
     log ""
     log "========== Iteration $ITERATION / $MAX_ITER =========="
+    local ITER_START_TIME
+    ITER_START_TIME=$(date +%s)
     local _iter_contract=""
     _iter_contract=$(sed -n '/^## Next Iteration Contract$/,/^## /{ /^## Next/d; /^## [^N]/d; p; }' "$MEMORY_FILE" 2>/dev/null | head -1 | tr '\n' ' ')
     log_debug "[EXEC] iter=$ITERATION start contract=\"${_iter_contract:-none}\""
@@ -1692,8 +1745,20 @@ main() {
         local worker_cmd
         worker_cmd=$(tmux display-message -p -t "$WORKER_PANE" '#{pane_current_command}' 2>/dev/null)
         if [[ "$worker_cmd" == "node" || "$worker_cmd" == "claude" || "$worker_cmd" == "codex" ]]; then
-          log "  Worker timed out but still active ($worker_cmd). Extending poll..."
-          log_debug "[EXEC] iter=$ITERATION timeout_active=true process=$worker_cmd"
+          # Check hard ceiling before extending
+          local iter_elapsed=$(( $(date +%s) - ITER_START_TIME ))
+          if (( iter_elapsed >= HARD_CEILING )); then
+            log_error "Worker hit hard ceiling (${HARD_CEILING}s = 3x iter_timeout). Killing iteration."
+            log_debug "[EXEC] iter=$ITERATION hard_ceiling_hit=true elapsed=${iter_elapsed}s ceiling=${HARD_CEILING}s process=$worker_cmd"
+            tmux send-keys -t "$WORKER_PANE" C-c 2>/dev/null
+            sleep 1
+            WORKER_PANE=$(replace_worker_pane "$WORKER_PANE" "worker")
+            update_status "worker" "hard_timeout"
+            worker_poll_done=1
+            break
+          fi
+          log "  Worker timed out but still active ($worker_cmd). Extending poll... (${iter_elapsed}s/${HARD_CEILING}s)"
+          log_debug "[EXEC] iter=$ITERATION timeout_active=true process=$worker_cmd elapsed=${iter_elapsed}s ceiling=${HARD_CEILING}s"
           log_debug "[EXEC] iter=$ITERATION poll_extended=true worker_cmd=$worker_cmd"
           update_status "worker" "slow"
           # Loop continues — re-poll same iteration