@ai-dev-methodologies/rlp-desk 0.15.0 → 0.15.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -189,7 +189,7 @@ Tell the user:
189
189
  /rlp-desk run <actual-slug> --debug
190
190
 
191
191
  # Full options reference:
192
- # --mode agent|tmux (default: agent)
192
+ # --mode native|tmux (default: native; legacy `agent` redirects to native)
193
193
  # --worker-model MODEL haiku|sonnet|opus or gpt-5.5:high|spark:high (default: haiku)
194
194
  # --lock-worker-model disable auto model upgrade
195
195
  # --verifier-model MODEL per-US verifier (default: sonnet)
@@ -217,14 +217,14 @@ Tell the user:
217
217
  # ★ Recommended: tmux mode + claude-only (real-time visibility):
218
218
  /rlp-desk run <actual-slug> --mode tmux --debug
219
219
 
220
- # Agent mode:
221
- /rlp-desk run <actual-slug> --debug
220
+ # Native Agent() mode (slash leader, short / interactive campaigns):
221
+ /rlp-desk run <actual-slug> --mode native --debug
222
222
 
223
223
  # Install codex for cost savings + cross-engine blind-spot coverage:
224
224
  npm install -g @openai/codex
225
225
 
226
226
  # Full options reference:
227
- # --mode agent|tmux (default: agent)
227
+ # --mode native|tmux (default: native; legacy `agent` redirects to native)
228
228
  # --worker-model MODEL haiku|sonnet|opus (default: haiku)
229
229
  # --lock-worker-model disable auto model upgrade
230
230
  # --verifier-model MODEL per-US verifier (default: sonnet)
@@ -252,7 +252,7 @@ Tell the user:
252
252
  **YOU are the leader. Do NOT delegate leadership.**
253
253
 
254
254
  Options (parse from `$ARGUMENTS`):
255
- - `--mode agent|tmux` (default: `agent`) — execution mode
255
+ - `--mode native|tmux` (default: `native`) — execution mode. `native` = slash command is the leader, calls `Agent(...)` (claude) and `Bash("codex exec ...")` (codex). `tmux` = slash command spawns the zsh runner via `node run.mjs --mode tmux`. Legacy `--mode agent` typed against the slash command emits a deprecation notice and redirects to `--mode native` (NOT to be confused with `node run.mjs --mode agent`, which is the deprecated Node-leader alpha — see "Direct Node CLI invocation" below).
256
256
  - `--worker-model MODEL` (default: `haiku`) — Worker model. Format: `model` = claude engine, `model:reasoning` = codex engine. Examples: `haiku`, `sonnet`, `opus`, `spark:high`, `gpt-5.5:high`. Parsed by `parse_model_flag()` which auto-splits engine/model/reasoning.
257
257
  - `--lock-worker-model` — disable automatic model upgrade on failure. Worker stays on the specified model regardless of consecutive failures.
258
258
  - `--verifier-model MODEL` (default: `sonnet`) — per-US verification model. Campaign-fixed (no progressive upgrade). Lighter than final verifier.
@@ -284,20 +284,26 @@ Cross-project aggregation: scan `~/.claude/ralph-desk/analytics/` and read each
284
284
 
285
285
  ### Mode Selection
286
286
 
287
- Parse the `--mode` flag. If absent or `agent`, use the Agent() path below. If `tmux`, use the Tmux path.
287
+ Parse the `--mode` flag. Slash command canonical labels:
288
+
289
+ - `--mode native` (default): **Native Agent() path** below. The slash command IS the leader. It calls `Agent(description=…, model=<m>, mode="bypassPermissions", prompt=…)` for claude workers/verifiers and `Bash("codex exec --model <m> --reasoning-effort <r> <prompt>")` for codex workers/verifiers.
290
+ - `--mode tmux`: **zsh runner path** below. The slash command shells out to `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux …` which spawns `run_ralph_desk.zsh` as a subprocess.
288
291
 
289
- > **v0.14.0 stability tiers:**
290
- > - `--mode tmux` is the **stable, production** path. The Node leader (`run.mjs`)
291
- > now routes tmux invocations to `~/.claude/ralph-desk/run_ralph_desk.zsh`
292
- > as a subprocess that runner has the full safety net (heartbeat,
293
- > copy-mode guard, prompt-stall, no-progress detection, claude model
294
- > upgrade chain). Recommend this for autonomous campaigns.
295
- > - `--mode agent` is **alpha** (Node-native LLM-driven Leader). The runner
296
- > emits a stderr warning when this mode is invoked.
292
+ Legacy `--mode agent` typed against this slash command emits a deprecation notice and redirects to `--mode native`. **Do NOT confuse `/rlp-desk run --mode agent`** (slash command, redirects to Native Agent()) **with** `node run.mjs run --mode agent` (deprecated Node-leader alpha, direct CLI invocation, unrelated code path — see "Direct Node CLI invocation" below).
293
+
294
+ > **Stability tiers:**
295
+ > - `--mode tmux` is the **stable, production** path. The slash command spawns
296
+ > the Node leader, which spawns `run_ralph_desk.zsh` — the zsh runner has the
297
+ > full safety net (heartbeat, copy-mode guard, prompt-stall, no-progress
298
+ > detection, claude model upgrade chain). Recommend this for autonomous
299
+ > campaigns.
300
+ > - `--mode native` is **for short / interactive campaigns**. Native Agent()
301
+ > has no timeout API (platform constraint). Long-running autonomous campaigns
302
+ > SHOULD use `--mode tmux`.
297
303
 
298
304
  #### Tmux Mode (`--mode tmux`)
299
305
 
300
- When `--mode tmux` is specified (v0.14.0+: `run.mjs` accepts the same flags as before but spawns `run_ralph_desk.zsh` as a subprocess and inherits stdio. Flywheel and self-verification flags are not honored under tmux mode — they require `--mode agent`):
306
+ When `--mode tmux` is specified (v0.14.0+: `run.mjs` accepts the same flags as before but spawns `run_ralph_desk.zsh` as a subprocess and inherits stdio. Flywheel and self-verification flags are not honored under tmux mode — they currently require the deprecated Node-leader direct-CLI path `node run.mjs --mode agent` (see "Direct Node CLI invocation" below). Native Agent() port is a post-Node-leader-retirement task):
301
307
 
302
308
  1. **Validate scaffold** — same as Agent() mode: check `.rlp-desk/prompts/<slug>.worker.prompt.md` etc.
303
309
  2. **Check sentinels** — same as Agent() mode.
@@ -331,7 +337,7 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
331
337
 
332
338
  **Env-var translation (v5.7 §4.1)**: the slash command historically built `LANE_MODE=strict zsh ...` and `TEST_DENSITY_MODE=strict zsh ...` from CLI flags. The Node leader uses CLI flags instead — translate `--lane-strict` and `--test-density-strict` into the corresponding flags. Direct env-var users (running zsh directly) are unaffected.
333
339
 
334
- 6. **If the Node leader exits with error** — report the error to the user and STOP. Do NOT attempt to work around it. Do NOT create tmux sessions yourself. Do NOT re-launch in a different way. Tell the user what went wrong and suggest `--mode agent` as alternative.
340
+ 6. **If the Node leader exits with error** — report the error to the user and STOP. Do NOT attempt to work around it. Do NOT create tmux sessions yourself. Do NOT re-launch in a different way. Tell the user what went wrong and suggest `--mode native` (slash command Native Agent() path) as alternative.
335
341
  7. **If successful** — tell the user the tmux session has been started. The Node leader takes over as the deterministic Leader. No Agent() calls are made in tmux mode.
336
342
 
337
343
  **IMPORTANT RULES:**
@@ -339,35 +345,36 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
339
345
  - MUST launch with `run_in_background: true` so `/rlp-desk` returns control immediately while preserving live tmux visibility.
340
346
  - Run-in-background is used so the shell can keep the command visible and keep the pane layout stable for status checks and completion flow.
341
347
  - Do NOT kill panes after completion. Panes stay alive for inspection. User cleans up with `/rlp-desk clean <slug> --kill-session`.
342
- - v0.14.0: `--with-self-verification`, `--flywheel`, and `--flywheel-guard` are **not honored** under `--mode tmux` — the zsh runner has no SV/flywheel implementation. The Node leader emits a stderr WARNING listing the dropped flags. For SV/flywheel, use `--mode agent` (alpha).
343
- - The slash command always invokes `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux ...`. Do NOT invoke `~/.claude/ralph-desk/run_ralph_desk.zsh` directly — the Node router resolves the runner path, runs legacy detection, and surfaces actionable errors when the runner is missing.
348
+ - v0.14.0: `--with-self-verification`, `--flywheel`, and `--flywheel-guard` are **not honored** under `--mode tmux` — the zsh runner has no SV/flywheel implementation. The Node leader emits a stderr WARNING listing the dropped flags. For SV/flywheel today, use the deprecated Node-leader direct-CLI path `node run.mjs --mode agent` (see "Direct Node CLI invocation" below). The slash command's Native Agent() (`--mode native`) does not yet implement SV/flywheel — port is a post-Node-leader-retirement task.
349
+ - **For `--mode tmux` only**: the slash command invokes `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux ...`. Do NOT invoke `~/.claude/ralph-desk/run_ralph_desk.zsh` directly — the Node router resolves the runner path, runs legacy detection, and surfaces actionable errors when the runner is missing. **For `--mode native`**, the slash command does NOT invoke the Node CLI — it acts as the leader itself; see Native Agent() Mode section below.
344
350
 
345
351
  **tmux UX model (5 items):**
346
352
  - The session returns immediately after launch (`run_in_background: true`) so the command returns control to the parent CLI.
347
353
  - Worker/Verifier panes remain visible to the user during execution.
348
354
  - Users check progress with the **status command**: `/rlp-desk status <slug>`.
349
355
  - On completion, the command returns a completion notification before the loop ends.
350
- - Agent mode remains unchanged, and no tmux-specific behavior is mixed into Agent mode.
351
-
352
- #### Agent Mode (`--mode agent` or default — **alpha**)
353
-
354
- > **v0.14.0:** Agent mode is the alpha LLM-driven path. The Node port shipped
355
- > without zsh-equivalent safety nets (heartbeat, copy-mode guard, prompt-stall
356
- > timeout, no-progress detection, claude model upgrade chain). The runner
357
- > emits a stderr WARNING when agent mode is invoked. For production
358
- > autonomous campaigns, prefer `--mode tmux`.
359
-
360
- **Why Agent mode is structurally immune to Bug 4/5 (mid-execution prompt hang
361
- & A4 premature dispatch):** Worker/Verifier are dispatched as `Agent(...,
362
- mode="bypassPermissions", ...)`. The subagent runs non-interactively under
363
- the platform's bypass it has no tmux pane, no TUI surface, and cannot
364
- surface a `[y/N]` prompt to the parent Leader. The auto-dismiss /
365
- prompt-stall / no-progress timeouts in `run_ralph_desk.zsh` (v5.7 §4.13.b /
366
- §4.16 / §4.17) are therefore tmux-only by design. **Tradeoff**: because
367
- `Agent()` has no timeout API, agent-mode iterations are not bounded — if
368
- the platform's `bypassPermissions` ever fails to suppress an interactive
369
- prompt at the SDK level, the call hangs indefinitely with no rlp-desk-side
370
- watchdog. Use `--mode tmux` if you need bounded execution time.
356
+ - Native Agent() mode remains unchanged, and no tmux-specific behavior is mixed into it.
357
+
358
+ #### Native Agent() Mode (`--mode native` or default)
359
+
360
+ The slash command IS the leader. Workers/Verifiers are spawned via `Agent(model=…, mode="bypassPermissions", prompt=…)` (claude) or `Bash("codex exec --model <m> --reasoning-effort <r> <prompt>")` (codex).
361
+
362
+ ### Native Agent() Safety Contract
363
+
364
+ This contract MUST be observed in every iteration of the leader loop below. Future PRs deleting any of these guarantees break the slash command's behavior.
365
+
366
+ 1. **Turn-keepalive**: every status report uses `Bash("echo '...'")` to emit messages. NEVER output plain text without an accompanying tool call. Plain text = turn ends = loop stops. (Mitigation for commit `29fd29b` platform constraint, permanent.)
367
+ 2. **no `subagent_type` parameter**: the `Agent(...)` call form is exactly `Agent(description=…, model=<m>, mode="bypassPermissions", prompt=…)`. Do NOT pass `subagent_type`. (Mitigation for commit `920a31c`: `subagent_type="executor"` overrode `bypassPermissions` and surfaced a permission popup; permanent.)
368
+ 3. **`mode="bypassPermissions"` mandatory**: every claude `Agent()` worker/verifier dispatch must include `mode="bypassPermissions"`.
369
+ 4. **Long-running campaigns: prefer `--mode tmux`** for production. Native Agent() has no timeout API (platform constraint) — if `bypassPermissions` fails to suppress an interactive prompt at the SDK level, the call hangs indefinitely with no rlp-desk-side watchdog.
370
+
371
+ **Why Native Agent() is structurally immune to Bug 4/5 (mid-execution prompt hang & A4 premature dispatch)**: Worker/Verifier run non-interactively under the platform's bypass — they have no tmux pane, no TUI surface, and cannot surface a `[y/N]` prompt to the parent Leader. The auto-dismiss / prompt-stall / no-progress timeouts in `run_ralph_desk.zsh` (v5.7 §4.13.b / §4.16 / §4.17) are therefore tmux-only by design.
372
+
373
+ **Tradeoff**: because `Agent()` has no timeout API, Native Agent() iterations are not bounded — if the platform's `bypassPermissions` ever fails to suppress an interactive prompt at the SDK level, the call hangs indefinitely with no rlp-desk-side watchdog. Use `--mode tmux` if you need bounded execution time.
374
+
375
+ #### Direct Node CLI invocation (`node run.mjs run <slug> --mode agent` — deprecated alpha)
376
+
377
+ Direct invocation of `node ~/.claude/ralph-desk/node/run.mjs run <slug> --mode agent` is **the deprecated Node-leader alpha path**. This is unrelated to the slash command's Native Agent() path above — different code, different leader, different lifecycle. The Node leader currently retains SV/flywheel implementations not yet ported to Native Agent(). The Node CLI emits a deprecation banner on this mode and is scheduled for hard-error in the next major release. For production tmux orchestration, use `--mode tmux`. For Claude Code Native Agent() campaigns, use `/rlp-desk run <slug> --mode native` from a Claude Code session.
371
378
 
372
379
  ### Preparation
373
380
  1. Validate scaffold: `.rlp-desk/prompts/<slug>.worker.prompt.md` etc.
@@ -775,13 +782,13 @@ Example:
775
782
  ```
776
783
  /rlp-desk brainstorm <description> Plan before init (interactive)
777
784
  /rlp-desk init <slug> [objective] Create project scaffold
778
- /rlp-desk run <slug> [options] Run loop (agent=LLM leader, tmux=shell leader)
785
+ /rlp-desk run <slug> [options] Run loop (native=Native Agent() leader (slash), tmux=zsh leader (production); legacy `agent` redirects to `native` — direct Node CLI `--mode agent` is deprecated alpha)
779
786
  /rlp-desk status <slug> Show loop status
780
787
  /rlp-desk logs <slug> [N] Show iteration log
781
788
  /rlp-desk clean <slug> [--kill-session] Reset for re-run (--kill-session kills tmux)
782
789
 
783
790
  Run options:
784
- --mode agent|tmux Execution mode (default: agent)
791
+ --mode native|tmux Execution mode (default: native)
785
792
  --worker-model MODEL Worker model: haiku|sonnet|opus or gpt-5.5:high|spark:high (default: haiku)
786
793
  --lock-worker-model Disable auto model upgrade on failure
787
794
  --verifier-model MODEL per-US verifier (default: sonnet)
@@ -799,15 +806,18 @@ Run options:
799
806
 
800
807
  ## Architecture
801
808
 
802
- ### Agent Mode (default: `--mode agent`)
809
+ ### Native Agent() Mode (default: `--mode native`)
803
810
  ```
804
- [This session = LEADER (LLM)]
811
+ [This session = LEADER (LLM, slash command itself)]
805
812
 
806
- Agent()├──▶ [Worker: executor (fresh context)]
813
+ Agent()├──▶ [Worker: claude subagent (fresh context, mode="bypassPermissions")]
807
814
  │ └── reads desk files, implements, updates memory
808
815
 
809
- Agent()└──▶ [Verifier: executor (fresh context)]
810
- └── reads done-claim, runs checks, writes verdict
816
+ Agent()└──▶ [Verifier: claude subagent (fresh context, mode="bypassPermissions")]
817
+ └── reads done-claim, runs checks, writes verdict
818
+
819
+ Bash() ───▶ [Worker/Verifier: codex CLI subprocess]
820
+ └── `codex exec --model <m> --reasoning-effort <r> <prompt>`
811
821
  ```
812
822
 
813
823
  ### Tmux Mode (`--mode tmux`)
package/src/node/run.mjs CHANGED
@@ -48,14 +48,14 @@ function buildHelpText() {
48
48
  'Commands:',
49
49
  ' brainstorm <description> Plan before init (not implemented in the Node rewrite yet)',
50
50
  ' init <slug> [objective] Create project scaffold',
51
- ' run <slug> [options] Run loop (agent=LLM leader, tmux=shell leader)',
51
+ ' run <slug> [options] Run loop (tmux=zsh leader [production], agent=Node leader [deprecated alpha], native=slash-only error)',
52
52
  ' status <slug> Show loop status',
53
53
  ' logs <slug> [N] Show iteration log (not implemented in the Node rewrite yet)',
54
54
  ' clean <slug> [--kill-session] Reset for re-run (not implemented in the Node rewrite yet)',
55
55
  ' resume <slug> Resume loop (not implemented in the Node rewrite yet)',
56
56
  '',
57
57
  'Run Options:',
58
- ' --mode agent|tmux',
58
+ ' --mode tmux|agent|native (CLI: tmux=production, agent=deprecated, native=errors with redirect to slash command)',
59
59
  ' --worker-model MODEL',
60
60
  ' --lock-worker-model',
61
61
  ' --verifier-model MODEL',
@@ -358,10 +358,32 @@ async function runRunCommand(args, deps) {
358
358
  return runTmuxViaZsh(slug, options, deps);
359
359
  }
360
360
 
361
- // v0.14.0: agent mode is the alpha LLM-driven path. The Node port shipped
362
- // without zsh-equivalent safety nets (heartbeat, copy-mode guard,
363
- // prompt-stall timeout, no-progress detection, claude model upgrade chain).
364
- // Surface that explicitly so production users pick --mode tmux instead.
361
+ // P1.b (native-agent-revert plan v7): --mode native is slash-command-only.
362
+ // The Node CLI does not implement Native Agent() that path lives in
363
+ // src/commands/rlp-desk.md and runs in a Claude Code session. Surface a
364
+ // hard error here so direct CLI invocation does not silently fall through
365
+ // to the deprecated Node-leader path.
366
+ if (options.mode === 'native') {
367
+ write(
368
+ deps.stderr,
369
+ 'ERROR: --mode native is slash-command-only. The Node CLI does not implement it.',
370
+ );
371
+ write(
372
+ deps.stderr,
373
+ 'Use `/rlp-desk run <slug> --mode native` from a Claude Code session,',
374
+ );
375
+ write(
376
+ deps.stderr,
377
+ 'or use `--mode tmux` (production) / `--mode agent` (deprecated alpha) for direct CLI invocation.',
378
+ );
379
+ return 2;
380
+ }
381
+
382
+ // P1.b: --mode agent (Node-leader alpha) is deprecated. The slash command's
383
+ // Native Agent() path (`/rlp-desk run --mode native`) is unrelated — different
384
+ // code, different leader. We keep the Node-leader behavior unchanged for
385
+ // backward compatibility but surface a strong deprecation banner so wrappers
386
+ // can migrate before the next major release hard-errors this mode.
365
387
  if (
366
388
  options.mode === 'agent'
367
389
  && !process.env.RLP_DESK_QUIET_WARNINGS
@@ -369,7 +391,32 @@ async function runRunCommand(args, deps) {
369
391
  ) {
370
392
  write(
371
393
  deps.stderr,
372
- 'WARNING: --mode agent is alpha. For production tmux orchestration, prefer --mode tmux (zsh-backed, stable).',
394
+ 'WARNING: --mode agent (Node-leader alpha) is deprecated.',
395
+ );
396
+ write(
397
+ deps.stderr,
398
+ 'This is the direct Node-CLI alpha path — UNRELATED to the slash command Native Agent() path (`/rlp-desk run --mode native`).',
399
+ );
400
+ write(
401
+ deps.stderr,
402
+ 'For production tmux orchestration, use `--mode tmux`.',
403
+ );
404
+ write(
405
+ deps.stderr,
406
+ 'For Claude Code Native Agent() campaigns, use `/rlp-desk run --mode native` from a Claude Code session.',
407
+ );
408
+ // 2026-05-07 (v0.15.2): rlp-desk is in active stabilization. Goal: reach
409
+ // omc /team/ralph/ralplan level of reliability while preserving
410
+ // rlp-desk's self-driving advantages (multi-engine consensus, multi-mission
411
+ // queue, BLOCK_TAGS taxonomy, structured SV reports). omc is the BENCHMARK,
412
+ // not a replacement. See docs/plans/v0.15-stabilization-plan.md.
413
+ write(
414
+ deps.stderr,
415
+ 'SCHEDULED REMOVAL: --mode agent (Node CLI alpha) will be removed in a future major release. Date TBD until stabilization milestones complete.',
416
+ );
417
+ write(
418
+ deps.stderr,
419
+ 'STABILIZATION IN PROGRESS: rlp-desk is hardening against the 10-bug regression pattern observed 2026-05-01..05-07. See docs/plans/v0.15-stabilization-plan.md.',
373
420
  );
374
421
  }
375
422
 
@@ -388,6 +388,110 @@ async function readCurrentState(paths, slug, options) {
388
388
  };
389
389
  }
390
390
 
391
+ // PR-A (Bug #10): validate operator-written recovery artifacts. When the
392
+ // operator hand-rolls a `phase=verify` recovery (jq-patches status.json,
393
+ // writes iter-signal.json + done-claim.json by hand, deletes the blocked
394
+ // sentinel), the leader must NOT silently overwrite that work on relaunch.
395
+ // All five checks must pass for the leader to honor the recovery.
396
+ //
397
+ // Returns { ok: boolean, reason: string }. On any failure the caller falls
398
+ // through to the default behavior (worker dispatch) — defensive by design.
399
+ async function _validateOperatorRecoveryArtifacts({ paths, state }) {
400
+ // 1. iter-signal.json + done-claim.json must both exist and parse.
401
+ let signal;
402
+ let doneClaim;
403
+ try {
404
+ signal = await readJsonIfExists(paths.signalFile);
405
+ } catch (err) {
406
+ return { ok: false, reason: `iter-signal.json parse error: ${err?.message ?? err}` };
407
+ }
408
+ if (!signal) return { ok: false, reason: 'iter-signal.json missing' };
409
+
410
+ try {
411
+ doneClaim = await readJsonIfExists(paths.doneClaimFile);
412
+ } catch (err) {
413
+ return { ok: false, reason: `done-claim.json parse error: ${err?.message ?? err}` };
414
+ }
415
+ if (!doneClaim) return { ok: false, reason: 'done-claim.json missing' };
416
+
417
+ // 2. us_id must match status.current_us in BOTH artifacts.
418
+ if (signal.us_id !== state.current_us) {
419
+ return {
420
+ ok: false,
421
+ reason: `iter-signal.us_id (${signal.us_id}) != status.current_us (${state.current_us})`,
422
+ };
423
+ }
424
+ if (doneClaim.us_id !== state.current_us) {
425
+ return {
426
+ ok: false,
427
+ reason: `done-claim.us_id (${doneClaim.us_id}) != status.current_us (${state.current_us})`,
428
+ };
429
+ }
430
+
431
+ // 3. iteration must match status.iteration in BOTH artifacts.
432
+ if (signal.iteration !== state.iteration) {
433
+ return {
434
+ ok: false,
435
+ reason: `iter-signal.iteration (${signal.iteration}) != status.iteration (${state.iteration})`,
436
+ };
437
+ }
438
+ if (doneClaim.iteration !== state.iteration) {
439
+ return {
440
+ ok: false,
441
+ reason: `done-claim.iteration (${doneClaim.iteration}) != status.iteration (${state.iteration})`,
442
+ };
443
+ }
444
+
445
+ // 4. iter_signal_quality must be 'specific' (not generic / vague).
446
+ if (signal.iter_signal_quality !== 'specific') {
447
+ return {
448
+ ok: false,
449
+ reason: `iter-signal.iter_signal_quality (${signal.iter_signal_quality}) != 'specific'`,
450
+ };
451
+ }
452
+
453
+ // 5. Both artifact mtimes must be NEWER than the most recent
454
+ // iter-NNN.worker-prompt.md mtime — guards against operator running
455
+ // `phase=verify` against stale artifacts from a much earlier iteration.
456
+ const promptFile = path.join(
457
+ paths.campaignLogDir,
458
+ `iter-${String(state.iteration).padStart(3, '0')}.worker-prompt.md`,
459
+ );
460
+ let promptMtime = 0;
461
+ try {
462
+ const promptStat = await fs.stat(promptFile);
463
+ promptMtime = promptStat.mtimeMs;
464
+ } catch {
465
+ // No worker-prompt.md for this iteration → check vacuously passes
466
+ // (operator is recovering from a state that never even dispatched yet).
467
+ promptMtime = 0;
468
+ }
469
+ if (promptMtime > 0) {
470
+ let signalMtime = 0;
471
+ let doneClaimMtime = 0;
472
+ try {
473
+ signalMtime = (await fs.stat(paths.signalFile)).mtimeMs;
474
+ doneClaimMtime = (await fs.stat(paths.doneClaimFile)).mtimeMs;
475
+ } catch (err) {
476
+ return { ok: false, reason: `mtime stat failed: ${err?.message ?? err}` };
477
+ }
478
+ if (signalMtime <= promptMtime) {
479
+ return {
480
+ ok: false,
481
+ reason: `iter-signal.json mtime (${signalMtime}) is not strictly newer than worker-prompt mtime (${promptMtime})`,
482
+ };
483
+ }
484
+ if (doneClaimMtime <= promptMtime) {
485
+ return {
486
+ ok: false,
487
+ reason: `done-claim.json mtime (${doneClaimMtime}) is not strictly newer than worker-prompt mtime (${promptMtime})`,
488
+ };
489
+ }
490
+ }
491
+
492
+ return { ok: true, reason: 'all five checks passed' };
493
+ }
494
+
391
495
  async function appendIterationAnalytics(paths, state, usId, verdict, options) {
392
496
  await appendCampaignAnalytics(paths.analyticsFile, {
393
497
  iter: state.iteration,
@@ -1288,6 +1392,28 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
1288
1392
 
1289
1393
  let fixContractPath = null;
1290
1394
 
1395
+ // PR-A (Bug #10): operator-recovery hygiene. If the operator hand-rolled a
1396
+ // `phase=verify` recovery (jq-patches status.json, writes manual artifacts,
1397
+ // deletes the blocked sentinel), the leader MUST honor that work instead of
1398
+ // resetting to phase=worker on relaunch. The validator runs five checks
1399
+ // (see _validateOperatorRecoveryArtifacts); on full pass, _skipNextWorkerDispatch
1400
+ // is set as a one-shot flag consumed at the worker dispatch call site below.
1401
+ // On any failure the leader logs the reason and falls through to default
1402
+ // behavior.
1403
+ if (state.phase === 'verify' && state.iteration > 0) {
1404
+ const validation = await _validateOperatorRecoveryArtifacts({ paths, state });
1405
+ if (validation.ok) {
1406
+ console.error(
1407
+ `[recovery] Resuming verify phase — operator manual recovery detected (us=${state.current_us} iter=${state.iteration}): ${validation.reason}`,
1408
+ );
1409
+ state._skipNextWorkerDispatch = true;
1410
+ } else {
1411
+ console.error(
1412
+ `[recovery] phase=verify ignored, falling through to worker dispatch: ${validation.reason}`,
1413
+ );
1414
+ }
1415
+ }
1416
+
1291
1417
  // P1-E Lane Enforcement: snapshot lane mtimes before each iteration,
1292
1418
  // compare at the top of the next iteration. Drift on read-only artifacts
1293
1419
  // (PRD, test-spec, context) emits a lane_violation_warning event + audit
@@ -1572,18 +1698,36 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
1572
1698
  }
1573
1699
  }
1574
1700
 
1575
- state.phase = 'worker';
1576
- await writeStatus(paths, state, options.onStatusChange, options.now);
1577
- await dispatchWorker({
1578
- iteration: state.iteration,
1579
- paths,
1580
- slug,
1581
- usList,
1582
- state,
1583
- sendKeys,
1584
- workerPaneId: state.worker_pane_id,
1585
- fixContractPath,
1586
- });
1701
+ // PR-A (Bug #10): one-shot guard. When the operator's `phase=verify`
1702
+ // recovery was honored at campaign entry, skip both the phase reset and
1703
+ // the worker dispatch — the operator already wrote a valid iter-signal.json
1704
+ // and done-claim.json, so pollForSignal below will pick them up immediately
1705
+ // and the loop continues into the verifier phase. The flag is cleared
1706
+ // after consumption so subsequent iterations dispatch the worker normally.
1707
+ if (state._skipNextWorkerDispatch) {
1708
+ state._skipNextWorkerDispatch = false;
1709
+ console.error(
1710
+ `[recovery] Skipping worker dispatch for iter=${state.iteration} (honoring operator manual recovery)`,
1711
+ );
1712
+ // Persist phase=verify so a subsequent crash-and-relaunch sees the same
1713
+ // contract. writeStatus is intentionally called BEFORE pollForSignal so
1714
+ // the on-disk state matches what we are about to do.
1715
+ state.phase = 'verify';
1716
+ await writeStatus(paths, state, options.onStatusChange, options.now);
1717
+ } else {
1718
+ state.phase = 'worker';
1719
+ await writeStatus(paths, state, options.onStatusChange, options.now);
1720
+ await dispatchWorker({
1721
+ iteration: state.iteration,
1722
+ paths,
1723
+ slug,
1724
+ usList,
1725
+ state,
1726
+ sendKeys,
1727
+ workerPaneId: state.worker_pane_id,
1728
+ fixContractPath,
1729
+ });
1730
+ }
1587
1731
 
1588
1732
  let signal;
1589
1733
  try {
@@ -285,6 +285,90 @@ _unlock_sentinel() {
285
285
  return 0
286
286
  }
287
287
 
288
+ # PR-A (Bug #10) — validate operator-written manual recovery artifacts.
289
+ # Returns 0 when all 5 checks pass; 1 otherwise. Sets RECOVERY_FAIL_REASON
290
+ # (global) on failure for caller logging. Mirrors the Node-side helper
291
+ # `_validateOperatorRecoveryArtifacts` in `src/node/runner/campaign-main-loop.mjs`.
292
+ #
293
+ # Args:
294
+ # $1 iter-signal.json path
295
+ # $2 done-claim.json path
296
+ # $3 status.json path
297
+ # $4 iter-NNN.worker-prompt.md path (may not exist for iter-1 fresh start)
298
+ _validate_operator_recovery_artifacts() {
299
+ local sig_file="$1" done_file="$2" status_file="$3" prompt_file="$4"
300
+ RECOVERY_FAIL_REASON=""
301
+
302
+ # Check 1: both artifacts exist + parse as JSON
303
+ if [[ ! -f "$sig_file" ]]; then
304
+ RECOVERY_FAIL_REASON="iter-signal.json missing"; return 1
305
+ fi
306
+ if [[ ! -f "$done_file" ]]; then
307
+ RECOVERY_FAIL_REASON="done-claim.json missing"; return 1
308
+ fi
309
+ if ! command -v jq >/dev/null 2>&1; then
310
+ RECOVERY_FAIL_REASON="jq unavailable; cannot validate"; return 1
311
+ fi
312
+ if ! jq -e . "$sig_file" >/dev/null 2>&1; then
313
+ RECOVERY_FAIL_REASON="iter-signal.json parse error"; return 1
314
+ fi
315
+ if ! jq -e . "$done_file" >/dev/null 2>&1; then
316
+ RECOVERY_FAIL_REASON="done-claim.json parse error"; return 1
317
+ fi
318
+ if [[ ! -f "$status_file" ]] || ! jq -e . "$status_file" >/dev/null 2>&1; then
319
+ RECOVERY_FAIL_REASON="status.json missing or invalid"; return 1
320
+ fi
321
+
322
+ # Check 2: us_id match in both artifacts
323
+ local current_us sig_us done_us
324
+ current_us=$(jq -r '.current_us // ""' "$status_file" 2>/dev/null)
325
+ sig_us=$(jq -r '.us_id // ""' "$sig_file" 2>/dev/null)
326
+ done_us=$(jq -r '.us_id // ""' "$done_file" 2>/dev/null)
327
+ if [[ "$sig_us" != "$current_us" ]]; then
328
+ RECOVERY_FAIL_REASON="iter-signal.us_id ($sig_us) != status.current_us ($current_us)"; return 1
329
+ fi
330
+ if [[ "$done_us" != "$current_us" ]]; then
331
+ RECOVERY_FAIL_REASON="done-claim.us_id ($done_us) != status.current_us ($current_us)"; return 1
332
+ fi
333
+
334
+ # Check 3: iteration match in both artifacts
335
+ local current_iter sig_iter done_iter
336
+ current_iter=$(jq -r '.iteration // 0' "$status_file" 2>/dev/null)
337
+ sig_iter=$(jq -r '.iteration // 0' "$sig_file" 2>/dev/null)
338
+ done_iter=$(jq -r '.iteration // 0' "$done_file" 2>/dev/null)
339
+ if [[ "$sig_iter" != "$current_iter" ]]; then
340
+ RECOVERY_FAIL_REASON="iter-signal.iteration ($sig_iter) != status.iteration ($current_iter)"; return 1
341
+ fi
342
+ if [[ "$done_iter" != "$current_iter" ]]; then
343
+ RECOVERY_FAIL_REASON="done-claim.iteration ($done_iter) != status.iteration ($current_iter)"; return 1
344
+ fi
345
+
346
+ # Check 4: iter_signal_quality must equal 'specific'
347
+ local sig_quality
348
+ sig_quality=$(jq -r '.iter_signal_quality // ""' "$sig_file" 2>/dev/null)
349
+ if [[ "$sig_quality" != "specific" ]]; then
350
+ RECOVERY_FAIL_REASON="iter-signal.iter_signal_quality ($sig_quality) != 'specific'"; return 1
351
+ fi
352
+
353
+ # Check 5: artifact mtimes must be strictly newer than worker-prompt mtime.
354
+ # Vacuously passes when the prompt file does not exist (fresh iter-1 start
355
+ # before any leader-written prompt).
356
+ if [[ -f "$prompt_file" ]]; then
357
+ local prompt_mtime sig_mtime done_mtime
358
+ prompt_mtime=$(stat -f %m "$prompt_file" 2>/dev/null || stat -c %Y "$prompt_file" 2>/dev/null || print 0)
359
+ sig_mtime=$(stat -f %m "$sig_file" 2>/dev/null || stat -c %Y "$sig_file" 2>/dev/null || print 0)
360
+ done_mtime=$(stat -f %m "$done_file" 2>/dev/null || stat -c %Y "$done_file" 2>/dev/null || print 0)
361
+ if (( sig_mtime <= prompt_mtime )); then
362
+ RECOVERY_FAIL_REASON="iter-signal.json mtime ($sig_mtime) not strictly newer than worker-prompt mtime ($prompt_mtime)"; return 1
363
+ fi
364
+ if (( done_mtime <= prompt_mtime )); then
365
+ RECOVERY_FAIL_REASON="done-claim.json mtime ($done_mtime) not strictly newer than worker-prompt mtime ($prompt_mtime)"; return 1
366
+ fi
367
+ fi
368
+
369
+ return 0
370
+ }
371
+
288
372
  # PR-0b-narrow (Plan v6) — stamp leader handshake ack onto the sentinel.
289
373
  # Mirror of src/node/shared/fs.mjs::stampAckField. Best-effort, audit-only:
290
374
  # any failure is silently swallowed. Sequence: