@ai-dev-methodologies/rlp-desk 0.15.0 → 0.15.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/bug-report-overhaul-backlog.md +49 -0
- package/docs/plans/bug-report-overhaul-v0.md +238 -0
- package/docs/plans/bug-report-overhaul-v1.md +319 -0
- package/docs/plans/native-agent-revert.md +184 -0
- package/docs/plans/strategic-review/rlp-desk-strategic-review.md +125 -0
- package/docs/plans/v0.15-stabilization-plan.md +178 -0
- package/package.json +1 -1
- package/src/commands/rlp-desk.md +56 -46
- package/src/node/run.mjs +54 -7
- package/src/node/runner/campaign-main-loop.mjs +156 -12
- package/src/scripts/lib_ralph_desk.zsh +84 -0
- package/src/scripts/run_ralph_desk.zsh +76 -39
package/src/commands/rlp-desk.md
CHANGED
|
@@ -189,7 +189,7 @@ Tell the user:
|
|
|
189
189
|
/rlp-desk run <actual-slug> --debug
|
|
190
190
|
|
|
191
191
|
# Full options reference:
|
|
192
|
-
# --mode
|
|
192
|
+
# --mode native|tmux (default: native; legacy `agent` redirects to native)
|
|
193
193
|
# --worker-model MODEL haiku|sonnet|opus or gpt-5.5:high|spark:high (default: haiku)
|
|
194
194
|
# --lock-worker-model disable auto model upgrade
|
|
195
195
|
# --verifier-model MODEL per-US verifier (default: sonnet)
|
|
@@ -217,14 +217,14 @@ Tell the user:
|
|
|
217
217
|
# ★ Recommended: tmux mode + claude-only (real-time visibility):
|
|
218
218
|
/rlp-desk run <actual-slug> --mode tmux --debug
|
|
219
219
|
|
|
220
|
-
# Agent mode:
|
|
221
|
-
/rlp-desk run <actual-slug> --debug
|
|
220
|
+
# Native Agent() mode (slash leader, short / interactive campaigns):
|
|
221
|
+
/rlp-desk run <actual-slug> --mode native --debug
|
|
222
222
|
|
|
223
223
|
# Install codex for cost savings + cross-engine blind-spot coverage:
|
|
224
224
|
npm install -g @openai/codex
|
|
225
225
|
|
|
226
226
|
# Full options reference:
|
|
227
|
-
# --mode
|
|
227
|
+
# --mode native|tmux (default: native; legacy `agent` redirects to native)
|
|
228
228
|
# --worker-model MODEL haiku|sonnet|opus (default: haiku)
|
|
229
229
|
# --lock-worker-model disable auto model upgrade
|
|
230
230
|
# --verifier-model MODEL per-US verifier (default: sonnet)
|
|
@@ -252,7 +252,7 @@ Tell the user:
|
|
|
252
252
|
**YOU are the leader. Do NOT delegate leadership.**
|
|
253
253
|
|
|
254
254
|
Options (parse from `$ARGUMENTS`):
|
|
255
|
-
- `--mode
|
|
255
|
+
- `--mode native|tmux` (default: `native`) — execution mode. `native` = slash command is the leader, calls `Agent(...)` (claude) and `Bash("codex exec ...")` (codex). `tmux` = slash command spawns the zsh runner via `node run.mjs --mode tmux`. Legacy `--mode agent` typed against the slash command emits a deprecation notice and redirects to `--mode native` (NOT to be confused with `node run.mjs --mode agent`, which is the deprecated Node-leader alpha — see "Direct Node CLI invocation" below).
|
|
256
256
|
- `--worker-model MODEL` (default: `haiku`) — Worker model. Format: `model` = claude engine, `model:reasoning` = codex engine. Examples: `haiku`, `sonnet`, `opus`, `spark:high`, `gpt-5.5:high`. Parsed by `parse_model_flag()` which auto-splits engine/model/reasoning.
|
|
257
257
|
- `--lock-worker-model` — disable automatic model upgrade on failure. Worker stays on the specified model regardless of consecutive failures.
|
|
258
258
|
- `--verifier-model MODEL` (default: `sonnet`) — per-US verification model. Campaign-fixed (no progressive upgrade). Lighter than final verifier.
|
|
@@ -284,20 +284,26 @@ Cross-project aggregation: scan `~/.claude/ralph-desk/analytics/` and read each
|
|
|
284
284
|
|
|
285
285
|
### Mode Selection
|
|
286
286
|
|
|
287
|
-
Parse the `--mode` flag.
|
|
287
|
+
Parse the `--mode` flag. Slash command canonical labels:
|
|
288
|
+
|
|
289
|
+
- `--mode native` (default): **Native Agent() path** below. The slash command IS the leader. It calls `Agent(description=…, model=<m>, mode="bypassPermissions", prompt=…)` for claude workers/verifiers and `Bash("codex exec --model <m> --reasoning-effort <r> <prompt>")` for codex workers/verifiers.
|
|
290
|
+
- `--mode tmux`: **zsh runner path** below. The slash command shells out to `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux …` which spawns `run_ralph_desk.zsh` as a subprocess.
|
|
288
291
|
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
>
|
|
292
|
-
>
|
|
293
|
-
>
|
|
294
|
-
>
|
|
295
|
-
>
|
|
296
|
-
>
|
|
292
|
+
Legacy `--mode agent` typed against this slash command emits a deprecation notice and redirects to `--mode native`. **Do NOT confuse `/rlp-desk run --mode agent`** (slash command, redirects to Native Agent()) **with** `node run.mjs run --mode agent` (deprecated Node-leader alpha, direct CLI invocation, unrelated code path — see "Direct Node CLI invocation" below).
|
|
293
|
+
|
|
294
|
+
> **Stability tiers:**
|
|
295
|
+
> - `--mode tmux` is the **stable, production** path. The slash command spawns
|
|
296
|
+
> the Node leader, which spawns `run_ralph_desk.zsh` — the zsh runner has the
|
|
297
|
+
> full safety net (heartbeat, copy-mode guard, prompt-stall, no-progress
|
|
298
|
+
> detection, claude model upgrade chain). Recommend this for autonomous
|
|
299
|
+
> campaigns.
|
|
300
|
+
> - `--mode native` is **for short / interactive campaigns**. Native Agent()
|
|
301
|
+
> has no timeout API (platform constraint). Long-running autonomous campaigns
|
|
302
|
+
> SHOULD use `--mode tmux`.
|
|
297
303
|
|
|
298
304
|
#### Tmux Mode (`--mode tmux`)
|
|
299
305
|
|
|
300
|
-
When `--mode tmux` is specified (v0.14.0+: `run.mjs` accepts the same flags as before but spawns `run_ralph_desk.zsh` as a subprocess and inherits stdio. Flywheel and self-verification flags are not honored under tmux mode — they require
|
|
306
|
+
When `--mode tmux` is specified (v0.14.0+: `run.mjs` accepts the same flags as before but spawns `run_ralph_desk.zsh` as a subprocess and inherits stdio. Flywheel and self-verification flags are not honored under tmux mode — they currently require the deprecated Node-leader direct-CLI path `node run.mjs --mode agent` (see "Direct Node CLI invocation" below). Native Agent() port is a post-Node-leader-retirement task):
|
|
301
307
|
|
|
302
308
|
1. **Validate scaffold** — same as Agent() mode: check `.rlp-desk/prompts/<slug>.worker.prompt.md` etc.
|
|
303
309
|
2. **Check sentinels** — same as Agent() mode.
|
|
@@ -331,7 +337,7 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
|
|
|
331
337
|
|
|
332
338
|
**Env-var translation (v5.7 §4.1)**: the slash command historically built `LANE_MODE=strict zsh ...` and `TEST_DENSITY_MODE=strict zsh ...` from CLI flags. The Node leader uses CLI flags instead — translate `--lane-strict` and `--test-density-strict` into the corresponding flags. Direct env-var users (running zsh directly) are unaffected.
|
|
333
339
|
|
|
334
|
-
6. **If the Node leader exits with error** — report the error to the user and STOP. Do NOT attempt to work around it. Do NOT create tmux sessions yourself. Do NOT re-launch in a different way. Tell the user what went wrong and suggest `--mode
|
|
340
|
+
6. **If the Node leader exits with error** — report the error to the user and STOP. Do NOT attempt to work around it. Do NOT create tmux sessions yourself. Do NOT re-launch in a different way. Tell the user what went wrong and suggest `--mode native` (slash command Native Agent() path) as alternative.
|
|
335
341
|
7. **If successful** — tell the user the tmux session has been started. The Node leader takes over as the deterministic Leader. No Agent() calls are made in tmux mode.
|
|
336
342
|
|
|
337
343
|
**IMPORTANT RULES:**
|
|
@@ -339,35 +345,36 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
|
|
|
339
345
|
- MUST launch with `run_in_background: true` so `/rlp-desk` returns control immediately while preserving live tmux visibility.
|
|
340
346
|
- Run-in-background is used so the shell can keep the command visible and keep the pane layout stable for status checks and completion flow.
|
|
341
347
|
- Do NOT kill panes after completion. Panes stay alive for inspection. User cleans up with `/rlp-desk clean <slug> --kill-session`.
|
|
342
|
-
- v0.14.0: `--with-self-verification`, `--flywheel`, and `--flywheel-guard` are **not honored** under `--mode tmux` — the zsh runner has no SV/flywheel implementation. The Node leader emits a stderr WARNING listing the dropped flags. For SV/flywheel, use
|
|
343
|
-
-
|
|
348
|
+
- v0.14.0: `--with-self-verification`, `--flywheel`, and `--flywheel-guard` are **not honored** under `--mode tmux` — the zsh runner has no SV/flywheel implementation. The Node leader emits a stderr WARNING listing the dropped flags. For SV/flywheel today, use the deprecated Node-leader direct-CLI path `node run.mjs --mode agent` (see "Direct Node CLI invocation" below). The slash command's Native Agent() (`--mode native`) does not yet implement SV/flywheel — port is a post-Node-leader-retirement task.
|
|
349
|
+
- **For `--mode tmux` only**: the slash command invokes `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux ...`. Do NOT invoke `~/.claude/ralph-desk/run_ralph_desk.zsh` directly — the Node router resolves the runner path, runs legacy detection, and surfaces actionable errors when the runner is missing. **For `--mode native`**, the slash command does NOT invoke the Node CLI — it acts as the leader itself; see Native Agent() Mode section below.
|
|
344
350
|
|
|
345
351
|
**tmux UX model (5 items):**
|
|
346
352
|
- The session returns immediately after launch (`run_in_background: true`) so the command returns control to the parent CLI.
|
|
347
353
|
- Worker/Verifier panes remain visible to the user during execution.
|
|
348
354
|
- Users check progress with the **status command**: `/rlp-desk status <slug>`.
|
|
349
355
|
- On completion, the command returns a completion notification before the loop ends.
|
|
350
|
-
- Agent mode remains unchanged, and no tmux-specific behavior is mixed into
|
|
351
|
-
|
|
352
|
-
#### Agent Mode (`--mode
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
|
|
360
|
-
**
|
|
361
|
-
|
|
362
|
-
mode="bypassPermissions"
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
prompt-stall / no-progress timeouts in `run_ralph_desk.zsh` (v5.7 §4.13.b /
|
|
366
|
-
|
|
367
|
-
`Agent()` has no timeout API,
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
356
|
+
- Native Agent() mode remains unchanged, and no tmux-specific behavior is mixed into it.
|
|
357
|
+
|
|
358
|
+
#### Native Agent() Mode (`--mode native` or default)
|
|
359
|
+
|
|
360
|
+
The slash command IS the leader. Workers/Verifiers are spawned via `Agent(model=…, mode="bypassPermissions", prompt=…)` (claude) or `Bash("codex exec --model <m> --reasoning-effort <r> <prompt>")` (codex).
|
|
361
|
+
|
|
362
|
+
### Native Agent() Safety Contract
|
|
363
|
+
|
|
364
|
+
This contract MUST be observed in every iteration of the leader loop below. Future PRs deleting any of these guarantees break the slash command's behavior.
|
|
365
|
+
|
|
366
|
+
1. **Turn-keepalive**: every status report uses `Bash("echo '...'")` to emit messages. NEVER output plain text without an accompanying tool call. Plain text = turn ends = loop stops. (Mitigation for commit `29fd29b` platform constraint, permanent.)
|
|
367
|
+
2. **no `subagent_type` parameter**: the `Agent(...)` call form is exactly `Agent(description=…, model=<m>, mode="bypassPermissions", prompt=…)`. Do NOT pass `subagent_type`. (Mitigation for commit `920a31c`: `subagent_type="executor"` overrode `bypassPermissions` and surfaced a permission popup; permanent.)
|
|
368
|
+
3. **`mode="bypassPermissions"` mandatory**: every claude `Agent()` worker/verifier dispatch must include `mode="bypassPermissions"`.
|
|
369
|
+
4. **Long-running campaigns: prefer `--mode tmux`** for production. Native Agent() has no timeout API (platform constraint) — if `bypassPermissions` fails to suppress an interactive prompt at the SDK level, the call hangs indefinitely with no rlp-desk-side watchdog.
|
|
370
|
+
|
|
371
|
+
**Why Native Agent() is structurally immune to Bug 4/5 (mid-execution prompt hang & A4 premature dispatch)**: Worker/Verifier run non-interactively under the platform's bypass — they have no tmux pane, no TUI surface, and cannot surface a `[y/N]` prompt to the parent Leader. The auto-dismiss / prompt-stall / no-progress timeouts in `run_ralph_desk.zsh` (v5.7 §4.13.b / §4.16 / §4.17) are therefore tmux-only by design.
|
|
372
|
+
|
|
373
|
+
**Tradeoff**: because `Agent()` has no timeout API, Native Agent() iterations are not bounded — if the platform's `bypassPermissions` ever fails to suppress an interactive prompt at the SDK level, the call hangs indefinitely with no rlp-desk-side watchdog. Use `--mode tmux` if you need bounded execution time.
|
|
374
|
+
|
|
375
|
+
#### Direct Node CLI invocation (`node run.mjs run <slug> --mode agent` — deprecated alpha)
|
|
376
|
+
|
|
377
|
+
Direct invocation of `node ~/.claude/ralph-desk/node/run.mjs run <slug> --mode agent` is **the deprecated Node-leader alpha path**. This is unrelated to the slash command's Native Agent() path above — different code, different leader, different lifecycle. The Node leader currently retains SV/flywheel implementations not yet ported to Native Agent(). The Node CLI emits a deprecation banner on this mode and is scheduled for hard-error in the next major release. For production tmux orchestration, use `--mode tmux`. For Claude Code Native Agent() campaigns, use `/rlp-desk run <slug> --mode native` from a Claude Code session.
|
|
371
378
|
|
|
372
379
|
### Preparation
|
|
373
380
|
1. Validate scaffold: `.rlp-desk/prompts/<slug>.worker.prompt.md` etc.
|
|
@@ -775,13 +782,13 @@ Example:
|
|
|
775
782
|
```
|
|
776
783
|
/rlp-desk brainstorm <description> Plan before init (interactive)
|
|
777
784
|
/rlp-desk init <slug> [objective] Create project scaffold
|
|
778
|
-
/rlp-desk run <slug> [options] Run loop (
|
|
785
|
+
/rlp-desk run <slug> [options] Run loop (native=Native Agent() leader (slash), tmux=zsh leader (production); legacy `agent` redirects to `native` — direct Node CLI `--mode agent` is deprecated alpha)
|
|
779
786
|
/rlp-desk status <slug> Show loop status
|
|
780
787
|
/rlp-desk logs <slug> [N] Show iteration log
|
|
781
788
|
/rlp-desk clean <slug> [--kill-session] Reset for re-run (--kill-session kills tmux)
|
|
782
789
|
|
|
783
790
|
Run options:
|
|
784
|
-
--mode
|
|
791
|
+
--mode native|tmux Execution mode (default: native)
|
|
785
792
|
--worker-model MODEL Worker model: haiku|sonnet|opus or gpt-5.5:high|spark:high (default: haiku)
|
|
786
793
|
--lock-worker-model Disable auto model upgrade on failure
|
|
787
794
|
--verifier-model MODEL per-US verifier (default: sonnet)
|
|
@@ -799,15 +806,18 @@ Run options:
|
|
|
799
806
|
|
|
800
807
|
## Architecture
|
|
801
808
|
|
|
802
|
-
### Agent Mode (default: `--mode
|
|
809
|
+
### Native Agent() Mode (default: `--mode native`)
|
|
803
810
|
```
|
|
804
|
-
[This session = LEADER (LLM)]
|
|
811
|
+
[This session = LEADER (LLM, slash command itself)]
|
|
805
812
|
│
|
|
806
|
-
Agent()├──▶ [Worker:
|
|
813
|
+
Agent()├──▶ [Worker: claude subagent (fresh context, mode="bypassPermissions")]
|
|
807
814
|
│ └── reads desk files, implements, updates memory
|
|
808
815
|
│
|
|
809
|
-
Agent()└──▶ [Verifier:
|
|
810
|
-
|
|
816
|
+
Agent()└──▶ [Verifier: claude subagent (fresh context, mode="bypassPermissions")]
|
|
817
|
+
│ └── reads done-claim, runs checks, writes verdict
|
|
818
|
+
│
|
|
819
|
+
Bash() ───▶ [Worker/Verifier: codex CLI subprocess]
|
|
820
|
+
└── `codex exec --model <m> --reasoning-effort <r> <prompt>`
|
|
811
821
|
```
|
|
812
822
|
|
|
813
823
|
### Tmux Mode (`--mode tmux`)
|
package/src/node/run.mjs
CHANGED
|
@@ -48,14 +48,14 @@ function buildHelpText() {
|
|
|
48
48
|
'Commands:',
|
|
49
49
|
' brainstorm <description> Plan before init (not implemented in the Node rewrite yet)',
|
|
50
50
|
' init <slug> [objective] Create project scaffold',
|
|
51
|
-
' run <slug> [options] Run loop (agent=
|
|
51
|
+
' run <slug> [options] Run loop (tmux=zsh leader [production], agent=Node leader [deprecated alpha], native=slash-only error)',
|
|
52
52
|
' status <slug> Show loop status',
|
|
53
53
|
' logs <slug> [N] Show iteration log (not implemented in the Node rewrite yet)',
|
|
54
54
|
' clean <slug> [--kill-session] Reset for re-run (not implemented in the Node rewrite yet)',
|
|
55
55
|
' resume <slug> Resume loop (not implemented in the Node rewrite yet)',
|
|
56
56
|
'',
|
|
57
57
|
'Run Options:',
|
|
58
|
-
' --mode agent|tmux',
|
|
58
|
+
' --mode tmux|agent|native (CLI: tmux=production, agent=deprecated, native=errors with redirect to slash command)',
|
|
59
59
|
' --worker-model MODEL',
|
|
60
60
|
' --lock-worker-model',
|
|
61
61
|
' --verifier-model MODEL',
|
|
@@ -358,10 +358,32 @@ async function runRunCommand(args, deps) {
|
|
|
358
358
|
return runTmuxViaZsh(slug, options, deps);
|
|
359
359
|
}
|
|
360
360
|
|
|
361
|
-
//
|
|
362
|
-
//
|
|
363
|
-
//
|
|
364
|
-
//
|
|
361
|
+
// P1.b (native-agent-revert plan v7): --mode native is slash-command-only.
|
|
362
|
+
// The Node CLI does not implement Native Agent() — that path lives in
|
|
363
|
+
// src/commands/rlp-desk.md and runs in a Claude Code session. Surface a
|
|
364
|
+
// hard error here so direct CLI invocation does not silently fall through
|
|
365
|
+
// to the deprecated Node-leader path.
|
|
366
|
+
if (options.mode === 'native') {
|
|
367
|
+
write(
|
|
368
|
+
deps.stderr,
|
|
369
|
+
'ERROR: --mode native is slash-command-only. The Node CLI does not implement it.',
|
|
370
|
+
);
|
|
371
|
+
write(
|
|
372
|
+
deps.stderr,
|
|
373
|
+
'Use `/rlp-desk run <slug> --mode native` from a Claude Code session,',
|
|
374
|
+
);
|
|
375
|
+
write(
|
|
376
|
+
deps.stderr,
|
|
377
|
+
'or use `--mode tmux` (production) / `--mode agent` (deprecated alpha) for direct CLI invocation.',
|
|
378
|
+
);
|
|
379
|
+
return 2;
|
|
380
|
+
}
|
|
381
|
+
|
|
382
|
+
// P1.b: --mode agent (Node-leader alpha) is deprecated. The slash command's
|
|
383
|
+
// Native Agent() path (`/rlp-desk run --mode native`) is unrelated — different
|
|
384
|
+
// code, different leader. We keep the Node-leader behavior unchanged for
|
|
385
|
+
// backward compatibility but surface a strong deprecation banner so wrappers
|
|
386
|
+
// can migrate before the next major release hard-errors this mode.
|
|
365
387
|
if (
|
|
366
388
|
options.mode === 'agent'
|
|
367
389
|
&& !process.env.RLP_DESK_QUIET_WARNINGS
|
|
@@ -369,7 +391,32 @@ async function runRunCommand(args, deps) {
|
|
|
369
391
|
) {
|
|
370
392
|
write(
|
|
371
393
|
deps.stderr,
|
|
372
|
-
'WARNING: --mode agent
|
|
394
|
+
'WARNING: --mode agent (Node-leader alpha) is deprecated.',
|
|
395
|
+
);
|
|
396
|
+
write(
|
|
397
|
+
deps.stderr,
|
|
398
|
+
'This is the direct Node-CLI alpha path — UNRELATED to the slash command Native Agent() path (`/rlp-desk run --mode native`).',
|
|
399
|
+
);
|
|
400
|
+
write(
|
|
401
|
+
deps.stderr,
|
|
402
|
+
'For production tmux orchestration, use `--mode tmux`.',
|
|
403
|
+
);
|
|
404
|
+
write(
|
|
405
|
+
deps.stderr,
|
|
406
|
+
'For Claude Code Native Agent() campaigns, use `/rlp-desk run --mode native` from a Claude Code session.',
|
|
407
|
+
);
|
|
408
|
+
// 2026-05-07 (v0.15.2): rlp-desk is in active stabilization. Goal: reach
|
|
409
|
+
// omc /team/ralph/ralplan level of reliability while preserving
|
|
410
|
+
// rlp-desk's self-driving advantages (multi-engine consensus, multi-mission
|
|
411
|
+
// queue, BLOCK_TAGS taxonomy, structured SV reports). omc is the BENCHMARK,
|
|
412
|
+
// not a replacement. See docs/plans/v0.15-stabilization-plan.md.
|
|
413
|
+
write(
|
|
414
|
+
deps.stderr,
|
|
415
|
+
'SCHEDULED REMOVAL: --mode agent (Node CLI alpha) will be removed in a future major release. Date TBD until stabilization milestones complete.',
|
|
416
|
+
);
|
|
417
|
+
write(
|
|
418
|
+
deps.stderr,
|
|
419
|
+
'STABILIZATION IN PROGRESS: rlp-desk is hardening against the 10-bug regression pattern observed 2026-05-01..05-07. See docs/plans/v0.15-stabilization-plan.md.',
|
|
373
420
|
);
|
|
374
421
|
}
|
|
375
422
|
|
|
@@ -388,6 +388,110 @@ async function readCurrentState(paths, slug, options) {
|
|
|
388
388
|
};
|
|
389
389
|
}
|
|
390
390
|
|
|
391
|
+
// PR-A (Bug #10): validate operator-written recovery artifacts. When the
|
|
392
|
+
// operator hand-rolls a `phase=verify` recovery (jq-patches status.json,
|
|
393
|
+
// writes iter-signal.json + done-claim.json by hand, deletes the blocked
|
|
394
|
+
// sentinel), the leader must NOT silently overwrite that work on relaunch.
|
|
395
|
+
// All five checks must pass for the leader to honor the recovery.
|
|
396
|
+
//
|
|
397
|
+
// Returns { ok: boolean, reason: string }. On any failure the caller falls
|
|
398
|
+
// through to the default behavior (worker dispatch) — defensive by design.
|
|
399
|
+
async function _validateOperatorRecoveryArtifacts({ paths, state }) {
|
|
400
|
+
// 1. iter-signal.json + done-claim.json must both exist and parse.
|
|
401
|
+
let signal;
|
|
402
|
+
let doneClaim;
|
|
403
|
+
try {
|
|
404
|
+
signal = await readJsonIfExists(paths.signalFile);
|
|
405
|
+
} catch (err) {
|
|
406
|
+
return { ok: false, reason: `iter-signal.json parse error: ${err?.message ?? err}` };
|
|
407
|
+
}
|
|
408
|
+
if (!signal) return { ok: false, reason: 'iter-signal.json missing' };
|
|
409
|
+
|
|
410
|
+
try {
|
|
411
|
+
doneClaim = await readJsonIfExists(paths.doneClaimFile);
|
|
412
|
+
} catch (err) {
|
|
413
|
+
return { ok: false, reason: `done-claim.json parse error: ${err?.message ?? err}` };
|
|
414
|
+
}
|
|
415
|
+
if (!doneClaim) return { ok: false, reason: 'done-claim.json missing' };
|
|
416
|
+
|
|
417
|
+
// 2. us_id must match status.current_us in BOTH artifacts.
|
|
418
|
+
if (signal.us_id !== state.current_us) {
|
|
419
|
+
return {
|
|
420
|
+
ok: false,
|
|
421
|
+
reason: `iter-signal.us_id (${signal.us_id}) != status.current_us (${state.current_us})`,
|
|
422
|
+
};
|
|
423
|
+
}
|
|
424
|
+
if (doneClaim.us_id !== state.current_us) {
|
|
425
|
+
return {
|
|
426
|
+
ok: false,
|
|
427
|
+
reason: `done-claim.us_id (${doneClaim.us_id}) != status.current_us (${state.current_us})`,
|
|
428
|
+
};
|
|
429
|
+
}
|
|
430
|
+
|
|
431
|
+
// 3. iteration must match status.iteration in BOTH artifacts.
|
|
432
|
+
if (signal.iteration !== state.iteration) {
|
|
433
|
+
return {
|
|
434
|
+
ok: false,
|
|
435
|
+
reason: `iter-signal.iteration (${signal.iteration}) != status.iteration (${state.iteration})`,
|
|
436
|
+
};
|
|
437
|
+
}
|
|
438
|
+
if (doneClaim.iteration !== state.iteration) {
|
|
439
|
+
return {
|
|
440
|
+
ok: false,
|
|
441
|
+
reason: `done-claim.iteration (${doneClaim.iteration}) != status.iteration (${state.iteration})`,
|
|
442
|
+
};
|
|
443
|
+
}
|
|
444
|
+
|
|
445
|
+
// 4. iter_signal_quality must be 'specific' (not generic / vague).
|
|
446
|
+
if (signal.iter_signal_quality !== 'specific') {
|
|
447
|
+
return {
|
|
448
|
+
ok: false,
|
|
449
|
+
reason: `iter-signal.iter_signal_quality (${signal.iter_signal_quality}) != 'specific'`,
|
|
450
|
+
};
|
|
451
|
+
}
|
|
452
|
+
|
|
453
|
+
// 5. Both artifact mtimes must be NEWER than the most recent
|
|
454
|
+
// iter-NNN.worker-prompt.md mtime — guards against operator running
|
|
455
|
+
// `phase=verify` against stale artifacts from a much earlier iteration.
|
|
456
|
+
const promptFile = path.join(
|
|
457
|
+
paths.campaignLogDir,
|
|
458
|
+
`iter-${String(state.iteration).padStart(3, '0')}.worker-prompt.md`,
|
|
459
|
+
);
|
|
460
|
+
let promptMtime = 0;
|
|
461
|
+
try {
|
|
462
|
+
const promptStat = await fs.stat(promptFile);
|
|
463
|
+
promptMtime = promptStat.mtimeMs;
|
|
464
|
+
} catch {
|
|
465
|
+
// No worker-prompt.md for this iteration → check vacuously passes
|
|
466
|
+
// (operator is recovering from a state that never even dispatched yet).
|
|
467
|
+
promptMtime = 0;
|
|
468
|
+
}
|
|
469
|
+
if (promptMtime > 0) {
|
|
470
|
+
let signalMtime = 0;
|
|
471
|
+
let doneClaimMtime = 0;
|
|
472
|
+
try {
|
|
473
|
+
signalMtime = (await fs.stat(paths.signalFile)).mtimeMs;
|
|
474
|
+
doneClaimMtime = (await fs.stat(paths.doneClaimFile)).mtimeMs;
|
|
475
|
+
} catch (err) {
|
|
476
|
+
return { ok: false, reason: `mtime stat failed: ${err?.message ?? err}` };
|
|
477
|
+
}
|
|
478
|
+
if (signalMtime <= promptMtime) {
|
|
479
|
+
return {
|
|
480
|
+
ok: false,
|
|
481
|
+
reason: `iter-signal.json mtime (${signalMtime}) is not strictly newer than worker-prompt mtime (${promptMtime})`,
|
|
482
|
+
};
|
|
483
|
+
}
|
|
484
|
+
if (doneClaimMtime <= promptMtime) {
|
|
485
|
+
return {
|
|
486
|
+
ok: false,
|
|
487
|
+
reason: `done-claim.json mtime (${doneClaimMtime}) is not strictly newer than worker-prompt mtime (${promptMtime})`,
|
|
488
|
+
};
|
|
489
|
+
}
|
|
490
|
+
}
|
|
491
|
+
|
|
492
|
+
return { ok: true, reason: 'all five checks passed' };
|
|
493
|
+
}
|
|
494
|
+
|
|
391
495
|
async function appendIterationAnalytics(paths, state, usId, verdict, options) {
|
|
392
496
|
await appendCampaignAnalytics(paths.analyticsFile, {
|
|
393
497
|
iter: state.iteration,
|
|
@@ -1288,6 +1392,28 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1288
1392
|
|
|
1289
1393
|
let fixContractPath = null;
|
|
1290
1394
|
|
|
1395
|
+
// PR-A (Bug #10): operator-recovery hygiene. If the operator hand-rolled a
|
|
1396
|
+
// `phase=verify` recovery (jq-patches status.json, writes manual artifacts,
|
|
1397
|
+
// deletes the blocked sentinel), the leader MUST honor that work instead of
|
|
1398
|
+
// resetting to phase=worker on relaunch. The validator runs five checks
|
|
1399
|
+
// (see _validateOperatorRecoveryArtifacts); on full pass, _skipNextWorkerDispatch
|
|
1400
|
+
// is set as a one-shot flag consumed at the worker dispatch call site below.
|
|
1401
|
+
// On any failure the leader logs the reason and falls through to default
|
|
1402
|
+
// behavior.
|
|
1403
|
+
if (state.phase === 'verify' && state.iteration > 0) {
|
|
1404
|
+
const validation = await _validateOperatorRecoveryArtifacts({ paths, state });
|
|
1405
|
+
if (validation.ok) {
|
|
1406
|
+
console.error(
|
|
1407
|
+
`[recovery] Resuming verify phase — operator manual recovery detected (us=${state.current_us} iter=${state.iteration}): ${validation.reason}`,
|
|
1408
|
+
);
|
|
1409
|
+
state._skipNextWorkerDispatch = true;
|
|
1410
|
+
} else {
|
|
1411
|
+
console.error(
|
|
1412
|
+
`[recovery] phase=verify ignored, falling through to worker dispatch: ${validation.reason}`,
|
|
1413
|
+
);
|
|
1414
|
+
}
|
|
1415
|
+
}
|
|
1416
|
+
|
|
1291
1417
|
// P1-E Lane Enforcement: snapshot lane mtimes before each iteration,
|
|
1292
1418
|
// compare at the top of the next iteration. Drift on read-only artifacts
|
|
1293
1419
|
// (PRD, test-spec, context) emits a lane_violation_warning event + audit
|
|
@@ -1572,18 +1698,36 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1572
1698
|
}
|
|
1573
1699
|
}
|
|
1574
1700
|
|
|
1575
|
-
|
|
1576
|
-
|
|
1577
|
-
|
|
1578
|
-
|
|
1579
|
-
|
|
1580
|
-
|
|
1581
|
-
|
|
1582
|
-
state
|
|
1583
|
-
|
|
1584
|
-
|
|
1585
|
-
|
|
1586
|
-
|
|
1701
|
+
// PR-A (Bug #10): one-shot guard. When the operator's `phase=verify`
|
|
1702
|
+
// recovery was honored at campaign entry, skip both the phase reset and
|
|
1703
|
+
// the worker dispatch — the operator already wrote a valid iter-signal.json
|
|
1704
|
+
// and done-claim.json, so pollForSignal below will pick them up immediately
|
|
1705
|
+
// and the loop continues into the verifier phase. The flag is cleared
|
|
1706
|
+
// after consumption so subsequent iterations dispatch the worker normally.
|
|
1707
|
+
if (state._skipNextWorkerDispatch) {
|
|
1708
|
+
state._skipNextWorkerDispatch = false;
|
|
1709
|
+
console.error(
|
|
1710
|
+
`[recovery] Skipping worker dispatch for iter=${state.iteration} (honoring operator manual recovery)`,
|
|
1711
|
+
);
|
|
1712
|
+
// Persist phase=verify so a subsequent crash-and-relaunch sees the same
|
|
1713
|
+
// contract. writeStatus is intentionally called BEFORE pollForSignal so
|
|
1714
|
+
// the on-disk state matches what we are about to do.
|
|
1715
|
+
state.phase = 'verify';
|
|
1716
|
+
await writeStatus(paths, state, options.onStatusChange, options.now);
|
|
1717
|
+
} else {
|
|
1718
|
+
state.phase = 'worker';
|
|
1719
|
+
await writeStatus(paths, state, options.onStatusChange, options.now);
|
|
1720
|
+
await dispatchWorker({
|
|
1721
|
+
iteration: state.iteration,
|
|
1722
|
+
paths,
|
|
1723
|
+
slug,
|
|
1724
|
+
usList,
|
|
1725
|
+
state,
|
|
1726
|
+
sendKeys,
|
|
1727
|
+
workerPaneId: state.worker_pane_id,
|
|
1728
|
+
fixContractPath,
|
|
1729
|
+
});
|
|
1730
|
+
}
|
|
1587
1731
|
|
|
1588
1732
|
let signal;
|
|
1589
1733
|
try {
|
|
@@ -285,6 +285,90 @@ _unlock_sentinel() {
|
|
|
285
285
|
return 0
|
|
286
286
|
}
|
|
287
287
|
|
|
288
|
+
# PR-A (Bug #10) — validate operator-written manual recovery artifacts.
|
|
289
|
+
# Returns 0 when all 5 checks pass; 1 otherwise. Sets RECOVERY_FAIL_REASON
|
|
290
|
+
# (global) on failure for caller logging. Mirrors the Node-side helper
|
|
291
|
+
# `_validateOperatorRecoveryArtifacts` in `src/node/runner/campaign-main-loop.mjs`.
|
|
292
|
+
#
|
|
293
|
+
# Args:
|
|
294
|
+
# $1 iter-signal.json path
|
|
295
|
+
# $2 done-claim.json path
|
|
296
|
+
# $3 status.json path
|
|
297
|
+
# $4 iter-NNN.worker-prompt.md path (may not exist for iter-1 fresh start)
|
|
298
|
+
_validate_operator_recovery_artifacts() {
|
|
299
|
+
local sig_file="$1" done_file="$2" status_file="$3" prompt_file="$4"
|
|
300
|
+
RECOVERY_FAIL_REASON=""
|
|
301
|
+
|
|
302
|
+
# Check 1: both artifacts exist + parse as JSON
|
|
303
|
+
if [[ ! -f "$sig_file" ]]; then
|
|
304
|
+
RECOVERY_FAIL_REASON="iter-signal.json missing"; return 1
|
|
305
|
+
fi
|
|
306
|
+
if [[ ! -f "$done_file" ]]; then
|
|
307
|
+
RECOVERY_FAIL_REASON="done-claim.json missing"; return 1
|
|
308
|
+
fi
|
|
309
|
+
if ! command -v jq >/dev/null 2>&1; then
|
|
310
|
+
RECOVERY_FAIL_REASON="jq unavailable; cannot validate"; return 1
|
|
311
|
+
fi
|
|
312
|
+
if ! jq -e . "$sig_file" >/dev/null 2>&1; then
|
|
313
|
+
RECOVERY_FAIL_REASON="iter-signal.json parse error"; return 1
|
|
314
|
+
fi
|
|
315
|
+
if ! jq -e . "$done_file" >/dev/null 2>&1; then
|
|
316
|
+
RECOVERY_FAIL_REASON="done-claim.json parse error"; return 1
|
|
317
|
+
fi
|
|
318
|
+
if [[ ! -f "$status_file" ]] || ! jq -e . "$status_file" >/dev/null 2>&1; then
|
|
319
|
+
RECOVERY_FAIL_REASON="status.json missing or invalid"; return 1
|
|
320
|
+
fi
|
|
321
|
+
|
|
322
|
+
# Check 2: us_id match in both artifacts
|
|
323
|
+
local current_us sig_us done_us
|
|
324
|
+
current_us=$(jq -r '.current_us // ""' "$status_file" 2>/dev/null)
|
|
325
|
+
sig_us=$(jq -r '.us_id // ""' "$sig_file" 2>/dev/null)
|
|
326
|
+
done_us=$(jq -r '.us_id // ""' "$done_file" 2>/dev/null)
|
|
327
|
+
if [[ "$sig_us" != "$current_us" ]]; then
|
|
328
|
+
RECOVERY_FAIL_REASON="iter-signal.us_id ($sig_us) != status.current_us ($current_us)"; return 1
|
|
329
|
+
fi
|
|
330
|
+
if [[ "$done_us" != "$current_us" ]]; then
|
|
331
|
+
RECOVERY_FAIL_REASON="done-claim.us_id ($done_us) != status.current_us ($current_us)"; return 1
|
|
332
|
+
fi
|
|
333
|
+
|
|
334
|
+
# Check 3: iteration match in both artifacts
|
|
335
|
+
local current_iter sig_iter done_iter
|
|
336
|
+
current_iter=$(jq -r '.iteration // 0' "$status_file" 2>/dev/null)
|
|
337
|
+
sig_iter=$(jq -r '.iteration // 0' "$sig_file" 2>/dev/null)
|
|
338
|
+
done_iter=$(jq -r '.iteration // 0' "$done_file" 2>/dev/null)
|
|
339
|
+
if [[ "$sig_iter" != "$current_iter" ]]; then
|
|
340
|
+
RECOVERY_FAIL_REASON="iter-signal.iteration ($sig_iter) != status.iteration ($current_iter)"; return 1
|
|
341
|
+
fi
|
|
342
|
+
if [[ "$done_iter" != "$current_iter" ]]; then
|
|
343
|
+
RECOVERY_FAIL_REASON="done-claim.iteration ($done_iter) != status.iteration ($current_iter)"; return 1
|
|
344
|
+
fi
|
|
345
|
+
|
|
346
|
+
# Check 4: iter_signal_quality must equal 'specific'
|
|
347
|
+
local sig_quality
|
|
348
|
+
sig_quality=$(jq -r '.iter_signal_quality // ""' "$sig_file" 2>/dev/null)
|
|
349
|
+
if [[ "$sig_quality" != "specific" ]]; then
|
|
350
|
+
RECOVERY_FAIL_REASON="iter-signal.iter_signal_quality ($sig_quality) != 'specific'"; return 1
|
|
351
|
+
fi
|
|
352
|
+
|
|
353
|
+
# Check 5: artifact mtimes must be strictly newer than worker-prompt mtime.
|
|
354
|
+
# Vacuously passes when the prompt file does not exist (fresh iter-1 start
|
|
355
|
+
# before any leader-written prompt).
|
|
356
|
+
if [[ -f "$prompt_file" ]]; then
|
|
357
|
+
local prompt_mtime sig_mtime done_mtime
|
|
358
|
+
prompt_mtime=$(stat -f %m "$prompt_file" 2>/dev/null || stat -c %Y "$prompt_file" 2>/dev/null || print 0)
|
|
359
|
+
sig_mtime=$(stat -f %m "$sig_file" 2>/dev/null || stat -c %Y "$sig_file" 2>/dev/null || print 0)
|
|
360
|
+
done_mtime=$(stat -f %m "$done_file" 2>/dev/null || stat -c %Y "$done_file" 2>/dev/null || print 0)
|
|
361
|
+
if (( sig_mtime <= prompt_mtime )); then
|
|
362
|
+
RECOVERY_FAIL_REASON="iter-signal.json mtime ($sig_mtime) not strictly newer than worker-prompt mtime ($prompt_mtime)"; return 1
|
|
363
|
+
fi
|
|
364
|
+
if (( done_mtime <= prompt_mtime )); then
|
|
365
|
+
RECOVERY_FAIL_REASON="done-claim.json mtime ($done_mtime) not strictly newer than worker-prompt mtime ($prompt_mtime)"; return 1
|
|
366
|
+
fi
|
|
367
|
+
fi
|
|
368
|
+
|
|
369
|
+
return 0
|
|
370
|
+
}
|
|
371
|
+
|
|
288
372
|
# PR-0b-narrow (Plan v6) — stamp leader handshake ack onto the sentinel.
|
|
289
373
|
# Mirror of src/node/shared/fs.mjs::stampAckField. Best-effort, audit-only:
|
|
290
374
|
# any failure is silently swallowed. Sequence:
|