npm - @ai-dev-methodologies/rlp-desk - Versions diffs - 0.15.3 → 0.15.4 - Mend

@ai-dev-methodologies/rlp-desk 0.15.3 → 0.15.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +82 -0
package/README.md +34 -4
package/docs/rlp-desk/failure-modes.md +191 -0
package/package.json +3 -2
package/src/node/runner/campaign-main-loop.mjs +84 -11
package/src/node/util/debug-log.mjs +10 -6
package/src/node/util/lifecycle-metrics.mjs +102 -0
package/src/scripts/lib_ralph_desk.zsh +66 -0
package/src/scripts/run_ralph_desk.zsh +18 -0
package/docs/plans/bug-report-overhaul-backlog.md +0 -49
package/docs/plans/bug-report-overhaul-v0.md +0 -238
package/docs/plans/bug-report-overhaul-v1.md +0 -319
package/docs/plans/native-agent-revert.md +0 -184
package/docs/plans/polished-gliding-toucan.md +0 -234
package/docs/plans/pr-e-phase-c1-blocked-recovery-hygiene-v0.md +0 -233
package/docs/plans/spicy-booping-galaxy.md +0 -717
package/docs/plans/strategic-review/rlp-desk-strategic-review.md +0 -125
package/docs/plans/v0.15-stabilization-phase-a-prep.md +0 -130
package/docs/plans/v0.15-stabilization-plan.md +0 -178
package/docs/plans/v0.16-real-llm-sv-gate-spec.md +0 -177

package/docs/plans/bug-report-overhaul-v1.md DELETED Viewed

@@ -1,319 +0,0 @@
-# Bug Report Mechanism Overhaul — v1 (Architect-revised)
-> **Status**: Planner v1 awaiting Codex Critic.
-> **Mode**: deliberate.
-> **Stop rule**: iterate until codex critic returns 0 P0 + 0 P1. P2 → backlog.
-> **Critic instruction**: *approve unless P0 or P1 found.*
-> **Changes from v0**: Architect ITERATE feedback applied — split into 3 sequenced PRs, exact file:line for Bug #10, `pattern_match` seed → P2 backlog, explicit `native-agent-revert` dependency, governance §1g rationale.
----
-## 1. Problem statement (unchanged from v0)
-10 hand-written 200-line bug reports (`Bug #1`–`Bug #10`, BOS dev `2026-05-01..05-07`) point at one root frustration: **bugs are endless and each one costs 30+ min of operator time to package** before the rlp-desk side can even start triage. Two distinct cost lines:
-| Pain | Evidence | Cost line |
-|---|---|---|
-| **Recovery friction** (lose work on relaunch) | Bug #10 100%-reproducible: leader resets `phase=worker`, deletes operator-written iter-signal/done-claim | per BLOCKED, lost 30+ min of manual work |
-| **Capture friction** (hand-write report) | All 10 reports re-collect env, version, command, pane logs, settings, gitignore — already on disk | per BLOCKED, 30+ min hand-writing |
-| Cluster blindness | Bug #6/#7/#8 are all "worker hang variants"; cluster re-discovered each time | amortized over many BLOCKED |
-| Reactive only | Bugs surface only after 30-min poll timeout | per BLOCKED, 30 min wall-clock |
-The blocked-sentinel JSON (`schema_version: 2.0`) already classifies (`reason_category` / `recoverable` / `suggested_action`) but stops at the campaign boundary — it does not become a *bug report*. That gap is the target.
----
-## 2. Principles (5)
-1. **Capture-by-default, not by-request.** When the campaign blocks, the operator should not have to gather anything that already exists on disk.
-2. **One canonical schema, two consumers.** A single `bug-report.json` feeds both BOS-side templates and rlp-desk-side triage; no divergent representations.
-3. **Surgical diffs over new infra.** Extend the existing `blocked.{md,json}` writer + `/rlp-desk` subcommand surface; do not introduce a new daemon, queue, or service. **Sequenced PRs > one big PR.**
-4. **Recovery must be idempotent.** Manual recovery of a BLOCKED campaign must not be silently overwritten on relaunch (Bug #10 contract).
-5. **Earlier is cheaper.** A heartbeat-anomaly *warning* costs nothing; a 30-min BLOCKED poll-timeout is the most expensive form of feedback.
----
-## 3. Decision drivers (top 3)
-| # | Driver | Why it dominates |
-|---|---|---|
-| D1 | **Operator minutes per BLOCKED → first actionable report** | Today: 30+ min recovery + 30+ min hand-writing = 60+ min. Target: ≤2 min recovery (just relaunch) + ≤2 min report review. |
-| D2 | **Cluster recognition (avoid duplicate `Bug #N` for same root cause)** | 5 of 10 reports cluster around "worker hang on sentinel" or "verifier post-sentinel race". Without similarity hinting we keep paying triage cost N times. |
-| D3 | **Zero regression on `--mode tmux` 19th launch** + **zero merge collision with `feat/native-agent-revert`** | Per `docs/plans/native-agent-revert.md:7`, the production tmux path is mid-flight. PR-A (Bug #10 hygiene) and PR-B (bundler) BOTH must wait for `native-agent-revert` to land. Documented as hard dependency below. |
-**Architect-flagged ranking shift**: the per-BLOCKED cost of *recovery loss* (Bug #10) > per-BLOCKED cost of *hand-writing*. PR-A (Bug #10) lands first because it has higher per-event leverage AND smaller surface.
----
-## 4. Viable options
-### Option A — **Sequenced bundle**: PR-A relaunch hygiene, PR-B bundler, PR-C governance/patterns *(recommended)*
-Three sequenced PRs landed in order. Each has a single, narrow purpose; review surface bounded; dependencies explicit.
-**Pros**: Surgical-diff principle satisfied; merge-conflict risk with `native-agent-revert` minimized; each PR reverts cleanly; per-PR self-verification scenarios stay scoped.
-**Cons**: more wall-clock to land the full vision (3 land-cycles instead of 1). Acceptable: PR-A alone moves D1 by ~50% on its own.
-### Option B — **One mega-PR** *(rejected per Architect)*
-The v0 plan, kept here for the record. **Invalidated** because (a) merge collision with `native-agent-revert` is near-certain on `campaign-main-loop.mjs` + `governance.md` + `commands/rlp-desk.md` (3 of the 5 modified files overlap), (b) self-verification scope balloons (one PR triggers MEDIUM+CRITICAL on too many subsystems), (c) Bug #10 fix is delayed by bundler's review surface.
-### Option C — **Heartbeat-first (early warning)** *(deferred to backlog)*
-Same as v0 — orthogonal to the report-quality problem; defer to backlog.
-### Option D — **External tracker integration** *(rejected, principle 3 violation)*
-Same as v0.
-### Option E — **Doc template only** *(rejected, does not move D1/D2)*
-Same as v0.
-### Why A wins
-A directly addresses D1 (PR-A: recovery, PR-B: capture), D2 (PR-C: pattern_match operationalized), and D3 (PRs sequenced AFTER `native-agent-revert`). C is complementary; B/D/E fail principles or drivers.
----
-## 5. Scope
-### Hard dependency (all PRs)
-**`feat/native-agent-revert` (P0+P1) MUST merge to `main` before PR-A starts.** Source: `docs/plans/native-agent-revert.md:7`. Documented in §3 D3. No PR-A/B/C work begins on shared files until that merge is verified by `git log main --oneline | head -3` showing the native-revert P0+P1 commits.
-### PR-A — **Bug #10 relaunch hygiene** (lands first; ~1 file modified, ~1 test added; ~50-line surgical diff)
-**P0**:
-1. **Inject phase=verify honor branch** at `src/node/runner/campaign-main-loop.mjs`. Two surgical edits:
-   - Verified ground-truth (read 2026-05-07): `readCurrentState` (`:364-389`) at `:371` already preserves `status.phase`. The bug is that the main loop unconditionally writes `state.phase = 'worker'` at `:1575` (every iteration, top of body) before `dispatchWorker`. Architect-flagged misattribution in v0 corrected.
-   - Edit 1: after `readCurrentState` (`:1256`), BEFORE the main `while` loop, add a single-iteration "phase=verify recovery branch":
-     ```
-     if (state.phase === 'verify' && state.iteration > 0) {
-       const valid = await _validateOperatorRecoveryArtifacts(paths, state);  // 5 checks (see §7 S2)
-       if (valid) {
-         _logRecovery('Resuming verify phase — operator manual recovery detected');
-         state.phase = 'verify';                  // explicit re-affirm
-         state._skipNextWorkerDispatch = true;    // consumed at :1575
-       } else {
-         _logRecovery(`phase=verify ignored: ${valid.reason}`);
-         // fall through; default behavior (worker dispatch) remains
-       }
-     }
-     ```
-   - Edit 2: at `:1575`, guard the unconditional reset:
-     ```
-     if (!state._skipNextWorkerDispatch) {
-       state.phase = 'worker';
-       await writeStatus(...);
-       await dispatchWorker(...);
-     } else {
-       state._skipNextWorkerDispatch = false;     // one-shot
-     }
-     ```
-   - `_validateOperatorRecoveryArtifacts` is a new internal helper in the same file (~25 lines): exists+parses `iter-signal.json` AND `done-claim.json`; `us_id == state.current_us`; `iteration == state.iteration`; `iter_signal_quality == 'specific'`; both files newer than the most recent `iter-NNN.worker-prompt.md` mtime.
-2. **Mirror the same guard in `src/scripts/run_ralph_desk.zsh`** for `--mode tmux`. **Verified injection points (read 2026-05-07, P1-1 corrected from v1)**: the iteration-body reset is **not** in any `start_iteration()` function — it lives at the top of the iteration body. Three concrete sites:
-   - `:3053` — `rm -f "$SIGNAL_FILE" "$DONE_CLAIM_FILE" "$VERDICT_FILE"` **deletes operator-written recovery artifacts**. PR-A guard MUST wrap this with: skip the rm when `LAST_PHASE == "verify"` AND validator passes.
-   - `:3084` — `update_status "worker" "running"` forces phase=worker. PR-A guard: skip when phase=verify+valid.
-   - `:3087` (and following dispatch block down to ~`:3110`) — worker launch. PR-A guard: when phase=verify+valid, jump straight to verifier dispatch (mirrors the per-US verifier dispatch already in the file; reuse `dispatch_verifier_per_us`).
-   - 5-check validator added to `lib_ralph_desk.zsh` as `_validate_operator_recovery_artifacts` (LAST_PHASE read from earlier `read_status` call site, audited at impl time).
-**Tests**:
-- `tests/node/test-relaunch-phase-verify-hygiene.test.mjs` (NEW) — 5 ACs (R1–R5 in §8).
-- `tests/test-bug10-zsh-relaunch-hygiene.sh` (NEW) — zsh side, mirrors `test-bug7-post-sentinel-race.sh` style.
-**No governance change in PR-A.** No new sentinel writer. Just the guard + helper + tests.
-### PR-B — **Bug-report bundler** (lands second; ~3 modified + 4 new; bundler module + subcommand)
-**P0**:
-3. **`bug-report.json` writer** — new `src/node/shared/bug-report.mjs` exposes `writeBugReport({slug, classification, reason, paths, env, paneTails, recentArtifacts, now})`. Called from `_emitBlockedSentinel` in `campaign-main-loop.mjs:920-968` AFTER the existing JSON sidecar write succeeds. Idempotent via `writeSentinelExclusive` semantics (per-block `<iso>` filename).
-4. **Mirror in zsh**: `_write_bug_report` in `lib_ralph_desk.zsh`, called from `write_blocked_sentinel`. Exact call-site count audited at implementation time (current grep shows ~10 invocations, all in one taxonomy).
-5. **Redaction pass** — deny-list applied before write (12 secret-shape regex from §8 AC-W2); `meta.redacted_line_count` exposed for operator audit. Markdown render reads from JSON post-redaction.
-6. **`pattern_match` field reserved-but-empty** — schema includes it; bundler writes `{ candidate_bug_ids: [], score: null, source: "deferred-to-PR-C" }`. No `docs/bug-patterns.json` shipped in PR-B.
-**P1**:
-7. **`/rlp-desk report <slug>` subcommand** — new section in `src/commands/rlp-desk.md`; reads latest `bug-reports/<slug>-*.json`, prints markdown to stdout. Optional `--headline "..."` flag. **No remote publish.**
-8. **Schema doc** — `docs/rlp-desk/bug-report-schema.md` (NEW): JSON schema + worked example + .gitignore snippet recommendation.
-**Tests**:
-- `tests/node/test-bug-report-writer.test.mjs` (NEW) — 5 ACs (W1–W5 in §8).
-- `tests/test-bug-report-zsh-emit.sh` (NEW).
-- `tests/node/us006-campaign-main-loop.test.mjs` extension — 2 integration ACs (I1–I2 in §8).
-### PR-C — **Pattern operationalization + governance §1g** (lands third; pattern data + governance ride-along justified)
-**P1**:
-9. **`docs/bug-patterns.json`** (NEW) — seed signatures for Bug #1–#10, hand-authored from BOS reports. Schema: `{bug_id, signature: {reason_category, failure_category, pane_token_bag[]}, fix_pr_url}`.
-10. **Pattern-match implementation** — `bug-report.mjs` Jaccard implementation populates `pattern_match.candidate_bug_ids[]` + `score` (P2 cap: ≥0.7 = "candidate", < 0.7 = empty list). Output remains data-only — no inline CLI suggestion. v0 §7 S3 mitigation preserved.
-11. **Governance §1g "Bug Report Capture"** — additive section; documents (a) every BLOCKED writes a bug-report invariant, (b) redaction rules + audit field, (c) PR-A relaunch hygiene contract (operator's recovery is honored). **Ride-along rationale**: §1g formalizes the contract that PR-A and PR-B implement; landing them as separate PRs without governance text leaves the invariants implicit. Per CLAUDE.md, governance changes require `ralplan + codex review` — this very plan satisfies that for §1g; PR-C's review surface is *only* the §1g text + pattern data + Jaccard ≤80 LOC implementation. Bounded.
-**Tests**:
-- `tests/node/test-bug-report-pattern-match.test.mjs` (NEW) — Jaccard determinism, threshold, regression on synthetic Bug #6 fixture. AC-W4 from §8 (relocated from PR-B).
-### P2+ → `docs/plans/bug-report-overhaul-backlog.md` (separate file, NOT this PR-A/B/C set)
-- Heartbeat-warning sidecar (Option C).
-- GitHub Issues integration (Option D, after authn story).
-- Pattern-learning loop that mines `~/.claude/ralph-desk/analytics/*/bug-reports/` for emerging clusters.
-- Cross-campaign bug-report dashboard in `/rlp-desk analytics`.
-- Auto-suggest "this looks like Bug #N — try fix-X" inline in CLI output (today: data-only).
-- Operator-CLI `/rlp-desk recover <slug> --to verify` to write the manual recovery artifacts deterministically (currently a hand-rolled jq pipeline).
----
-## 6. Files to modify (summary across PR-A/B/C)
-| PR | File | Change | Risk |
-|---|---|---|---|
-| A | `src/node/runner/campaign-main-loop.mjs` | Phase-verify recovery branch (after `:1256`); guard at `:1575`; `_validateOperatorRecoveryArtifacts` helper | **MED** (control-flow change in main loop) |
-| A | `src/scripts/lib_ralph_desk.zsh` | `_validate_operator_recovery_artifacts` helper | LOW |
-| A | `src/scripts/run_ralph_desk.zsh` | Phase-verify recovery branch wrapping the iteration-body sites at `:3053` (rm guard), `:3084` (update_status guard), `:3087` (worker dispatch skip → verifier dispatch jump). No `start_iteration` function exists — iteration body starts at the top of the main while loop after the cleanup block. | **MED** |
-| A | `tests/node/test-relaunch-phase-verify-hygiene.test.mjs` | NEW — 5 ACs | LOW |
-| A | `tests/test-bug10-zsh-relaunch-hygiene.sh` | NEW | LOW |
-| B | `src/node/shared/bug-report.mjs` | NEW module — writer + redaction; `pattern_match` reserved-empty | LOW |
-| B | `src/node/runner/campaign-main-loop.mjs` | Call `writeBugReport` from `_emitBlockedSentinel` (post existing JSON write) | LOW |
-| B | `src/scripts/lib_ralph_desk.zsh` | `_write_bug_report` helper | MED |
-| B | `src/scripts/run_ralph_desk.zsh` | Wire `_write_bug_report` after `write_blocked_sentinel` sites | MED |
-| B | `src/commands/rlp-desk.md` | Add `## report <slug>` + help-block entry | LOW |
-| B | `docs/rlp-desk/bug-report-schema.md` | NEW | LOW |
-| B | `tests/node/test-bug-report-writer.test.mjs` | NEW | LOW |
-| B | `tests/test-bug-report-zsh-emit.sh` | NEW | LOW |
-| B | `tests/node/us006-campaign-main-loop.test.mjs` | +2 integration ACs | LOW |
-| C | `docs/bug-patterns.json` | NEW seed | LOW |
-| C | `src/node/shared/bug-report.mjs` | Jaccard implementation (≤80 LOC); replaces reserved-empty logic | LOW |
-| C | `src/governance.md` | Additive §1g | LOW (additive) |
-| C | `tests/node/test-bug-report-pattern-match.test.mjs` | NEW | LOW |
-**PR-A**: 3 modified + 2 new = 5 files.
-**PR-B**: 4 modified + 4 new = 8 files.
-**PR-C**: 2 modified + 2 new = 4 files.
-Each PR's review surface bounded; merge-conflict surface with `native-agent-revert` is empty (PRs sequenced AFTER it lands).
----
-## 7. Pre-mortem (deliberate mode — 3 scenarios; updated for v1)
-### S1 — Pane-tail leaks a secret into a committed bug-report (PR-B risk)
-(unchanged from v0; mitigation in PR-B test AC-W2; `meta.redacted_line_count` audit field; schema doc recommends `.gitignore` snippet but does not auto-modify user repo)
-### S2 — Bug #10 fix accidentally honors a stale `phase=verify` from a CRASHED leader (PR-A risk)
-The 5-validation gate (exists × 2, us_id match, iteration match, `iter_signal_quality=='specific'`, mtimes-newer-than-worker-prompt) blocks the most likely race. PR-A also adds a `_logRecovery` audit line for every relaunch outcome (honored / ignored + reason) so operators can confirm in `/rlp-desk logs <slug>`.
-**Residual risk**: a clever filesystem race can pass all five checks. Backlog item: `/rlp-desk recover <slug> --to verify` opt-in flag (P2). Until then, the validator's strictness (any miss → fall through to current behavior) makes the failure mode "no improvement" not "regression".
-### S3 — `pattern_match` false-positive trains operators to dismiss real bugs (PR-C risk)
-In PR-B, `pattern_match` is empty/reserved — no risk. In PR-C, threshold 0.7 + Jaccard determinism + data-only output. Operator sees `score: 0.83 — review before assuming match`. Auto-suggest deferred to P2.
-**Architect-flagged risk added (S4)**: PR-A's `state._skipNextWorkerDispatch` is a one-shot mutation on a shared object. **Mitigation**: explicitly cleared inside the guard branch (Edit 2 above); PR-A includes a unit test that runs 2 consecutive iterations and asserts the worker IS dispatched on iter-2.
----
-## 8. Expanded test plan (deliberate mode)
-### PR-A unit (Node) — `tests/node/test-relaunch-phase-verify-hygiene.test.mjs`
-- AC-R1: status.phase=verify + valid artifacts → verifier-only entry (no worker dispatch).
-- AC-R2: status.phase=verify + missing `done-claim.json` → fall through to worker, log warning.
-- AC-R3: status.phase=verify + `us_id` mismatch → fall through, warning.
-- AC-R4: status.phase=verify + `iter-signal.json` older than worker-prompt.md → fall through, warning.
-- AC-R5: status.phase=verify + `iter_signal_quality != 'specific'` → fall through, warning.
-- AC-R6 (Architect S4): `_skipNextWorkerDispatch` cleared after one use; iter-2 worker dispatched normally.
-### PR-B unit (Node) — `tests/node/test-bug-report-writer.test.mjs`
-- AC-W1: schema fields all present + types match `docs/rlp-desk/bug-report-schema.md`.
-- AC-W2: redaction — 12 secret-shape fixtures all replaced by `<REDACTED>`; `meta.redacted_line_count` reflects count.
-- AC-W3: pane-tail truncates at 200 lines; preserves last lines.
-- AC-W5: idempotent — second call with same `(slug, iso)` is a no-op.
-- (AC-W4 relocated to PR-C — Jaccard pattern match.)
-### PR-B integration (Node) — `tests/node/us006-campaign-main-loop.test.mjs` extension
-- AC-I1: BLOCKED via `flywheel_inconclusive` → bug-report file written; JSON parses; `reason_category == 'mission_abort'`.
-- AC-I2: BLOCKED via `worker_exited` → bug-report `pattern_match` field exists with `{candidate_bug_ids: [], score: null, source: "deferred-to-PR-C"}`.
-### PR-C unit (Node) — `tests/node/test-bug-report-pattern-match.test.mjs`
-- AC-W4: pattern_match against seeded `docs/bug-patterns.json` — synthetic block matching Bug #6 signature returns `score >= 0.7` + correct `candidate_bug_ids`.
-- AC-W4b: synthetic block with no matches → `candidate_bug_ids: []`, `score < 0.7`.
-### Integration (zsh)
-- `tests/test-bug10-zsh-relaunch-hygiene.sh` (PR-A).
-- `tests/test-bug-report-zsh-emit.sh` (PR-B). Sc-1 + Sc-2 from v0.
-### Self-Verification scenarios (CLAUDE.md gate)
-**PR-A** (touches `run_ralph_desk.zsh` → CLAUDE.md mandates 3 SV scenarios):
-- LOW: 5-AC unit suite green; existing zsh + Node regression green.
-- MEDIUM: real campaign brought to BLOCKED via stub failure; operator runs the documented recovery flow (jq patches `phase=verify`, writes manual artifacts); relaunch → verifier-only path runs (no `iter-002.worker-prompt.md` created); verdict accepted.
-- CRITICAL: same as MEDIUM but also assert the `_skipNextWorkerDispatch` flag does not survive into iter-3; `/rlp-desk logs` shows `Resuming verify phase` audit line.
-**PR-B** (touches `run_ralph_desk.zsh` again → 3 SV scenarios):
-- LOW: redaction unit fixture passes.
-- MEDIUM: real campaign with stub worker fails → `bug-reports/<slug>-<iso>.{json,md}` appears; markdown render contains all required sections; `/rlp-desk report <slug>` prints same markdown.
-- CRITICAL: redaction smoke — pane log pre-injected with `Bearer X` and `OpenAI-API-Key: sk-...`; bug-report JSON does not contain those substrings; `meta.redacted_line_count >= 2`.
-**PR-C** (governance + patterns; no runtime code path → CLAUDE.md gate ride-along scenarios):
-- LOW: Jaccard unit suite green.
-- MEDIUM: synthetic block matching Bug #6 → bug-report `pattern_match.candidate_bug_ids` includes `Bug-6`.
-- CRITICAL: ralplan + codex review of governance §1g additions reaches 0 issues.
----
-## 9. Verification end-to-end (per PR)
-(Each PR verified independently before next PR starts.)
-1. `node --test 'tests/node/*.test.mjs'` all green; new PR-specific tests visible.
-2. zsh integration test for that PR green.
-3. Bug #7 regression suite (`test-bug7-post-sentinel-race.sh`, `test-bug7-poll-partial-write.sh`) unchanged green.
-4. CLAUDE.md SV gate × 3 for that PR — all PASS.
-5. **Local sync verification (P1-2 corrected from v1)** — depends on which file types changed:
-   - **Node files only** (PR-C): `node scripts/postinstall.js` + banner-aware diff `src/` ⇆ `~/.claude/ralph-desk/` per CLAUDE.md.
-   - **zsh wrappers changed** (PR-A and PR-B both touch `src/scripts/{run,lib}_ralph_desk.zsh`): `node scripts/postinstall.js` does NOT install the legacy zsh wrappers (CLAUDE.md `Local File Sync` §: "synced ONLY via `bash install.sh` curl path"). Required additional steps:
-     - `bash install.sh` from a clean shell to drive the curl-path installer.
-     - Verify zsh wrappers landed: `ls -la ~/.claude/ralph-desk/install.sh ~/.claude/ralph-desk/scripts/{init,run,lib}_ralph_desk.zsh`. Each must be banner-headed (`# DO NOT EDIT ...` line 2 after the shebang) and `chmod 0o444`.
-     - Banner-aware diff: `diff <(cat src/scripts/run_ralph_desk.zsh) <(tail -n +3 ~/.claude/ralph-desk/scripts/run_ralph_desk.zsh)` (skip shebang + banner).
-     - Document which install channel was exercised in the SV scenario notes so future audits can replay.
-6. Manual sandbox campaign trigger (PR-A: deliberate BLOCKED + recovery; PR-B: BLOCKED + read bug-report markdown; PR-C: pattern_match populated).
----
-## 10. ADR (preview — final once Critic approves)
-- **Decision**: Adopt Option A (sequenced PR-A→PR-B→PR-C) for v0.16.x; defer heartbeat-warning, external tracker, and operator recovery CLI to backlog.
-- **Drivers**: D1 operator-minutes, D2 cluster-recognition, D3 zero `--mode tmux` regression + zero `native-agent-revert` collision.
-- **Alternatives considered**: B (one mega-PR) rejected per Architect for merge collision + SV scope balloon; C (heartbeat) orthogonal — backlog; D (GitHub Issues) violates principle 3; E (doc only) does not move D1/D2.
-- **Why chosen**: PR-A fixes the highest per-event cost (recovery loss). PR-B captures-by-default with redaction. PR-C operationalizes patterns and formalizes governance §1g. Each PR has bounded review surface and clean revert.
-- **Consequences**: BLOCKED writes additional artifacts (`bug-reports/<slug>-<iso>.{json,md}`); operator recovery is honored; `bug-patterns.json` becomes a living artifact; governance gains §1g formal contract.
-- **Follow-ups**: Backlog file lists P2+ items. Heartbeat warning revisited after we measure operator minutes-saved on first 3 BLOCKED post-PR-A land.
----
-## 11. Round-by-round resolution log
-| Round | Reviewer | Verdict | Findings closed |
-|---|---|---|---|
-| 0 | — | Planner v0 | initial draft |
-| 1 | Architect (Claude) | ITERATE | (1) split into PRs (2) cite file:line (3) pattern_match → backlog/PR-C (4) native-revert dependency (5) governance §1g rationale |
-| 2 | Codex Critic | ITERATE — 0 P0, 2 P1 | P1-1 zsh sites corrected to `:3053/:3084/:3087` (no `start_iteration` function); P1-2 sync gap closed (zsh = `bash install.sh` channel + banner+chmod verify). BACKLOG: P2-1/P2-2/P3-1 captured below. |
-| 3 | Codex Critic | **APPROVE** — 0 P0, 0 P1 | Round 2 P1-1/P1-2 confirmed closed by ground-truth checks (zsh sites at `:3053/:3084/:3087` + CLAUDE.md `bash install.sh` channel). New P2 (lib_ralph_desk.zsh diff) → backlog. |
-**Loop terminated**: Codex Critic returned APPROVE at Round 3. P0+P1 findings = 0. Per user stop rule, ralplan exits. P2/P3 captured in `bug-report-overhaul-backlog.md`.

package/docs/plans/native-agent-revert.md DELETED Viewed

@@ -1,184 +0,0 @@
-# Native Agent() Revert Plan (P0+P1)
-6-round ralplan consensus 결과. Goal: slash command(`src/commands/rlp-desk.md`)가 진짜 leader가 되어 Claude Code Agent() 호출 + Bash codex exec로 worker/verifier를 spawn하는 v0.13.x 이전 방식으로 회귀. `--mode tmux`(zsh runner) 경로 미변경.
-## Scope
-- **P0**: Bug #7 fix 단독 commit + 명시적 invariant ADR
-- **P1**: slash command native prose 회복 + `--mode native` 도입 + Node CLI `--mode agent` deprecation banner
-- **Out-of-scope**: P2 (Node CLI default flip, `--mode agent` hard-error), P3 (Node leader 삭제 ~4.5k LOC + repo-wide ghost-removal gate). 19th launch 종료 후 별도 PR.
-## Principles
-1. Naming truth: `--mode agent`라는 flag 하나가 두 곳에서 다른 의미를 가지는 상태 해소.
-2. Single leader per mode.
-3. Surgical revert: tmux 경로 + 19th launch 영향 0.
-4. Bug #7 fix preservation: zsh side에 invariant 보존됨을 ADR로 명시.
-5. Reversibility: silent reclaim 없음. `--mode agent` (Node CLI)는 호환 유지하면서 deprecation warning만.
-## P0 — Bug #7 fix commit + Invariant ADR
-### a. Bug #7 fix commit
-- 현재 working tree (5 src + 4 untracked tests + 1 plan markdown) 단독 commit
-- Local sync는 이미 완료(`~/.claude/ralph-desk/` chmod 0o444 + banner)
-### b. ADR `docs/adr/0001-bug7-invariant-zsh-only-by-structural-necessity.md`
-명시적 scope:
-- Bug #7 invariant는 **slash-command Native Agent() / Bash codex exec path**에 한정해 **zsh runner side에 enforce**된다.
-- Native Agent() path는 short-lived per-call subagent — long-lived TUI process가 없어 동일 race를 가지지 않는다.
-- Node CLI `--mode agent`(Node leader, deprecated alpha)는 long-lived tmux pane을 사용 → 동일 race 보유. P3에서 삭제될 때까지 별도 reaper/lock 코드(`src/node/runner/campaign-main-loop.mjs:1091`, `:1577`, `src/node/tmux/pane-manager.mjs`, `src/node/shared/fs.mjs`) 유지.
-- zsh side invariant 인용 (codex critic verified file:line):
-  - `src/scripts/lib_ralph_desk.zsh:248` — helpers (`_kill_pane_process`, `_lock_sentinel`, `_unlock_sentinel`)
-  - `src/scripts/run_ralph_desk.zsh:2179` — partial-write `jq -e .` validity gate
-  - `src/scripts/run_ralph_desk.zsh:2484` — verifier reap+lock (per-US main path)
-  - `src/scripts/run_ralph_desk.zsh:2551` — final-verify per-US reap+lock
-  - `src/scripts/run_ralph_desk.zsh:2969` — prep cleanup unlock
-  - `src/scripts/run_ralph_desk.zsh:3036` — worker reap+lock
-  - `src/scripts/run_ralph_desk.zsh:3247` — verifier reap+lock (consensus)
-### c. re-sync
-- `node scripts/postinstall.js`
-- banner-aware diff: src ⇆ `~/.claude/ralph-desk/`
-### Acceptance
-- AC0.1 `git log -1`이 Bug #7 fix
-- AC0.2 ADR file 존재 + 위 7 file:line 인용 + scope 명문
-- AC0.3 `bash tests/test-bug7-post-sentinel-race.sh` + `bash tests/test-bug7-poll-partial-write.sh` 통과
-- AC0.4 Node 315/315 통과
-- AC0.5 banner-aware diff src ⇆ install 일치
-## P1 — Slash native prose + `--mode native` + Node CLI deprecation
-### a. `src/commands/rlp-desk.md` audit list (전체)
-| Line | 현재 | 변경 후 |
-|---|---|---|
-| 192 | init이 emit하는 첫 "/rlp-desk run" Full options reference 블록의 `--mode agent\|tmux` | `--mode native\|tmux (default: native)` (recommended example는 `--mode tmux` 유지) |
-| 227 | 두 번째 init emission | 동일 처리 |
-| 255 | Options block: `- \`--mode agent\|tmux\` (default: \`agent\`)` | `- \`--mode native\|tmux\` (default: \`native\`)` (정확 라인 형태) |
-| 287 | Mode Selection: "If absent or `agent`, use the Agent() path below" | 두 축 명문화: `--mode native` (default, slash native Agent() leader) / `--mode tmux` (zsh runner). Legacy `--mode agent` deprecation+redirect prose. Direct Node CLI `node run.mjs --mode agent`는 deprecated alpha — 별도 paragraph |
-| 334 | tmux fallback "suggest `--mode agent`" | "suggest `--mode native`" |
-| 342 | "SV/flywheel은 `--mode agent`에서 지원" | "SV/flywheel은 현재 Node-leader `--mode agent` (deprecated alpha, direct Node CLI)에서만 구현. Native Agent() path(`--mode native`)는 SV/flywheel 미구현 — post-P3 작업" |
-| 343 | Tmux IMPORTANT RULES "always invokes node ..." | `--mode tmux` 한정으로 scope |
-| 360-410 | "Why Agent mode is structurally immune" + "PLATFORM CONSTRAINT" 분산 | 단일 박스 `### Native Agent() Safety Contract`로 verbatim 흡수. 4 sentinel: turn-keepalive, no `subagent_type`, `mode="bypassPermissions"` mandatory, long-running→tmux |
-| 448, 460 | claude/codex worker dispatch code | 변경 없음 (이미 native wired) |
-| 778 | "agent=LLM leader, tmux=shell leader" help | "native=Native Agent() leader (slash), tmux=zsh leader (production). Legacy `agent` redirects to `native`. Direct Node CLI `--mode agent`는 deprecated alpha — Direct Node CLI invocation 섹션 참조" |
-| 784 | run 예시 fallback에 `--mode agent` | `--mode native` |
-| 802 | "Agent Mode (default: --mode agent)" 헤딩 | "Native Agent() Mode (default: --mode native)" |
-### b. `src/node/run.mjs`
-- 신규 `--mode native` 핸들러:
-  - stderr: `ERROR: --mode native is slash-command-only. The Node CLI does not implement it. Use \`/rlp-desk run --mode native\` from a Claude Code session, or use \`--mode {tmux,agent}\` for direct CLI invocation.`
-  - exit 2
-- `--mode agent` (Node CLI, line 366-374) deprecation banner 강화:
-  ```
-  WARNING: --mode agent (Node-leader alpha) is deprecated.
-  This is the direct Node-CLI alpha path — UNRELATED to the slash command's
-  Native Agent() path (`/rlp-desk run --mode native`).
-  For production tmux orchestration use `--mode tmux`.
-  For Claude Code Native Agent() campaigns use `/rlp-desk run --mode native`
-  from a Claude Code session.
-  This mode will hard-error in the next major release.
-  ```
-  - default 동작 unchanged (silent reclaim NO; backward compat)
-  - `--allow-deprecated` flag 도입 X (P3에서 삭제할 ghost flag 회피)
-  - wrapper가 silence 원하면 `2>/dev/null`
-### c. Tests
-#### us008 신규 3 cases
-1. `node run.mjs run demo --mode native` → exit 2 + stderr ERROR 메시지
-2. `node run.mjs run demo --mode agent` → stderr deprecation banner + exit 0 (default 동작 유지)
-3. `node run.mjs run demo --mode tmux` 회귀 unchanged
-#### SV grep/awk guards (`tests/sv-gate-bug7-mode-prose.sh` 신규 또는 sv-gate-fast.sh 병합)
-```bash
-# 1. count-aware: --mode native 최소 5회 등장
-[ "$(grep -c '\-\-mode native' src/commands/rlp-desk.md)" -ge 5 ] || { echo "FAIL: --mode native must appear ≥5 times"; exit 1; }
-# 2. block-aware safety contract
-WINDOW=$(awk '/^### Native Agent\(\) Safety Contract/,/^### /' src/commands/rlp-desk.md)
-echo "$WINDOW" | grep -q 'Turn-keepalive: every status report uses' || { echo "FAIL: turn-keepalive sentinel"; exit 1; }
-echo "$WINDOW" | grep -q 'no `subagent_type` parameter' || { echo "FAIL: no-subagent_type sentinel"; exit 1; }
-echo "$WINDOW" | grep -q 'mode="bypassPermissions" mandatory' || { echo "FAIL: bypassPermissions sentinel"; exit 1; }
-echo "$WINDOW" | grep -qi 'long-running.*tmux' || { echo "FAIL: long-running tmux recommendation"; exit 1; }
-# 3. dispatch snippet preservation (AC1.5a static)
-awk '/^If claude engine \(default\):/,/^If codex engine:/' src/commands/rlp-desk.md > /tmp/_disp_claude
-grep -q 'Agent(' /tmp/_disp_claude || { echo "FAIL: claude dispatch missing Agent("; exit 1; }
-grep -q 'mode="bypassPermissions"' /tmp/_disp_claude || { echo "FAIL: claude dispatch missing bypassPermissions"; exit 1; }
-awk '/^If codex engine:/,/^\*\*⑥\*\*|^### /' src/commands/rlp-desk.md > /tmp/_disp_codex
-grep -q 'Bash("codex exec' /tmp/_disp_codex || { echo "FAIL: codex dispatch missing Bash codex exec"; exit 1; }
-# 4. Options block exact match
-WINDOW=$(awk '/^Options \(parse from/,/^- `--worker-model/' src/commands/rlp-desk.md)
-echo "$WINDOW" | grep -qE '^\- `--mode native\|tmux` \(default: `native`\)$' || { echo "FAIL: Options block --mode line not exact"; exit 1; }
-echo "$WINDOW" | grep -qE '\-\-mode .*agent' && { echo "FAIL: stale 'agent' in Options block --mode line"; exit 1; }
-# 5. Tmux IMPORTANT RULES contradiction removed
-! awk '/^\*\*IMPORTANT RULES:\*\*/,/^####/' src/commands/rlp-desk.md | grep -q "always invokes node" || { echo "FAIL: stale 'always invokes node'"; exit 1; }
-# 6. Legacy redirect prose present
-grep -q 'Legacy.*\-\-mode agent.*redirect' src/commands/rlp-desk.md || { echo "FAIL: deprecation prose missing"; exit 1; }
-```
-#### AC1.5b: manual transcript artifact
-`docs/verifications/p1-native-mode-transcript.md` (git-tracked):
-1. P1 land 후 Claude Code session에서 `/rlp-desk run sample --mode native` 1 iteration 실행
-2. 전체 transcript 캡처 — 다음 관측 포함:
-   - `Agent(model=…, mode="bypassPermissions", …)` worker dispatch 라인
-   - status report가 `Bash("echo '...'")` 로 wrap
-   - `subagent_type=` 미사용
-3. Reviewer가 `## Reviewer Sign-off` 섹션에 이름 + 날짜 기재
-4. CI guard: 파일 존재 + signoff 비-placeholder 검증 (`grep -E "^- Name: \S+"`, `grep -E "^- Date: [0-9]{4}-[0-9]{2}-[0-9]{2}"`)
-### d. Re-sync
-- `node scripts/postinstall.js`
-- banner-aware diff for `src/commands/rlp-desk.md` ⇆ `~/.claude/commands/rlp-desk.md`, `src/node/run.mjs` ⇆ `~/.claude/ralph-desk/node/run.mjs`
-### Acceptance
-- AC1.1 6 grep/awk guards all return 0
-- AC1.2 us008 신규 3 cases all green
-- AC1.3 us008/us006 기존 회귀 0
-- AC1.4 banner-aware diff src ⇆ install 일치
-- AC1.5a 정적 dispatch grep (#3) 통과
-- AC1.5b transcript artifact + signoff non-placeholder
-## Out-of-scope (deferred PR list)
-- **P2**: `src/node/run.mjs:16` default `'agent'` → `'tmux'` flip + `--mode agent` (Node CLI) hard-error. 19th launch 종료 후, 외부 wrapper 영향 평가 후 별도 PR.
-- **P3**: Node leader 삭제 (`src/node/runner/campaign-main-loop.mjs`, `src/node/tmux/`, `src/node/polling/` 등 ~4.5k LOC) + Bug-7 Node 통합 테스트 폐기 + repo-wide ghost-removal gate (`rg -n "Node leader\|node-leader\|--mode agent" src docs scripts tests` = 의도된 hits만). P2 후.
-- **`--mode agent` reclaim to Native Agent()**: P3 이후 next major version에서만. 이번 PR에선 silent reclaim 없음.
-## Pre-mortem
-1. **`--mode agent` 호출자가 native와 alpha 의미를 헷갈린다** — slash command에서 호출 시 deprecation+redirect로 native path 진행. 외부 shell wrapper에서 `node run.mjs run X --mode agent` 호출 시 deprecation banner + 기존 Node leader path. 두 경로 모두 메시지로 명시.
-2. **Native Agent() turn-end가 사용자를 괴롭힌다** — Safety Contract 박스의 turn-keepalive 명문화로 mitigate. 그래도 100%는 아니므로 docs는 long-running = `--mode tmux` 강력 권고.
-3. **외부 wrapper가 `--mode agent` (Node CLI) 의존** — 동작 unchanged + deprecation banner만. P3에서야 hard-error. wrapper는 그동안 마이그레이션.
-## Verification end-to-end (P0+P1 land 후)
-1. `git log --oneline HEAD~3..HEAD` — Bug #7 + ADR + P1 commits
-2. `node --test 'tests/node/*.test.mjs' 'tests/node/*.mjs'` — 315+3 = 318 통과
-3. `bash tests/test-bug7-post-sentinel-race.sh` + `bash tests/test-bug7-poll-partial-write.sh` 통과
-4. `bash tests/sv-gate-bug7-mode-prose.sh` (또는 sv-gate-fast.sh) — 6 grep/awk guards 0
-5. banner-aware diff src ⇆ `~/.claude/`
-6. AC1.5b: 사용자가 Claude Code session에서 `/rlp-desk run sample --mode native` 실행 후 transcript 검토 + signoff
-## Round-by-round resolution table
-| Round | Verdict | Findings closed |
-|---|---|---|
-| 1 (Architect) | shift to A-strict | option A → A-strict |
-| 2 (Critic codex) | ITERATE 7 | entrypoint, default flip, ADR scope, naming, reclaim, AC, re-sync |
-| 3 (Architect) | ITERATE 2 | synonym ghost, allow-deprecated ghost |
-| 4 (Critic codex) | ITERATE 4 | init blocks (192/227), fallback (334/342), grep guards, ADR scope |
-| 5 (Critic codex) | ITERATE 3 | label expansion (778/802), exact options match, AC1.5 runnable |
-| 6 (Critic codex) | ITERATE 3 (1 actionable, 2 cross-check false-positives) | signoff non-placeholder check |
-Net: 모든 v0-v5 actionable findings closed. Round 6 finding 3 (signoff non-placeholder)은 v7에 이미 반영됨 (AC1.5b CI guard 4번째 항목). Round 6 finding 1/2는 v6 base에 이미 포함된 사항 — critic의 cross-check 누락.