@ai-dev-methodologies/rlp-desk 0.13.1 → 0.14.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -6
- package/docs/plans/spicy-booping-galaxy.md +397 -2
- package/package.json +3 -3
- package/scripts/postinstall.js +14 -5
- package/scripts/uninstall.js +5 -0
- package/src/commands/rlp-desk.md +19 -5
- package/src/node/polling/signal-poller.mjs +14 -0
- package/src/node/run.mjs +131 -1
- package/src/node/runner/campaign-main-loop.mjs +6 -0
- package/src/node/runner/prompt-dismisser.mjs +12 -0
package/README.md
CHANGED
|
@@ -238,7 +238,7 @@ When all US pass individually, the final ALL verify runs **sequentially per-US**
|
|
|
238
238
|
|
|
239
239
|
| Option | Default | Description |
|
|
240
240
|
|--------|---------|-------------|
|
|
241
|
-
| `--mode agent\|tmux` | agent |
|
|
241
|
+
| `--mode agent\|tmux` | agent | tmux=zsh Leader (stable, production), agent=Node Leader (alpha) |
|
|
242
242
|
| `--worker-model MODEL` | haiku | Worker model. `name`=claude, `name:reasoning`=codex |
|
|
243
243
|
| `--lock-worker-model` | off | Disable auto model upgrade on failure |
|
|
244
244
|
| `--verifier-model MODEL` | sonnet | per-US verification model (lighter) |
|
|
@@ -277,10 +277,17 @@ The brainstorm phase evaluates complexity (US count, file scope, logic, dependen
|
|
|
277
277
|
|
|
278
278
|
RLP Desk supports two execution modes. Both honor the same governance protocol.
|
|
279
279
|
|
|
280
|
+
> **v0.14.0 status:** `--mode tmux` (zsh-backed) is the **stable, production** path
|
|
281
|
+
> with the full safety net (heartbeat, copy-mode guard, prompt-stall timeout,
|
|
282
|
+
> no-progress detection, claude model upgrade chain). `--mode agent` is **alpha**
|
|
283
|
+
> and ships without those features — the runner emits a stderr warning when
|
|
284
|
+
> agent mode is invoked. For long campaigns and BOS-style autonomous loops,
|
|
285
|
+
> use `--mode tmux`.
|
|
286
|
+
|
|
280
287
|
### Environment Compatibility
|
|
281
288
|
|
|
282
|
-
| Environment | Agent Mode | Tmux Mode |
|
|
283
|
-
|
|
289
|
+
| Environment | Agent Mode (alpha) | Tmux Mode (stable) |
|
|
290
|
+
|-------------|--------------------|--------------------|
|
|
284
291
|
| Claude Code (any terminal) | **Works** | Requires tmux |
|
|
285
292
|
| Inside tmux session | **Works** | **Works** — panes split in current window |
|
|
286
293
|
| Outside tmux session | **Works** | **Rejected** — "start tmux first" |
|
|
@@ -289,9 +296,9 @@ RLP Desk supports two execution modes. Both honor the same governance protocol.
|
|
|
289
296
|
|
|
290
297
|
| Need | Use |
|
|
291
298
|
|------|-----|
|
|
292
|
-
|
|
|
293
|
-
|
|
|
294
|
-
|
|
|
299
|
+
| Production / autonomous campaigns | `--mode tmux` (stable) |
|
|
300
|
+
| Long campaigns, CI, overnight runs | `--mode tmux` (stable) |
|
|
301
|
+
| Quick interactive exploration inside Claude Code | `--mode agent` (alpha — Node-native) |
|
|
295
302
|
|
|
296
303
|
### Agent Mode (default) — "Smart Mode"
|
|
297
304
|
|
|
@@ -1,8 +1,403 @@
|
|
|
1
|
-
# Plan —
|
|
1
|
+
# Plan — v0.14.1: Codex verifier idle을 frozen으로 오인하는 버그 수정
|
|
2
|
+
|
|
3
|
+
> **상위 우선순위 plan. 아래 v0.14.0 / v0.13.0 plan은 history reference로 보존.**
|
|
4
|
+
> **Trigger**: BOS Bug Report #3 (`/Users/kyjin/dev/doul/bos/docs/exec-plans/active/2026-05-04-rlp-desk-bug-report-3-verifier-noprogress.md`). codex verifier가 verdict 작성 + idle UI ("Worked for 5m 36s ──") 표시 시 main polling loop의 byte-stasis 감지가 BLOCKED 판정 → 8 iter (148분) 손실.
|
|
5
|
+
> **Target version**: 0.14.1
|
|
6
|
+
> **Severity**: HIGH (Bug Report 분류).
|
|
7
|
+
> **승인된 strategy**: 작성 시점 미정 — 본 plan에 단일 추천안 명시 후 ExitPlanMode로 승인 요청.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## A. Context (v0.14.1)
|
|
12
|
+
|
|
13
|
+
### 문제 진단
|
|
14
|
+
|
|
15
|
+
v0.14.0 출시 후 production path는 `--mode tmux` → `run_ralph_desk.zsh`. 신규 버그는 zsh runner의 main polling loop에서 발생. codex (gpt-5.5:high) verifier가:
|
|
16
|
+
1. verdict file (`<slug>-verify-verdict.json`) 작성 완료
|
|
17
|
+
2. 이후 codex CLI는 다음 입력 대기 idle UI 노출 (예: `─ Worked for 5m 36s ──`, `› Summarize recent commits`, `gpt-5.5 high · Context 66% left`)
|
|
18
|
+
3. pane content 변화 0 → main loop의 `check_no_progress()` (`PROGRESS_NO_CHANGE_TIMEOUT`=600s) 가 frozen 판정
|
|
19
|
+
4. BLOCKED `verifier_dead` (sentinel category=infra_failure) 작성 → 캠페인 종료
|
|
20
|
+
5. **그 시점에 verdict 파일은 이미 디스크에 있었다** → 무효화
|
|
21
|
+
|
|
22
|
+
Node leader (`--mode agent`, alpha)에도 동일 결함 존재: `signal-poller.mjs` + `prompt-dismisser.mjs` 에 codex idle 패턴 부재. agent mode에서 동일 시나리오로 회귀할 위험.
|
|
23
|
+
|
|
24
|
+
### Explore 결과 (file:line 인용)
|
|
25
|
+
|
|
26
|
+
**zsh runner**:
|
|
27
|
+
- `src/scripts/run_ralph_desk.zsh:1437-1469` — `check_no_progress()`. `PROGRESS_NO_CHANGE_TIMEOUT` (기본 600s) 이상 pane bytes 동일 시 BLOCKED. **verdict 파일 검사 없음**.
|
|
28
|
+
- `src/scripts/run_ralph_desk.zsh:1407-1427` — `check_prompt_stall()`. `_PROMPT_RE`/`_AFFORDANCE_RE` 만 검사. **codex idle UI 패턴 없음**.
|
|
29
|
+
- `src/scripts/run_ralph_desk.zsh:2185, 2194` — main polling loop가 `check_prompt_stall` + `check_no_progress` 호출.
|
|
30
|
+
- `src/scripts/run_ralph_desk.zsh:2269-2306` — codex verifier 전용 폴링: verdict file 존재 + jq valid JSON + 30s grace + `pane_current_command` shell-back 조기 종료. **그러나 위 main loop의 check_no_progress 가 동시에 작동 → 동일 600s에 BLOCKED 발화**.
|
|
31
|
+
- `src/scripts/run_ralph_desk.zsh:272` — `VERDICT_FILE="${MEMOS_DIR}/${SLUG}-verify-verdict.json"`.
|
|
32
|
+
- `src/scripts/run_ralph_desk.zsh:382, 407, 536, 560` — codex working/thinking/Exploring/Running/reading/searching/editing/writing 키워드만 인식. idle UI 미인식.
|
|
33
|
+
|
|
34
|
+
**Node leader**:
|
|
35
|
+
- `src/node/runner/prompt-dismisser.mjs:18-26` — `PROMPT_RE` + `AFFORDANCE_RE` 모두 claude 패턴. codex 특화 0개.
|
|
36
|
+
- `src/node/runner/prompt-detector.mjs:6-11` — claude permission 시그니처 only.
|
|
37
|
+
- `src/node/polling/signal-poller.mjs:118-242` — `pollForSignal(signalFile, { timeoutMs })`. signal file이 없고 pane이 shell로 돌아오면 `WorkerExitedError`. timeout 시 `TimeoutError`. **verdict 파일 자체를 polling 함** — verdict 작성 후 codex idle 시점이라도 file 존재하면 즉시 반환. 즉 Bug #3는 Node 측에서는 발생 빈도가 낮으나, 여전히 timeout 600s 안에 read 못하면 (예: codex가 verdict를 매우 늦게 atomic-write) 회귀 가능.
|
|
38
|
+
- `src/node/runner/campaign-main-loop.mjs:1449-1468` — verifier 폴링 호출.
|
|
39
|
+
- `src/node/runner/campaign-main-loop.mjs:471-501` — `BLOCK_TAGS`: `VERIFIER_TIMEOUT`, `VERIFIER_EXITED`, `PROMPT_BLOCKED`, `PERMISSION_PROMPT`.
|
|
40
|
+
|
|
41
|
+
### 핵심 결정 (추천안)
|
|
42
|
+
|
|
43
|
+
**zsh runner와 Node leader 양쪽에 "verdict file 우선 검사 + codex idle UI 인식" 이중 방어 추가**. Bug Report Fix-A + Fix-B를 두 경로에 적용. Fix-C(`--verifier-noprogress-timeout` 옵션 분리)는 보류 — Fix-A 가 효력을 보이면 불필요.
|
|
44
|
+
|
|
45
|
+
**근거**:
|
|
46
|
+
- v0.14.0 production path(zsh)에서 즉시 효과. BOS 캠페인 즉시 회복.
|
|
47
|
+
- agent mode(Node, alpha)에도 같은 회귀 위험이 있으므로 동시 적용으로 일관된 contract.
|
|
48
|
+
- 1줄 추가(verdict file 존재 검사)는 risk가 매우 낮고, codex idle 패턴 추가는 기존 패턴 정규식에 alternation 추가로 끝.
|
|
49
|
+
- workaround W1(`--verifier-model sonnet`) 은 BOS 권고이지만 codex consensus 가치를 깎으므로 fix 가 우선.
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## B. Approach (5 Phases)
|
|
54
|
+
|
|
55
|
+
### Phase 1 — zsh: verdict-aware no-progress + codex idle 인식 (Day 1, 2시간)
|
|
56
|
+
|
|
57
|
+
**파일**: `src/scripts/run_ralph_desk.zsh`
|
|
58
|
+
|
|
59
|
+
1. **`check_no_progress()` (L1437-1469)** 진입부에 verdict-aware short-circuit 추가:
|
|
60
|
+
```zsh
|
|
61
|
+
check_no_progress() {
|
|
62
|
+
# v0.14.1 Fix-A: codex verifier가 verdict 작성 후 idle UI 노출 시 byte-stasis가
|
|
63
|
+
# frozen으로 오인됨. main verdict 파일이 이미 valid JSON이면 verifier는 done이며
|
|
64
|
+
# main loop 다음 phase(verdict 수확)가 처리해야 한다 — frozen으로 분류 X.
|
|
65
|
+
if [[ "${PHASE:-}" == "verifier" || "${PHASE:-}" == "final_verifier" ]] \
|
|
66
|
+
&& [[ -f "$VERDICT_FILE" ]] \
|
|
67
|
+
&& jq -e . "$VERDICT_FILE" >/dev/null 2>&1; then
|
|
68
|
+
return 0 # verdict already written; let polling loop harvest
|
|
69
|
+
fi
|
|
70
|
+
# ... 기존 byte-stasis 로직 유지 ...
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
- `PHASE` 변수 노출이 미흡하면, current pane id 가 `$VERIFIER_PANE` 인지로 분기.
|
|
74
|
+
- consensus 모드에서는 `${SLUG}-verify-verdict-claude.json` / `${SLUG}-verify-verdict-codex.json` 도 OR 검사.
|
|
75
|
+
|
|
76
|
+
2. **`check_prompt_stall()` (L1407-1427)** 의 `_PROMPT_RE` / `_AFFORDANCE_RE` 에 codex idle 패턴 추가하지 **않는다** — 이건 prompt가 아니라 idle 상태. 대신 신규 helper `is_codex_idle_ui()` 추가:
|
|
77
|
+
```zsh
|
|
78
|
+
is_codex_idle_ui() {
|
|
79
|
+
local pane_text="$1"
|
|
80
|
+
# codex post-work idle: "─ Worked for Xm Ys ──", "› " prefix, "Context X% left"
|
|
81
|
+
echo "$pane_text" | grep -qE '─ Worked for [0-9]+m [0-9]+s ─' \
|
|
82
|
+
|| echo "$pane_text" | grep -qE 'Context [0-9]+% left'
|
|
83
|
+
}
|
|
84
|
+
```
|
|
85
|
+
`check_no_progress()` 의 byte-stasis 단계에서 verdict file이 부재해도 codex idle UI 가 감지되면 추가 grace 기간(예: +120s) 부여. timeout 사용자에게 명확하도록 stderr 노티스 1회.
|
|
86
|
+
|
|
87
|
+
3. **codex verifier 폴링 (L2269-2306)** 의 grace period(현재 30s) 는 그대로 유지.
|
|
88
|
+
|
|
89
|
+
### Phase 2 — Node leader: signal-poller + prompt-dismisser 보강 (Day 1, 3시간)
|
|
90
|
+
|
|
91
|
+
**파일**: `src/node/runner/prompt-dismisser.mjs`, `src/node/polling/signal-poller.mjs`, `src/node/runner/prompt-detector.mjs`
|
|
92
|
+
|
|
93
|
+
1. **`prompt-dismisser.mjs`** 에 codex idle 인식용 정규식 분리 (기존 PROMPT_RE 와는 다른 카테고리):
|
|
94
|
+
```js
|
|
95
|
+
// v0.14.1: codex post-work idle UI markers. Not a permission prompt — work
|
|
96
|
+
// is done; the CLI is just waiting for next user input. Emitting these as
|
|
97
|
+
// "prompt blocked" would be wrong — the right response is to let the
|
|
98
|
+
// verifier-side polling harvest the already-written verdict file.
|
|
99
|
+
export const CODEX_IDLE_RE = /─\s*Worked for \d+m \d+s\s*─|Context \d+% left/;
|
|
100
|
+
export function isCodexIdleUi(paneText) { return CODEX_IDLE_RE.test(paneText); }
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
2. **`signal-poller.mjs`** (L210-226 의 pane-shell-back 분기 직전):
|
|
104
|
+
```js
|
|
105
|
+
// v0.14.1 Fix-A: re-read signal file once more before declaring exit.
|
|
106
|
+
// Codex may write verdict + return to idle UI almost simultaneously; if
|
|
107
|
+
// the verdict landed on disk after our last readFile, we must not
|
|
108
|
+
// misclassify the idle UI as WorkerExited.
|
|
109
|
+
try {
|
|
110
|
+
const raw = await readFile(signalFile, 'utf8');
|
|
111
|
+
const parsed = JSON.parse(raw);
|
|
112
|
+
return parsed;
|
|
113
|
+
} catch { /* still missing — fall through to existing exit logic */ }
|
|
114
|
+
```
|
|
115
|
+
현재 코드 패스가 이미 readFile loop을 하므로 차이가 작을 수 있음 — 정확한 위치는 구현 시점에 결정.
|
|
116
|
+
|
|
117
|
+
3. **timeout 직전(L deadline 비교)** 에 verdict file 마지막 점검 추가 (Bug Report Fix-A 의 Node 버전):
|
|
118
|
+
```js
|
|
119
|
+
if (Date.now() >= deadline) {
|
|
120
|
+
try {
|
|
121
|
+
const last = await readFile(signalFile, 'utf8');
|
|
122
|
+
return JSON.parse(last);
|
|
123
|
+
} catch {}
|
|
124
|
+
throw new TimeoutError(`Timed out waiting for valid JSON signal at ${signalFile}`);
|
|
125
|
+
}
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
4. **`prompt-detector.mjs`** 는 권한 프롬프트 전용이므로 codex idle 추가하지 않음. 대신 `prompt-dismisser` 의 `isCodexIdleUi()` 를 signal-poller가 임포트해서, codex mode 일 때 idle 감지 시 deadline 연장 (예: 마지막 byte-change 가 600s 전이라도 idle UI 보이면 추가 120s grace).
|
|
129
|
+
|
|
130
|
+
### Phase 3 — 테스트 (Day 1, 3시간)
|
|
131
|
+
|
|
132
|
+
**신규 / 갱신**:
|
|
133
|
+
- `tests/node/test-prompt-dismisser.mjs` (신규 또는 기존 보강): `CODEX_IDLE_RE` / `isCodexIdleUi()` 단위 테스트. 실제 BOS 캡처 텍스트 fixture로 verify (≥3 케이스).
|
|
134
|
+
- `tests/node/test-signal-poller.mjs` (us003 보강): "verdict written between polls + codex idle UI" 시나리오 — `readFile` mock이 첫 호출 ENOENT, 두 번째 호출 valid JSON 반환. timeout 직전 last-read 가 verdict 회수하는지 검증.
|
|
135
|
+
- `tests/node/us003-signal-poller.test.mjs` 의 기존 flake 테스트는 손대지 않음(타이밍 flake — 별도 이슈).
|
|
136
|
+
- zsh 측: `tests/test_us0XX_codex_idle_no_progress.sh` 신규. mock pane 텍스트 + verdict file 시뮬레이션으로 `check_no_progress()` 가 `PHASE=verifier && verdict exists` 일 때 BLOCKED 발화 안 함을 zsh 단위 테스트로 검증.
|
|
137
|
+
|
|
138
|
+
### Phase 4 — SV gate 갱신 (Day 1, 1시간)
|
|
139
|
+
|
|
140
|
+
**파일**: `tests/sv-self-verify-0.14.sh` 보강 또는 신규 `tests/sv-self-verify-0.14.1.sh`.
|
|
141
|
+
신규 시나리오:
|
|
142
|
+
- L6.1 (CRITICAL) verdict-aware no-progress: `PHASE=verifier` + verdict 파일 존재 시 `check_no_progress` 가 BLOCKED 안 함.
|
|
143
|
+
- L6.2 (CRITICAL) Node `signal-poller` last-chance verdict read: deadline 직전 verdict 작성된 경우 timeout 대신 verdict 반환.
|
|
144
|
+
- L6.3 (MEDIUM) `isCodexIdleUi()` 단위 테스트 PASS.
|
|
145
|
+
- L6.4 (MEDIUM) BOS-shape 캡처 텍스트("Worked for 5m 36s ──", "Context 66% left") 가 idle 로 인식.
|
|
146
|
+
- L6.5 (LOW) v0.14.0 contract 회귀 가드: `--mode tmux` 가 여전히 zsh subprocess로 위임 (us008 happy 재실행).
|
|
147
|
+
|
|
148
|
+
### Phase 5 — Ship (Day 2, CLAUDE.md mandate)
|
|
149
|
+
|
|
150
|
+
1. self-verification gate 100% PASS.
|
|
151
|
+
2. ralplan + codex review (governance.md 변경 없으면 ralplan 생략 가능, 단 governance docs에 codex idle 인식 정책 1단락 추가 시 mandatory).
|
|
152
|
+
3. version bump 0.14.1.
|
|
153
|
+
4. CHANGELOG: "Fix codex verifier idle UI being mis-classified as no-progress; verdict-aware short-circuit in zsh runner; symmetric guard in Node signal-poller (agent mode)."
|
|
154
|
+
5. commit + push + gh release + npm publish (각 단계 사용자 승인 필수, CLAUDE.md `Commit & Publish Gate`).
|
|
155
|
+
6. local sync banner-aware verify.
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## C. v0.14.0 / v0.13.x 에서 보존되는 것
|
|
160
|
+
|
|
161
|
+
- v0.14.0 routing contract: `--mode tmux` → zsh subprocess. 변경 없음.
|
|
162
|
+
- v0.13.0 path migration (`.rlp-desk/`).
|
|
163
|
+
- v0.13.0 prompt-detector + signal-poller permission_prompt 감지.
|
|
164
|
+
- v0.13.1 detached vs attached tmux 분기.
|
|
165
|
+
- agent-mode alpha 라벨링.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## D. Critical Files
|
|
170
|
+
|
|
171
|
+
```
|
|
172
|
+
src/scripts/run_ralph_desk.zsh # Phase 1 — verdict-aware check_no_progress + is_codex_idle_ui()
|
|
173
|
+
src/node/runner/prompt-dismisser.mjs # Phase 2 — CODEX_IDLE_RE + isCodexIdleUi()
|
|
174
|
+
src/node/polling/signal-poller.mjs # Phase 2 — last-chance verdict read on deadline + idle-UI grace
|
|
175
|
+
src/node/runner/prompt-detector.mjs # Phase 2 — 변경 없음 (확인용)
|
|
176
|
+
src/node/runner/campaign-main-loop.mjs # 변경 거의 없음 — pollForSignal 호출은 그대로
|
|
177
|
+
tests/node/test-prompt-dismisser.mjs # 신규/보강
|
|
178
|
+
tests/node/test-signal-poller.mjs # 보강
|
|
179
|
+
tests/test_us0XX_codex_idle_no_progress.sh # 신규 (zsh 단위)
|
|
180
|
+
tests/sv-self-verify-0.14.1.sh 또는 0.14.sh 보강 # Phase 4
|
|
181
|
+
package.json # Phase 5 — 0.14.1
|
|
182
|
+
docs/plans/spicy-booping-galaxy.md # 본 파일 (최종 plan 기록)
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
Verdict file paths (참조 only, 변경 없음):
|
|
186
|
+
- zsh: `${MEMOS_DIR}/${SLUG}-verify-verdict.json` (run_ralph_desk.zsh:272)
|
|
187
|
+
- Node: `paths.verdictFile = <deskRoot>/memos/<slug>-verify-verdict.json` (campaign-main-loop.mjs:78)
|
|
188
|
+
- Consensus: `<slug>-verify-verdict-claude.json`, `<slug>-verify-verdict-codex.json`
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## E. Verification (E2E)
|
|
193
|
+
|
|
194
|
+
1. **BOS 회귀 (CRITICAL)**: BOS Phase 1 캠페인을 `--worker-model sonnet --verifier-model gpt-5.5:high --consensus final-only` 로 재실행. US-003 verifier idle UI 시점에 BLOCKED 발화 안 함, verdict 정상 회수 + 다음 iteration 진입 확인.
|
|
195
|
+
2. **Local sync**: `node scripts/postinstall.js` 후 banner-aware diff 0 mismatches.
|
|
196
|
+
3. **Backward compat**:
|
|
197
|
+
- claude-only verifier (`--verifier-model opus`): 회귀 없음 — verdict-aware short-circuit 도 trip 안 함 (claude는 idle 시점 자체가 다름).
|
|
198
|
+
- `--mode agent`: 동일 fix 적용으로 회귀 없음.
|
|
199
|
+
- 진짜 frozen worker (verdict 부재 + pane 600s 변화 없음): 기존대로 BLOCKED 정상 발화.
|
|
200
|
+
4. **Inspection 세션 정리**: 사용자 측 `tmux session doul-bos-583` (worker pane `%1681` node 잔존) 은 이번 fix 이후 재실행 결정 시점에 사용자가 직접 정리 (`/rlp-desk clean <slug> --kill-session`).
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## F. 기각된 대안
|
|
205
|
+
|
|
206
|
+
- **Fix-C (`--verifier-noprogress-timeout` 옵션 분리)**: 사용자 노출 surface 증가. Fix-A 가 효력을 보이면 불필요. v0.15.0+ 에 별도 백로그.
|
|
207
|
+
- **Workaround W1 영구화 (claude 만 verifier 허용)**: codex consensus 가치 손실. 부정.
|
|
208
|
+
- **agent mode 만 fix, tmux 보류**: production path 가 zsh이므로 BOS 회복 안 됨. 부정.
|
|
209
|
+
- **prompt-detector에 codex idle 추가**: detector 는 permission prompt 전용 — 의미적으로 어긋남. dismisser쪽 분리 함수가 정확.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## G. Risk Notes
|
|
214
|
+
|
|
215
|
+
- `PHASE` 변수가 zsh runner 전반에 노출되어 있는지 확인 필요(미노출 시 verifier pane id로 분기 변경).
|
|
216
|
+
- consensus 모드 시 두 verdict 파일 OR 검사로 단일 verifier 작성된 시점에 short-circuit 안 됨 — 이 부분은 Phase 1 구현 시 명시 검증.
|
|
217
|
+
- `jq -e .` 검사가 codex가 partial-write 중인 verdict (파일 생성 후 atomic mv 전)에 false-positive 안 발생하는지 확인 — `atomic_write` 가 이미 mv 사용이라 안전.
|
|
218
|
+
- Node 측 last-chance read 가 race condition (deadline 전 다른 thread가 file 작성) 에서도 동작하는지 확인 — fs.readFile 은 atomic 이므로 OK.
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
# v0.14.0 (HISTORY) — Restore zsh as primary tmux runner
|
|
223
|
+
|
|
224
|
+
> **Status**: SHIPPED 2026-05-03. v0.14.0 npm + GitHub release 게시 완료.
|
|
225
|
+
> **상위 우선순위 plan. 아래 v0.13.0 plan은 history reference로 보존.**
|
|
226
|
+
> **Trigger**: 사용자 평가 — "rlp-desk가 못 쓸 폐급, 통제 불가능 수준". v0.13.0/v0.13.1 fix는 빙산 일각.
|
|
227
|
+
> **Target version**: 0.14.0
|
|
228
|
+
> **승인된 strategy**: 경로 A (zsh restoration as tmux primary; Node leader는 `--mode agent`만 담당)
|
|
229
|
+
|
|
230
|
+
---
|
|
231
|
+
|
|
232
|
+
## A. Context (v0.14.0)
|
|
233
|
+
|
|
234
|
+
### 문제 진단
|
|
235
|
+
|
|
236
|
+
2026-04-12 Node port 시점부터 v0.13.x까지, **Node leader가 zsh runner의 핵심 안전망 11개를 누락한 채 ship**. 사용자(BOS) 평가는 "통제 불가능". v0.13.0/v0.13.1 patch는 다음 2개만 해결:
|
|
237
|
+
1. `.claude/` sensitive prompt hang
|
|
238
|
+
2. detached session UX 회귀
|
|
239
|
+
|
|
240
|
+
**여전히 누락된 것** (file:line 인용):
|
|
241
|
+
| # | 기능 | zsh location | Node 상태 |
|
|
242
|
+
|---|------|--------------|----------|
|
|
243
|
+
| 1 | Copy-mode 가드 send-keys | `safe_send_keys` L976-1083 | 없음 (pane-manager.mjs:50-53 단순 send-keys) |
|
|
244
|
+
| 2 | Heartbeat 주기적 쓰기 + staleness 감지 | L1735-1750, L1158 | 없음 |
|
|
245
|
+
| 3 | No-progress 10분 byte-stasis 감지 | `check_no_progress` L2372-2420 | 없음 |
|
|
246
|
+
| 4 | Prompt-stall 5분 timeout | `check_prompt_stall` L2298-2370 | 없음 |
|
|
247
|
+
| 5 | Stale-context 3 consecutive unchanged iter | `check_stale_context` lib L1162-1179 | 없음 |
|
|
248
|
+
| 6 | Claude 모델 upgrade chain (haiku→sonnet→opus) | `get_next_model` lib L136-155 | 없음 (codex chain만) |
|
|
249
|
+
| 7 | LOCK_WORKER_MODEL flag 처리 | lib L197 | 없음 |
|
|
250
|
+
| 8 | Codex update prompt auto-dismiss | L1007-1011 | 없음 |
|
|
251
|
+
| 9 | Pane lifecycle cleanup | `cleanup_panes` L3310-3320 | 없음 (`_ensureTerminalSentinel` 부분만) |
|
|
252
|
+
| 10 | 사용자 pane-kill graceful detection | L134-150 | 없음 |
|
|
253
|
+
| 11 | Cleanup trap (C-c → /exit → kill-pane) | L1864-2014 | 부분만 |
|
|
254
|
+
|
|
255
|
+
### 핵심 결정
|
|
256
|
+
|
|
257
|
+
**zsh를 tmux mode primary path로 복원, Node leader는 `--mode agent`(LLM-driven orchestration) 단독 담당**.
|
|
258
|
+
|
|
259
|
+
**근거**:
|
|
260
|
+
- zsh runner는 v0.12.0 deprecation 전까지 6주+ production 검증.
|
|
261
|
+
- Node port의 누락 11개를 모두 port = 5-7일 + 새 회귀 위험. zsh 복원 = 1-2일 + 검증된 코드.
|
|
262
|
+
- Node leader는 LLM이 worker/verifier를 spawn하는 agent mode에 고유 가치 — tmux 기계적 orchestration은 zsh가 더 잘함.
|
|
263
|
+
- 사용자 즉시 회복 우선.
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## B. Approach (6 Phases)
|
|
268
|
+
|
|
269
|
+
### Phase 1 — zsh deprecation 게이트 해제 (Day 1, 2시간)
|
|
270
|
+
|
|
271
|
+
**파일**: `src/scripts/run_ralph_desk.zsh`
|
|
272
|
+
- L69-90: `--flywheel`/`--with-self-verification`/`--flywheel-guard` hard-reject 블록 제거. zsh가 이 flag들을 다시 honor.
|
|
273
|
+
- L91 "deprecated" 메시지 제거.
|
|
274
|
+
- `RALPH_DESK_VERSION` 0.14.0으로 갱신.
|
|
275
|
+
|
|
276
|
+
### Phase 2 — Node `--mode tmux` → zsh subprocess 라우팅 (Day 1, 4시간)
|
|
277
|
+
|
|
278
|
+
**파일**: `src/node/run.mjs`
|
|
279
|
+
- `parseRunOptions()` 후 `runRunCommand()` 진입에서 `mode === 'tmux'` 분기:
|
|
280
|
+
- `~/.claude/ralph-desk/run_ralph_desk.zsh` 경로 확인 (postinstall이 sync 보장).
|
|
281
|
+
- 모든 옵션을 env vars로 변환 (`LOOP_NAME`, `WORKER_MODEL`, `FLYWHEEL`, `FLYWHEEL_GUARD`, `WITH_SELF_VERIFICATION`, `MAX_ITER`, `ITER_TIMEOUT`, `CB_THRESHOLD`, `CONSENSUS_*`, `LOCK_WORKER_MODEL` 등).
|
|
282
|
+
- `child_process.spawn('zsh', [zshPath], { env, stdio: 'inherit' })`로 위임.
|
|
283
|
+
- exit code 그대로 propagate.
|
|
284
|
+
- legacy detection (`detectLegacyDeskInRunMode`) 호출은 zsh spawn 전에 유지.
|
|
285
|
+
- claude+tmux warning 유지 (zsh도 같은 worker engine 분기에서 적용).
|
|
286
|
+
|
|
287
|
+
**파일**: `src/node/runner/campaign-main-loop.mjs`
|
|
288
|
+
- `run()` 진입에서 `options.mode === 'tmux'`일 때 가드: `throw new Error('tmux mode is delegated to zsh — invoke via run.mjs router')`. dead-code 표시 + 회귀 방지.
|
|
289
|
+
|
|
290
|
+
### Phase 3 — postinstall + install.sh가 zsh를 항상 sync (Day 1, 1시간)
|
|
291
|
+
|
|
292
|
+
**파일**: `scripts/postinstall.js`
|
|
293
|
+
- 현재 `legacyFiles` 배열로 zsh 3개 삭제 → **유지·sync로 변경**.
|
|
294
|
+
- `runtimeSources`에 `src/scripts/{init,run,lib}_ralph_desk.zsh` → `~/.claude/ralph-desk/` 추가 (또는 `scripts/` 하위 — install.sh와 일관성 결정).
|
|
295
|
+
- banner-aware sync.
|
|
296
|
+
|
|
297
|
+
**파일**: `tests/node/us008-cli-entrypoint.test.mjs:47`
|
|
298
|
+
- 기존 "removes legacy zsh scripts" 테스트 invert: zsh 3개가 install 후 존재 + spawnable 검증.
|
|
299
|
+
|
|
300
|
+
### Phase 4 — Node `--mode agent` 라벨링 (Day 2, 2시간)
|
|
301
|
+
|
|
302
|
+
**파일**: `src/node/run.mjs`, `src/commands/rlp-desk.md`, `README.md`
|
|
303
|
+
- `--mode agent` 진입 시 stderr 경고: "agent mode is alpha — for production use --mode tmux".
|
|
304
|
+
- README mode 표:
|
|
305
|
+
- `tmux` (stable, zsh-backed)
|
|
306
|
+
- `agent` (alpha, Node-native)
|
|
307
|
+
- v0.13.0/0.13.1 fix(`.claude/` 마이그레이션, prompt-detector, claude+tmux warning)는 모두 agent mode에서 잔존 — Node 단독 가치 보존.
|
|
308
|
+
|
|
309
|
+
### Phase 5 — 검증 시나리오 (Day 2-3, 1일)
|
|
310
|
+
|
|
311
|
+
**Self-verification gate (`tests/sv-self-verify-0.14.sh`)** — v0.13 시나리오 + 신규:
|
|
312
|
+
- L5.1 (CRITICAL) BOS 회귀: claude worker + tmux mode + 1 iter 완주 (실제 tmux session 생성, kill-session cleanup).
|
|
313
|
+
- L5.2 (CRITICAL) zsh subprocess routing: `--mode tmux` 호출이 `child_process.spawn('zsh', ...)`로 위임됐는지 mock 검증.
|
|
314
|
+
- L5.3 (CRITICAL) flag → env var conversion 단언 (모든 supported flag 1개씩).
|
|
315
|
+
- L5.4 (MEDIUM) zsh deprecation 게이트 제거 검증 (L69-90).
|
|
316
|
+
- L5.5 (MEDIUM) postinstall이 zsh 3개를 install (us008 invert).
|
|
317
|
+
- L5.6 (MEDIUM) `--mode agent` warning 출력 검증.
|
|
318
|
+
- L5.7 (UX) attached tmux 안에서 leader+worker+verifier panes 사용자 현재 window에 표시 (zsh L815-823 동작).
|
|
319
|
+
|
|
320
|
+
### Phase 6 — Ship (Day 3)
|
|
321
|
+
|
|
322
|
+
CLAUDE.md release workflow 그대로:
|
|
323
|
+
1. self-verification gate 17/17 PASS.
|
|
324
|
+
2. ralplan + codex review (기존 mandate).
|
|
325
|
+
3. version bump 0.14.0.
|
|
326
|
+
4. CHANGELOG: "Restore zsh as primary tmux runner. Node tmux delegates to validated zsh codepath. Node agent mode marked alpha."
|
|
327
|
+
5. commit + push + gh release + npm publish.
|
|
328
|
+
6. local sync banner-aware verify.
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
|
|
332
|
+
## C. v0.13.x에서 보존되는 것
|
|
333
|
+
|
|
334
|
+
- `.claude/ralph-desk/` → `.rlp-desk/` 경로 마이그레이션 (init mode auto-mv, run mode 안내) — zsh도 v0.13.0에서 이미 반영됨.
|
|
335
|
+
- `RLP_DESK_RUNTIME_DIR` env override.
|
|
336
|
+
- prompt-detector + signal-poller permission_prompt 감지 — agent mode 전용.
|
|
337
|
+
- BLOCK_TAGS.PERMISSION_PROMPT 상수.
|
|
338
|
+
- claude+tmux warning (run.mjs 진입 시).
|
|
339
|
+
|
|
340
|
+
---
|
|
341
|
+
|
|
342
|
+
## D. v0.15.0+ 점진 port 백로그 (deferred)
|
|
343
|
+
|
|
344
|
+
**P0 (2주 내, agent mode parity 위해)**:
|
|
345
|
+
- heartbeat 메커니즘
|
|
346
|
+
- copy-mode 가드 send-keys
|
|
347
|
+
- prompt-stall 5분 timeout
|
|
348
|
+
- no-progress 10분 byte-stasis
|
|
349
|
+
|
|
350
|
+
**P1**:
|
|
351
|
+
- stale-context 감지
|
|
352
|
+
- claude 모델 upgrade chain (haiku→sonnet→opus)
|
|
353
|
+
- LOCK_WORKER_MODEL flag
|
|
354
|
+
|
|
355
|
+
**P2**:
|
|
356
|
+
- codex update prompt auto-dismiss
|
|
357
|
+
- pane lifecycle cleanup
|
|
358
|
+
- user-kill graceful detection
|
|
359
|
+
- cleanup trap full parity
|
|
360
|
+
|
|
361
|
+
---
|
|
362
|
+
|
|
363
|
+
## E. Critical Files
|
|
364
|
+
|
|
365
|
+
```
|
|
366
|
+
src/scripts/run_ralph_desk.zsh # Phase 1 — deprecation 게이트 해제
|
|
367
|
+
src/scripts/lib_ralph_desk.zsh # 변경 없음 (zsh helpers 그대로)
|
|
368
|
+
src/scripts/init_ralph_desk.zsh # 변경 없음 (v0.13.0 마이그레이션 그대로)
|
|
369
|
+
src/node/run.mjs # Phase 2 — tmux mode 라우터
|
|
370
|
+
src/node/runner/campaign-main-loop.mjs # Phase 2 — tmux mode 가드
|
|
371
|
+
scripts/postinstall.js # Phase 3 — zsh sync 복원
|
|
372
|
+
tests/node/us008-cli-entrypoint.test.mjs # Phase 3 — invert
|
|
373
|
+
src/commands/rlp-desk.md # Phase 4 — mode 표 갱신
|
|
374
|
+
README.md # Phase 4 — stable/alpha 표시
|
|
375
|
+
package.json # Phase 6 — 0.14.0
|
|
376
|
+
tests/sv-self-verify-0.14.sh # Phase 5 — 신규 SV gate
|
|
377
|
+
```
|
|
378
|
+
|
|
379
|
+
---
|
|
380
|
+
|
|
381
|
+
## F. 검증 (E2E)
|
|
382
|
+
|
|
383
|
+
1. **BOS 회귀 (CRITICAL)**: BOS 프로젝트 worktree에서 `node ~/.claude/ralph-desk/node/run.mjs run bos-phase-1 --mode tmux --worker-model sonnet --max-iter 1 --iter-timeout 600` → tmux session 생성, leader/worker/verifier panes 사용자 현재 window에 split, 1 iter sentinel write 성공, no permission prompt hang.
|
|
384
|
+
2. **Local sync**: `npm install` 후 `~/.claude/ralph-desk/run_ralph_desk.zsh` 존재 + banner.
|
|
385
|
+
3. **Backward compat**: v0.13.x mid-campaign 사용자 — `mv .claude/ralph-desk .rlp-desk` 안내 그대로.
|
|
386
|
+
|
|
387
|
+
---
|
|
388
|
+
|
|
389
|
+
## G. 기각된 대안
|
|
390
|
+
|
|
391
|
+
- **B (Node 전면 port)**: 11 features × 평균 3시간 + parity 회귀테스트 = 5-7일 + 신규 버그 위험. v0.15.0+ 점진 port로 deferred.
|
|
392
|
+
- **C (단독 라벨링)**: experimental 라벨만으로는 "통제 불가능" 즉시 해소 불가. 경로 A에 흡수.
|
|
393
|
+
|
|
394
|
+
---
|
|
395
|
+
|
|
396
|
+
# v0.13.0 (HISTORY) — Claude worker `.claude/` sensitive prompt hang 수정
|
|
2
397
|
|
|
3
398
|
> **Source bug report**: `/Users/kyjin/dev/doul/bos/docs/exec-plans/active/2026-05-01-rlp-desk-bug-report.md`
|
|
4
399
|
> **Severity**: HIGH — `--mode tmux` + `--worker-model sonnet/haiku/opus` 조합에서 모든 campaign blocking
|
|
5
|
-
> **Target version**: 0.13.0 (breaking — project-local sentinel 경로 이동)
|
|
400
|
+
> **Target version**: 0.13.0 (breaking — project-local sentinel 경로 이동) — **SHIPPED**, but coverage was a sliver of the real failure surface (see v0.14.0 plan above).
|
|
6
401
|
|
|
7
402
|
---
|
|
8
403
|
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ai-dev-methodologies/rlp-desk",
|
|
3
|
-
"version": "0.
|
|
4
|
-
"description": "Fresh-context iterative loops for Claude Code
|
|
3
|
+
"version": "0.14.1",
|
|
4
|
+
"description": "Fresh-context iterative loops for Claude Code — autonomous task completion with independent verification",
|
|
5
5
|
"scripts": {
|
|
6
6
|
"postinstall": "node scripts/postinstall.js",
|
|
7
7
|
"uninstall": "node scripts/uninstall.js",
|
|
@@ -46,4 +46,4 @@
|
|
|
46
46
|
"engines": {
|
|
47
47
|
"node": ">=16"
|
|
48
48
|
}
|
|
49
|
-
}
|
|
49
|
+
}
|
package/scripts/postinstall.js
CHANGED
|
@@ -19,6 +19,14 @@ const runtimeSources = [
|
|
|
19
19
|
["src/model-upgrade-table.md", path.join(deskDir, "model-upgrade-table.md")],
|
|
20
20
|
["README.md", path.join(deskDir, "README.md")],
|
|
21
21
|
["install.sh", path.join(deskDir, "install.sh")],
|
|
22
|
+
// v0.14.0: zsh runner is the canonical --mode tmux backend again.
|
|
23
|
+
// src/node/run.mjs spawns it as a subprocess for tmux mode invocations.
|
|
24
|
+
// injectBannerAndLock preserves the shebang and adds a `# DO NOT EDIT`
|
|
25
|
+
// line on line 2 so the verification scripts in CLAUDE.md still
|
|
26
|
+
// recognize the file as installed.
|
|
27
|
+
["src/scripts/init_ralph_desk.zsh", path.join(deskDir, "init_ralph_desk.zsh")],
|
|
28
|
+
["src/scripts/run_ralph_desk.zsh", path.join(deskDir, "run_ralph_desk.zsh")],
|
|
29
|
+
["src/scripts/lib_ralph_desk.zsh", path.join(deskDir, "lib_ralph_desk.zsh")],
|
|
22
30
|
// v5.7 §4.15: all rlp-desk docs (user-facing + dev meta) under docs/rlp-desk/.
|
|
23
31
|
["docs/rlp-desk/architecture.md", path.join(docsDir, "rlp-desk", "architecture.md")],
|
|
24
32
|
["docs/rlp-desk/getting-started.md", path.join(docsDir, "rlp-desk", "getting-started.md")],
|
|
@@ -26,11 +34,12 @@ const runtimeSources = [
|
|
|
26
34
|
["docs/rlp-desk/TODO-verification-next.md", path.join(docsDir, "rlp-desk", "TODO-verification-next.md")],
|
|
27
35
|
["docs/rlp-desk/multi-mission-orchestration.md", path.join(docsDir, "rlp-desk", "multi-mission-orchestration.md")],
|
|
28
36
|
];
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
37
|
+
// v0.14.0: legacy-deletion list cleared. The Node-canonical era (v5.7+)
|
|
38
|
+
// removed zsh after install; v0.14.0 reverts that — the zsh runner is the
|
|
39
|
+
// production --mode tmux path. Keep the empty array so the surrounding loop
|
|
40
|
+
// (`for (const legacyFile of legacyFiles)`) remains a no-op without churning
|
|
41
|
+
// the call site.
|
|
42
|
+
const legacyFiles = [];
|
|
34
43
|
|
|
35
44
|
function getNodeVersion() {
|
|
36
45
|
return process.env.RLP_DESK_NODE_VERSION_OVERRIDE || process.version;
|
package/scripts/uninstall.js
CHANGED
|
@@ -20,6 +20,11 @@ const files = [
|
|
|
20
20
|
path.join(deskDir, "model-upgrade-table.md"),
|
|
21
21
|
path.join(deskDir, "README.md"),
|
|
22
22
|
path.join(deskDir, "install.sh"),
|
|
23
|
+
// v0.14.0: zsh tmux runner is part of the install set again — clean it up
|
|
24
|
+
// on uninstall so users do not end up with orphaned 0o444 files.
|
|
25
|
+
path.join(deskDir, "init_ralph_desk.zsh"),
|
|
26
|
+
path.join(deskDir, "run_ralph_desk.zsh"),
|
|
27
|
+
path.join(deskDir, "lib_ralph_desk.zsh"),
|
|
23
28
|
];
|
|
24
29
|
|
|
25
30
|
for (const targetPath of files) {
|
package/src/commands/rlp-desk.md
CHANGED
|
@@ -278,9 +278,18 @@ Cross-project aggregation: scan `~/.claude/ralph-desk/analytics/` and read each
|
|
|
278
278
|
|
|
279
279
|
Parse the `--mode` flag. If absent or `agent`, use the Agent() path below. If `tmux`, use the Tmux path.
|
|
280
280
|
|
|
281
|
+
> **v0.14.0 stability tiers:**
|
|
282
|
+
> - `--mode tmux` is the **stable, production** path. The Node leader (`run.mjs`)
|
|
283
|
+
> now routes tmux invocations to `~/.claude/ralph-desk/run_ralph_desk.zsh`
|
|
284
|
+
> as a subprocess — that runner has the full safety net (heartbeat,
|
|
285
|
+
> copy-mode guard, prompt-stall, no-progress detection, claude model
|
|
286
|
+
> upgrade chain). Recommend this for autonomous campaigns.
|
|
287
|
+
> - `--mode agent` is **alpha** (Node-native LLM-driven Leader). The runner
|
|
288
|
+
> emits a stderr warning when this mode is invoked.
|
|
289
|
+
|
|
281
290
|
#### Tmux Mode (`--mode tmux`)
|
|
282
291
|
|
|
283
|
-
When `--mode tmux` is specified (v0.
|
|
292
|
+
When `--mode tmux` is specified (v0.14.0+: `run.mjs` accepts the same flags as before but spawns `run_ralph_desk.zsh` as a subprocess and inherits stdio. Flywheel and self-verification flags are not honored under tmux mode — they require `--mode agent`):
|
|
284
293
|
|
|
285
294
|
1. **Validate scaffold** — same as Agent() mode: check `.rlp-desk/prompts/<slug>.worker.prompt.md` etc.
|
|
286
295
|
2. **Check sentinels** — same as Agent() mode.
|
|
@@ -322,9 +331,8 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
|
|
|
322
331
|
- MUST launch with `run_in_background: true` so `/rlp-desk` returns control immediately while preserving live tmux visibility.
|
|
323
332
|
- Run-in-background is used so the shell can keep the command visible and keep the pane layout stable for status checks and completion flow.
|
|
324
333
|
- Do NOT kill panes after completion. Panes stay alive for inspection. User cleans up with `/rlp-desk clean <slug> --kill-session`.
|
|
325
|
-
- `--with-self-verification`
|
|
326
|
-
-
|
|
327
|
-
- Legacy `zsh ~/.claude/ralph-desk/run_ralph_desk.zsh` (deprecated in 0.12.0) still runs for non-flywheel/non-SV invocations but emits a deprecation `[notice]`. Calling it with `FLYWHEEL` or `WITH_SELF_VERIFICATION` env vars exits 2 with a migration banner pointing to the Node leader.
|
|
334
|
+
- v0.14.0: `--with-self-verification`, `--flywheel`, and `--flywheel-guard` are **not honored** under `--mode tmux` — the zsh runner has no SV/flywheel implementation. The Node leader emits a stderr WARNING listing the dropped flags. For SV/flywheel, use `--mode agent` (alpha).
|
|
335
|
+
- The slash command always invokes `node ~/.claude/ralph-desk/node/run.mjs run --mode tmux ...`. Do NOT invoke `~/.claude/ralph-desk/run_ralph_desk.zsh` directly — the Node router resolves the runner path, runs legacy detection, and surfaces actionable errors when the runner is missing.
|
|
328
336
|
|
|
329
337
|
**tmux UX model (5 items):**
|
|
330
338
|
- The session returns immediately after launch (`run_in_background: true`) so the command returns control to the parent CLI.
|
|
@@ -333,7 +341,13 @@ node ~/.claude/ralph-desk/node/run.mjs run '<slug>' \
|
|
|
333
341
|
- On completion, the command returns a completion notification before the loop ends.
|
|
334
342
|
- Agent mode remains unchanged, and no tmux-specific behavior is mixed into Agent mode.
|
|
335
343
|
|
|
336
|
-
#### Agent Mode (`--mode agent` or default)
|
|
344
|
+
#### Agent Mode (`--mode agent` or default — **alpha**)
|
|
345
|
+
|
|
346
|
+
> **v0.14.0:** Agent mode is the alpha LLM-driven path. The Node port shipped
|
|
347
|
+
> without zsh-equivalent safety nets (heartbeat, copy-mode guard, prompt-stall
|
|
348
|
+
> timeout, no-progress detection, claude model upgrade chain). The runner
|
|
349
|
+
> emits a stderr WARNING when agent mode is invoked. For production
|
|
350
|
+
> autonomous campaigns, prefer `--mode tmux`.
|
|
337
351
|
|
|
338
352
|
**Why Agent mode is structurally immune to Bug 4/5 (mid-execution prompt hang
|
|
339
353
|
& A4 premature dispatch):** Worker/Verifier are dispatched as `Agent(...,
|
|
@@ -238,5 +238,19 @@ export async function pollForSignal(
|
|
|
238
238
|
await delay(pollIntervalMs);
|
|
239
239
|
}
|
|
240
240
|
|
|
241
|
+
// v0.14.1: last-chance verdict read before declaring timeout. Codex CLI
|
|
242
|
+
// can finish work + atomic-mv the verdict + return to its idle UI all
|
|
243
|
+
// within a single poll interval; if our previous readFile happened to
|
|
244
|
+
// race with the rename, we would have seen ENOENT/SyntaxError. Try once
|
|
245
|
+
// more synchronously before throwing — the file is now either present
|
|
246
|
+
// and parseable (success) or genuinely missing (real timeout).
|
|
247
|
+
// Bug Report #3 (BOS 2026-05-04).
|
|
248
|
+
try {
|
|
249
|
+
const rawContent = await readFile(signalFile);
|
|
250
|
+
return JSON.parse(rawContent);
|
|
251
|
+
} catch {
|
|
252
|
+
// fall through to TimeoutError
|
|
253
|
+
}
|
|
254
|
+
|
|
241
255
|
throw new TimeoutError(`Timed out waiting for valid JSON signal at ${signalFile}`);
|
|
242
256
|
}
|
package/src/node/run.mjs
CHANGED
|
@@ -1,9 +1,15 @@
|
|
|
1
|
+
import fs from 'node:fs';
|
|
2
|
+
import os from 'node:os';
|
|
1
3
|
import path from 'node:path';
|
|
4
|
+
import { spawn } from 'node:child_process';
|
|
2
5
|
import { fileURLToPath } from 'node:url';
|
|
3
6
|
|
|
4
7
|
import { initCampaign } from './init/campaign-initializer.mjs';
|
|
5
8
|
import { readStatus } from './reporting/campaign-reporting.mjs';
|
|
6
|
-
import {
|
|
9
|
+
import {
|
|
10
|
+
run as runCampaignMain,
|
|
11
|
+
detectLegacyDeskInRunMode,
|
|
12
|
+
} from './runner/campaign-main-loop.mjs';
|
|
7
13
|
import { isClaudeEngine } from './cli/command-builder.mjs';
|
|
8
14
|
|
|
9
15
|
const RUN_DEFAULTS = {
|
|
@@ -211,6 +217,102 @@ async function runStatusCommand(args, deps) {
|
|
|
211
217
|
return 0;
|
|
212
218
|
}
|
|
213
219
|
|
|
220
|
+
// v0.14.0: Default location of the zsh runner installed by postinstall.js
|
|
221
|
+
// (Phase 3 of the v0.14.0 plan re-enables this sync). Overridable via
|
|
222
|
+
// RLP_DESK_ZSH_RUNNER for development checkouts that point to src/scripts.
|
|
223
|
+
function defaultZshRunnerPath() {
|
|
224
|
+
return (
|
|
225
|
+
process.env.RLP_DESK_ZSH_RUNNER
|
|
226
|
+
|| path.join(os.homedir(), '.claude', 'ralph-desk', 'run_ralph_desk.zsh')
|
|
227
|
+
);
|
|
228
|
+
}
|
|
229
|
+
|
|
230
|
+
// v0.14.0: convert parsed CLI options to env vars consumed by run_ralph_desk.zsh.
|
|
231
|
+
// Names mirror the variables declared in src/scripts/run_ralph_desk.zsh
|
|
232
|
+
// (LOOP_NAME, ROOT, WORKER_MODEL, VERIFIER_MODEL, FINAL_VERIFIER_MODEL,
|
|
233
|
+
// MAX_ITER, ITER_TIMEOUT, CB_THRESHOLD, VERIFY_MODE, CONSENSUS_MODE,
|
|
234
|
+
// CONSENSUS_MODEL, FINAL_CONSENSUS_MODEL, LOCK_WORKER_MODEL, AUTONOMOUS_MODE,
|
|
235
|
+
// LANE_MODE, TEST_DENSITY_MODE).
|
|
236
|
+
function buildZshEnv(slug, options, parentEnv) {
|
|
237
|
+
return {
|
|
238
|
+
...parentEnv,
|
|
239
|
+
LOOP_NAME: slug,
|
|
240
|
+
ROOT: options.rootDir,
|
|
241
|
+
WORKER_MODEL: options.workerModel,
|
|
242
|
+
VERIFIER_MODEL: options.verifierModel,
|
|
243
|
+
FINAL_VERIFIER_MODEL: options.finalVerifierModel,
|
|
244
|
+
MAX_ITER: String(options.maxIterations),
|
|
245
|
+
ITER_TIMEOUT: String(options.iterTimeout),
|
|
246
|
+
CB_THRESHOLD: String(options.cbThreshold),
|
|
247
|
+
VERIFY_MODE: options.verifyMode,
|
|
248
|
+
CONSENSUS_MODE: options.consensusMode,
|
|
249
|
+
CONSENSUS_MODEL: options.consensusModel,
|
|
250
|
+
FINAL_CONSENSUS_MODEL: options.finalConsensusModel,
|
|
251
|
+
LOCK_WORKER_MODEL: options.lockWorkerModel ? '1' : '0',
|
|
252
|
+
AUTONOMOUS_MODE: options.autonomous ? '1' : '0',
|
|
253
|
+
LANE_MODE: options.laneStrict ? 'strict' : 'warn',
|
|
254
|
+
TEST_DENSITY_MODE: options.testDensityStrict ? 'strict' : 'warn',
|
|
255
|
+
};
|
|
256
|
+
}
|
|
257
|
+
|
|
258
|
+
// v0.14.0: default tmux-mode delegate. Spawns the zsh runner inheriting stdio
|
|
259
|
+
// so the operator sees pane orchestration in real time. Resolves with the
|
|
260
|
+
// child exit code (or 1 on spawn error) to keep the caller deterministic.
|
|
261
|
+
function defaultSpawnZsh(zshPath, env, cwd) {
|
|
262
|
+
return new Promise((resolve) => {
|
|
263
|
+
const child = spawn('zsh', [zshPath], { env, stdio: 'inherit', cwd });
|
|
264
|
+
child.on('error', (err) => {
|
|
265
|
+
process.stderr.write(`failed to spawn zsh runner: ${err.message}\n`);
|
|
266
|
+
resolve(1);
|
|
267
|
+
});
|
|
268
|
+
child.on('exit', (code, signal) => {
|
|
269
|
+
if (signal) {
|
|
270
|
+
resolve(128 + (typeof signal === 'string' ? 0 : signal));
|
|
271
|
+
return;
|
|
272
|
+
}
|
|
273
|
+
resolve(typeof code === 'number' ? code : 0);
|
|
274
|
+
});
|
|
275
|
+
});
|
|
276
|
+
}
|
|
277
|
+
|
|
278
|
+
async function runTmuxViaZsh(slug, options, deps) {
|
|
279
|
+
// v0.13.0 legacy detection still applies — the zsh runner shares the same
|
|
280
|
+
// .rlp-desk/ contract, so a stray .claude/ralph-desk/ left over from an
|
|
281
|
+
// older campaign must be migrated by the operator before we hand off.
|
|
282
|
+
const legacy = detectLegacyDeskInRunMode(options.rootDir, process.env);
|
|
283
|
+
if (legacy) {
|
|
284
|
+
write(deps.stderr, legacy.message);
|
|
285
|
+
return 1;
|
|
286
|
+
}
|
|
287
|
+
|
|
288
|
+
const zshPath = (deps.zshRunnerPath ?? defaultZshRunnerPath)();
|
|
289
|
+
if (!deps.fileExists(zshPath)) {
|
|
290
|
+
write(
|
|
291
|
+
deps.stderr,
|
|
292
|
+
`ERROR: zsh runner not found at ${zshPath}. Run \`npm install rlp-desk\` (or set RLP_DESK_ZSH_RUNNER) to sync.`,
|
|
293
|
+
);
|
|
294
|
+
return 1;
|
|
295
|
+
}
|
|
296
|
+
|
|
297
|
+
// Surface flags the zsh runner cannot honor. Flywheel and self-verification
|
|
298
|
+
// remain Node-leader features, available only in --mode agent. Warn loudly
|
|
299
|
+
// instead of silent no-op so the operator understands the trade-off.
|
|
300
|
+
const unsupported = [];
|
|
301
|
+
if (options.flywheel !== 'off') unsupported.push('--flywheel');
|
|
302
|
+
if (options.flywheelGuard !== 'off') unsupported.push('--flywheel-guard');
|
|
303
|
+
if (options.withSelfVerification) unsupported.push('--with-self-verification');
|
|
304
|
+
if (unsupported.length > 0) {
|
|
305
|
+
write(
|
|
306
|
+
deps.stderr,
|
|
307
|
+
`WARNING: ${unsupported.join(', ')} not honored in --mode tmux (zsh runner). Use --mode agent for those features.`,
|
|
308
|
+
);
|
|
309
|
+
}
|
|
310
|
+
|
|
311
|
+
const env = buildZshEnv(slug, options, process.env);
|
|
312
|
+
const spawnZsh = deps.spawnZsh ?? defaultSpawnZsh;
|
|
313
|
+
return spawnZsh(zshPath, env, options.rootDir);
|
|
314
|
+
}
|
|
315
|
+
|
|
214
316
|
async function runRunCommand(args, deps) {
|
|
215
317
|
if (args.length === 0) {
|
|
216
318
|
throw new Error('run requires a slug');
|
|
@@ -250,6 +352,27 @@ async function runRunCommand(args, deps) {
|
|
|
250
352
|
);
|
|
251
353
|
}
|
|
252
354
|
|
|
355
|
+
// v0.14.0: --mode tmux delegates to the zsh runner. Node leader keeps
|
|
356
|
+
// ownership of --mode agent only (LLM-driven orchestration).
|
|
357
|
+
if (options.mode === 'tmux') {
|
|
358
|
+
return runTmuxViaZsh(slug, options, deps);
|
|
359
|
+
}
|
|
360
|
+
|
|
361
|
+
// v0.14.0: agent mode is the alpha LLM-driven path. The Node port shipped
|
|
362
|
+
// without zsh-equivalent safety nets (heartbeat, copy-mode guard,
|
|
363
|
+
// prompt-stall timeout, no-progress detection, claude model upgrade chain).
|
|
364
|
+
// Surface that explicitly so production users pick --mode tmux instead.
|
|
365
|
+
if (
|
|
366
|
+
options.mode === 'agent'
|
|
367
|
+
&& !process.env.RLP_DESK_QUIET_WARNINGS
|
|
368
|
+
&& process.env.NODE_ENV !== 'test'
|
|
369
|
+
) {
|
|
370
|
+
write(
|
|
371
|
+
deps.stderr,
|
|
372
|
+
'WARNING: --mode agent is alpha. For production tmux orchestration, prefer --mode tmux (zsh-backed, stable).',
|
|
373
|
+
);
|
|
374
|
+
}
|
|
375
|
+
|
|
253
376
|
const result = await deps.runCampaign(slug, options);
|
|
254
377
|
// governance §1f BLOCKED Surfacing: surface the blocked reason on stderr so
|
|
255
378
|
// the operator (or wrapper script) does not have to grep memo files.
|
|
@@ -273,6 +396,13 @@ export async function main(argv = process.argv.slice(2), overrides = {}) {
|
|
|
273
396
|
initCampaign: overrides.initCampaign ?? initCampaign,
|
|
274
397
|
readStatus: overrides.readStatus ?? readStatus,
|
|
275
398
|
runCampaign: overrides.runCampaign ?? runCampaignMain,
|
|
399
|
+
// v0.14.0: --mode tmux delegate. Tests inject `spawnZsh` to assert the
|
|
400
|
+
// env mapping without actually fork+exec'ing zsh. `fileExists` and
|
|
401
|
+
// `zshRunnerPath` are similarly injectable so a test can pretend the
|
|
402
|
+
// installed runner is or isn't present.
|
|
403
|
+
spawnZsh: overrides.spawnZsh,
|
|
404
|
+
zshRunnerPath: overrides.zshRunnerPath,
|
|
405
|
+
fileExists: overrides.fileExists ?? ((p) => fs.existsSync(p)),
|
|
276
406
|
};
|
|
277
407
|
|
|
278
408
|
try {
|
|
@@ -949,6 +949,12 @@ export function shouldRunGuard(flywheelGuard, state, usId) {
|
|
|
949
949
|
return true;
|
|
950
950
|
}
|
|
951
951
|
|
|
952
|
+
// v0.14.0: production --mode tmux is routed to the zsh runner by
|
|
953
|
+
// src/node/run.mjs (see runTmuxViaZsh). The Node leader below owns the
|
|
954
|
+
// --mode agent (LLM-driven) flow. In-tree tests still exercise this path
|
|
955
|
+
// with `mode: 'tmux'` as a label while injecting fake
|
|
956
|
+
// createSession/sendKeys/pollForSignal — that is intentional and is NOT a
|
|
957
|
+
// regression of the routing contract.
|
|
952
958
|
export async function run(slug, options = {}) {
|
|
953
959
|
const rootDir = path.resolve(options.rootDir ?? process.cwd());
|
|
954
960
|
const env = options.env ?? process.env;
|
|
@@ -35,6 +35,18 @@ const DEFAULT_NO_RE = /\[y\/N\]|\(yes\/no,\s*default\s+no\)|[Dd]efault[: ]+[Nn]o
|
|
|
35
35
|
// output that may legitimately contain "(y/n)"-shaped substrings.
|
|
36
36
|
const ACTIVE_TASK_RE =
|
|
37
37
|
/esc to interrupt|background terminal running|^\s*[·✻]\s+[A-Za-z]+(\.{3}|…)/m;
|
|
38
|
+
|
|
39
|
+
// v0.14.1: codex post-work idle UI markers. NOT a permission prompt — the
|
|
40
|
+
// codex CLI has finished its task and is waiting for the next user input.
|
|
41
|
+
// Pattern: a divider line "─ Worked for Xm Ys ─", a "› " input prompt, and
|
|
42
|
+
// a status bar "Context X% left". Sources: BOS Bug Report #3 (2026-05-04).
|
|
43
|
+
// Treat this as "task done, idle awaiting input"; callers should harvest
|
|
44
|
+
// the verdict file rather than escalate as `prompt_blocked`.
|
|
45
|
+
export const CODEX_IDLE_RE = /─\s*Worked for \d+m \d+s\s*─|Context \d+%\s*left/;
|
|
46
|
+
export function isCodexIdleUi(paneText) {
|
|
47
|
+
if (typeof paneText !== 'string' || paneText.length === 0) return false;
|
|
48
|
+
return CODEX_IDLE_RE.test(paneText);
|
|
49
|
+
}
|
|
38
50
|
const DEBOUNCE_MS = 3000;
|
|
39
51
|
|
|
40
52
|
const lastApprovalAt = new Map();
|