@ai-dev-methodologies/rlp-desk 0.14.6 → 0.15.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/bug-report-overhaul-backlog.md +49 -0
- package/docs/plans/bug-report-overhaul-v0.md +238 -0
- package/docs/plans/bug-report-overhaul-v1.md +319 -0
- package/docs/plans/native-agent-revert.md +184 -0
- package/docs/plans/polished-gliding-toucan.md +234 -0
- package/docs/plans/strategic-review/rlp-desk-strategic-review.md +125 -0
- package/docs/rlp-desk/signal-protocol.md +93 -0
- package/install.sh +2 -0
- package/package.json +1 -1
- package/scripts/postinstall.js +2 -0
- package/src/commands/rlp-desk.md +56 -46
- package/src/node/run.mjs +45 -7
- package/src/node/runner/campaign-main-loop.mjs +372 -15
- package/src/node/shared/fs.mjs +83 -0
- package/src/node/tmux/pane-manager.mjs +39 -0
- package/src/scripts/lib_ralph_desk.zsh +152 -0
- package/src/scripts/run_ralph_desk.zsh +218 -59
|
@@ -0,0 +1,184 @@
|
|
|
1
|
+
# Native Agent() Revert Plan (P0+P1)
|
|
2
|
+
|
|
3
|
+
6-round ralplan consensus 결과. Goal: slash command(`src/commands/rlp-desk.md`)가 진짜 leader가 되어 Claude Code Agent() 호출 + Bash codex exec로 worker/verifier를 spawn하는 v0.13.x 이전 방식으로 회귀. `--mode tmux`(zsh runner) 경로 미변경.
|
|
4
|
+
|
|
5
|
+
## Scope
|
|
6
|
+
|
|
7
|
+
- **P0**: Bug #7 fix 단독 commit + 명시적 invariant ADR
|
|
8
|
+
- **P1**: slash command native prose 회복 + `--mode native` 도입 + Node CLI `--mode agent` deprecation banner
|
|
9
|
+
- **Out-of-scope**: P2 (Node CLI default flip, `--mode agent` hard-error), P3 (Node leader 삭제 ~4.5k LOC + repo-wide ghost-removal gate). 19th launch 종료 후 별도 PR.
|
|
10
|
+
|
|
11
|
+
## Principles
|
|
12
|
+
|
|
13
|
+
1. Naming truth: `--mode agent`라는 flag 하나가 두 곳에서 다른 의미를 가지는 상태 해소.
|
|
14
|
+
2. Single leader per mode.
|
|
15
|
+
3. Surgical revert: tmux 경로 + 19th launch 영향 0.
|
|
16
|
+
4. Bug #7 fix preservation: zsh side에 invariant 보존됨을 ADR로 명시.
|
|
17
|
+
5. Reversibility: silent reclaim 없음. `--mode agent` (Node CLI)는 호환 유지하면서 deprecation warning만.
|
|
18
|
+
|
|
19
|
+
## P0 — Bug #7 fix commit + Invariant ADR
|
|
20
|
+
|
|
21
|
+
### a. Bug #7 fix commit
|
|
22
|
+
- 현재 working tree (5 src + 4 untracked tests + 1 plan markdown) 단독 commit
|
|
23
|
+
- Local sync는 이미 완료(`~/.claude/ralph-desk/` chmod 0o444 + banner)
|
|
24
|
+
|
|
25
|
+
### b. ADR `docs/adr/0001-bug7-invariant-zsh-only-by-structural-necessity.md`
|
|
26
|
+
|
|
27
|
+
명시적 scope:
|
|
28
|
+
- Bug #7 invariant는 **slash-command Native Agent() / Bash codex exec path**에 한정해 **zsh runner side에 enforce**된다.
|
|
29
|
+
- Native Agent() path는 short-lived per-call subagent — long-lived TUI process가 없어 동일 race를 가지지 않는다.
|
|
30
|
+
- Node CLI `--mode agent`(Node leader, deprecated alpha)는 long-lived tmux pane을 사용 → 동일 race 보유. P3에서 삭제될 때까지 별도 reaper/lock 코드(`src/node/runner/campaign-main-loop.mjs:1091`, `:1577`, `src/node/tmux/pane-manager.mjs`, `src/node/shared/fs.mjs`) 유지.
|
|
31
|
+
- zsh side invariant 인용 (codex critic verified file:line):
|
|
32
|
+
- `src/scripts/lib_ralph_desk.zsh:248` — helpers (`_kill_pane_process`, `_lock_sentinel`, `_unlock_sentinel`)
|
|
33
|
+
- `src/scripts/run_ralph_desk.zsh:2179` — partial-write `jq -e .` validity gate
|
|
34
|
+
- `src/scripts/run_ralph_desk.zsh:2484` — verifier reap+lock (per-US main path)
|
|
35
|
+
- `src/scripts/run_ralph_desk.zsh:2551` — final-verify per-US reap+lock
|
|
36
|
+
- `src/scripts/run_ralph_desk.zsh:2969` — prep cleanup unlock
|
|
37
|
+
- `src/scripts/run_ralph_desk.zsh:3036` — worker reap+lock
|
|
38
|
+
- `src/scripts/run_ralph_desk.zsh:3247` — verifier reap+lock (consensus)
|
|
39
|
+
|
|
40
|
+
### c. re-sync
|
|
41
|
+
- `node scripts/postinstall.js`
|
|
42
|
+
- banner-aware diff: src ⇆ `~/.claude/ralph-desk/`
|
|
43
|
+
|
|
44
|
+
### Acceptance
|
|
45
|
+
- AC0.1 `git log -1`이 Bug #7 fix
|
|
46
|
+
- AC0.2 ADR file 존재 + 위 7 file:line 인용 + scope 명문
|
|
47
|
+
- AC0.3 `bash tests/test-bug7-post-sentinel-race.sh` + `bash tests/test-bug7-poll-partial-write.sh` 통과
|
|
48
|
+
- AC0.4 Node 315/315 통과
|
|
49
|
+
- AC0.5 banner-aware diff src ⇆ install 일치
|
|
50
|
+
|
|
51
|
+
## P1 — Slash native prose + `--mode native` + Node CLI deprecation
|
|
52
|
+
|
|
53
|
+
### a. `src/commands/rlp-desk.md` audit list (전체)
|
|
54
|
+
|
|
55
|
+
| Line | 현재 | 변경 후 |
|
|
56
|
+
|---|---|---|
|
|
57
|
+
| 192 | init이 emit하는 첫 "/rlp-desk run" Full options reference 블록의 `--mode agent\|tmux` | `--mode native\|tmux (default: native)` (recommended example는 `--mode tmux` 유지) |
|
|
58
|
+
| 227 | 두 번째 init emission | 동일 처리 |
|
|
59
|
+
| 255 | Options block: `- \`--mode agent\|tmux\` (default: \`agent\`)` | `- \`--mode native\|tmux\` (default: \`native\`)` (정확 라인 형태) |
|
|
60
|
+
| 287 | Mode Selection: "If absent or `agent`, use the Agent() path below" | 두 축 명문화: `--mode native` (default, slash native Agent() leader) / `--mode tmux` (zsh runner). Legacy `--mode agent` deprecation+redirect prose. Direct Node CLI `node run.mjs --mode agent`는 deprecated alpha — 별도 paragraph |
|
|
61
|
+
| 334 | tmux fallback "suggest `--mode agent`" | "suggest `--mode native`" |
|
|
62
|
+
| 342 | "SV/flywheel은 `--mode agent`에서 지원" | "SV/flywheel은 현재 Node-leader `--mode agent` (deprecated alpha, direct Node CLI)에서만 구현. Native Agent() path(`--mode native`)는 SV/flywheel 미구현 — post-P3 작업" |
|
|
63
|
+
| 343 | Tmux IMPORTANT RULES "always invokes node ..." | `--mode tmux` 한정으로 scope |
|
|
64
|
+
| 360-410 | "Why Agent mode is structurally immune" + "PLATFORM CONSTRAINT" 분산 | 단일 박스 `### Native Agent() Safety Contract`로 verbatim 흡수. 4 sentinel: turn-keepalive, no `subagent_type`, `mode="bypassPermissions"` mandatory, long-running→tmux |
|
|
65
|
+
| 448, 460 | claude/codex worker dispatch code | 변경 없음 (이미 native wired) |
|
|
66
|
+
| 778 | "agent=LLM leader, tmux=shell leader" help | "native=Native Agent() leader (slash), tmux=zsh leader (production). Legacy `agent` redirects to `native`. Direct Node CLI `--mode agent`는 deprecated alpha — Direct Node CLI invocation 섹션 참조" |
|
|
67
|
+
| 784 | run 예시 fallback에 `--mode agent` | `--mode native` |
|
|
68
|
+
| 802 | "Agent Mode (default: --mode agent)" 헤딩 | "Native Agent() Mode (default: --mode native)" |
|
|
69
|
+
|
|
70
|
+
### b. `src/node/run.mjs`
|
|
71
|
+
|
|
72
|
+
- 신규 `--mode native` 핸들러:
|
|
73
|
+
- stderr: `ERROR: --mode native is slash-command-only. The Node CLI does not implement it. Use \`/rlp-desk run --mode native\` from a Claude Code session, or use \`--mode {tmux,agent}\` for direct CLI invocation.`
|
|
74
|
+
- exit 2
|
|
75
|
+
- `--mode agent` (Node CLI, line 366-374) deprecation banner 강화:
|
|
76
|
+
```
|
|
77
|
+
WARNING: --mode agent (Node-leader alpha) is deprecated.
|
|
78
|
+
This is the direct Node-CLI alpha path — UNRELATED to the slash command's
|
|
79
|
+
Native Agent() path (`/rlp-desk run --mode native`).
|
|
80
|
+
For production tmux orchestration use `--mode tmux`.
|
|
81
|
+
For Claude Code Native Agent() campaigns use `/rlp-desk run --mode native`
|
|
82
|
+
from a Claude Code session.
|
|
83
|
+
This mode will hard-error in the next major release.
|
|
84
|
+
```
|
|
85
|
+
- default 동작 unchanged (silent reclaim NO; backward compat)
|
|
86
|
+
- `--allow-deprecated` flag 도입 X (P3에서 삭제할 ghost flag 회피)
|
|
87
|
+
- wrapper가 silence 원하면 `2>/dev/null`
|
|
88
|
+
|
|
89
|
+
### c. Tests
|
|
90
|
+
|
|
91
|
+
#### us008 신규 3 cases
|
|
92
|
+
1. `node run.mjs run demo --mode native` → exit 2 + stderr ERROR 메시지
|
|
93
|
+
2. `node run.mjs run demo --mode agent` → stderr deprecation banner + exit 0 (default 동작 유지)
|
|
94
|
+
3. `node run.mjs run demo --mode tmux` 회귀 unchanged
|
|
95
|
+
|
|
96
|
+
#### SV grep/awk guards (`tests/sv-gate-bug7-mode-prose.sh` 신규 또는 sv-gate-fast.sh 병합)
|
|
97
|
+
|
|
98
|
+
```bash
|
|
99
|
+
# 1. count-aware: --mode native 최소 5회 등장
|
|
100
|
+
[ "$(grep -c '\-\-mode native' src/commands/rlp-desk.md)" -ge 5 ] || { echo "FAIL: --mode native must appear ≥5 times"; exit 1; }
|
|
101
|
+
|
|
102
|
+
# 2. block-aware safety contract
|
|
103
|
+
WINDOW=$(awk '/^### Native Agent\(\) Safety Contract/,/^### /' src/commands/rlp-desk.md)
|
|
104
|
+
echo "$WINDOW" | grep -q 'Turn-keepalive: every status report uses' || { echo "FAIL: turn-keepalive sentinel"; exit 1; }
|
|
105
|
+
echo "$WINDOW" | grep -q 'no `subagent_type` parameter' || { echo "FAIL: no-subagent_type sentinel"; exit 1; }
|
|
106
|
+
echo "$WINDOW" | grep -q 'mode="bypassPermissions" mandatory' || { echo "FAIL: bypassPermissions sentinel"; exit 1; }
|
|
107
|
+
echo "$WINDOW" | grep -qi 'long-running.*tmux' || { echo "FAIL: long-running tmux recommendation"; exit 1; }
|
|
108
|
+
|
|
109
|
+
# 3. dispatch snippet preservation (AC1.5a static)
|
|
110
|
+
awk '/^If claude engine \(default\):/,/^If codex engine:/' src/commands/rlp-desk.md > /tmp/_disp_claude
|
|
111
|
+
grep -q 'Agent(' /tmp/_disp_claude || { echo "FAIL: claude dispatch missing Agent("; exit 1; }
|
|
112
|
+
grep -q 'mode="bypassPermissions"' /tmp/_disp_claude || { echo "FAIL: claude dispatch missing bypassPermissions"; exit 1; }
|
|
113
|
+
awk '/^If codex engine:/,/^\*\*⑥\*\*|^### /' src/commands/rlp-desk.md > /tmp/_disp_codex
|
|
114
|
+
grep -q 'Bash("codex exec' /tmp/_disp_codex || { echo "FAIL: codex dispatch missing Bash codex exec"; exit 1; }
|
|
115
|
+
|
|
116
|
+
# 4. Options block exact match
|
|
117
|
+
WINDOW=$(awk '/^Options \(parse from/,/^- `--worker-model/' src/commands/rlp-desk.md)
|
|
118
|
+
echo "$WINDOW" | grep -qE '^\- `--mode native\|tmux` \(default: `native`\)$' || { echo "FAIL: Options block --mode line not exact"; exit 1; }
|
|
119
|
+
echo "$WINDOW" | grep -qE '\-\-mode .*agent' && { echo "FAIL: stale 'agent' in Options block --mode line"; exit 1; }
|
|
120
|
+
|
|
121
|
+
# 5. Tmux IMPORTANT RULES contradiction removed
|
|
122
|
+
! awk '/^\*\*IMPORTANT RULES:\*\*/,/^####/' src/commands/rlp-desk.md | grep -q "always invokes node" || { echo "FAIL: stale 'always invokes node'"; exit 1; }
|
|
123
|
+
|
|
124
|
+
# 6. Legacy redirect prose present
|
|
125
|
+
grep -q 'Legacy.*\-\-mode agent.*redirect' src/commands/rlp-desk.md || { echo "FAIL: deprecation prose missing"; exit 1; }
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
#### AC1.5b: manual transcript artifact
|
|
129
|
+
|
|
130
|
+
`docs/verifications/p1-native-mode-transcript.md` (git-tracked):
|
|
131
|
+
|
|
132
|
+
1. P1 land 후 Claude Code session에서 `/rlp-desk run sample --mode native` 1 iteration 실행
|
|
133
|
+
2. 전체 transcript 캡처 — 다음 관측 포함:
|
|
134
|
+
- `Agent(model=…, mode="bypassPermissions", …)` worker dispatch 라인
|
|
135
|
+
- status report가 `Bash("echo '...'")` 로 wrap
|
|
136
|
+
- `subagent_type=` 미사용
|
|
137
|
+
3. Reviewer가 `## Reviewer Sign-off` 섹션에 이름 + 날짜 기재
|
|
138
|
+
4. CI guard: 파일 존재 + signoff 비-placeholder 검증 (`grep -E "^- Name: \S+"`, `grep -E "^- Date: [0-9]{4}-[0-9]{2}-[0-9]{2}"`)
|
|
139
|
+
|
|
140
|
+
### d. Re-sync
|
|
141
|
+
- `node scripts/postinstall.js`
|
|
142
|
+
- banner-aware diff for `src/commands/rlp-desk.md` ⇆ `~/.claude/commands/rlp-desk.md`, `src/node/run.mjs` ⇆ `~/.claude/ralph-desk/node/run.mjs`
|
|
143
|
+
|
|
144
|
+
### Acceptance
|
|
145
|
+
- AC1.1 6 grep/awk guards all return 0
|
|
146
|
+
- AC1.2 us008 신규 3 cases all green
|
|
147
|
+
- AC1.3 us008/us006 기존 회귀 0
|
|
148
|
+
- AC1.4 banner-aware diff src ⇆ install 일치
|
|
149
|
+
- AC1.5a 정적 dispatch grep (#3) 통과
|
|
150
|
+
- AC1.5b transcript artifact + signoff non-placeholder
|
|
151
|
+
|
|
152
|
+
## Out-of-scope (deferred PR list)
|
|
153
|
+
|
|
154
|
+
- **P2**: `src/node/run.mjs:16` default `'agent'` → `'tmux'` flip + `--mode agent` (Node CLI) hard-error. 19th launch 종료 후, 외부 wrapper 영향 평가 후 별도 PR.
|
|
155
|
+
- **P3**: Node leader 삭제 (`src/node/runner/campaign-main-loop.mjs`, `src/node/tmux/`, `src/node/polling/` 등 ~4.5k LOC) + Bug-7 Node 통합 테스트 폐기 + repo-wide ghost-removal gate (`rg -n "Node leader\|node-leader\|--mode agent" src docs scripts tests` = 의도된 hits만). P2 후.
|
|
156
|
+
- **`--mode agent` reclaim to Native Agent()**: P3 이후 next major version에서만. 이번 PR에선 silent reclaim 없음.
|
|
157
|
+
|
|
158
|
+
## Pre-mortem
|
|
159
|
+
|
|
160
|
+
1. **`--mode agent` 호출자가 native와 alpha 의미를 헷갈린다** — slash command에서 호출 시 deprecation+redirect로 native path 진행. 외부 shell wrapper에서 `node run.mjs run X --mode agent` 호출 시 deprecation banner + 기존 Node leader path. 두 경로 모두 메시지로 명시.
|
|
161
|
+
2. **Native Agent() turn-end가 사용자를 괴롭힌다** — Safety Contract 박스의 turn-keepalive 명문화로 mitigate. 그래도 100%는 아니므로 docs는 long-running = `--mode tmux` 강력 권고.
|
|
162
|
+
3. **외부 wrapper가 `--mode agent` (Node CLI) 의존** — 동작 unchanged + deprecation banner만. P3에서야 hard-error. wrapper는 그동안 마이그레이션.
|
|
163
|
+
|
|
164
|
+
## Verification end-to-end (P0+P1 land 후)
|
|
165
|
+
|
|
166
|
+
1. `git log --oneline HEAD~3..HEAD` — Bug #7 + ADR + P1 commits
|
|
167
|
+
2. `node --test 'tests/node/*.test.mjs' 'tests/node/*.mjs'` — 315+3 = 318 통과
|
|
168
|
+
3. `bash tests/test-bug7-post-sentinel-race.sh` + `bash tests/test-bug7-poll-partial-write.sh` 통과
|
|
169
|
+
4. `bash tests/sv-gate-bug7-mode-prose.sh` (또는 sv-gate-fast.sh) — 6 grep/awk guards 0
|
|
170
|
+
5. banner-aware diff src ⇆ `~/.claude/`
|
|
171
|
+
6. AC1.5b: 사용자가 Claude Code session에서 `/rlp-desk run sample --mode native` 실행 후 transcript 검토 + signoff
|
|
172
|
+
|
|
173
|
+
## Round-by-round resolution table
|
|
174
|
+
|
|
175
|
+
| Round | Verdict | Findings closed |
|
|
176
|
+
|---|---|---|
|
|
177
|
+
| 1 (Architect) | shift to A-strict | option A → A-strict |
|
|
178
|
+
| 2 (Critic codex) | ITERATE 7 | entrypoint, default flip, ADR scope, naming, reclaim, AC, re-sync |
|
|
179
|
+
| 3 (Architect) | ITERATE 2 | synonym ghost, allow-deprecated ghost |
|
|
180
|
+
| 4 (Critic codex) | ITERATE 4 | init blocks (192/227), fallback (334/342), grep guards, ADR scope |
|
|
181
|
+
| 5 (Critic codex) | ITERATE 3 | label expansion (778/802), exact options match, AC1.5 runnable |
|
|
182
|
+
| 6 (Critic codex) | ITERATE 3 (1 actionable, 2 cross-check false-positives) | signoff non-placeholder check |
|
|
183
|
+
|
|
184
|
+
Net: 모든 v0-v5 actionable findings closed. Round 6 finding 3 (signoff non-placeholder)은 v7에 이미 반영됨 (AC1.5b CI guard 4번째 항목). Round 6 finding 1/2는 v6 base에 이미 포함된 사항 — critic의 cross-check 누락.
|
|
@@ -0,0 +1,234 @@
|
|
|
1
|
+
# Bug Report #7 — Post-Sentinel Process Race Fix
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
BOS 사용자가 19th launch에서 측정한 race window:
|
|
6
|
+
- iter-1 verifier가 verdict detect 후 **1m 43s** 뒤 `verify-verdict.json` 재수정 (file mtime 증거)
|
|
7
|
+
- iter-1 verifier post-verdict 후속 활동 **2m 1s**
|
|
8
|
+
- iter-1 verifier ↔ iter-2 worker 동시 작업 약 **2분**
|
|
9
|
+
|
|
10
|
+
Bug report:
|
|
11
|
+
`/Users/kyjin/dev/doul/bos/docs/exec-plans/active/2026-05-06-rlp-desk-bug-report-7-post-sentinel-process-race.md`
|
|
12
|
+
|
|
13
|
+
### Root cause
|
|
14
|
+
|
|
15
|
+
Leader는 `iter-signal.json` / `verify-verdict.json` 발견 즉시 다음 iter로 진입하지만, 그 sentinel을 쓴 Worker/Verifier process(claude/codex TUI)는 **명시적으로 종료되지 않는다**. tmux pane은 살아 있고 TUI는 idle prompt로 회귀 후 자체 self-review를 수행 → sentinel 재수정·working tree 오염·토큰 낭비.
|
|
16
|
+
|
|
17
|
+
### 모드 영향 범위 (중요)
|
|
18
|
+
|
|
19
|
+
`--mode tmux`(zsh runner)와 `--mode agent`(Node leader) **둘 다 영향**. Node leader도 `defaultSendKeys`/`defaultCreatePane`(`src/node/tmux/pane-manager.mjs`)을 통해 실제 tmux pane 위에서 worker/verifier를 실행한다 (`src/node/runner/campaign-main-loop.mjs:1077-1080`, `1116-1133`). Agent 모드 면역이라는 초기 가설은 부정확.
|
|
20
|
+
|
|
21
|
+
### 비대칭 (현 상태)
|
|
22
|
+
|
|
23
|
+
| 경로 | Worker 후처리 | Verifier 후처리 |
|
|
24
|
+
|---|---|---|
|
|
25
|
+
| Node leader | 없음 | 없음 |
|
|
26
|
+
| zsh runner | 다음 iter 시작 시 cleanup (`run_ralph_desk.zsh:2948-2956`) — race window 5s+ | dispatch 직전 cleanup (`3160-3180`) — 같은 iter 내에선 보호되나 final iter 종료 후 또는 cross-iter race는 불보호 |
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Approach (Fix-Q + Fix-R, 최소 surgical 조합)
|
|
31
|
+
|
|
32
|
+
| Fix | 효과 | 채택 |
|
|
33
|
+
|---|---|---|
|
|
34
|
+
| **Q** Sentinel detect 즉시 producing pane에 Ctrl+C → process 종료 | race를 ~1초 안에 직접 차단 | **YES (primary)** |
|
|
35
|
+
| **R** Sentinel 파일 chmod 0444로 재수정 차단 | Q가 늦거나 fail해도 mtime 동결 | **YES (defense-in-depth)** |
|
|
36
|
+
| S Pane lifecycle 전면 리팩토링 | 효과는 있으나 surface가 너무 큼. 기존 prep cleanup (zsh 2948-2956)으로 부분 커버됨. Karpathy "surgical changes" 원칙 위반 | NO |
|
|
37
|
+
| T post-sentinel 30s 안전망 timeout | Q가 fail-open이고 다음 iter prep cleanup이 backup이라 중복 | NO |
|
|
38
|
+
|
|
39
|
+
근거:
|
|
40
|
+
- Q는 producer를 ~1초 내 죽여서 root cause 차단. 기존 패턴 정확히 미러 (zsh `run_ralph_desk.zsh:2384-2397`, Ctrl+C 더블 송신 + `wait_for_pane_ready`).
|
|
41
|
+
- R은 chmod 실패에 관대(EPERM/ENOTSUP 무시 — `scripts/postinstall.js:104` `tryLockFile` 선례). WSL1/NTFS/tmpfs 등 chmod no-op 환경에서도 graceful degradation.
|
|
42
|
+
- S/T 제거로 review surface 최소화.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Concrete code changes
|
|
47
|
+
|
|
48
|
+
### Node leader
|
|
49
|
+
|
|
50
|
+
#### 1. `src/node/tmux/pane-manager.mjs` — helper 추가 (line 77 뒤)
|
|
51
|
+
|
|
52
|
+
신규 export:
|
|
53
|
+
- `sendRawKey(paneId, key)` — `runTmux(['send-keys', '-t', paneId, key])`. `sendKeys`(`-l --` literal text)와 분리: C-c 같은 raw key용.
|
|
54
|
+
- `killPaneProcess(paneId, { sendRawKey, waitForExit, gracePeriodMs=800, exitTimeoutMs=5000, log })`:
|
|
55
|
+
1. `sendRawKey('C-c')` → `await sleep(gracePeriodMs)` → `sendRawKey('C-c')` (double press, zsh `375-376` 미러).
|
|
56
|
+
2. `await waitForExit(paneId, { timeoutMs: exitTimeoutMs }).catch(log)` — fail-open.
|
|
57
|
+
3. raw key 송신 자체의 TmuxError도 catch+log (이미 죽은 pane에 안전).
|
|
58
|
+
|
|
59
|
+
기존 `waitForProcessExit` (line 55) 그대로 재사용.
|
|
60
|
+
|
|
61
|
+
#### 2. `src/node/shared/fs.mjs` — helper 추가 (line 61 뒤)
|
|
62
|
+
|
|
63
|
+
- `lockSentinelFile(filePath, { log })` — `fs.chmod(filePath, 0o444)`, error 시 한 번만 경고 로그. `tryLockFile`(`scripts/postinstall.js:104`) 선례 미러.
|
|
64
|
+
- `unlockSentinelFile(filePath)` — `fs.chmod(filePath, 0o644)`, 실패 무시. iter cleanup 직전에 호출.
|
|
65
|
+
|
|
66
|
+
#### 3. `src/node/runner/campaign-main-loop.mjs` — wire + call sites
|
|
67
|
+
|
|
68
|
+
DI 슬롯 추가 (line 1077-1080):
|
|
69
|
+
```
|
|
70
|
+
const sendRawKey = options.sendRawKey ?? defaultSendRawKey;
|
|
71
|
+
const waitForProcessExit = options.waitForProcessExit ?? defaultWaitForProcessExit;
|
|
72
|
+
const killPaneProcess = options.killPaneProcess ?? defaultKillPaneProcess;
|
|
73
|
+
const lockSentinel = options.lockSentinelFile ?? lockSentinelFile;
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
내부 wrapper:
|
|
77
|
+
```
|
|
78
|
+
async function reapProducer(paneId, sentinelFile) {
|
|
79
|
+
await killPaneProcess(paneId, { sendRawKey, waitForExit: waitForProcessExit, log: console.error });
|
|
80
|
+
if (sentinelFile) await lockSentinel(sentinelFile, { log: console.error });
|
|
81
|
+
}
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
호출 사이트 (성공 + `validateArtifact` 통과 직후):
|
|
85
|
+
|
|
86
|
+
| Site | Line | 호출 |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| Flywheel poll | 1267-1277 다음 (1285 앞) | `reapProducer(state.flywheel_pane_id ?? state.verifier_pane_id, paths.flywheelSignalFile)` |
|
|
89
|
+
| Guard poll | 1305-1315 다음 (1323 앞) | `reapProducer(guardPaneId, paths.flywheelGuardVerdictFile)` |
|
|
90
|
+
| Worker poll | 1422-1432 다음 (1456 앞) | `reapProducer(state.worker_pane_id, paths.signalFile)` |
|
|
91
|
+
| Verifier poll | 1489-1513 다음 (1522 앞) | `reapProducer(state.verifier_pane_id, paths.verdictFile)` |
|
|
92
|
+
| Final per-US verifier (`runFinalSequentialVerify`) | 890-894 다음 (896 앞) | `reapProducer(verifierPaneId, paths.verdictFile)` — `runFinalSequentialVerify` 시그니처에 `reapProducer` 추가 + 호출처(1185-1194) 전달 |
|
|
93
|
+
|
|
94
|
+
iter cleanup unlock — `fs.unlink(...)` 호출 직전 `unlockSentinelFile` 호출:
|
|
95
|
+
- L1291 (`flywheelSignalFile`)
|
|
96
|
+
- L1328 (`flywheelGuardVerdictFile`)
|
|
97
|
+
- 루프 상단 (1145 직후) — Worker `signalFile` / Verifier `verdictFile` 방어적 unlock (다음 iter producer가 atomic rename으로 덮어쓸 때 대비)
|
|
98
|
+
|
|
99
|
+
### zsh runner
|
|
100
|
+
|
|
101
|
+
#### 4. `src/scripts/lib_ralph_desk.zsh` — helper 추가 (`atomic_write` 다음, line 245 뒤)
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
_kill_pane_process() {
|
|
105
|
+
local pane_id="$1" role="${2:-producer}"
|
|
106
|
+
log_debug "[bug7] kill_pane_process pane=$pane_id role=$role"
|
|
107
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
108
|
+
sleep 0.5
|
|
109
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
110
|
+
sleep 1
|
|
111
|
+
wait_for_pane_ready "$pane_id" 5 2>/dev/null || true
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
_lock_sentinel() {
|
|
115
|
+
local file="$1"
|
|
116
|
+
[[ -f "$file" ]] || return 0
|
|
117
|
+
chmod 0444 "$file" 2>/dev/null || true
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
_unlock_sentinel() {
|
|
121
|
+
local file="$1"
|
|
122
|
+
[[ -f "$file" ]] || return 0
|
|
123
|
+
chmod 0644 "$file" 2>/dev/null || true
|
|
124
|
+
}
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
#### 5. `src/scripts/run_ralph_desk.zsh` — call sites
|
|
128
|
+
|
|
129
|
+
| Site | Line | 호출 |
|
|
130
|
+
|---|---|---|
|
|
131
|
+
| Worker poll 성공 직후 | 3003 (`worker_poll_done=1` 분기 안, `log_debug` 다음) | `_kill_pane_process "$WORKER_PANE" "worker"; _lock_sentinel "$SIGNAL_FILE"` |
|
|
132
|
+
| Verifier poll 성공 직후 (main path) | 3202 통과 후, 3215 앞 (`ITER_VERIFIER_END`) | `_kill_pane_process "$VERIFIER_PANE" "verifier"; _lock_sentinel "$VERDICT_FILE"` |
|
|
133
|
+
| Final-verify per-US (`run_sequential_final_verify`) | 2524 통과 후, 다음 iter 진입 전 | `_kill_pane_process "$VERIFIER_PANE" "verifier-final"; _lock_sentinel "$VERDICT_FILE"` |
|
|
134
|
+
| Codex grace path | `dispatch_verifier_per_us` (2420 그레이스 종료 직후, 2471 `cp` 앞) | `_kill_pane_process "$VERIFIER_PANE" "verifier-${suffix}"; _lock_sentinel "$VERDICT_FILE"` |
|
|
135
|
+
| Consensus path | `run_consensus_verification` 내 각 `poll_for_signal` 성공 직후 | 동일 패턴 |
|
|
136
|
+
|
|
137
|
+
prep cleanup unlock — line 2948-2956 cleanup 직전:
|
|
138
|
+
```
|
|
139
|
+
_unlock_sentinel "$SIGNAL_FILE"; _unlock_sentinel "$VERDICT_FILE"
|
|
140
|
+
rm -f "$SIGNAL_FILE" "$DONE_CLAIM_FILE" "$VERDICT_FILE" 2>/dev/null
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Files to modify
|
|
146
|
+
|
|
147
|
+
| 파일 | 변경 |
|
|
148
|
+
|---|---|
|
|
149
|
+
| `src/node/tmux/pane-manager.mjs` | `sendRawKey`, `killPaneProcess` export 추가 |
|
|
150
|
+
| `src/node/shared/fs.mjs` | `lockSentinelFile`, `unlockSentinelFile` export 추가 |
|
|
151
|
+
| `src/node/runner/campaign-main-loop.mjs` | DI + `reapProducer` + 5개 call site + iter cleanup unlock |
|
|
152
|
+
| `src/scripts/lib_ralph_desk.zsh` | `_kill_pane_process`, `_lock_sentinel`, `_unlock_sentinel` 추가 |
|
|
153
|
+
| `src/scripts/run_ralph_desk.zsh` | 4-5개 call site + prep cleanup unlock |
|
|
154
|
+
| `tests/node/us006-campaign-main-loop.test.mjs` | `createTmuxFakes()`에 `killPaneProcess`/`lockSentinelFile` 레코더 추가 + Bug-7 테스트 3건 |
|
|
155
|
+
| `tests/node/test-kill-pane-process.test.mjs` | NEW — helper 단위 테스트 |
|
|
156
|
+
| `tests/node/test-lock-sentinel-file.test.mjs` | NEW — chmod 단위 테스트 |
|
|
157
|
+
| `tests/test-bug7-post-sentinel-race.sh` | NEW — 실제 tmux 통합 테스트 (Bug #6 패턴 미러) |
|
|
158
|
+
|
|
159
|
+
배포는 단일 PR (helper는 call site 없으면 no-op이라 review surface 작음).
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Reused functions (참조)
|
|
164
|
+
|
|
165
|
+
- Node: `pane-manager.mjs:50` `sendKeys`, `pane-manager.mjs:55` `waitForProcessExit` (5s timeout, shell 감지)
|
|
166
|
+
- Node: `shared/fs.mjs:6-23` `writeFileAtomic`, `42-61` `writeSentinelExclusive`
|
|
167
|
+
- Node: `scripts/postinstall.js:104` `tryLockFile` (chmod 0o444 선례)
|
|
168
|
+
- zsh: `lib_ralph_desk.zsh:240-245` `atomic_write`, `1075-1137` `wait_for_pane_ready`
|
|
169
|
+
- zsh: `run_ralph_desk.zsh:2384-2397` 검증된 verifier-cleanup 패턴 (Ctrl+C + /exit + wait), `375-376/529-530` 더블 Ctrl+C 패턴
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
173
|
+
## Testing strategy
|
|
174
|
+
|
|
175
|
+
### 단위 테스트 (Node)
|
|
176
|
+
|
|
177
|
+
`tests/node/test-kill-pane-process.test.mjs` (NEW):
|
|
178
|
+
- AC1 정상: C-c → sleep → C-c → waitForExit 순서 (fake recorder 검증).
|
|
179
|
+
- AC2 fail-open: `waitForExit` 가 TmuxError throw 시 helper resolve.
|
|
180
|
+
- AC3 dead-pane: `sendRawKey` throw 시 resolve.
|
|
181
|
+
- AC4 grace: gracePeriodMs 준수 (fake clock 또는 tolerance 검증).
|
|
182
|
+
|
|
183
|
+
`tests/node/test-lock-sentinel-file.test.mjs` (NEW):
|
|
184
|
+
- AC1: lock 후 mode `& 0o222 === 0` (chmod 무시 FS는 skip).
|
|
185
|
+
- AC2: 존재하지 않는 path에 lock — throw 안 함.
|
|
186
|
+
- AC3: unlock 후 writable.
|
|
187
|
+
|
|
188
|
+
### 통합 테스트 (Node)
|
|
189
|
+
|
|
190
|
+
`tests/node/us006-campaign-main-loop.test.mjs` 확장:
|
|
191
|
+
1. **Bug-7-A**: Worker pollForSignal 성공 → next dispatchVerifier 전에 `killPaneProcess('%worker')` + `lockSentinelFile(signalFile)` 호출 순서 검증.
|
|
192
|
+
2. **Bug-7-B**: Verifier verdict pass 후 next iter dispatchWorker 전에 `killPaneProcess('%verifier')` + `lockSentinelFile(verdictFile)`.
|
|
193
|
+
3. **Bug-7-C**: `killPaneProcess`가 throw해도 run() 정상 완료.
|
|
194
|
+
|
|
195
|
+
`createTmuxFakes()`(line 83)에 fake `killPaneProcess`/`lockSentinelFile` 레코더 추가 (기존 30+ 테스트 호환 보장).
|
|
196
|
+
|
|
197
|
+
### 통합 테스트 (zsh)
|
|
198
|
+
|
|
199
|
+
`tests/test-bug7-post-sentinel-race.sh` (NEW, `test-bug6-worker-idle-false-positive.sh` 패턴 미러):
|
|
200
|
+
- Scenario 1: tmux 세션에 `sleep 600` 띄우고 `_kill_pane_process` 호출 → 2s 안에 `pane_current_command`가 zsh/bash로 회귀.
|
|
201
|
+
- Scenario 2: `_lock_sentinel` → mode 0444 검증 → `_unlock_sentinel` → writable → `rm -f` 성공.
|
|
202
|
+
- Scenario 3 (REAL_E2E gated): 1-iter 캠페인 + stub claude(sentinel write 후 sleep 120) → 10s 후 verdict file mtime delta == 0.
|
|
203
|
+
|
|
204
|
+
### Self-Verification 시나리오 (CLAUDE.md gate, 3건 필수)
|
|
205
|
+
|
|
206
|
+
`src/scripts/run_ralph_desk.zsh` 수정 — MEDIUM-HIGH risk:
|
|
207
|
+
- **LOW**: helper 단위 테스트 + 기존 Node/zsh 회귀 테스트 통과.
|
|
208
|
+
- **MEDIUM**: 1-iter 실제 캠페인. Worker → Verifier 전이 시점에 `pane_current_command` 캡처, 2s 내 shell 회귀 검증. Verdict file mtime 동결 검증.
|
|
209
|
+
- **CRITICAL**: 2-iter 캠페인 (verify→fail→verify→pass). iter-N+1 worker dispatch가 iter-N verifier `pane_current_command == zsh` 확인 후에만 발생 — 타임스탬프 로그 캡처. `--mode agent`와 `--mode tmux` 둘 다 실행.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Verification end-to-end
|
|
214
|
+
|
|
215
|
+
1. **단위**: `node --test tests/node/test-kill-pane-process.test.mjs tests/node/test-lock-sentinel-file.test.mjs` 통과.
|
|
216
|
+
2. **통합 (Node)**: `node --test tests/node/us006-campaign-main-loop.test.mjs` 통과 — call order 단언이 회귀 가드.
|
|
217
|
+
3. **라이브 tmux**: `_kill_pane_process` 호출 후 2s 내 `tmux display-message -p '#{pane_current_command}' -t $pane`가 `zsh`/`bash` 반환.
|
|
218
|
+
4. **mtime 동결**: `stat -f %m verify-verdict.json`을 detect 시점과 +10s 시점에 측정해 delta == 0. Bug report의 1m43s 증거를 직접 반박.
|
|
219
|
+
5. **Pane 출력**: `tmux capture-pane -p` 결과에 `Worked for Xm Ys` / `esc to interrupt` 신규 표식 없음.
|
|
220
|
+
6. **두 모드**: 스모크 테스트를 `--mode tmux`(zsh runner)와 `--mode agent`(Node leader) 각각 실행 — 둘 다 4초 내 shell 회귀 검증.
|
|
221
|
+
7. **재현 시나리오**: 19th launch와 동일 조건(claude opus 1m worker + gpt-5.5:high codex verifier)으로 캠페인 1회 실행 후 leader log + file mtime 비교 — race 0.
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## Risk / mitigation
|
|
226
|
+
|
|
227
|
+
| Risk | 가능성 | 완화 |
|
|
228
|
+
|---|---|---|
|
|
229
|
+
| C-c가 producer artifact 쓰기 중간 인터럽트 | LOW — sentinel은 detect 시점에 이미 디스크에 존재 | `MalformedArtifactError` 경로가 partial write 처리 |
|
|
230
|
+
| chmod 0444가 다음 iter cleanup의 `unlink` 차단 | LOW | `_unlock_sentinel` / `unlockSentinelFile`이 unlink 직전 실행. 대부분 Unix FS는 dir-perms 기준이라 0444 파일도 unlink 가능 |
|
|
231
|
+
| Producer가 atomic rename으로 sentinel 재기록 (chmod 우회) | POSSIBLE | Q(kill)이 ~1s 내 producer 죽이므로 rewrite window가 2분 → 1초로 축소. 게다가 leader는 이미 in-band로 sentinel 소비 |
|
|
232
|
+
| `killPaneProcess`가 죽은 pane에 throw | POSSIBLE | helper 내부 catch + 단위 테스트 AC2/AC3로 회귀 가드 |
|
|
233
|
+
| chmod 0444 silent no-op (WSL1/NTFS/tmpfs) | OBSERVED (postinstall.js 선례) | 한 번만 경고 로그. Q(kill)이 primary defense라 graceful degradation |
|
|
234
|
+
| 기존 us006 테스트 회귀 | MEDIUM | `createTmuxFakes()`에 fake helper 레코더 추가 — 기존 호출자는 자동 주입 받음 |
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# rlp-desk 전략 재평가 — autoplan 입력
|
|
2
|
+
|
|
3
|
+
> **목표**: rlp-desk를 계속 발전시킬지(patch/redesign), 폐기 후 재구성할지(rebuild), 기존 도구로 pivot할지 결정.
|
|
4
|
+
> **핵심 KPI**: blueprint 처음 → 끝 자율 완료. 사람 개입 최소.
|
|
5
|
+
> **현재 상태**: 6주간 10개 bug, 매주 1-2개 신규 발견, manual recovery마저 broken (#10).
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 1. Vision (BOS dev 합의, 2026 초)
|
|
10
|
+
|
|
11
|
+
1. ralph-loop를 **fresh-context**로 매번 시작 (컨텍스트 오염 없음)
|
|
12
|
+
2. idea → plan distillation
|
|
13
|
+
3. PRD 정형화
|
|
14
|
+
4. Worker/Verifier 사이클로 반복 개선
|
|
15
|
+
5. **완전 자율화** — 사람 개입 최소
|
|
16
|
+
|
|
17
|
+
→ 5번이 단일 success criterion. 1-4는 5번을 달성하기 위한 메커니즘.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## 2. Reality — 6주간 측정된 사실
|
|
22
|
+
|
|
23
|
+
### Bug 발생 패턴
|
|
24
|
+
|
|
25
|
+
| # | 일자 | 카테고리 | 회수 가능 | 다음 fix가 노출한 다음 bug |
|
|
26
|
+
|---|---|---|---|---|
|
|
27
|
+
| #1 | 2026-05-01 | LLM-runtime (`.claude/` self-modification gate) | partial — codex worker로 우회 | #2 (tmux session) |
|
|
28
|
+
| #2 | 2026-05-01 | tmux session lifecycle | yes | #3 |
|
|
29
|
+
| #3 | 2026-05-04 | verifier no-progress | yes | #4 (regression of #3) |
|
|
30
|
+
| #4 | 2026-05-05 | #3 regression | partial | #5 |
|
|
31
|
+
| #5 | 2026-05-05 | worker dead on reuse (pane lifecycle) | yes | #6 |
|
|
32
|
+
| #6 | 2026-05-06 | claude worker idle false-positive | yes | #7 |
|
|
33
|
+
| #7 | 2026-05-06 | post-sentinel process race (1m43s drift) | yes | #8 + #10 |
|
|
34
|
+
| #8 | 2026-05-06 | worker incomplete + leader A4 fallback | partial | #10 |
|
|
35
|
+
| #9 | (별도) | verified_us 영속성 (status.json) | yes | — |
|
|
36
|
+
| #10 | 2026-05-07 | leader ignores phase=verify on relaunch — **회수마저 깨짐** | NO (이게 회수 메커니즘) | ? |
|
|
37
|
+
|
|
38
|
+
**Pattern observed:**
|
|
39
|
+
- 매주 1-2개 신규 bug
|
|
40
|
+
- 절반은 **이전 fix가 노출**한 새 failure mode (regression style)
|
|
41
|
+
- Bug #10이 가장 심각 — 회수 메커니즘 자체가 broken → BLOCKED 시 operator의 manual recovery가 무효화
|
|
42
|
+
|
|
43
|
+
### Bug 카테고리 분류
|
|
44
|
+
|
|
45
|
+
| 카테고리 | Bug | 비중 | architectural 필연성? |
|
|
46
|
+
|---|---|---|---|
|
|
47
|
+
| (a) tmux/process lifecycle race | #2, #5, #6, #7 | 40% | **YES** — tmux pane lifecycle은 claude/codex TUI lifecycle과 분리됨. race window가 본질적 |
|
|
48
|
+
| (b) artifact contract / schema | #3, #4, #8, #9 | 40% | partial — schema가 더 strict하면 줄어들지만 LLM이 schema 어기는 빈도가 본질적 |
|
|
49
|
+
| (c) LLM-runtime constraint | #1 | 10% | **YES** — Claude Code의 `.claude/` self-modification gate는 외부 변수 |
|
|
50
|
+
| (d) recovery hygiene | #10 | 10% | accidental — fix 가능 |
|
|
51
|
+
|
|
52
|
+
→ **80%가 architectural inevitability** (a + c). schema strict + retry는 (b)를 줄이지만 LLM 비결정성이 한계.
|
|
53
|
+
|
|
54
|
+
### SV gate 한계
|
|
55
|
+
|
|
56
|
+
- 모든 sv-self-verify-*.sh가 **Worker/Verifier metaphor만 사용** → 실제 LLM agent run 없음
|
|
57
|
+
- grep + unit test + regression의 5-category labeling
|
|
58
|
+
- → 10개 bug가 SV gate 통과 후 production에서 발견됐다는 사실 = **framework이 production failure mode를 cover 못 함**
|
|
59
|
+
- Production failure mode는 **LLM/tmux/network/timing 비결정성** — unit test로 잡을 수 없는 영역
|
|
60
|
+
|
|
61
|
+
### In-flight 미완 branch
|
|
62
|
+
|
|
63
|
+
- `feat/native-agent-revert`: P0(Bug #7 fix) + P1(slash native prose 회복) — 미완. Plan: 5 round ralplan 합의됨, codex critic APPROVED
|
|
64
|
+
- `feat/bug10-relaunch-hygiene`: PR-A commit 95c0d4e + SV gate 추가 — 방금 완료, push 안 됨
|
|
65
|
+
|
|
66
|
+
→ 두 branch 모두 land 안 됨 = 새 bug 발견 시 기존 fix가 무력화될 위험
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## 3. 4가지 옵션
|
|
71
|
+
|
|
72
|
+
### Option A — Continue patching (현재 path)
|
|
73
|
+
PR-A (Bug #10) → PR-B (bundler) → PR-C (patterns) → 다음 bug → 반복.
|
|
74
|
+
|
|
75
|
+
### Option B — Fundamental redesign (vision 유지, architecture 재설계)
|
|
76
|
+
**핵심 변경**: tmux pane orchestration 폐기 → Claude Code Native Agent() / subprocess 직접 dispatch. Sentinel file → in-memory channel. 두 변경으로 (a) 카테고리 80%가 제거.
|
|
77
|
+
|
|
78
|
+
### Option C — Scrap and rebuild (vision 일부 수정)
|
|
79
|
+
6주 코드 폐기. Vision 5번 (완전 자율화)은 유지하되 1번(fresh-context)은 task-isolated subprocess로 재정의. ralph-loop 자체 폐기 가능 — 그냥 plan→worker→verify 사이클로 단순화.
|
|
80
|
+
|
|
81
|
+
### Option D — Pivot to existing tool
|
|
82
|
+
- ralph plugin (cradle): 가벼움, 기능 부족
|
|
83
|
+
- omc (oh-my-claudecode): /ralph + /ralplan + /omc-teams + /autopilot — 이미 multi-model orchestration 있음
|
|
84
|
+
- superpowers: subagent-driven-development, executing-plans, brainstorming — plan→exec 사이클은 이미 있음
|
|
85
|
+
- claude-devfleet: dmux 기반, multi-agent
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## 4. 평가 항목 (각 옵션마다)
|
|
90
|
+
|
|
91
|
+
| 항목 | 정의 | 이유 |
|
|
92
|
+
|---|---|---|
|
|
93
|
+
| Vision 보존도 | 5개 vision 중 살아남는 개수 | "완전 자율화" 달성 가능성 |
|
|
94
|
+
| Time-to-first-successful-blueprint | 처음으로 blueprint를 끝까지 자율 완료하는 데 걸리는 시간 | **단일 핵심 KPI** |
|
|
95
|
+
| Sunk cost write-off % | 폐기되는 코드/SV 비율 | 결정의 reversibility |
|
|
96
|
+
| Bug regression 위험 | 새 시스템에서 #1-#10 같은 bug가 다시 나올 확률 | "다음 6주에 또 10개?" |
|
|
97
|
+
| Personal capacity ROI | 1주 투자 시 deliverable | sustainability |
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## 5. In-flight branch 처리 결정
|
|
102
|
+
|
|
103
|
+
이 평가가 끝난 후 결정:
|
|
104
|
+
- `feat/native-agent-revert`: land / abandon / re-scope?
|
|
105
|
+
- `feat/bug10-relaunch-hygiene`: 이미 commit 95c0d4e, push할까 hold할까?
|
|
106
|
+
|
|
107
|
+
선택 옵션이 무엇이든 두 branch는 처리 필요.
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## 6. 제약
|
|
112
|
+
|
|
113
|
+
- BOS dev가 실제 캠페인을 돌려야 함 → short-term (1-2주) patching 불가피
|
|
114
|
+
- rlp-desk는 npm published — breaking change는 user-facing (semver 고려)
|
|
115
|
+
- 분석은 advisory only — 실제 코드 수정/commit/push는 사용자 승인 별도
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## 7. autoplan에 요구 — 이 문서를 입력으로
|
|
120
|
+
|
|
121
|
+
CEO 관점: rlp-desk가 푸는 문제가 옳은가? 다른 도구로 같은 가치를 이미 얻을 수 있나?
|
|
122
|
+
Eng 관점: 6주간의 architectural pattern을 보면 (a) 80% 카테고리가 patching으로 해결될까? 아니면 redesign이 필연인가?
|
|
123
|
+
DX 관점: rlp-desk는 dev tool — operator(BOS dev)가 매번 30분씩 hand-write recovery를 해야 한다는 사실이 DX failure인가?
|
|
124
|
+
|
|
125
|
+
각 phase에서 dual voices (Codex + Claude subagent) 실행, consensus table 생성, 4 options steelman.
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Signal Protocol — current contract + alternatives
|
|
2
|
+
|
|
3
|
+
**Spec version:** `signal-protocol-v1`
|
|
4
|
+
**Source consensus:** ralplan iter 6 — Architect synthesis, Critic codex APPROVED (P0=0, P1=0)
|
|
5
|
+
**Audience:** maintainers evaluating whether to adopt mailbox-dir, daemon, or in-process IPC alternatives.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 1. Current Contract
|
|
10
|
+
|
|
11
|
+
rlp-desk routes Worker → Verifier handoff through a **single sentinel file per role per iteration**. The contract has four invariants:
|
|
12
|
+
|
|
13
|
+
1. **Sentinel = artifact.** Every transition step (`verify`, `verdict`, `flywheel`, `flywheel-guard`) is encoded as a JSON file at a deterministic path under `.rlp-desk/memos/`. The Leader polls the path with `fs.access` + atomic JSON-parse; any partial write is rejected (`jq -e .` gate, see `tests/test-bug7-poll-partial-write.sh`).
|
|
14
|
+
2. **`reapProducer` = lifecycle.** Once the Leader accepts a sentinel (validateArtifact passes), it MUST kill the producing TUI pane and chmod-lock the file. Skipping the reap leaves a self-reviewing claude/codex pane that overwrites the artifact mid-poll (Bug #7).
|
|
15
|
+
3. **Strict ordering: detect → reap → wait shell → next dispatch.** The Leader does NOT dispatch the next role (Verifier after Worker, next-iter Worker after Verifier) until the producing pane's `pane_current_command` has returned to `zsh|bash|sh`. AC-H1 of PR-0b-narrow strengthens this with `waitForProcessExit`.
|
|
16
|
+
4. **First-writer-wins for terminal sentinels.** `blocked.md` and `complete.md` are written via `O_EXCL` (`writeSentinelExclusive`); concurrent error paths cannot trample the canonical exit reason.
|
|
17
|
+
|
|
18
|
+
The same contract is implemented twice (`src/node/runner/campaign-main-loop.mjs` for `--mode agent`, `src/scripts/run_ralph_desk.zsh` for `--mode tmux`) with bit-for-bit parity on `(reason_text, reason_category, failure_category)` — verified by `tests/test-bug8-refuse-synthesis.sh` Scenario 4.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## 2. omc-teams Comparison (mailbox dir, daemon-backed CLI)
|
|
23
|
+
|
|
24
|
+
[omc-teams](https://github.com/oh-my-claudecode) delivers multi-agent coordination over a **daemon-backed CLI** (`omc team api ...`). Producers append to a per-team mailbox directory; consumers tail it. The reliability contract is enforced by the daemon process, not by file polling.
|
|
25
|
+
|
|
26
|
+
**What omc-teams gives you:**
|
|
27
|
+
|
|
28
|
+
- Crash-safe append-only message log (no truncated JSON window).
|
|
29
|
+
- Per-team subscription with backpressure.
|
|
30
|
+
- Cross-process delivery guarantees (daemon survives subprocess restart).
|
|
31
|
+
|
|
32
|
+
**What's load-bearing in the reliability gain — and what's not:**
|
|
33
|
+
|
|
34
|
+
The reliability gain is the **daemon**, not the mailbox dir. A bare file-mailbox (without daemon) inherits the same partial-write and self-review failure modes that rlp-desk's sentinel path already guards against, plus a new failure mode: a Worker prompt that misbehaves and dumps multiple JSON files into the mailbox (no single-writer invariant). Architect findings recorded in ralplan iter 6:
|
|
35
|
+
|
|
36
|
+
> Mailbox-dir without a daemon = same polling reliability as the sentinel approach + worker-prompt failure-mode increase. Adopting it as an intermediate step is strictly worse than the current contract.
|
|
37
|
+
|
|
38
|
+
So if rlp-desk wants the actual omc-teams reliability profile, it must adopt the **daemon**, not just the directory layout. That is the `Track B` work, not a sentinel rewrite.
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## 3. claude code `/team` Comparison (in-process TeamCreate + SendMessage)
|
|
43
|
+
|
|
44
|
+
The Claude Code SDK exposes `TeamCreate` + `SendMessage` for in-process subagent coordination. This is fundamentally different:
|
|
45
|
+
|
|
46
|
+
| Property | rlp-desk sentinel | claude `/team` |
|
|
47
|
+
|---|---|---|
|
|
48
|
+
| Process model | Standalone tmux runner | Single-process subagent tree |
|
|
49
|
+
| IPC channel | Filesystem | In-memory message bus |
|
|
50
|
+
| Failure mode | Pane death, partial write | Subagent throw |
|
|
51
|
+
| Lifetime | Survives leader exit | Dies with parent |
|
|
52
|
+
|
|
53
|
+
`/team` is **not applicable** to a standalone tmux runner. rlp-desk explicitly supports the use case where the Leader can crash, the user can detach the tmux session, and a fresh Leader process can resume against the on-disk sentinel state. `/team` cannot be paused, snapshotted, or resumed across processes — by design.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## 4. Why rlp-desk does NOT adopt mailbox-dir
|
|
58
|
+
|
|
59
|
+
Architect/Critic codex consensus iter 6 rejected swapping the sentinel contract for a mailbox-dir for three concrete reasons:
|
|
60
|
+
|
|
61
|
+
1. **No reliability gain without the daemon.** Section 2 above. The daemon is the load-bearing piece; the directory is a side-effect of the daemon's protocol.
|
|
62
|
+
2. **Increased Worker-prompt failure surface.** Today the Worker is held to a single-writer contract: it MUST write `iter-signal.json` exactly once. A mailbox flips this to "append any number of messages and the daemon picks the latest" — a much weaker prompt-side invariant that empirically breaks under the kind of multi-pass self-review failures that Bug #7 was created to fix.
|
|
63
|
+
3. **Migration cost without commensurate benefit.** Two implementations (Node + zsh), Self-Verification Gate matrix (LOW/MEDIUM/CRITICAL × `--mode tmux/agent`), backwards compatibility for in-flight campaigns, and downstream wrapper tools (analytics, blueprints, Test Spec) all assume the sentinel contract. Replacing it is a multi-PR migration with no incremental win until the daemon ships.
|
|
64
|
+
|
|
65
|
+
The bug-fix track (Bug #6 worker-dead, Bug #7 post-sentinel-race, Bug #8 refuse-synthesize) closes the actual reliability gaps inside the sentinel contract and is strictly cheaper than the mailbox migration.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## 5. Track B Roadmap — daemon-backed `rlp-desk team api`
|
|
70
|
+
|
|
71
|
+
When the project is ready to adopt the omc-teams reliability profile, the migration looks like this:
|
|
72
|
+
|
|
73
|
+
**Track B — Phase 1 (PoC, separate ralplan):**
|
|
74
|
+
- New CLI: `rlp-desk team api start|stop|status|send|recv`
|
|
75
|
+
- Daemon process (`rlp-desk-teamd`) owns a per-campaign mailbox under `~/.rlp-desk/team/{slug}/`.
|
|
76
|
+
- Leader and Workers route through the CLI; no direct file polling.
|
|
77
|
+
- File-system fallback retained for the migration window — daemon down ⇒ degrade to sentinel mode.
|
|
78
|
+
|
|
79
|
+
**Track B — Phase 2 (cutover):**
|
|
80
|
+
- Sentinel reads behind a feature flag (`RLP_TEAM_API=1`).
|
|
81
|
+
- Self-Verification Gate matrix extended: each scenario runs once per backend (sentinel + team-api).
|
|
82
|
+
- Wrapper tools (analytics, blueprints) updated to consume the new event stream.
|
|
83
|
+
|
|
84
|
+
**Track B — Phase 3 (deprecation):**
|
|
85
|
+
- Sentinel path removed from runtime once team-api has burned in for ≥1 release.
|
|
86
|
+
- Documentation rolled forward; `signal-protocol-v1` archived.
|
|
87
|
+
|
|
88
|
+
Dependencies:
|
|
89
|
+
- Daemon implementation (~600 LoC Node, drawing on Bun's IPC primitives or plain `node:net`).
|
|
90
|
+
- Integration test harness for daemon crash recovery.
|
|
91
|
+
- Self-Verification Gate parity matrix (Node × zsh × team-api).
|
|
92
|
+
|
|
93
|
+
This track is **explicitly out of scope** for the Bug #6/#7/#8 plan v6. It is captured here so future maintainers do not interpret "rlp-desk does not use a mailbox" as an oversight — it is a deliberate architectural decision with a known successor path.
|
package/install.sh
CHANGED
|
@@ -115,6 +115,8 @@ fetch "$REPO_URL/docs/rlp-desk/getting-started.md" "$DESK_DIR/docs/rlp-desk/gett
|
|
|
115
115
|
fetch "$REPO_URL/docs/rlp-desk/protocol-reference.md" "$DESK_DIR/docs/rlp-desk/protocol-reference.md"
|
|
116
116
|
fetch "$REPO_URL/docs/rlp-desk/TODO-verification-next.md" "$DESK_DIR/docs/rlp-desk/TODO-verification-next.md"
|
|
117
117
|
fetch "$REPO_URL/docs/rlp-desk/multi-mission-orchestration.md" "$DESK_DIR/docs/rlp-desk/multi-mission-orchestration.md"
|
|
118
|
+
# Plan v6 PR-0a: signal protocol documentation
|
|
119
|
+
fetch "$REPO_URL/docs/rlp-desk/signal-protocol.md" "$DESK_DIR/docs/rlp-desk/signal-protocol.md"
|
|
118
120
|
# Dev meta docs (v5.7 §4.15: under docs/rlp-desk/ to avoid mixing with user docs)
|
|
119
121
|
fetch "$REPO_URL/docs/rlp-desk/internal/verification-policy-gap-analysis.md" "$DESK_DIR/docs/rlp-desk/internal/verification-policy-gap-analysis.md"
|
|
120
122
|
fetch "$REPO_URL/docs/rlp-desk/internal/verification-strategy-research.md" "$DESK_DIR/docs/rlp-desk/internal/verification-strategy-research.md"
|