@ai-dev-methodologies/rlp-desk 0.14.5 → 0.15.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/polished-gliding-toucan.md +234 -0
- package/docs/rlp-desk/signal-protocol.md +93 -0
- package/install.sh +2 -0
- package/package.json +1 -1
- package/scripts/postinstall.js +2 -0
- package/src/commands/rlp-desk.md +8 -0
- package/src/node/cli/command-builder.mjs +8 -6
- package/src/node/constants.mjs +13 -11
- package/src/node/runner/campaign-main-loop.mjs +222 -7
- package/src/node/shared/fs.mjs +83 -0
- package/src/node/tmux/pane-manager.mjs +39 -0
- package/src/scripts/lib_ralph_desk.zsh +75 -5
- package/src/scripts/run_ralph_desk.zsh +147 -25
|
@@ -0,0 +1,234 @@
|
|
|
1
|
+
# Bug Report #7 — Post-Sentinel Process Race Fix
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
BOS 사용자가 19th launch에서 측정한 race window:
|
|
6
|
+
- iter-1 verifier가 verdict detect 후 **1m 43s** 뒤 `verify-verdict.json` 재수정 (file mtime 증거)
|
|
7
|
+
- iter-1 verifier post-verdict 후속 활동 **2m 1s**
|
|
8
|
+
- iter-1 verifier ↔ iter-2 worker 동시 작업 약 **2분**
|
|
9
|
+
|
|
10
|
+
Bug report:
|
|
11
|
+
`/Users/kyjin/dev/doul/bos/docs/exec-plans/active/2026-05-06-rlp-desk-bug-report-7-post-sentinel-process-race.md`
|
|
12
|
+
|
|
13
|
+
### Root cause
|
|
14
|
+
|
|
15
|
+
Leader는 `iter-signal.json` / `verify-verdict.json` 발견 즉시 다음 iter로 진입하지만, 그 sentinel을 쓴 Worker/Verifier process(claude/codex TUI)는 **명시적으로 종료되지 않는다**. tmux pane은 살아 있고 TUI는 idle prompt로 회귀 후 자체 self-review를 수행 → sentinel 재수정·working tree 오염·토큰 낭비.
|
|
16
|
+
|
|
17
|
+
### 모드 영향 범위 (중요)
|
|
18
|
+
|
|
19
|
+
`--mode tmux`(zsh runner)와 `--mode agent`(Node leader) **둘 다 영향**. Node leader도 `defaultSendKeys`/`defaultCreatePane`(`src/node/tmux/pane-manager.mjs`)을 통해 실제 tmux pane 위에서 worker/verifier를 실행한다 (`src/node/runner/campaign-main-loop.mjs:1077-1080`, `1116-1133`). Agent 모드 면역이라는 초기 가설은 부정확.
|
|
20
|
+
|
|
21
|
+
### 비대칭 (현 상태)
|
|
22
|
+
|
|
23
|
+
| 경로 | Worker 후처리 | Verifier 후처리 |
|
|
24
|
+
|---|---|---|
|
|
25
|
+
| Node leader | 없음 | 없음 |
|
|
26
|
+
| zsh runner | 다음 iter 시작 시 cleanup (`run_ralph_desk.zsh:2948-2956`) — race window 5s+ | dispatch 직전 cleanup (`3160-3180`) — 같은 iter 내에선 보호되나 final iter 종료 후 또는 cross-iter race는 불보호 |
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Approach (Fix-Q + Fix-R, 최소 surgical 조합)
|
|
31
|
+
|
|
32
|
+
| Fix | 효과 | 채택 |
|
|
33
|
+
|---|---|---|
|
|
34
|
+
| **Q** Sentinel detect 즉시 producing pane에 Ctrl+C → process 종료 | race를 ~1초 안에 직접 차단 | **YES (primary)** |
|
|
35
|
+
| **R** Sentinel 파일 chmod 0444로 재수정 차단 | Q가 늦거나 fail해도 mtime 동결 | **YES (defense-in-depth)** |
|
|
36
|
+
| S Pane lifecycle 전면 리팩토링 | 효과는 있으나 surface가 너무 큼. 기존 prep cleanup (zsh 2948-2956)으로 부분 커버됨. Karpathy "surgical changes" 원칙 위반 | NO |
|
|
37
|
+
| T post-sentinel 30s 안전망 timeout | Q가 fail-open이고 다음 iter prep cleanup이 backup이라 중복 | NO |
|
|
38
|
+
|
|
39
|
+
근거:
|
|
40
|
+
- Q는 producer를 ~1초 내 죽여서 root cause 차단. 기존 패턴 정확히 미러 (zsh `run_ralph_desk.zsh:2384-2397`, Ctrl+C 더블 송신 + `wait_for_pane_ready`).
|
|
41
|
+
- R은 chmod 실패에 관대(EPERM/ENOTSUP 무시 — `scripts/postinstall.js:104` `tryLockFile` 선례). WSL1/NTFS/tmpfs 등 chmod no-op 환경에서도 graceful degradation.
|
|
42
|
+
- S/T 제거로 review surface 최소화.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Concrete code changes
|
|
47
|
+
|
|
48
|
+
### Node leader
|
|
49
|
+
|
|
50
|
+
#### 1. `src/node/tmux/pane-manager.mjs` — helper 추가 (line 77 뒤)
|
|
51
|
+
|
|
52
|
+
신규 export:
|
|
53
|
+
- `sendRawKey(paneId, key)` — `runTmux(['send-keys', '-t', paneId, key])`. `sendKeys`(`-l --` literal text)와 분리: C-c 같은 raw key용.
|
|
54
|
+
- `killPaneProcess(paneId, { sendRawKey, waitForExit, gracePeriodMs=800, exitTimeoutMs=5000, log })`:
|
|
55
|
+
1. `sendRawKey('C-c')` → `await sleep(gracePeriodMs)` → `sendRawKey('C-c')` (double press, zsh `375-376` 미러).
|
|
56
|
+
2. `await waitForExit(paneId, { timeoutMs: exitTimeoutMs }).catch(log)` — fail-open.
|
|
57
|
+
3. raw key 송신 자체의 TmuxError도 catch+log (이미 죽은 pane에 안전).
|
|
58
|
+
|
|
59
|
+
기존 `waitForProcessExit` (line 55) 그대로 재사용.
|
|
60
|
+
|
|
61
|
+
#### 2. `src/node/shared/fs.mjs` — helper 추가 (line 61 뒤)
|
|
62
|
+
|
|
63
|
+
- `lockSentinelFile(filePath, { log })` — `fs.chmod(filePath, 0o444)`, error 시 한 번만 경고 로그. `tryLockFile`(`scripts/postinstall.js:104`) 선례 미러.
|
|
64
|
+
- `unlockSentinelFile(filePath)` — `fs.chmod(filePath, 0o644)`, 실패 무시. iter cleanup 직전에 호출.
|
|
65
|
+
|
|
66
|
+
#### 3. `src/node/runner/campaign-main-loop.mjs` — wire + call sites
|
|
67
|
+
|
|
68
|
+
DI 슬롯 추가 (line 1077-1080):
|
|
69
|
+
```
|
|
70
|
+
const sendRawKey = options.sendRawKey ?? defaultSendRawKey;
|
|
71
|
+
const waitForProcessExit = options.waitForProcessExit ?? defaultWaitForProcessExit;
|
|
72
|
+
const killPaneProcess = options.killPaneProcess ?? defaultKillPaneProcess;
|
|
73
|
+
const lockSentinel = options.lockSentinelFile ?? lockSentinelFile;
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
내부 wrapper:
|
|
77
|
+
```
|
|
78
|
+
async function reapProducer(paneId, sentinelFile) {
|
|
79
|
+
await killPaneProcess(paneId, { sendRawKey, waitForExit: waitForProcessExit, log: console.error });
|
|
80
|
+
if (sentinelFile) await lockSentinel(sentinelFile, { log: console.error });
|
|
81
|
+
}
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
호출 사이트 (성공 + `validateArtifact` 통과 직후):
|
|
85
|
+
|
|
86
|
+
| Site | Line | 호출 |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| Flywheel poll | 1267-1277 다음 (1285 앞) | `reapProducer(state.flywheel_pane_id ?? state.verifier_pane_id, paths.flywheelSignalFile)` |
|
|
89
|
+
| Guard poll | 1305-1315 다음 (1323 앞) | `reapProducer(guardPaneId, paths.flywheelGuardVerdictFile)` |
|
|
90
|
+
| Worker poll | 1422-1432 다음 (1456 앞) | `reapProducer(state.worker_pane_id, paths.signalFile)` |
|
|
91
|
+
| Verifier poll | 1489-1513 다음 (1522 앞) | `reapProducer(state.verifier_pane_id, paths.verdictFile)` |
|
|
92
|
+
| Final per-US verifier (`runFinalSequentialVerify`) | 890-894 다음 (896 앞) | `reapProducer(verifierPaneId, paths.verdictFile)` — `runFinalSequentialVerify` 시그니처에 `reapProducer` 추가 + 호출처(1185-1194) 전달 |
|
|
93
|
+
|
|
94
|
+
iter cleanup unlock — `fs.unlink(...)` 호출 직전 `unlockSentinelFile` 호출:
|
|
95
|
+
- L1291 (`flywheelSignalFile`)
|
|
96
|
+
- L1328 (`flywheelGuardVerdictFile`)
|
|
97
|
+
- 루프 상단 (1145 직후) — Worker `signalFile` / Verifier `verdictFile` 방어적 unlock (다음 iter producer가 atomic rename으로 덮어쓸 때 대비)
|
|
98
|
+
|
|
99
|
+
### zsh runner
|
|
100
|
+
|
|
101
|
+
#### 4. `src/scripts/lib_ralph_desk.zsh` — helper 추가 (`atomic_write` 다음, line 245 뒤)
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
_kill_pane_process() {
|
|
105
|
+
local pane_id="$1" role="${2:-producer}"
|
|
106
|
+
log_debug "[bug7] kill_pane_process pane=$pane_id role=$role"
|
|
107
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
108
|
+
sleep 0.5
|
|
109
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
110
|
+
sleep 1
|
|
111
|
+
wait_for_pane_ready "$pane_id" 5 2>/dev/null || true
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
_lock_sentinel() {
|
|
115
|
+
local file="$1"
|
|
116
|
+
[[ -f "$file" ]] || return 0
|
|
117
|
+
chmod 0444 "$file" 2>/dev/null || true
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
_unlock_sentinel() {
|
|
121
|
+
local file="$1"
|
|
122
|
+
[[ -f "$file" ]] || return 0
|
|
123
|
+
chmod 0644 "$file" 2>/dev/null || true
|
|
124
|
+
}
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
#### 5. `src/scripts/run_ralph_desk.zsh` — call sites
|
|
128
|
+
|
|
129
|
+
| Site | Line | 호출 |
|
|
130
|
+
|---|---|---|
|
|
131
|
+
| Worker poll 성공 직후 | 3003 (`worker_poll_done=1` 분기 안, `log_debug` 다음) | `_kill_pane_process "$WORKER_PANE" "worker"; _lock_sentinel "$SIGNAL_FILE"` |
|
|
132
|
+
| Verifier poll 성공 직후 (main path) | 3202 통과 후, 3215 앞 (`ITER_VERIFIER_END`) | `_kill_pane_process "$VERIFIER_PANE" "verifier"; _lock_sentinel "$VERDICT_FILE"` |
|
|
133
|
+
| Final-verify per-US (`run_sequential_final_verify`) | 2524 통과 후, 다음 iter 진입 전 | `_kill_pane_process "$VERIFIER_PANE" "verifier-final"; _lock_sentinel "$VERDICT_FILE"` |
|
|
134
|
+
| Codex grace path | `dispatch_verifier_per_us` (2420 그레이스 종료 직후, 2471 `cp` 앞) | `_kill_pane_process "$VERIFIER_PANE" "verifier-${suffix}"; _lock_sentinel "$VERDICT_FILE"` |
|
|
135
|
+
| Consensus path | `run_consensus_verification` 내 각 `poll_for_signal` 성공 직후 | 동일 패턴 |
|
|
136
|
+
|
|
137
|
+
prep cleanup unlock — line 2948-2956 cleanup 직전:
|
|
138
|
+
```
|
|
139
|
+
_unlock_sentinel "$SIGNAL_FILE"; _unlock_sentinel "$VERDICT_FILE"
|
|
140
|
+
rm -f "$SIGNAL_FILE" "$DONE_CLAIM_FILE" "$VERDICT_FILE" 2>/dev/null
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Files to modify
|
|
146
|
+
|
|
147
|
+
| 파일 | 변경 |
|
|
148
|
+
|---|---|
|
|
149
|
+
| `src/node/tmux/pane-manager.mjs` | `sendRawKey`, `killPaneProcess` export 추가 |
|
|
150
|
+
| `src/node/shared/fs.mjs` | `lockSentinelFile`, `unlockSentinelFile` export 추가 |
|
|
151
|
+
| `src/node/runner/campaign-main-loop.mjs` | DI + `reapProducer` + 5개 call site + iter cleanup unlock |
|
|
152
|
+
| `src/scripts/lib_ralph_desk.zsh` | `_kill_pane_process`, `_lock_sentinel`, `_unlock_sentinel` 추가 |
|
|
153
|
+
| `src/scripts/run_ralph_desk.zsh` | 4-5개 call site + prep cleanup unlock |
|
|
154
|
+
| `tests/node/us006-campaign-main-loop.test.mjs` | `createTmuxFakes()`에 `killPaneProcess`/`lockSentinelFile` 레코더 추가 + Bug-7 테스트 3건 |
|
|
155
|
+
| `tests/node/test-kill-pane-process.test.mjs` | NEW — helper 단위 테스트 |
|
|
156
|
+
| `tests/node/test-lock-sentinel-file.test.mjs` | NEW — chmod 단위 테스트 |
|
|
157
|
+
| `tests/test-bug7-post-sentinel-race.sh` | NEW — 실제 tmux 통합 테스트 (Bug #6 패턴 미러) |
|
|
158
|
+
|
|
159
|
+
배포는 단일 PR (helper는 call site 없으면 no-op이라 review surface 작음).
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Reused functions (참조)
|
|
164
|
+
|
|
165
|
+
- Node: `pane-manager.mjs:50` `sendKeys`, `pane-manager.mjs:55` `waitForProcessExit` (5s timeout, shell 감지)
|
|
166
|
+
- Node: `shared/fs.mjs:6-23` `writeFileAtomic`, `42-61` `writeSentinelExclusive`
|
|
167
|
+
- Node: `scripts/postinstall.js:104` `tryLockFile` (chmod 0o444 선례)
|
|
168
|
+
- zsh: `lib_ralph_desk.zsh:240-245` `atomic_write`, `1075-1137` `wait_for_pane_ready`
|
|
169
|
+
- zsh: `run_ralph_desk.zsh:2384-2397` 검증된 verifier-cleanup 패턴 (Ctrl+C + /exit + wait), `375-376/529-530` 더블 Ctrl+C 패턴
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
173
|
+
## Testing strategy
|
|
174
|
+
|
|
175
|
+
### 단위 테스트 (Node)
|
|
176
|
+
|
|
177
|
+
`tests/node/test-kill-pane-process.test.mjs` (NEW):
|
|
178
|
+
- AC1 정상: C-c → sleep → C-c → waitForExit 순서 (fake recorder 검증).
|
|
179
|
+
- AC2 fail-open: `waitForExit` 가 TmuxError throw 시 helper resolve.
|
|
180
|
+
- AC3 dead-pane: `sendRawKey` throw 시 resolve.
|
|
181
|
+
- AC4 grace: gracePeriodMs 준수 (fake clock 또는 tolerance 검증).
|
|
182
|
+
|
|
183
|
+
`tests/node/test-lock-sentinel-file.test.mjs` (NEW):
|
|
184
|
+
- AC1: lock 후 mode `& 0o222 === 0` (chmod 무시 FS는 skip).
|
|
185
|
+
- AC2: 존재하지 않는 path에 lock — throw 안 함.
|
|
186
|
+
- AC3: unlock 후 writable.
|
|
187
|
+
|
|
188
|
+
### 통합 테스트 (Node)
|
|
189
|
+
|
|
190
|
+
`tests/node/us006-campaign-main-loop.test.mjs` 확장:
|
|
191
|
+
1. **Bug-7-A**: Worker pollForSignal 성공 → next dispatchVerifier 전에 `killPaneProcess('%worker')` + `lockSentinelFile(signalFile)` 호출 순서 검증.
|
|
192
|
+
2. **Bug-7-B**: Verifier verdict pass 후 next iter dispatchWorker 전에 `killPaneProcess('%verifier')` + `lockSentinelFile(verdictFile)`.
|
|
193
|
+
3. **Bug-7-C**: `killPaneProcess`가 throw해도 run() 정상 완료.
|
|
194
|
+
|
|
195
|
+
`createTmuxFakes()`(line 83)에 fake `killPaneProcess`/`lockSentinelFile` 레코더 추가 (기존 30+ 테스트 호환 보장).
|
|
196
|
+
|
|
197
|
+
### 통합 테스트 (zsh)
|
|
198
|
+
|
|
199
|
+
`tests/test-bug7-post-sentinel-race.sh` (NEW, `test-bug6-worker-idle-false-positive.sh` 패턴 미러):
|
|
200
|
+
- Scenario 1: tmux 세션에 `sleep 600` 띄우고 `_kill_pane_process` 호출 → 2s 안에 `pane_current_command`가 zsh/bash로 회귀.
|
|
201
|
+
- Scenario 2: `_lock_sentinel` → mode 0444 검증 → `_unlock_sentinel` → writable → `rm -f` 성공.
|
|
202
|
+
- Scenario 3 (REAL_E2E gated): 1-iter 캠페인 + stub claude(sentinel write 후 sleep 120) → 10s 후 verdict file mtime delta == 0.
|
|
203
|
+
|
|
204
|
+
### Self-Verification 시나리오 (CLAUDE.md gate, 3건 필수)
|
|
205
|
+
|
|
206
|
+
`src/scripts/run_ralph_desk.zsh` 수정 — MEDIUM-HIGH risk:
|
|
207
|
+
- **LOW**: helper 단위 테스트 + 기존 Node/zsh 회귀 테스트 통과.
|
|
208
|
+
- **MEDIUM**: 1-iter 실제 캠페인. Worker → Verifier 전이 시점에 `pane_current_command` 캡처, 2s 내 shell 회귀 검증. Verdict file mtime 동결 검증.
|
|
209
|
+
- **CRITICAL**: 2-iter 캠페인 (verify→fail→verify→pass). iter-N+1 worker dispatch가 iter-N verifier `pane_current_command == zsh` 확인 후에만 발생 — 타임스탬프 로그 캡처. `--mode agent`와 `--mode tmux` 둘 다 실행.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Verification end-to-end
|
|
214
|
+
|
|
215
|
+
1. **단위**: `node --test tests/node/test-kill-pane-process.test.mjs tests/node/test-lock-sentinel-file.test.mjs` 통과.
|
|
216
|
+
2. **통합 (Node)**: `node --test tests/node/us006-campaign-main-loop.test.mjs` 통과 — call order 단언이 회귀 가드.
|
|
217
|
+
3. **라이브 tmux**: `_kill_pane_process` 호출 후 2s 내 `tmux display-message -p '#{pane_current_command}' -t $pane`가 `zsh`/`bash` 반환.
|
|
218
|
+
4. **mtime 동결**: `stat -f %m verify-verdict.json`을 detect 시점과 +10s 시점에 측정해 delta == 0. Bug report의 1m43s 증거를 직접 반박.
|
|
219
|
+
5. **Pane 출력**: `tmux capture-pane -p` 결과에 `Worked for Xm Ys` / `esc to interrupt` 신규 표식 없음.
|
|
220
|
+
6. **두 모드**: 스모크 테스트를 `--mode tmux`(zsh runner)와 `--mode agent`(Node leader) 각각 실행 — 둘 다 4초 내 shell 회귀 검증.
|
|
221
|
+
7. **재현 시나리오**: 19th launch와 동일 조건(claude opus 1m worker + gpt-5.5:high codex verifier)으로 캠페인 1회 실행 후 leader log + file mtime 비교 — race 0.
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## Risk / mitigation
|
|
226
|
+
|
|
227
|
+
| Risk | 가능성 | 완화 |
|
|
228
|
+
|---|---|---|
|
|
229
|
+
| C-c가 producer artifact 쓰기 중간 인터럽트 | LOW — sentinel은 detect 시점에 이미 디스크에 존재 | `MalformedArtifactError` 경로가 partial write 처리 |
|
|
230
|
+
| chmod 0444가 다음 iter cleanup의 `unlink` 차단 | LOW | `_unlock_sentinel` / `unlockSentinelFile`이 unlink 직전 실행. 대부분 Unix FS는 dir-perms 기준이라 0444 파일도 unlink 가능 |
|
|
231
|
+
| Producer가 atomic rename으로 sentinel 재기록 (chmod 우회) | POSSIBLE | Q(kill)이 ~1s 내 producer 죽이므로 rewrite window가 2분 → 1초로 축소. 게다가 leader는 이미 in-band로 sentinel 소비 |
|
|
232
|
+
| `killPaneProcess`가 죽은 pane에 throw | POSSIBLE | helper 내부 catch + 단위 테스트 AC2/AC3로 회귀 가드 |
|
|
233
|
+
| chmod 0444 silent no-op (WSL1/NTFS/tmpfs) | OBSERVED (postinstall.js 선례) | 한 번만 경고 로그. Q(kill)이 primary defense라 graceful degradation |
|
|
234
|
+
| 기존 us006 테스트 회귀 | MEDIUM | `createTmuxFakes()`에 fake helper 레코더 추가 — 기존 호출자는 자동 주입 받음 |
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Signal Protocol — current contract + alternatives
|
|
2
|
+
|
|
3
|
+
**Spec version:** `signal-protocol-v1`
|
|
4
|
+
**Source consensus:** ralplan iter 6 — Architect synthesis, Critic codex APPROVED (P0=0, P1=0)
|
|
5
|
+
**Audience:** maintainers evaluating whether to adopt mailbox-dir, daemon, or in-process IPC alternatives.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 1. Current Contract
|
|
10
|
+
|
|
11
|
+
rlp-desk routes Worker → Verifier handoff through a **single sentinel file per role per iteration**. The contract has four invariants:
|
|
12
|
+
|
|
13
|
+
1. **Sentinel = artifact.** Every transition step (`verify`, `verdict`, `flywheel`, `flywheel-guard`) is encoded as a JSON file at a deterministic path under `.rlp-desk/memos/`. The Leader polls the path with `fs.access` + atomic JSON-parse; any partial write is rejected (`jq -e .` gate, see `tests/test-bug7-poll-partial-write.sh`).
|
|
14
|
+
2. **`reapProducer` = lifecycle.** Once the Leader accepts a sentinel (validateArtifact passes), it MUST kill the producing TUI pane and chmod-lock the file. Skipping the reap leaves a self-reviewing claude/codex pane that overwrites the artifact mid-poll (Bug #7).
|
|
15
|
+
3. **Strict ordering: detect → reap → wait shell → next dispatch.** The Leader does NOT dispatch the next role (Verifier after Worker, next-iter Worker after Verifier) until the producing pane's `pane_current_command` has returned to `zsh|bash|sh`. AC-H1 of PR-0b-narrow strengthens this with `waitForProcessExit`.
|
|
16
|
+
4. **First-writer-wins for terminal sentinels.** `blocked.md` and `complete.md` are written via `O_EXCL` (`writeSentinelExclusive`); concurrent error paths cannot trample the canonical exit reason.
|
|
17
|
+
|
|
18
|
+
The same contract is implemented twice (`src/node/runner/campaign-main-loop.mjs` for `--mode agent`, `src/scripts/run_ralph_desk.zsh` for `--mode tmux`) with bit-for-bit parity on `(reason_text, reason_category, failure_category)` — verified by `tests/test-bug8-refuse-synthesis.sh` Scenario 4.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## 2. omc-teams Comparison (mailbox dir, daemon-backed CLI)
|
|
23
|
+
|
|
24
|
+
[omc-teams](https://github.com/oh-my-claudecode) delivers multi-agent coordination over a **daemon-backed CLI** (`omc team api ...`). Producers append to a per-team mailbox directory; consumers tail it. The reliability contract is enforced by the daemon process, not by file polling.
|
|
25
|
+
|
|
26
|
+
**What omc-teams gives you:**
|
|
27
|
+
|
|
28
|
+
- Crash-safe append-only message log (no truncated JSON window).
|
|
29
|
+
- Per-team subscription with backpressure.
|
|
30
|
+
- Cross-process delivery guarantees (daemon survives subprocess restart).
|
|
31
|
+
|
|
32
|
+
**What's load-bearing in the reliability gain — and what's not:**
|
|
33
|
+
|
|
34
|
+
The reliability gain is the **daemon**, not the mailbox dir. A bare file-mailbox (without daemon) inherits the same partial-write and self-review failure modes that rlp-desk's sentinel path already guards against, plus a new failure mode: a Worker prompt that misbehaves and dumps multiple JSON files into the mailbox (no single-writer invariant). Architect findings recorded in ralplan iter 6:
|
|
35
|
+
|
|
36
|
+
> Mailbox-dir without a daemon = same polling reliability as the sentinel approach + worker-prompt failure-mode increase. Adopting it as an intermediate step is strictly worse than the current contract.
|
|
37
|
+
|
|
38
|
+
So if rlp-desk wants the actual omc-teams reliability profile, it must adopt the **daemon**, not just the directory layout. That is the `Track B` work, not a sentinel rewrite.
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## 3. claude code `/team` Comparison (in-process TeamCreate + SendMessage)
|
|
43
|
+
|
|
44
|
+
The Claude Code SDK exposes `TeamCreate` + `SendMessage` for in-process subagent coordination. This is fundamentally different:
|
|
45
|
+
|
|
46
|
+
| Property | rlp-desk sentinel | claude `/team` |
|
|
47
|
+
|---|---|---|
|
|
48
|
+
| Process model | Standalone tmux runner | Single-process subagent tree |
|
|
49
|
+
| IPC channel | Filesystem | In-memory message bus |
|
|
50
|
+
| Failure mode | Pane death, partial write | Subagent throw |
|
|
51
|
+
| Lifetime | Survives leader exit | Dies with parent |
|
|
52
|
+
|
|
53
|
+
`/team` is **not applicable** to a standalone tmux runner. rlp-desk explicitly supports the use case where the Leader can crash, the user can detach the tmux session, and a fresh Leader process can resume against the on-disk sentinel state. `/team` cannot be paused, snapshotted, or resumed across processes — by design.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## 4. Why rlp-desk does NOT adopt mailbox-dir
|
|
58
|
+
|
|
59
|
+
Architect/Critic codex consensus iter 6 rejected swapping the sentinel contract for a mailbox-dir for three concrete reasons:
|
|
60
|
+
|
|
61
|
+
1. **No reliability gain without the daemon.** Section 2 above. The daemon is the load-bearing piece; the directory is a side-effect of the daemon's protocol.
|
|
62
|
+
2. **Increased Worker-prompt failure surface.** Today the Worker is held to a single-writer contract: it MUST write `iter-signal.json` exactly once. A mailbox flips this to "append any number of messages and the daemon picks the latest" — a much weaker prompt-side invariant that empirically breaks under the kind of multi-pass self-review failures that Bug #7 was created to fix.
|
|
63
|
+
3. **Migration cost without commensurate benefit.** Two implementations (Node + zsh), Self-Verification Gate matrix (LOW/MEDIUM/CRITICAL × `--mode tmux/agent`), backwards compatibility for in-flight campaigns, and downstream wrapper tools (analytics, blueprints, Test Spec) all assume the sentinel contract. Replacing it is a multi-PR migration with no incremental win until the daemon ships.
|
|
64
|
+
|
|
65
|
+
The bug-fix track (Bug #6 worker-dead, Bug #7 post-sentinel-race, Bug #8 refuse-synthesize) closes the actual reliability gaps inside the sentinel contract and is strictly cheaper than the mailbox migration.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## 5. Track B Roadmap — daemon-backed `rlp-desk team api`
|
|
70
|
+
|
|
71
|
+
When the project is ready to adopt the omc-teams reliability profile, the migration looks like this:
|
|
72
|
+
|
|
73
|
+
**Track B — Phase 1 (PoC, separate ralplan):**
|
|
74
|
+
- New CLI: `rlp-desk team api start|stop|status|send|recv`
|
|
75
|
+
- Daemon process (`rlp-desk-teamd`) owns a per-campaign mailbox under `~/.rlp-desk/team/{slug}/`.
|
|
76
|
+
- Leader and Workers route through the CLI; no direct file polling.
|
|
77
|
+
- File-system fallback retained for the migration window — daemon down ⇒ degrade to sentinel mode.
|
|
78
|
+
|
|
79
|
+
**Track B — Phase 2 (cutover):**
|
|
80
|
+
- Sentinel reads behind a feature flag (`RLP_TEAM_API=1`).
|
|
81
|
+
- Self-Verification Gate matrix extended: each scenario runs once per backend (sentinel + team-api).
|
|
82
|
+
- Wrapper tools (analytics, blueprints) updated to consume the new event stream.
|
|
83
|
+
|
|
84
|
+
**Track B — Phase 3 (deprecation):**
|
|
85
|
+
- Sentinel path removed from runtime once team-api has burned in for ≥1 release.
|
|
86
|
+
- Documentation rolled forward; `signal-protocol-v1` archived.
|
|
87
|
+
|
|
88
|
+
Dependencies:
|
|
89
|
+
- Daemon implementation (~600 LoC Node, drawing on Bun's IPC primitives or plain `node:net`).
|
|
90
|
+
- Integration test harness for daemon crash recovery.
|
|
91
|
+
- Self-Verification Gate parity matrix (Node × zsh × team-api).
|
|
92
|
+
|
|
93
|
+
This track is **explicitly out of scope** for the Bug #6/#7/#8 plan v6. It is captured here so future maintainers do not interpret "rlp-desk does not use a mailbox" as an oversight — it is a deliberate architectural decision with a known successor path.
|
package/install.sh
CHANGED
|
@@ -115,6 +115,8 @@ fetch "$REPO_URL/docs/rlp-desk/getting-started.md" "$DESK_DIR/docs/rlp-desk/gett
|
|
|
115
115
|
fetch "$REPO_URL/docs/rlp-desk/protocol-reference.md" "$DESK_DIR/docs/rlp-desk/protocol-reference.md"
|
|
116
116
|
fetch "$REPO_URL/docs/rlp-desk/TODO-verification-next.md" "$DESK_DIR/docs/rlp-desk/TODO-verification-next.md"
|
|
117
117
|
fetch "$REPO_URL/docs/rlp-desk/multi-mission-orchestration.md" "$DESK_DIR/docs/rlp-desk/multi-mission-orchestration.md"
|
|
118
|
+
# Plan v6 PR-0a: signal protocol documentation
|
|
119
|
+
fetch "$REPO_URL/docs/rlp-desk/signal-protocol.md" "$DESK_DIR/docs/rlp-desk/signal-protocol.md"
|
|
118
120
|
# Dev meta docs (v5.7 §4.15: under docs/rlp-desk/ to avoid mixing with user docs)
|
|
119
121
|
fetch "$REPO_URL/docs/rlp-desk/internal/verification-policy-gap-analysis.md" "$DESK_DIR/docs/rlp-desk/internal/verification-policy-gap-analysis.md"
|
|
120
122
|
fetch "$REPO_URL/docs/rlp-desk/internal/verification-strategy-research.md" "$DESK_DIR/docs/rlp-desk/internal/verification-strategy-research.md"
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ai-dev-methodologies/rlp-desk",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.15.0",
|
|
4
4
|
"description": "Fresh-context iterative loops for Claude Code — autonomous task completion with independent verification",
|
|
5
5
|
"scripts": {
|
|
6
6
|
"postinstall": "node scripts/postinstall.js",
|
package/scripts/postinstall.js
CHANGED
|
@@ -33,6 +33,8 @@ const runtimeSources = [
|
|
|
33
33
|
["docs/rlp-desk/protocol-reference.md", path.join(docsDir, "rlp-desk", "protocol-reference.md")],
|
|
34
34
|
["docs/rlp-desk/TODO-verification-next.md", path.join(docsDir, "rlp-desk", "TODO-verification-next.md")],
|
|
35
35
|
["docs/rlp-desk/multi-mission-orchestration.md", path.join(docsDir, "rlp-desk", "multi-mission-orchestration.md")],
|
|
36
|
+
// Plan v6 PR-0a: signal protocol documentation (Architect/Critic codex iter 6).
|
|
37
|
+
["docs/rlp-desk/signal-protocol.md", path.join(docsDir, "rlp-desk", "signal-protocol.md")],
|
|
36
38
|
];
|
|
37
39
|
// v0.14.0: legacy-deletion list cleared. The Node-canonical era (v5.7+)
|
|
38
40
|
// removed zsh after install; v0.14.0 reverts that — the zsh runner is the
|
package/src/commands/rlp-desk.md
CHANGED
|
@@ -89,6 +89,14 @@ Ask about these items one by one (or in small groups):
|
|
|
89
89
|
- **gpt-5.5:medium** — default recommendation (full context window, progressive upgrade handles harder US)
|
|
90
90
|
- **spark:high** — only when US is small enough for spark's 100k context (single-file, AC count <= 4, simple logic). Do NOT use as primary recommendation — spark context window is too small for most tasks
|
|
91
91
|
|
|
92
|
+
**Context window behavior (claude models — v0.14.6+)**:
|
|
93
|
+
- All claude models default to **200K**. `sonnet` and `opus` aliases both run at the standard window.
|
|
94
|
+
- To request 1M, append the explicit `[1m]` suffix on the full model id:
|
|
95
|
+
- `claude-opus-4-7[1m]` — 1M attempted via `ANTHROPIC_BETA=context-1m-2025-08-07`. Works on most Claude Max accounts.
|
|
96
|
+
- `claude-sonnet-4-6[1m]` — 1M attempted, **but** requires the Anthropic "Extra usage" toggle at https://claude.ai/settings/usage. Without that toggle the worker fails at the first API call with `Extra usage is required for 1M context`.
|
|
97
|
+
- rlp-desk does NOT pre-check entitlement — the explicit `[1m]` is honored as-is. If the API rejects it, you will see the error immediately and can re-run with the standard alias or the opus 1M form.
|
|
98
|
+
- **Default recommendation when 1M is genuinely needed:** prefer `claude-opus-4-7[1m]` over `claude-sonnet-4-6[1m]` because opus 1M does not require a separate entitlement toggle.
|
|
99
|
+
|
|
92
100
|
Present complexity score with evidence to the user, e.g.: "I rate this MEDIUM because: US count=4 (MEDIUM), file scope=2 (MEDIUM), logic=conditionals (MEDIUM), deps=none (LOW), impact=modify (MEDIUM). Highest=MEDIUM."
|
|
93
101
|
|
|
94
102
|
**If codex IS installed** — say: "Codex is installed. I recommend cross-engine Worker for cost savings (Pro token pool separation) and cross-engine blind-spot coverage (claude Verifier catches issues codex Worker misses)."
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
import { shellQuote } from '../util/shell-quote.mjs';
|
|
2
|
-
import {
|
|
2
|
+
import { ONE_MILLION_BETA, wantsOneMillionContext } from '../constants.mjs';
|
|
3
3
|
|
|
4
4
|
const CLAUDE_BIN = 'claude';
|
|
5
5
|
const CODEX_BIN = 'codex';
|
|
@@ -32,12 +32,14 @@ function assertTuiMode(mode, builderName) {
|
|
|
32
32
|
export function buildClaudeCmd(mode, model, options = {}) {
|
|
33
33
|
assertTuiMode(mode, 'buildClaudeCmd');
|
|
34
34
|
|
|
35
|
-
//
|
|
36
|
-
//
|
|
37
|
-
//
|
|
35
|
+
// v0.14.6: 1M context is opt-in only via the explicit '[1m]' suffix.
|
|
36
|
+
// opus / sonnet / claude-opus-4-7 (no suffix) all run at the standard
|
|
37
|
+
// 200K context. Adding '[1m]' on either opus or sonnet model id injects
|
|
38
|
+
// the ANTHROPIC_BETA header and attempts the 1M window — sonnet[1m] still
|
|
39
|
+
// requires Anthropic "Extra usage" entitlement at the API layer.
|
|
38
40
|
const parts = ['DISABLE_OMC=1'];
|
|
39
|
-
if (
|
|
40
|
-
parts.push(`ANTHROPIC_BETA=${shellQuote(
|
|
41
|
+
if (wantsOneMillionContext(model)) {
|
|
42
|
+
parts.push(`ANTHROPIC_BETA=${shellQuote(ONE_MILLION_BETA)}`);
|
|
41
43
|
}
|
|
42
44
|
parts.push(
|
|
43
45
|
CLAUDE_BIN,
|
package/src/node/constants.mjs
CHANGED
|
@@ -1,19 +1,21 @@
|
|
|
1
1
|
// Shared runtime constants. Single-source for cross-module values.
|
|
2
2
|
|
|
3
|
-
// Anthropic Claude API beta header
|
|
4
|
-
//
|
|
5
|
-
//
|
|
3
|
+
// Anthropic Claude API beta header for the 1M-token context window. Injected
|
|
4
|
+
// only when the user explicitly opts in via the '[1m]' suffix on the model
|
|
5
|
+
// id — see wantsOneMillionContext() below.
|
|
6
6
|
//
|
|
7
7
|
// Docs: https://docs.anthropic.com/en/docs/build-with-claude/context-windows
|
|
8
8
|
// (search "1M context") — header rotates with each beta phase.
|
|
9
|
-
export const
|
|
9
|
+
export const ONE_MILLION_BETA = 'context-1m-2025-08-07';
|
|
10
10
|
|
|
11
|
-
//
|
|
12
|
-
//
|
|
13
|
-
//
|
|
14
|
-
//
|
|
15
|
-
|
|
11
|
+
// v0.14.6: 1M context is opt-in only via the explicit '[1m]' suffix on the
|
|
12
|
+
// model id. Previously rlp-desk auto-injected ANTHROPIC_BETA for any opus
|
|
13
|
+
// model; in practice that produced surprising results (opus alias still
|
|
14
|
+
// reported a 200K window in real CLI calls, and sonnet[1m] requires a
|
|
15
|
+
// separate "Extra usage" entitlement). New rule: user is the source of
|
|
16
|
+
// truth. Type the suffix to opt in; otherwise both opus and sonnet run at
|
|
17
|
+
// the standard 200K context.
|
|
18
|
+
export function wantsOneMillionContext(model) {
|
|
16
19
|
if (!model) return false;
|
|
17
|
-
|
|
18
|
-
return m === 'opus' || m.startsWith('claude-opus-');
|
|
20
|
+
return String(model).toLowerCase().endsWith('[1m]');
|
|
19
21
|
}
|
|
@@ -7,10 +7,15 @@ import { promisify } from 'node:util';
|
|
|
7
7
|
|
|
8
8
|
import { buildClaudeCmd, buildCodexCmd, parseModelFlag } from '../cli/command-builder.mjs';
|
|
9
9
|
import { shellQuote } from '../util/shell-quote.mjs';
|
|
10
|
-
import {
|
|
10
|
+
import { ONE_MILLION_BETA, wantsOneMillionContext } from '../constants.mjs';
|
|
11
11
|
import { initCampaign } from '../init/campaign-initializer.mjs';
|
|
12
12
|
import { LEGACY_DESK_REL, resolveDeskRoot } from '../util/desk-root.mjs';
|
|
13
|
-
import {
|
|
13
|
+
import {
|
|
14
|
+
lockSentinelFile as defaultLockSentinelFile,
|
|
15
|
+
stampAckField as defaultStampAckField,
|
|
16
|
+
unlockSentinelFile,
|
|
17
|
+
writeSentinelExclusive,
|
|
18
|
+
} from '../shared/fs.mjs';
|
|
14
19
|
import {
|
|
15
20
|
TimeoutError,
|
|
16
21
|
WorkerExitedError,
|
|
@@ -29,7 +34,10 @@ import {
|
|
|
29
34
|
} from '../reporting/campaign-reporting.mjs';
|
|
30
35
|
import {
|
|
31
36
|
createPane as defaultCreatePane,
|
|
37
|
+
killPaneProcess as defaultKillPaneProcess,
|
|
32
38
|
sendKeys as defaultSendKeys,
|
|
39
|
+
sendRawKey as defaultSendRawKey,
|
|
40
|
+
waitForProcessExit as defaultWaitForProcessExit,
|
|
33
41
|
} from '../tmux/pane-manager.mjs';
|
|
34
42
|
|
|
35
43
|
const execFileAsync = promisify(execFile);
|
|
@@ -128,6 +136,39 @@ function buildPaths(rootDir, slug, env = process.env) {
|
|
|
128
136
|
};
|
|
129
137
|
}
|
|
130
138
|
|
|
139
|
+
// Bug #8 PR-B: default git working-tree probe. Inline (~20 LoC) — no new
|
|
140
|
+
// module per Architect/Critic codex iter 6 consensus. Tests inject a stub via
|
|
141
|
+
// run() option `checkWorkingTree`.
|
|
142
|
+
// - returns { ok: false, error } when git rev-parse fails (not a repo, etc).
|
|
143
|
+
// - returns { ok: true, dirty: bool, dirtyFiles[] } otherwise.
|
|
144
|
+
// - dirtyFiles are raw `git status --porcelain` lines (caller truncates).
|
|
145
|
+
async function _defaultCheckWorkingTree(rootDir) {
|
|
146
|
+
try {
|
|
147
|
+
const { stdout: top } = await execFileAsync('git', ['-C', rootDir, 'rev-parse', '--show-toplevel']);
|
|
148
|
+
const trimmed = top.trim();
|
|
149
|
+
// macOS `/var` resolves to `/private/var`; symlinks elsewhere too. Compare
|
|
150
|
+
// canonical realpaths via fs.realpath so the comparison does not fire on
|
|
151
|
+
// symlink-equivalent paths.
|
|
152
|
+
const [topCanon, rootCanon] = await Promise.all([
|
|
153
|
+
fs.realpath(trimmed).catch(() => trimmed),
|
|
154
|
+
fs.realpath(rootDir).catch(() => rootDir),
|
|
155
|
+
]);
|
|
156
|
+
if (topCanon !== rootCanon) {
|
|
157
|
+
// Worker is in a sub-tree, not the campaign root. Refuse to classify.
|
|
158
|
+
return { ok: false, error: `git toplevel ${trimmed} != ${rootDir}` };
|
|
159
|
+
}
|
|
160
|
+
} catch (err) {
|
|
161
|
+
return { ok: false, error: err?.message ?? String(err) };
|
|
162
|
+
}
|
|
163
|
+
try {
|
|
164
|
+
const { stdout } = await execFileAsync('git', ['-C', rootDir, 'status', '--porcelain']);
|
|
165
|
+
const lines = stdout.split('\n').filter(Boolean);
|
|
166
|
+
return { ok: true, dirty: lines.length > 0, dirtyFiles: lines };
|
|
167
|
+
} catch (err) {
|
|
168
|
+
return { ok: false, error: err?.message ?? String(err) };
|
|
169
|
+
}
|
|
170
|
+
}
|
|
171
|
+
|
|
131
172
|
async function exists(targetPath) {
|
|
132
173
|
try {
|
|
133
174
|
await fs.access(targetPath);
|
|
@@ -534,6 +575,12 @@ export const BLOCK_TAGS = Object.freeze({
|
|
|
534
575
|
MALFORMED_ARTIFACT: 'malformed_artifact',
|
|
535
576
|
// Backstop (run() try/finally)
|
|
536
577
|
LEADER_EXITED_WITHOUT_TERMINAL_STATE: 'leader_exited_without_terminal_state',
|
|
578
|
+
// Bug #8 (Plan v6 PR-B): refuse to synthesize verify signal when codex
|
|
579
|
+
// worker exited without committing. Three new tags route through
|
|
580
|
+
// _handlePollFailure with reasonOverride/categoryOverride.
|
|
581
|
+
CODEX_EXIT_NO_DONE_CLAIM: 'codex_exit_no_done_claim',
|
|
582
|
+
GIT_STATE_UNVERIFIABLE: 'git_state_unverifiable',
|
|
583
|
+
WORKER_INCOMPLETE_UNCOMMITTED: 'worker_incomplete_uncommitted',
|
|
537
584
|
});
|
|
538
585
|
|
|
539
586
|
// P1-D Failure Taxonomy classifier. governance §1f locks the reason_category
|
|
@@ -619,6 +666,32 @@ function _classifyBlock(source, { verdict, state, slug } = {}) {
|
|
|
619
666
|
action = 'investigate_leader_logs';
|
|
620
667
|
failureCategory = 'leader_exited_without_terminal_state';
|
|
621
668
|
break;
|
|
669
|
+
// Bug #8 PR-B — codex worker exited but did not write done-claim. Refuse
|
|
670
|
+
// to synthesize a verify signal; surface as infra_failure so wrapper does
|
|
671
|
+
// not retry blindly.
|
|
672
|
+
case BLOCK_TAGS.CODEX_EXIT_NO_DONE_CLAIM:
|
|
673
|
+
category = 'infra_failure';
|
|
674
|
+
recoverable = false;
|
|
675
|
+
action = 'investigate_pane_logs';
|
|
676
|
+
failureCategory = 'codex_exit_no_done_claim';
|
|
677
|
+
break;
|
|
678
|
+
// Bug #8 PR-B — git status could not be resolved (not a repo, git binary
|
|
679
|
+
// missing, etc). Without git we cannot prove the working tree is clean,
|
|
680
|
+
// so refuse to synthesize.
|
|
681
|
+
case BLOCK_TAGS.GIT_STATE_UNVERIFIABLE:
|
|
682
|
+
category = 'infra_failure';
|
|
683
|
+
recoverable = false;
|
|
684
|
+
action = 'investigate_git_state';
|
|
685
|
+
failureCategory = 'git_state_unverifiable';
|
|
686
|
+
break;
|
|
687
|
+
// Bug #8 PR-B — worker said it was done (done-claim present) but the tree
|
|
688
|
+
// is dirty. Recoverable: next iteration's worker can finish committing.
|
|
689
|
+
case BLOCK_TAGS.WORKER_INCOMPLETE_UNCOMMITTED:
|
|
690
|
+
category = 'metric_failure';
|
|
691
|
+
recoverable = true;
|
|
692
|
+
action = 'retry_after_fix';
|
|
693
|
+
failureCategory = 'worker_incomplete_uncommitted';
|
|
694
|
+
break;
|
|
622
695
|
default:
|
|
623
696
|
category = 'metric_failure';
|
|
624
697
|
recoverable = false;
|
|
@@ -650,9 +723,41 @@ async function _handlePollFailure(error, ctx) {
|
|
|
650
723
|
options,
|
|
651
724
|
role, // 'worker' | 'verifier' | 'final_verifier' | 'flywheel' | 'guard'
|
|
652
725
|
usIdOverride,
|
|
726
|
+
// Bug #8 PR-B: when the caller has already classified the failure (e.g.
|
|
727
|
+
// codex done-claim/git gate), forward an explicit BLOCK_TAGS value as
|
|
728
|
+
// categoryOverride and a reason string. Named `categoryOverride` per
|
|
729
|
+
// Plan v6 PRD (it overrides the tag→reason_category mapping). Existing 5
|
|
730
|
+
// callers omit both and the legacy error→tag mapping below runs unchanged.
|
|
731
|
+
categoryOverride,
|
|
732
|
+
reasonOverride,
|
|
653
733
|
} = ctx;
|
|
654
734
|
const usId = usIdOverride ?? state.current_us;
|
|
655
735
|
|
|
736
|
+
if (categoryOverride) {
|
|
737
|
+
state.phase = 'blocked';
|
|
738
|
+
const classification = _classifyBlock(categoryOverride, { state, slug });
|
|
739
|
+
const reasonText = reasonOverride ?? `${role} blocked: ${categoryOverride}`;
|
|
740
|
+
await writeSentinel(paths.blockedSentinel, 'blocked', usId, reasonText, classification, paths);
|
|
741
|
+
await writeStatus(paths, state, options.onStatusChange, options.now);
|
|
742
|
+
await generateCampaignReport({
|
|
743
|
+
slug,
|
|
744
|
+
reportFile: paths.reportFile,
|
|
745
|
+
prdFile: paths.prdFile,
|
|
746
|
+
statusFile: paths.statusFile,
|
|
747
|
+
analyticsFile: paths.analyticsFile,
|
|
748
|
+
now: resolveNow(options.now),
|
|
749
|
+
blockedReason: reasonText,
|
|
750
|
+
blockedCategory: classification.reason_category,
|
|
751
|
+
});
|
|
752
|
+
return {
|
|
753
|
+
status: 'blocked',
|
|
754
|
+
usId,
|
|
755
|
+
reason: reasonText,
|
|
756
|
+
category: classification.reason_category,
|
|
757
|
+
statusFile: paths.statusFile,
|
|
758
|
+
};
|
|
759
|
+
}
|
|
760
|
+
|
|
656
761
|
let tag;
|
|
657
762
|
let reason;
|
|
658
763
|
if (error instanceof WorkerExitedError) {
|
|
@@ -872,6 +977,10 @@ async function runFinalSequentialVerify({
|
|
|
872
977
|
pollForSignal,
|
|
873
978
|
runIntegrationCheck,
|
|
874
979
|
iterTimeoutMs,
|
|
980
|
+
// Bug #7 Fix-Q/R: optional reaper. Passed from _runCampaignBody so each
|
|
981
|
+
// per-US verdict kills the verifier TUI before the next per-US dispatch
|
|
982
|
+
// reuses the same pane. No-op when undefined (legacy/test callers).
|
|
983
|
+
reapProducer,
|
|
875
984
|
}) {
|
|
876
985
|
const verifierModel = state.final_verifier_model;
|
|
877
986
|
|
|
@@ -893,6 +1002,10 @@ async function runFinalSequentialVerify({
|
|
|
893
1002
|
timeoutMs: iterTimeoutMs,
|
|
894
1003
|
});
|
|
895
1004
|
|
|
1005
|
+
if (typeof reapProducer === 'function') {
|
|
1006
|
+
await reapProducer(verifierPaneId, paths.verdictFile);
|
|
1007
|
+
}
|
|
1008
|
+
|
|
896
1009
|
if (verdict.verdict !== 'pass') {
|
|
897
1010
|
return {
|
|
898
1011
|
status: 'continue',
|
|
@@ -933,9 +1046,11 @@ async function runFinalSequentialVerify({
|
|
|
933
1046
|
const HOME_DESK_DIR = path.join(os.homedir(), '.claude', 'ralph-desk');
|
|
934
1047
|
|
|
935
1048
|
function buildAutonomousClaudeCmd({ promptFile, model, rootDir, homeDeskDir = HOME_DESK_DIR }) {
|
|
936
|
-
//
|
|
937
|
-
|
|
938
|
-
|
|
1049
|
+
// v0.14.6: ANTHROPIC_BETA prefix injected only when the model id ends
|
|
1050
|
+
// with explicit '[1m]' suffix. opus / sonnet / claude-opus-4-7 (no
|
|
1051
|
+
// suffix) all run at the standard 200K context.
|
|
1052
|
+
const betaPrefix = wantsOneMillionContext(model)
|
|
1053
|
+
? `ANTHROPIC_BETA=${shellQuote(ONE_MILLION_BETA)} `
|
|
939
1054
|
: '';
|
|
940
1055
|
// §4.11.a: --add-dir whitelist (home rlp-desk + campaign cwd) for true autonomy.
|
|
941
1056
|
const addDirParts = [];
|
|
@@ -1076,6 +1191,46 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1076
1191
|
const createPane = options.createPane ?? defaultCreatePane;
|
|
1077
1192
|
const createSession = options.createSession ?? defaultCreateSession;
|
|
1078
1193
|
const pollForSignal = options.pollForSignal ?? defaultPollForSignal;
|
|
1194
|
+
// Bug #7 Fix-Q/R: post-sentinel reaper. Producer (claude/codex TUI) must be
|
|
1195
|
+
// interrupted the moment leader has consumed the sentinel; otherwise the
|
|
1196
|
+
// pane lingers in idle prompt and self-reviews for ~2min. lockSentinel
|
|
1197
|
+
// freezes the file mtime as defense-in-depth. All four are injectable so
|
|
1198
|
+
// existing tests with fake sendKeys keep working (us006 createTmuxFakes).
|
|
1199
|
+
const sendRawKey = options.sendRawKey ?? defaultSendRawKey;
|
|
1200
|
+
const waitForProcessExit = options.waitForProcessExit ?? defaultWaitForProcessExit;
|
|
1201
|
+
const killPaneProcess = options.killPaneProcess ?? defaultKillPaneProcess;
|
|
1202
|
+
const lockSentinel = options.lockSentinelFile ?? defaultLockSentinelFile;
|
|
1203
|
+
const stampAckField = options.stampAckField ?? defaultStampAckField;
|
|
1204
|
+
const reapProducer = async (paneId, sentinelFile) => {
|
|
1205
|
+
if (!paneId) return;
|
|
1206
|
+
await killPaneProcess(paneId, {
|
|
1207
|
+
sendRawKey,
|
|
1208
|
+
waitForExit: waitForProcessExit,
|
|
1209
|
+
log: (msg) => console.error(msg),
|
|
1210
|
+
});
|
|
1211
|
+
// PR-0b-narrow AC-H1: after killPaneProcess, wait for the producing
|
|
1212
|
+
// process to actually exit before continuing. waitForProcessExit returns
|
|
1213
|
+
// when pane_current_command resolves to a shell (zsh/bash/sh). Wrapped
|
|
1214
|
+
// in try/catch — failure here is non-fatal but emits a log entry.
|
|
1215
|
+
try {
|
|
1216
|
+
await waitForProcessExit(paneId, { timeoutMs: 5000 });
|
|
1217
|
+
} catch (err) {
|
|
1218
|
+
console.error(`[handshake] waitForProcessExit failed on ${paneId} (${err?.message ?? err}); continuing`);
|
|
1219
|
+
}
|
|
1220
|
+
if (sentinelFile) {
|
|
1221
|
+
await lockSentinel(sentinelFile, { log: (msg) => console.error(msg) });
|
|
1222
|
+
// PR-0b-narrow AC-H2: stamp the leader_ack audit field. Best-effort,
|
|
1223
|
+
// does not block subsequent dispatch.
|
|
1224
|
+
await stampAckField(sentinelFile, {
|
|
1225
|
+
acked_by: 'leader',
|
|
1226
|
+
acked_at: new Date(resolveNow(options.now)).toISOString(),
|
|
1227
|
+
ack_pane_state: 'shell',
|
|
1228
|
+
}, { log: (msg) => console.error(msg) });
|
|
1229
|
+
}
|
|
1230
|
+
};
|
|
1231
|
+
// Bug #8 PR-B: working-tree probe injected (or default execFile git).
|
|
1232
|
+
// Returns { ok: boolean, dirty?: boolean, dirtyFiles?: string[], error?: string }.
|
|
1233
|
+
const checkWorkingTree = options.checkWorkingTree ?? _defaultCheckWorkingTree;
|
|
1079
1234
|
const runIntegrationCheck = options.runIntegrationCheck ?? (async () => ({ exitCode: 0, summary: 'integration skipped' }));
|
|
1080
1235
|
const maxIterations = options.maxIterations ?? 100;
|
|
1081
1236
|
// v5.7 §4.19: campaign-level pollForSignal timeout (Node leader fix).
|
|
@@ -1141,6 +1296,11 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1141
1296
|
let _laneSnapshot = await _snapshotLaneMtimes(paths);
|
|
1142
1297
|
|
|
1143
1298
|
while (state.iteration <= maxIterations) {
|
|
1299
|
+
// Bug #7 Fix-R defensive unlock: a 0o444 sentinel left from the previous
|
|
1300
|
+
// iteration must not block the next producer's atomic-rename write.
|
|
1301
|
+
// Idempotent: missing-file calls are no-ops.
|
|
1302
|
+
await unlockSentinelFile(paths.signalFile);
|
|
1303
|
+
await unlockSentinelFile(paths.verdictFile);
|
|
1144
1304
|
// Audit drift from the prior iteration before doing anything new.
|
|
1145
1305
|
const _laneSnapshotAfter = await _snapshotLaneMtimes(paths);
|
|
1146
1306
|
const _laneViolations = await _checkLaneViolations(paths, _laneSnapshot, _laneSnapshotAfter, state, options);
|
|
@@ -1189,6 +1349,7 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1189
1349
|
pollForSignal,
|
|
1190
1350
|
runIntegrationCheck,
|
|
1191
1351
|
iterTimeoutMs,
|
|
1352
|
+
reapProducer,
|
|
1192
1353
|
});
|
|
1193
1354
|
} catch (error) {
|
|
1194
1355
|
// v5.7 §4.25 — uniform poll-failure handling for final verifier.
|
|
@@ -1280,12 +1441,17 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1280
1441
|
});
|
|
1281
1442
|
}
|
|
1282
1443
|
|
|
1444
|
+
// Bug #7 Fix-Q/R: reap flywheel pane before consuming the signal.
|
|
1445
|
+
await reapProducer(state.flywheel_pane_id ?? state.verifier_pane_id, paths.flywheelSignalFile);
|
|
1446
|
+
|
|
1283
1447
|
state.last_flywheel_decision = flywheelSignal.decision;
|
|
1284
1448
|
// P0-A multi-mission orchestration: optionally captured from flywheel signal.
|
|
1285
1449
|
// null when the flywheel did not suggest a next mission. Consumer wrappers
|
|
1286
1450
|
// poll status.next_mission_candidate to chain missions without code edits.
|
|
1287
1451
|
// See docs/multi-mission-orchestration.md.
|
|
1288
1452
|
state.next_mission_candidate = flywheelSignal.next_mission_candidate ?? null;
|
|
1453
|
+
// Bug #7 Fix-R cleanup: unlock before unlink so 0o444 doesn't block.
|
|
1454
|
+
await unlockSentinelFile(paths.flywheelSignalFile);
|
|
1289
1455
|
await fs.unlink(paths.flywheelSignalFile).catch(() => {});
|
|
1290
1456
|
|
|
1291
1457
|
// Flywheel Guard (independent validation of flywheel decision)
|
|
@@ -1318,11 +1484,15 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1318
1484
|
});
|
|
1319
1485
|
}
|
|
1320
1486
|
|
|
1487
|
+
// Bug #7 Fix-Q/R: reap guard pane before mutating state.
|
|
1488
|
+
await reapProducer(guardPaneId, paths.flywheelGuardVerdictFile);
|
|
1489
|
+
|
|
1321
1490
|
if (!state.flywheel_guard_count[state.current_us]) {
|
|
1322
1491
|
state.flywheel_guard_count[state.current_us] = 0;
|
|
1323
1492
|
}
|
|
1324
1493
|
state.flywheel_guard_count[state.current_us] += 1;
|
|
1325
1494
|
|
|
1495
|
+
await unlockSentinelFile(paths.flywheelGuardVerdictFile);
|
|
1326
1496
|
await fs.unlink(paths.flywheelGuardVerdictFile).catch(() => {});
|
|
1327
1497
|
|
|
1328
1498
|
if (guardVerdict.verdict === 'inconclusive') {
|
|
@@ -1430,8 +1600,43 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1430
1600
|
});
|
|
1431
1601
|
} catch (error) {
|
|
1432
1602
|
if (error instanceof TimeoutError && parseModelFlag(state.worker_model).engine === 'codex') {
|
|
1433
|
-
//
|
|
1434
|
-
//
|
|
1603
|
+
// Bug #8 PR-B 4-way gate: refuse to synthesize verify signal when
|
|
1604
|
+
// codex worker exited without committing real work.
|
|
1605
|
+
// 1. done-claim absent → BLOCKED infra_failure
|
|
1606
|
+
// 2. git unverifiable → BLOCKED infra_failure
|
|
1607
|
+
// 3. done-claim + dirty tree → BLOCKED metric_failure
|
|
1608
|
+
// 4. done-claim + clean tree → synthesize verify (legacy path)
|
|
1609
|
+
const doneClaimExists = await exists(paths.doneClaimFile);
|
|
1610
|
+
if (!doneClaimExists) {
|
|
1611
|
+
return _handlePollFailure(error, {
|
|
1612
|
+
paths, state, slug, options,
|
|
1613
|
+
role: 'worker',
|
|
1614
|
+
categoryOverride: BLOCK_TAGS.CODEX_EXIT_NO_DONE_CLAIM,
|
|
1615
|
+
reasonOverride:
|
|
1616
|
+
'codex worker exited (timeout) without writing done-claim; refusing to synthesize verify signal',
|
|
1617
|
+
});
|
|
1618
|
+
}
|
|
1619
|
+
const tree = await checkWorkingTree(rootDir);
|
|
1620
|
+
if (!tree.ok) {
|
|
1621
|
+
return _handlePollFailure(error, {
|
|
1622
|
+
paths, state, slug, options,
|
|
1623
|
+
role: 'worker',
|
|
1624
|
+
categoryOverride: BLOCK_TAGS.GIT_STATE_UNVERIFIABLE,
|
|
1625
|
+
reasonOverride:
|
|
1626
|
+
`git status unverifiable (${tree.error ?? 'unknown'}); refusing to synthesize verify signal`,
|
|
1627
|
+
});
|
|
1628
|
+
}
|
|
1629
|
+
if (tree.dirty) {
|
|
1630
|
+
const sample = (tree.dirtyFiles ?? []).slice(0, 5).join(', ');
|
|
1631
|
+
return _handlePollFailure(error, {
|
|
1632
|
+
paths, state, slug, options,
|
|
1633
|
+
role: 'worker',
|
|
1634
|
+
categoryOverride: BLOCK_TAGS.WORKER_INCOMPLETE_UNCOMMITTED,
|
|
1635
|
+
reasonOverride:
|
|
1636
|
+
`worker_incomplete_uncommitted: done-claim present but tree dirty (${sample || 'no file list'})`,
|
|
1637
|
+
});
|
|
1638
|
+
}
|
|
1639
|
+
// Clean tree — preserve the legacy synthesize behaviour.
|
|
1435
1640
|
signal = {
|
|
1436
1641
|
iteration: state.iteration,
|
|
1437
1642
|
status: 'verify',
|
|
@@ -1448,6 +1653,11 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1448
1653
|
}
|
|
1449
1654
|
}
|
|
1450
1655
|
|
|
1656
|
+
// Bug #7 Fix-Q/R: reap the worker pane the instant we accept the signal so
|
|
1657
|
+
// claude/codex cannot self-review and rewrite iter-signal.json. Runs even
|
|
1658
|
+
// for the codex-fallback synthesized signal (no-op on a dead pane).
|
|
1659
|
+
await reapProducer(state.worker_pane_id, paths.signalFile);
|
|
1660
|
+
|
|
1451
1661
|
// US-019 R7 P1-G: verify_partial malformed downgrade.
|
|
1452
1662
|
// verify_partial requires verified_acs[] to be a non-empty array. Otherwise the verifier
|
|
1453
1663
|
// has nothing to evaluate and we must treat the signal as broken contract → blocked.
|
|
@@ -1517,6 +1727,11 @@ async function _runCampaignBody(slug, options, paths, rootDir) {
|
|
|
1517
1727
|
});
|
|
1518
1728
|
}
|
|
1519
1729
|
|
|
1730
|
+
// Bug #7 Fix-Q/R: reap verifier pane immediately after accepting the
|
|
1731
|
+
// verdict — without this the codex/claude TUI keeps running for ~2min and
|
|
1732
|
+
// can rewrite verify-verdict.json (mtime drift observed in 19th launch).
|
|
1733
|
+
await reapProducer(state.verifier_pane_id, paths.verdictFile);
|
|
1734
|
+
|
|
1520
1735
|
if (verdict.verdict === 'pass') {
|
|
1521
1736
|
state.consecutive_failures = 0;
|
|
1522
1737
|
if (!state.verified_us.includes(usId)) {
|
package/src/node/shared/fs.mjs
CHANGED
|
@@ -59,3 +59,86 @@ export async function writeSentinelExclusive(targetPath, content) {
|
|
|
59
59
|
}
|
|
60
60
|
return { wrote: true };
|
|
61
61
|
}
|
|
62
|
+
|
|
63
|
+
// Bug #7 Fix-R: best-effort chmod 0o444 to freeze a sentinel file once the
|
|
64
|
+
// leader has accepted it. Mirror of scripts/postinstall.js tryLockFile (L104).
|
|
65
|
+
// Some filesystems silently ignore chmod (WSL1/NTFS, tmpfs); we log once and
|
|
66
|
+
// continue. Q (process kill) is the primary defense; R is defense-in-depth.
|
|
67
|
+
let _sentinelLockWarningEmitted = false;
|
|
68
|
+
export async function lockSentinelFile(filePath, { log = (msg) => console.error(msg) } = {}) {
|
|
69
|
+
try {
|
|
70
|
+
await fs.chmod(filePath, 0o444);
|
|
71
|
+
} catch (err) {
|
|
72
|
+
if (err && err.code === 'ENOENT') {
|
|
73
|
+
// File missing is not an error — sentinel may have been consumed and
|
|
74
|
+
// unlinked by a concurrent path. Idempotent no-op.
|
|
75
|
+
return;
|
|
76
|
+
}
|
|
77
|
+
if (!_sentinelLockWarningEmitted) {
|
|
78
|
+
log(`[bug7] chmod 0444 on ${filePath} failed (${err?.code ?? 'unknown'}); post-sentinel write-protection unavailable on this FS.`);
|
|
79
|
+
_sentinelLockWarningEmitted = true;
|
|
80
|
+
}
|
|
81
|
+
}
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
// Pair to lockSentinelFile. Called before fs.unlink in iter-cleanup paths so
|
|
85
|
+
// subsequent atomic-rename writes never see EACCES on the destination mode.
|
|
86
|
+
// Idempotent — missing file or already-writable is fine.
|
|
87
|
+
export async function unlockSentinelFile(filePath) {
|
|
88
|
+
try {
|
|
89
|
+
await fs.chmod(filePath, 0o644);
|
|
90
|
+
} catch {
|
|
91
|
+
// best-effort; cleanup proceeds regardless.
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
|
|
95
|
+
// PR-0b-narrow (Plan v6) — stamp leader handshake ack onto an already-locked
|
|
96
|
+
// sentinel. Best-effort, audit-only: the contract is "if we can write, do; if
|
|
97
|
+
// not, swallow". Callers must NOT depend on the ack landing for hard ordering
|
|
98
|
+
// semantics (use waitForProcessExit + the chmod 0o444 lock for that). The
|
|
99
|
+
// resulting `content.leader_ack` is auxiliary metadata so post-mortem audits
|
|
100
|
+
// can prove which Leader iteration consumed which sentinel.
|
|
101
|
+
//
|
|
102
|
+
// Sequence (mirrored in src/scripts/lib_ralph_desk.zsh::_stamp_ack_field):
|
|
103
|
+
// 1. chmod 0o644 (so we can write — sentinel was locked by lockSentinelFile)
|
|
104
|
+
// 2. JSON.parse
|
|
105
|
+
// 3. merge ack as content.leader_ack
|
|
106
|
+
// 4. atomic write
|
|
107
|
+
// 5. chmod 0o444 (re-lock)
|
|
108
|
+
//
|
|
109
|
+
// All steps wrapped in try/catch; any failure is silently dropped. Failure
|
|
110
|
+
// modes that we deliberately swallow:
|
|
111
|
+
// - File missing (sentinel was unlinked by a concurrent path).
|
|
112
|
+
// - Malformed JSON (race with a partial-write window — Bug #7 already gates
|
|
113
|
+
// this on the read side, but stampAckField may still observe it during
|
|
114
|
+
// transitional iterations).
|
|
115
|
+
// - chmod ENOTSUP / WSL1 / NTFS (recorded in Bug #7 fixes).
|
|
116
|
+
export async function stampAckField(filePath, ack, { log = (msg) => console.error(msg) } = {}) {
|
|
117
|
+
try {
|
|
118
|
+
await fs.chmod(filePath, 0o644);
|
|
119
|
+
} catch (err) {
|
|
120
|
+
if (err && err.code === 'ENOENT') return; // sentinel gone — nothing to stamp
|
|
121
|
+
// chmod failure is non-fatal — try the write anyway in case the FS already allows it
|
|
122
|
+
}
|
|
123
|
+
let content;
|
|
124
|
+
try {
|
|
125
|
+
const raw = await fs.readFile(filePath, 'utf8');
|
|
126
|
+
content = JSON.parse(raw);
|
|
127
|
+
} catch (err) {
|
|
128
|
+
log(`[stamp-ack] read/parse failed for ${filePath} (${err?.code ?? err?.message ?? 'unknown'}); ack dropped (audit-only)`);
|
|
129
|
+
// Re-lock if possible — best-effort.
|
|
130
|
+
try { await fs.chmod(filePath, 0o444); } catch {}
|
|
131
|
+
return;
|
|
132
|
+
}
|
|
133
|
+
if (!content || typeof content !== 'object') {
|
|
134
|
+
try { await fs.chmod(filePath, 0o444); } catch {}
|
|
135
|
+
return;
|
|
136
|
+
}
|
|
137
|
+
content.leader_ack = ack;
|
|
138
|
+
try {
|
|
139
|
+
await fs.writeFile(filePath, `${JSON.stringify(content, null, 2)}\n`, 'utf8');
|
|
140
|
+
} catch (err) {
|
|
141
|
+
log(`[stamp-ack] write failed for ${filePath} (${err?.code ?? err?.message ?? 'unknown'}); ack dropped`);
|
|
142
|
+
}
|
|
143
|
+
try { await fs.chmod(filePath, 0o444); } catch {}
|
|
144
|
+
}
|
|
@@ -52,6 +52,12 @@ export async function sendKeys(paneId, command) {
|
|
|
52
52
|
await runTmux(['send-keys', '-t', paneId, 'Enter'], { paneId });
|
|
53
53
|
}
|
|
54
54
|
|
|
55
|
+
// Bug #7 Fix-Q: send a raw tmux key (e.g. C-c) without the `-l --` literal-text
|
|
56
|
+
// flag. Distinct from sendKeys() so callers can interrupt a running TUI.
|
|
57
|
+
export async function sendRawKey(paneId, key) {
|
|
58
|
+
await runTmux(['send-keys', '-t', paneId, key], { paneId });
|
|
59
|
+
}
|
|
60
|
+
|
|
55
61
|
export async function waitForProcessExit(
|
|
56
62
|
paneId,
|
|
57
63
|
{ pollIntervalMs = 100, timeoutMs = 5000 } = {},
|
|
@@ -75,3 +81,36 @@ export async function waitForProcessExit(
|
|
|
75
81
|
paneId,
|
|
76
82
|
});
|
|
77
83
|
}
|
|
84
|
+
|
|
85
|
+
// Bug #7 Fix-Q: terminate the TUI process producing a sentinel file the moment
|
|
86
|
+
// the leader has accepted it. Without this, claude/codex returns to its idle
|
|
87
|
+
// prompt and continues self-review for 1-2 minutes, racing the next iteration.
|
|
88
|
+
// Mirror of zsh pattern at run_ralph_desk.zsh:2384-2397, 375-376, 529-530.
|
|
89
|
+
// Fail-open: pane may already be dead from prior teardown, or waitForExit may
|
|
90
|
+
// time out — neither aborts the iteration.
|
|
91
|
+
export async function killPaneProcess(
|
|
92
|
+
paneId,
|
|
93
|
+
{
|
|
94
|
+
sendRawKey: sendRawKeyImpl = sendRawKey,
|
|
95
|
+
waitForExit = waitForProcessExit,
|
|
96
|
+
gracePeriodMs = 800,
|
|
97
|
+
exitTimeoutMs = 5000,
|
|
98
|
+
log = () => {},
|
|
99
|
+
} = {},
|
|
100
|
+
) {
|
|
101
|
+
const safeSend = async (key) => {
|
|
102
|
+
try {
|
|
103
|
+
await sendRawKeyImpl(paneId, key);
|
|
104
|
+
} catch (err) {
|
|
105
|
+
log(`[bug7] killPaneProcess sendRawKey ${key} failed for ${paneId}: ${err?.message ?? err}`);
|
|
106
|
+
}
|
|
107
|
+
};
|
|
108
|
+
await safeSend('C-c');
|
|
109
|
+
await new Promise((resolve) => setTimeout(resolve, gracePeriodMs));
|
|
110
|
+
await safeSend('C-c');
|
|
111
|
+
try {
|
|
112
|
+
await waitForExit(paneId, { timeoutMs: exitTimeoutMs });
|
|
113
|
+
} catch (err) {
|
|
114
|
+
log(`[bug7] killPaneProcess waitForExit failed for ${paneId}: ${err?.message ?? err}`);
|
|
115
|
+
}
|
|
116
|
+
}
|
|
@@ -46,17 +46,19 @@ build_claude_cmd() {
|
|
|
46
46
|
# Defends against bracketed model ids like 'claude-opus-4-7[1m]' (zsh char-class glob),
|
|
47
47
|
# spaces, embedded quotes, etc. Plain "$model" would let zsh expand brackets as glob.
|
|
48
48
|
#
|
|
49
|
-
#
|
|
50
|
-
#
|
|
51
|
-
|
|
49
|
+
# v0.14.6: ANTHROPIC_BETA injected only when the model id ends with the
|
|
50
|
+
# explicit '[1m]' suffix. opus / sonnet / claude-opus-4-7 (no suffix) all
|
|
51
|
+
# run at the standard 200K context. Mirror of src/node/constants.mjs
|
|
52
|
+
# ONE_MILLION_BETA + wantsOneMillionContext(). Update both on rotation.
|
|
53
|
+
local _onem_beta=""
|
|
52
54
|
case "$model" in
|
|
53
|
-
|
|
55
|
+
*\[1m\]) _onem_beta="ANTHROPIC_BETA='context-1m-2025-08-07' " ;;
|
|
54
56
|
esac
|
|
55
57
|
# v5.7 §4.11.a: --add-dir whitelist for autonomous mode. ROOT (campaign cwd)
|
|
56
58
|
# plus home rlp-desk tree authorized for read/write without TUI prompts.
|
|
57
59
|
local _home_desk="$HOME/.claude/ralph-desk"
|
|
58
60
|
local _add_dirs="--add-dir ${(qq)_home_desk} --add-dir ${(qq)ROOT}"
|
|
59
|
-
local base="DISABLE_OMC=1 ${
|
|
61
|
+
local base="DISABLE_OMC=1 ${_onem_beta}$CLAUDE_BIN --model ${(qq)model} --mcp-config '{\"mcpServers\":{}}' --strict-mcp-config --dangerously-skip-permissions ${_add_dirs}"
|
|
60
62
|
if [[ -n "$effort" ]]; then
|
|
61
63
|
base="$base --effort $effort"
|
|
62
64
|
fi
|
|
@@ -242,6 +244,74 @@ atomic_write() {
|
|
|
242
244
|
mv "$tmp" "$target"
|
|
243
245
|
}
|
|
244
246
|
|
|
247
|
+
# =============================================================================
|
|
248
|
+
# Bug #7 Fix-Q/R: Post-sentinel pane reaper + sentinel write-lock
|
|
249
|
+
# =============================================================================
|
|
250
|
+
# Without explicit teardown the claude/codex TUI returns to its idle prompt and
|
|
251
|
+
# self-reviews for ~2min after writing iter-signal.json or verify-verdict.json.
|
|
252
|
+
# Observed: verdict mtime drift 1m43s post-detect; iter-N verifier overlapped
|
|
253
|
+
# iter-N+1 worker for 2min. _kill_pane_process closes the race; _lock_sentinel
|
|
254
|
+
# is defense-in-depth that freezes the file mtime. Mirror of run_ralph_desk.zsh
|
|
255
|
+
# verifier-cleanup pattern at L2384-2397 (Ctrl+C + /exit + wait_for_pane_ready).
|
|
256
|
+
# Both helpers are fail-open: pane may already be dead, FS may ignore chmod.
|
|
257
|
+
_kill_pane_process() {
|
|
258
|
+
local pane_id="$1"
|
|
259
|
+
local role="${2:-producer}"
|
|
260
|
+
[[ -n "$pane_id" ]] || return 0
|
|
261
|
+
if typeset -f log_debug >/dev/null 2>&1; then
|
|
262
|
+
log_debug "[bug7] kill_pane_process pane=$pane_id role=$role"
|
|
263
|
+
fi
|
|
264
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
265
|
+
sleep 0.5
|
|
266
|
+
tmux send-keys -t "$pane_id" C-c 2>/dev/null
|
|
267
|
+
sleep 1
|
|
268
|
+
if typeset -f wait_for_pane_ready >/dev/null 2>&1; then
|
|
269
|
+
wait_for_pane_ready "$pane_id" 5 2>/dev/null || true
|
|
270
|
+
fi
|
|
271
|
+
return 0
|
|
272
|
+
}
|
|
273
|
+
|
|
274
|
+
_lock_sentinel() {
|
|
275
|
+
local file="$1"
|
|
276
|
+
[[ -n "$file" && -f "$file" ]] || return 0
|
|
277
|
+
chmod 0444 "$file" 2>/dev/null || true
|
|
278
|
+
return 0
|
|
279
|
+
}
|
|
280
|
+
|
|
281
|
+
_unlock_sentinel() {
|
|
282
|
+
local file="$1"
|
|
283
|
+
[[ -n "$file" && -f "$file" ]] || return 0
|
|
284
|
+
chmod 0644 "$file" 2>/dev/null || true
|
|
285
|
+
return 0
|
|
286
|
+
}
|
|
287
|
+
|
|
288
|
+
# PR-0b-narrow (Plan v6) — stamp leader handshake ack onto the sentinel.
|
|
289
|
+
# Mirror of src/node/shared/fs.mjs::stampAckField. Best-effort, audit-only:
|
|
290
|
+
# any failure is silently swallowed. Sequence:
|
|
291
|
+
# 1. chmod 0644 (so jq + mv can write)
|
|
292
|
+
# 2. jq merge .leader_ack
|
|
293
|
+
# 3. atomic rename via tmp file
|
|
294
|
+
# 4. chmod 0444 (re-lock)
|
|
295
|
+
# Tolerant of jq absence (graceful degrade — no stamp, no error).
|
|
296
|
+
_stamp_ack_field() {
|
|
297
|
+
local file="$1"
|
|
298
|
+
[[ -n "$file" && -f "$file" ]] || return 0
|
|
299
|
+
command -v jq >/dev/null 2>&1 || return 0
|
|
300
|
+
local now_iso
|
|
301
|
+
now_iso=$(date -u +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || echo "")
|
|
302
|
+
local tmp="${file}.ack.tmp"
|
|
303
|
+
chmod 0644 "$file" 2>/dev/null || true
|
|
304
|
+
if jq --arg ts "$now_iso" \
|
|
305
|
+
'. + {leader_ack: {acked_by: "leader", acked_at: $ts, ack_pane_state: "shell"}}' \
|
|
306
|
+
"$file" > "$tmp" 2>/dev/null; then
|
|
307
|
+
mv "$tmp" "$file" 2>/dev/null || rm -f "$tmp" 2>/dev/null
|
|
308
|
+
else
|
|
309
|
+
rm -f "$tmp" 2>/dev/null
|
|
310
|
+
fi
|
|
311
|
+
chmod 0444 "$file" 2>/dev/null || true
|
|
312
|
+
return 0
|
|
313
|
+
}
|
|
314
|
+
|
|
245
315
|
# =============================================================================
|
|
246
316
|
# Scaffold Validation
|
|
247
317
|
# =============================================================================
|
|
@@ -635,27 +635,82 @@ launch_verifier_claude() {
|
|
|
635
635
|
# On exit: check done-claim, auto-generate iter-signal.
|
|
636
636
|
# Args: $1=iteration $2=signal_file
|
|
637
637
|
# Returns: 0 (signal generated), 1 (error)
|
|
638
|
+
# Bug #8 PR-B (codex critic P1.2 fix): shared 4-way gate used by both
|
|
639
|
+
# handle_worker_exit_codex and the inline-polling A4 path. Returns:
|
|
640
|
+
# 0 = synthesize allowed (caller writes signal_file + emits audit)
|
|
641
|
+
# 1 = BLOCKED (this function already wrote sentinel + emitted audit)
|
|
642
|
+
# Args: $1=iter $2=us_id $3=audit_clean_code (e.g. codex_exit_with_done_claim
|
|
643
|
+
# or inline_polling_a4_clean)
|
|
644
|
+
_bug8_check_synth_allowed() {
|
|
645
|
+
local iter="$1"
|
|
646
|
+
local us_id="${2:-${CURRENT_US:-ALL}}"
|
|
647
|
+
local audit_clean="$3"
|
|
648
|
+
|
|
649
|
+
# Gate 1: done-claim must exist.
|
|
650
|
+
if [[ ! -f "$DONE_CLAIM_FILE" ]]; then
|
|
651
|
+
log_error " Bug #8: no done-claim. Refusing to synthesize verify signal."
|
|
652
|
+
log_debug "[GOV] iter=$iter bug8=block_codex_exit_no_done_claim"
|
|
653
|
+
write_blocked_sentinel \
|
|
654
|
+
"Codex worker exited without writing done-claim (refusing to synthesize verify signal)" \
|
|
655
|
+
"$us_id" \
|
|
656
|
+
"infra_failure"
|
|
657
|
+
_emit_a4_fallback_audit "$us_id" "$iter" "blocked_codex_exit_no_done_claim"
|
|
658
|
+
return 1
|
|
659
|
+
fi
|
|
660
|
+
|
|
661
|
+
# Gate 2: git toplevel must equal $ROOT (canonicalized — macOS resolves
|
|
662
|
+
# /var → /private/var, NTFS may have 8.3 short paths; compare realpaths).
|
|
663
|
+
local _bug8_top _bug8_top_canon _bug8_root_canon
|
|
664
|
+
_bug8_top=$(git -C "$ROOT" rev-parse --show-toplevel 2>/dev/null)
|
|
665
|
+
_bug8_top_canon=$(cd "$_bug8_top" 2>/dev/null && pwd -P 2>/dev/null)
|
|
666
|
+
_bug8_root_canon=$(cd "$ROOT" 2>/dev/null && pwd -P 2>/dev/null)
|
|
667
|
+
if [[ -z "$_bug8_top" || "$_bug8_top_canon" != "$_bug8_root_canon" ]]; then
|
|
668
|
+
log_error " Bug #8: git unverifiable at \$ROOT=$ROOT (toplevel='$_bug8_top'). Refusing synthesis."
|
|
669
|
+
log_debug "[GOV] iter=$iter bug8=block_git_unverifiable root=$ROOT toplevel=$_bug8_top"
|
|
670
|
+
write_blocked_sentinel \
|
|
671
|
+
"git status unverifiable at $ROOT (toplevel='$_bug8_top'); refusing to synthesize verify signal" \
|
|
672
|
+
"$us_id" \
|
|
673
|
+
"infra_failure"
|
|
674
|
+
_emit_a4_fallback_audit "$us_id" "$iter" "blocked_git_unverifiable"
|
|
675
|
+
return 1
|
|
676
|
+
fi
|
|
677
|
+
|
|
678
|
+
# Gate 3: tree must be clean.
|
|
679
|
+
local _bug8_dirty
|
|
680
|
+
_bug8_dirty=$(git -C "$ROOT" status --porcelain 2>/dev/null)
|
|
681
|
+
if [[ -n "$_bug8_dirty" ]]; then
|
|
682
|
+
local _bug8_first5
|
|
683
|
+
_bug8_first5=$(printf '%s\n' "$_bug8_dirty" | head -n 5 | tr '\n' '|' | sed 's/|$//')
|
|
684
|
+
log_error " Bug #8: done-claim present but tree dirty. Refusing synthesis. dirty: $_bug8_first5"
|
|
685
|
+
log_debug "[GOV] iter=$iter bug8=block_dirty_tree us_id=$us_id dirty='$_bug8_first5'"
|
|
686
|
+
write_blocked_sentinel \
|
|
687
|
+
"worker_incomplete_uncommitted: done-claim present but tree dirty ($_bug8_first5)" \
|
|
688
|
+
"$us_id" \
|
|
689
|
+
"metric_failure"
|
|
690
|
+
_emit_a4_fallback_audit "$us_id" "$iter" "blocked_dirty_tree"
|
|
691
|
+
return 1
|
|
692
|
+
fi
|
|
693
|
+
|
|
694
|
+
# All gates passed — synthesize allowed.
|
|
695
|
+
return 0
|
|
696
|
+
}
|
|
697
|
+
|
|
638
698
|
handle_worker_exit_codex() {
|
|
639
699
|
local iter="$1"
|
|
640
700
|
local signal_file="$2"
|
|
641
701
|
|
|
642
|
-
log " Codex worker process exited. Checking for done-claim..."
|
|
643
|
-
|
|
644
|
-
|
|
645
|
-
|
|
646
|
-
log " Codex worker completed with done-claim (us_id=$dc_us_id). Auto-generating signal."
|
|
647
|
-
echo '{"iteration":'"$iter"',"status":"verify","us_id":"'"$dc_us_id"'","summary":"auto-generated after codex exit","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'"}' > "$signal_file"
|
|
648
|
-
_emit_a4_fallback_audit "$dc_us_id" "$iter" "codex_exit_with_done_claim"
|
|
649
|
-
else
|
|
650
|
-
log " WARNING: Codex worker exited without done-claim. Generating verify signal for current US."
|
|
651
|
-
local current_us
|
|
652
|
-
current_us=$(jq -r '.us_id // "US-001"' "$DESK/memos/${SLUG}-iter-signal.json" 2>/dev/null || echo "US-001")
|
|
653
|
-
local mem_us
|
|
654
|
-
mem_us=$(sed -n 's/.*Next.*US-\([0-9]*\).*/US-\1/p' "$DESK/memos/${SLUG}-memory.md" 2>/dev/null | head -1)
|
|
655
|
-
[[ -n "$mem_us" ]] && current_us="$mem_us"
|
|
656
|
-
echo '{"iteration":'"$iter"',"status":"verify","us_id":"'"$current_us"'","summary":"auto-generated after codex exit (no done-claim)","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'"}' > "$signal_file"
|
|
657
|
-
_emit_a4_fallback_audit "$current_us" "$iter" "codex_exit_no_done_claim"
|
|
702
|
+
log " Codex worker process exited. Checking for done-claim + clean tree..."
|
|
703
|
+
|
|
704
|
+
if ! _bug8_check_synth_allowed "$iter" "${CURRENT_US:-ALL}" "codex_exit_with_done_claim"; then
|
|
705
|
+
return 1
|
|
658
706
|
fi
|
|
707
|
+
|
|
708
|
+
# All 3 gates passed: done-claim present, git OK, tree clean → synthesize.
|
|
709
|
+
local dc_us_id
|
|
710
|
+
dc_us_id=$(jq -r '.us_id // "unknown"' "$DONE_CLAIM_FILE" 2>/dev/null)
|
|
711
|
+
log " Codex worker completed with done-claim (us_id=$dc_us_id) and clean tree. Auto-generating signal."
|
|
712
|
+
echo '{"iteration":'"$iter"',"status":"verify","us_id":"'"$dc_us_id"'","summary":"auto-generated after codex exit (clean tree)","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'"}' > "$signal_file"
|
|
713
|
+
_emit_a4_fallback_audit "$dc_us_id" "$iter" "codex_exit_with_done_claim_clean"
|
|
659
714
|
return 0
|
|
660
715
|
}
|
|
661
716
|
|
|
@@ -2176,8 +2231,22 @@ poll_for_signal() {
|
|
|
2176
2231
|
|
|
2177
2232
|
# Check if signal file appeared
|
|
2178
2233
|
if [[ -f "$signal_file" ]]; then
|
|
2179
|
-
|
|
2180
|
-
|
|
2234
|
+
# Bug #7-extra (BOS 2026-05-06): file existence is NOT enough. Worker
|
|
2235
|
+
# (claude opus) writes via Claude Code's Write tool, which is not
|
|
2236
|
+
# guaranteed atomic — the file can appear with empty / partial JSON
|
|
2237
|
+
# before the write completes. Verifier was being dispatched against a
|
|
2238
|
+
# half-written iter-signal.json. Validate that the file holds a single
|
|
2239
|
+
# parseable, non-null JSON value (`jq -e .`) before accepting; any
|
|
2240
|
+
# failure simply continues polling (next tick re-reads). Note: `jq
|
|
2241
|
+
# empty` was rejected because it accepts an EMPTY file as "zero
|
|
2242
|
+
# documents" — the exact race window we need to reject.
|
|
2243
|
+
if jq -e . "$signal_file" >/dev/null 2>&1; then
|
|
2244
|
+
log " Signal file detected: $signal_file"
|
|
2245
|
+
return 0 # success
|
|
2246
|
+
fi
|
|
2247
|
+
# Empty / truncated / mid-write JSON. Stay in the polling loop and let
|
|
2248
|
+
# the next tick re-read once the writer has finished.
|
|
2249
|
+
log_debug "[bug7-extra] $role signal file present but JSON not yet valid — continue polling"
|
|
2181
2250
|
fi
|
|
2182
2251
|
|
|
2183
2252
|
# A4 fallback: done-claim exists but no signal → Worker forgot iter-signal
|
|
@@ -2216,11 +2285,24 @@ poll_for_signal() {
|
|
|
2216
2285
|
local dc_us_id
|
|
2217
2286
|
dc_us_id=$(jq -r '.us_id // "unknown"' "$DONE_CLAIM_FILE" 2>/dev/null)
|
|
2218
2287
|
if [[ -n "$dc_us_id" && "$dc_us_id" != "null" ]]; then
|
|
2219
|
-
|
|
2220
|
-
|
|
2221
|
-
|
|
2222
|
-
|
|
2223
|
-
|
|
2288
|
+
# Bug #8 PR-B: defer to shared 4-way gate (codex critic P1.2).
|
|
2289
|
+
# _bug8_check_synth_allowed handles done-claim/git/dirty-tree gates
|
|
2290
|
+
# uniformly across handle_worker_exit_codex AND this inline path so
|
|
2291
|
+
# both codex-exit and inline-polling A4 enforce the same contract.
|
|
2292
|
+
if _bug8_check_synth_allowed "$ITERATION" "$dc_us_id" "inline_polling_a4_clean"; then
|
|
2293
|
+
log " WARNING: done-claim exists for $dc_us_id but no iter-signal. Tree clean — auto-generating signal (A4 fallback)."
|
|
2294
|
+
log_debug "[GOV] iter=$ITERATION done_claim_without_signal=true us_id=$dc_us_id action=auto_generate_signal"
|
|
2295
|
+
echo '{"iteration":'"$ITERATION"',"status":"verify","us_id":"'"$dc_us_id"'","summary":"auto-generated by A4 fallback (done-claim + clean tree)","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'"}' > "$signal_file"
|
|
2296
|
+
_emit_a4_fallback_audit "$dc_us_id" "$ITERATION" "inline_polling_a4_clean"
|
|
2297
|
+
return 0
|
|
2298
|
+
else
|
|
2299
|
+
# Bug #8 PR-B (codex critic round-2 P2): hard-stop rc=2 so the
|
|
2300
|
+
# main worker loop (L3119) treats this BLOCKED as terminal,
|
|
2301
|
+
# matching the handle_worker_exit_codex blocked path. rc=1 is
|
|
2302
|
+
# ambiguous — caller may interpret it as a recoverable poll
|
|
2303
|
+
# failure and re-loop while the BLOCKED sentinel is on disk.
|
|
2304
|
+
return 2
|
|
2305
|
+
fi
|
|
2224
2306
|
fi
|
|
2225
2307
|
fi
|
|
2226
2308
|
fi
|
|
@@ -2271,8 +2353,16 @@ poll_for_signal() {
|
|
|
2271
2353
|
fi
|
|
2272
2354
|
# Dispatch to engine-specific exit handler
|
|
2273
2355
|
if [[ "$WORKER_ENGINE" = "codex" && "$role" != *erifier* ]]; then
|
|
2274
|
-
handle_worker_exit_codex
|
|
2275
|
-
|
|
2356
|
+
# Bug #8 PR-B: handle_worker_exit_codex now returns 1 when it has
|
|
2357
|
+
# written a BLOCKED sentinel (no done-claim, dirty tree, git
|
|
2358
|
+
# unverifiable). Propagate the return so main loop stops, instead
|
|
2359
|
+
# of swallowing it with `return 0` and continuing as if the poll
|
|
2360
|
+
# had succeeded.
|
|
2361
|
+
if handle_worker_exit_codex "$ITERATION" "$signal_file"; then
|
|
2362
|
+
return 0
|
|
2363
|
+
else
|
|
2364
|
+
return 2
|
|
2365
|
+
fi
|
|
2276
2366
|
fi
|
|
2277
2367
|
# Claude path (or verifier of any engine)
|
|
2278
2368
|
if handle_worker_exit_claude "$pane_id" "$ITERATION" "$trigger_file"; then
|
|
@@ -2467,8 +2557,16 @@ run_single_verifier() {
|
|
|
2467
2557
|
fi
|
|
2468
2558
|
fi
|
|
2469
2559
|
|
|
2560
|
+
# Bug #7 Fix-Q/R: reap verifier pane the moment we accept the verdict so
|
|
2561
|
+
# codex/claude cannot keep self-reviewing and rewrite verify-verdict.json.
|
|
2562
|
+
# Lock applied AFTER cp so the archived snapshot is also frozen at intent.
|
|
2563
|
+
_kill_pane_process "$VERIFIER_PANE" "verifier-${suffix}"
|
|
2564
|
+
|
|
2470
2565
|
# Copy verdict to destination
|
|
2471
2566
|
cp "$VERDICT_FILE" "$verdict_dest"
|
|
2567
|
+
_lock_sentinel "$VERDICT_FILE"
|
|
2568
|
+
# PR-0b-narrow: stamp leader handshake ack on the verdict (audit-only).
|
|
2569
|
+
_stamp_ack_field "$VERDICT_FILE"
|
|
2472
2570
|
log " Verifier$suffix verdict saved to $verdict_dest"
|
|
2473
2571
|
return 0
|
|
2474
2572
|
}
|
|
@@ -2528,6 +2626,14 @@ run_sequential_final_verify() {
|
|
|
2528
2626
|
return 1
|
|
2529
2627
|
fi
|
|
2530
2628
|
|
|
2629
|
+
# Bug #7 Fix-Q/R: reap verifier pane between per-US final verifications so
|
|
2630
|
+
# the previous codex/claude TUI cannot continue running while the next per-
|
|
2631
|
+
# US verifier dispatch reuses the same pane.
|
|
2632
|
+
_kill_pane_process "$VERIFIER_PANE" "verifier-final"
|
|
2633
|
+
_lock_sentinel "$VERDICT_FILE"
|
|
2634
|
+
# PR-0b-narrow: stamp leader handshake ack on the verdict (audit-only).
|
|
2635
|
+
_stamp_ack_field "$VERDICT_FILE"
|
|
2636
|
+
|
|
2531
2637
|
# Check verdict
|
|
2532
2638
|
local verdict
|
|
2533
2639
|
verdict=$(jq -r '.verdict' "$VERDICT_FILE" 2>/dev/null)
|
|
@@ -2940,6 +3046,10 @@ main() {
|
|
|
2940
3046
|
fi
|
|
2941
3047
|
|
|
2942
3048
|
# --- governance.md s7 step 8 (cleanup): Clean previous iteration signals ---
|
|
3049
|
+
# Bug #7 Fix-R cleanup: unlock 0o444 sentinels written by the previous
|
|
3050
|
+
# iteration's reaper before rm so cleanup does not log permission noise.
|
|
3051
|
+
_unlock_sentinel "$SIGNAL_FILE"
|
|
3052
|
+
_unlock_sentinel "$VERDICT_FILE"
|
|
2943
3053
|
rm -f "$SIGNAL_FILE" "$DONE_CLAIM_FILE" "$VERDICT_FILE" 2>/dev/null
|
|
2944
3054
|
rm -f "$WORKER_HEARTBEAT" "$VERIFIER_HEARTBEAT" 2>/dev/null
|
|
2945
3055
|
|
|
@@ -3003,6 +3113,12 @@ main() {
|
|
|
3003
3113
|
if poll_for_signal "$SIGNAL_FILE" "$WORKER_HEARTBEAT" "$WORKER_PANE" "$worker_launch" "Worker"; then
|
|
3004
3114
|
worker_poll_done=1
|
|
3005
3115
|
log_debug "[FLOW] iter=$ITERATION poll_signal_received=true"
|
|
3116
|
+
# Bug #7 Fix-Q/R: reap worker pane immediately so claude/codex cannot
|
|
3117
|
+
# self-review and rewrite iter-signal.json (1m43s drift observed).
|
|
3118
|
+
_kill_pane_process "$WORKER_PANE" "worker"
|
|
3119
|
+
_lock_sentinel "$SIGNAL_FILE"
|
|
3120
|
+
# PR-0b-narrow: stamp leader handshake ack on the iter-signal (audit-only).
|
|
3121
|
+
_stamp_ack_field "$SIGNAL_FILE"
|
|
3006
3122
|
else
|
|
3007
3123
|
worker_poll_rc=$?
|
|
3008
3124
|
if (( worker_poll_rc == 2 )); then
|
|
@@ -3210,6 +3326,12 @@ main() {
|
|
|
3210
3326
|
update_status "blocked" "verifier_dead"
|
|
3211
3327
|
return 1
|
|
3212
3328
|
fi
|
|
3329
|
+
# Bug #7 Fix-Q/R: reap verifier pane immediately so codex cannot
|
|
3330
|
+
# rewrite verify-verdict.json post-detect (mtime drift fix).
|
|
3331
|
+
_kill_pane_process "$VERIFIER_PANE" "verifier"
|
|
3332
|
+
_lock_sentinel "$VERDICT_FILE"
|
|
3333
|
+
# PR-0b-narrow: stamp leader handshake ack on the verdict (audit-only).
|
|
3334
|
+
_stamp_ack_field "$VERDICT_FILE"
|
|
3213
3335
|
fi
|
|
3214
3336
|
|
|
3215
3337
|
# AC1: capture verifier end timestamp
|