@ai-dev-methodologies/rlp-desk 0.11.1 → 0.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/spicy-booping-galaxy.md +322 -0
- package/docs/rlp-desk/artifact-schema.md +99 -0
- package/docs/rlp-desk/ci-setup.md +100 -0
- package/docs/rlp-desk/e2e-scenarios.md +102 -0
- package/docs/rlp-desk/plans/rlp-desk-tmux-flywheel-routing.md +730 -0
- package/install.sh +93 -20
- package/package.json +9 -3
- package/scripts/build-node-manifest.js +52 -0
- package/scripts/postinstall.js +162 -8
- package/src/commands/rlp-desk.md +73 -50
- package/src/governance.md +56 -7
- package/src/node/MANIFEST.txt +15 -0
- package/src/node/cli/command-builder.mjs +43 -5
- package/src/node/constants.mjs +19 -0
- package/src/node/init/campaign-initializer.mjs +100 -10
- package/src/node/polling/signal-poller.mjs +139 -3
- package/src/node/reporting/campaign-reporting.mjs +5 -1
- package/src/node/run.mjs +31 -2
- package/src/node/runner/campaign-main-loop.mjs +521 -44
- package/src/node/runner/leader-registry.mjs +100 -0
- package/src/node/runner/prompt-detector.mjs +41 -0
- package/src/node/runner/prompt-dismisser.mjs +200 -0
- package/src/node/shared/fs.mjs +38 -0
- package/src/node/util/debug-log.mjs +56 -0
- package/src/node/util/desk-root.mjs +24 -0
- package/src/node/util/shell-quote.mjs +12 -0
- package/docs/superpowers/plans/2026-04-24-gpt-5-5-default.md +0 -517
- package/docs/superpowers/specs/2026-04-24-gpt-5-5-default.md +0 -107
- /package/docs/{TODO-verification-next.md → rlp-desk/TODO-verification-next.md} +0 -0
- /package/docs/{architecture.md → rlp-desk/architecture.md} +0 -0
- /package/docs/{blueprints → rlp-desk/blueprints}/blueprint-flywheel-enhancement.md +0 -0
- /package/docs/{blueprints → rlp-desk/blueprints}/blueprint-pivot-step.md +0 -0
- /package/docs/{blueprints → rlp-desk/blueprints}/plan-flywheel-enhancement.md +0 -0
- /package/docs/{blueprints → rlp-desk/blueprints}/sv-architecture-rethink.md +0 -0
- /package/docs/{getting-started.md → rlp-desk/getting-started.md} +0 -0
- /package/docs/{internal → rlp-desk/internal}/verification-policy-gap-analysis.md +0 -0
- /package/docs/{internal → rlp-desk/internal}/verification-strategy-research.md +0 -0
- /package/docs/{multi-mission-orchestration.md → rlp-desk/multi-mission-orchestration.md} +0 -0
- /package/docs/{plans → rlp-desk/plans}/cozy-gliding-trinket.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/frolicking-churning-honey.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/keen-sauteeing-snowflake.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/mutable-booping-corbato.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/rlp-desk-0.11-handoff-7fixes.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/rlp-desk-0.11.1-tmux-pane-disappearance.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/rlp-desk-elegant-papert-agent-a8cd695ffca2a3ad8.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/rlp-desk-elegant-papert.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/toasty-whistling-diffie-agent-a6814625642e956da.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/toasty-whistling-diffie.md +0 -0
- /package/docs/{plans → rlp-desk/plans}/validated-snacking-crayon.md +0 -0
- /package/docs/{protocol-reference.md → rlp-desk/protocol-reference.md} +0 -0
|
@@ -0,0 +1,322 @@
|
|
|
1
|
+
# Plan — Claude worker `.claude/` sensitive prompt hang 수정
|
|
2
|
+
|
|
3
|
+
> **Source bug report**: `/Users/kyjin/dev/doul/bos/docs/exec-plans/active/2026-05-01-rlp-desk-bug-report.md`
|
|
4
|
+
> **Severity**: HIGH — `--mode tmux` + `--worker-model sonnet/haiku/opus` 조합에서 모든 campaign blocking
|
|
5
|
+
> **Target version**: 0.13.0 (breaking — project-local sentinel 경로 이동)
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 1. Context
|
|
10
|
+
|
|
11
|
+
### 문제
|
|
12
|
+
|
|
13
|
+
`<project>/.claude/ralph-desk/memos/<slug>-done-claim.json` 등 sentinel 작성 시
|
|
14
|
+
Claude Code가 `.claude/` 경로를 self-modification suspect로 hardcoded 처리하여
|
|
15
|
+
permission prompt를 띄움. `--dangerously-skip-permissions`로도 우회 X.
|
|
16
|
+
Worker hang → Leader pollForSignal 30분 timeout → BLOCKED(`infra_failure`).
|
|
17
|
+
|
|
18
|
+
Codex worker(gpt-5.5:* 등)에서는 미발생 — Claude Code의 sensitive 정책 외부.
|
|
19
|
+
즉 **현재는 Claude 계열 worker가 사실상 사용 불가**.
|
|
20
|
+
|
|
21
|
+
### 핵심 결정
|
|
22
|
+
|
|
23
|
+
프로젝트-로컬 runtime 디렉토리를 `<project>/.claude/ralph-desk/`에서
|
|
24
|
+
`<project>/.rlp-desk/`로 이동.
|
|
25
|
+
|
|
26
|
+
**근거**:
|
|
27
|
+
- Claude Code의 sensitive 검사 트리거는 `.claude/` 디렉토리명 자체.
|
|
28
|
+
- 디렉토리명만 바꾸면 회피 (영감 출처 design-desk도 `.claude/` 안에 sentinel을 두지 않음).
|
|
29
|
+
- `~/.claude/ralph-desk/`(설치 위치 + cross-project analytics)는 변경 없음 — Leader가
|
|
30
|
+
자기 자신의 install dir을 self-modify할 일은 없으므로 sensitive 검사 트리거 안 함.
|
|
31
|
+
|
|
32
|
+
### 비-목표
|
|
33
|
+
|
|
34
|
+
- `~/.claude/ralph-desk/` 설치 경로 변경 (registry, analytics, leader binaries 유지)
|
|
35
|
+
- `--mode agent` 폐지 (Fix-1로 자동 해결되므로 그대로 유지)
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## 2. Approach (3단계)
|
|
40
|
+
|
|
41
|
+
### Phase 1 — Fix-1: 프로젝트-로컬 sentinel 경로 이동 (`.claude/ralph-desk/` → `.rlp-desk/`)
|
|
42
|
+
|
|
43
|
+
**변경 대상 파일** (Explore 결과):
|
|
44
|
+
|
|
45
|
+
| File | 위치 | 변경 내용 |
|
|
46
|
+
|------|------|----------|
|
|
47
|
+
| `src/node/init/campaign-initializer.mjs` | L5, L13 | `GITIGNORE_RULE` + `deskRoot` 상수 |
|
|
48
|
+
| `src/scripts/init_ralph_desk.zsh` | L77, L1091, L1100, L1105-1137 | `DESK` 변수 + permission marker 패턴 |
|
|
49
|
+
| `src/scripts/run_ralph_desk.zsh` | L255 | `DESK` 변수 |
|
|
50
|
+
| `src/scripts/lib_ralph_desk.zsh` | L57 | 홈 디렉토리 변수 주석 명확화 (변경 없음, 주석만) |
|
|
51
|
+
| `src/node/runner/campaign-main-loop.mjs` | L44-80 | 경로 빌드 함수 |
|
|
52
|
+
| `src/commands/rlp-desk.md` | 24개 라인 | 모든 `.claude/ralph-desk/` → `.rlp-desk/` 참조 |
|
|
53
|
+
| `src/governance.md` | 6개 라인 | 경로 문서화 |
|
|
54
|
+
|
|
55
|
+
**유지(변경 없음)**:
|
|
56
|
+
- `src/node/runner/leader-registry.mjs` (홈 디렉토리 `~/.claude/ralph-desk/registry.jsonl`)
|
|
57
|
+
- `install.sh`, `scripts/postinstall.js` (홈 디렉토리 설치)
|
|
58
|
+
|
|
59
|
+
**Worker/Verifier `--add-dir` whitelist**:
|
|
60
|
+
- 기존: `--add-dir "$HOME/.claude/ralph-desk" "$ROOT"` (lib_ralph_desk.zsh:57-58).
|
|
61
|
+
- `$ROOT`가 이미 whitelist이므로 `$ROOT/.rlp-desk`는 **자동 포함** — 별도 추가 불필요.
|
|
62
|
+
- 핵심은 디렉토리명 변경 자체로 sensitive 검사 trigger를 회피하는 것이지, sandbox/permission 변경이 아님.
|
|
63
|
+
|
|
64
|
+
**Runtime dir override (Synthesis — 미래 회피책)**:
|
|
65
|
+
|
|
66
|
+
`deskRoot`를 환경변수 `RLP_DESK_RUNTIME_DIR`로 외부화. 기본값 `.rlp-desk/`. 향후 platform이 또 sensitive 검사를 확장하면 사용자가 즉시 `RLP_DESK_RUNTIME_DIR=.rlp-runtime/` 등으로 우회 가능. P1(don't fight platform)을 코드 단에 영속화.
|
|
67
|
+
|
|
68
|
+
**Migration race-safety (atomic — Codex Critic 반영)**:
|
|
69
|
+
|
|
70
|
+
이 절차는 **init 모드 진입 시에만** 실행. run 모드는 §2 Phase 3 정책대로 자동 mv 수행 안 함(경고 + 수동 안내).
|
|
71
|
+
|
|
72
|
+
- 락 파일 위치: `<project>/.rlp-desk-migration.lock` (target dir 외부 — target dir이 아직 없을 수 있으므로 parent `<project>/`에 둠).
|
|
73
|
+
- 락 획득: `fs.openSync(lockPath, 'wx')` (exclusive create — TOCTOU 없음). 이미 존재하면 "다른 프로세스가 마이그레이션 중" 에러로 즉시 abort.
|
|
74
|
+
- init 모드 마이그레이션 절차 (락 보유 상태):
|
|
75
|
+
1. 양쪽(legacy `.claude/ralph-desk/` + new `.rlp-desk/`) 존재 여부 검사.
|
|
76
|
+
2. 둘 다 존재 → 자동 mv **거부** + 사용자 정리 안내 (pre-mortem #1 binding).
|
|
77
|
+
3. legacy만 존재 → `fs.renameSync(legacy, new)` (원자적, 같은 파일시스템 내).
|
|
78
|
+
4. 둘 다 없음 → noop (정상 init).
|
|
79
|
+
- run 모드(legacy 발견 시): mv 시도하지 않고 비-zero exit + 수동 명령 안내. 진행 중 캠페인 보호.
|
|
80
|
+
- 락 해제: `try/finally`로 `fs.unlinkSync(lockPath)` 보장. 프로세스 crash 시 다음 실행에서 stale 락 감지(mtime > N분) 시 경고 후 제거.
|
|
81
|
+
- `fresh` 모드(`campaign-initializer.mjs:20`의 `fs.rm({recursive:true})`)는 마이그레이션 완료 후 새 경로에서만 실행.
|
|
82
|
+
|
|
83
|
+
### Phase 2 — Fix-2: Claude worker + tmux 조합 경고
|
|
84
|
+
|
|
85
|
+
**위치**: `src/node/cli/command-builder.mjs` (이미 `CLAUDE_MODELS = Set(['haiku','sonnet','opus'])` 존재).
|
|
86
|
+
|
|
87
|
+
**로직**: `parseRunOptions()` (`src/node/run.mjs:101-180`) 파싱 후
|
|
88
|
+
`runRunCommand` 진입 시점에 다음 검증 추가:
|
|
89
|
+
|
|
90
|
+
```js
|
|
91
|
+
// src/node/run.mjs (파싱 후 검증 단계)
|
|
92
|
+
if (mode === 'tmux' && isClaudeEngine(workerModel)) {
|
|
93
|
+
console.warn(
|
|
94
|
+
'WARNING: Claude worker in tmux mode may hang on .claude/ sentinel writes.\n' +
|
|
95
|
+
'After v0.13.0, sentinels live in <project>/.rlp-desk/ which avoids this.\n' +
|
|
96
|
+
'If hang persists, switch to --worker-model gpt-5.5:high (codex) or --mode agent.'
|
|
97
|
+
);
|
|
98
|
+
}
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
PRD brainstorm 플로우(`src/commands/rlp-desk.md`)에도 동일 경고 문구 노출.
|
|
102
|
+
|
|
103
|
+
**Observability — sentinel hang early-detect 휴리스틱** (Architect synthesis):
|
|
104
|
+
|
|
105
|
+
기존 Leader pollForSignal은 30분 timeout으로만 감지 → silent failure. 보강:
|
|
106
|
+
- Worker pane stdout에 `Do you want to ` / `❯ 1. Yes` 등 prompt 시그니처 grep → 즉시 BLOCKED + `category=permission_prompt`로 라벨링.
|
|
107
|
+
- 위치: `src/node/runner/prompt-dismisser.mjs` 또는 별도 `prompt-detector.mjs`.
|
|
108
|
+
- 효과: 다음 platform 변화 시에도 30분이 아니라 수 초 내 발견.
|
|
109
|
+
|
|
110
|
+
### Phase 3 — 마이그레이션 도우미 (legacy `.claude/ralph-desk/` 감지)
|
|
111
|
+
|
|
112
|
+
**위치**: `src/node/init/campaign-initializer.mjs` 진입 시 + `src/node/runner/campaign-main-loop.mjs` `ensureScaffold()` 직전.
|
|
113
|
+
|
|
114
|
+
**로직**:
|
|
115
|
+
1. `<project>/.claude/ralph-desk/`가 존재하고 `<project>/.rlp-desk/`가 없으면
|
|
116
|
+
감지 후 다음 중 하나:
|
|
117
|
+
- **자동 mv** (init 모드): scaffold가 새로 만들어지는 단계라면 §2 Migration race-safety 절차로 자동 이동.
|
|
118
|
+
- **경고 + 수동 명령 안내** (run 모드): 비-zero exit + "기존 캠페인이 있습니다. `mv .claude/ralph-desk .rlp-desk` 후 재실행하세요."
|
|
119
|
+
2. 양쪽 다 존재 시 — 모드 무관하게 자동 mv **거부** + 비-zero exit + 사용자 정리 안내(stderr에 "both directories exist" 포함). §2 Migration race-safety + §3a MEDIUM-B 검증과 일치.
|
|
120
|
+
3. `.gitignore`에서 `.claude/ralph-desk/` 라인 제거 + `.rlp-desk/` 라인 추가 (init 시점, mv 성공 후).
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## 3. Verification
|
|
125
|
+
|
|
126
|
+
CLAUDE.md mandate에 따라 commit 전 다음을 모두 통과해야 함:
|
|
127
|
+
|
|
128
|
+
### 3a. Self-Verification (6 scenarios — `src/governance.md`/`init_ralph_desk.zsh` 변경 시 mandatory; executable commands)
|
|
129
|
+
|
|
130
|
+
각 시나리오: Worker(execution_steps) → Verifier(reasoning, 5 categories) → PASS.
|
|
131
|
+
|
|
132
|
+
#### LOW (단위 — `isClaudeEngine()` + env 해석)
|
|
133
|
+
```bash
|
|
134
|
+
node --test tests/node/test-claude-engine-detect.mjs
|
|
135
|
+
# expected: tests passed; isClaudeEngine('sonnet') === true; resolveDeskRoot(env={RLP_DESK_RUNTIME_DIR:'.x'}) === '.x'
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
#### MEDIUM-A (auto-mv 정상 케이스 — pre-mortem #2 part)
|
|
139
|
+
```bash
|
|
140
|
+
TMP=$(mktemp -d); cd "$TMP"; git init -q
|
|
141
|
+
mkdir -p .claude/ralph-desk/memos && echo data > .claude/ralph-desk/memos/x.md
|
|
142
|
+
node ~/.claude/ralph-desk/node/run.mjs init testslug --autonomous
|
|
143
|
+
test ! -d .claude/ralph-desk && test -f .rlp-desk/memos/x.md && \
|
|
144
|
+
grep -q '"Read(.rlp-desk/\*\*)"' .claude/settings.local.json
|
|
145
|
+
# expected: exit 0, all assertions PASS
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
#### MEDIUM-B (conflict 거부 — pre-mortem #1 binding)
|
|
149
|
+
```bash
|
|
150
|
+
TMP=$(mktemp -d); cd "$TMP"; git init -q
|
|
151
|
+
mkdir -p .claude/ralph-desk .rlp-desk
|
|
152
|
+
node ~/.claude/ralph-desk/node/run.mjs init testslug --autonomous 2> stderr.log
|
|
153
|
+
test $? -ne 0 && grep -q 'both directories exist' stderr.log
|
|
154
|
+
# expected: non-zero exit, conflict 안내
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
#### HIGH-A (claude+tmux E2E — AC4 binding, primary fix 검증)
|
|
158
|
+
```bash
|
|
159
|
+
TMP=$(mktemp -d); cd "$TMP"; git init -q
|
|
160
|
+
node ~/.claude/ralph-desk/node/run.mjs init testslug --autonomous
|
|
161
|
+
timeout 600 node ~/.claude/ralph-desk/node/run.mjs run testslug \
|
|
162
|
+
--mode tmux --worker-model sonnet --max-iter 1 --iter-timeout 300
|
|
163
|
+
test $? -eq 0 && test -f .rlp-desk/memos/testslug-done-claim.json
|
|
164
|
+
# expected: exit 0, sentinel hang 없이 완료
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
#### HIGH-B (codex+tmux 회귀 — AC5 binding, P3 first-class)
|
|
168
|
+
```bash
|
|
169
|
+
TMP=$(mktemp -d); cd "$TMP"; git init -q
|
|
170
|
+
node ~/.claude/ralph-desk/node/run.mjs init testslug --autonomous
|
|
171
|
+
timeout 600 node ~/.claude/ralph-desk/node/run.mjs run testslug \
|
|
172
|
+
--mode tmux --worker-model gpt-5.5:high --max-iter 1 --iter-timeout 300
|
|
173
|
+
test $? -eq 0 && test -f .rlp-desk/memos/testslug-done-claim.json
|
|
174
|
+
# expected: exit 0, codex worker 회귀 없음
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
#### OBSERVABILITY (prompt 조기 감지 — AC6 binding)
|
|
178
|
+
```bash
|
|
179
|
+
# 모의 worker stdout에 "❯ 1. Yes" 라인 주입 → prompt-detector가 5초 이내 BLOCKED 작성
|
|
180
|
+
node tests/node/test-prompt-detector-e2e.mjs
|
|
181
|
+
jq -r .category .rlp-desk/memos/testslug-blocked.json
|
|
182
|
+
# expected: "permission_prompt"
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
### 3b. Review
|
|
186
|
+
|
|
187
|
+
- **ralplan** (Planner→Architect→Critic): governance/template 변경이므로 mandatory.
|
|
188
|
+
- **codex review**: 0 issue 도달까지 반복 (CLAUDE.md mandate).
|
|
189
|
+
|
|
190
|
+
### 3c. Local sync 검증
|
|
191
|
+
|
|
192
|
+
CLAUDE.md `Local File Sync` 섹션의 banner-aware verification 절차로
|
|
193
|
+
모든 `src/` 변경분이 `~/.claude/ralph-desk/`에 sync되었는지 확인:
|
|
194
|
+
|
|
195
|
+
```bash
|
|
196
|
+
diff -rq src/node ~/.claude/ralph-desk/node | grep -v 'DO NOT EDIT'
|
|
197
|
+
# expected: empty output (모든 파일이 banner 차이 외에 동일)
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### 3d. 수동 reproduction (버그 리포터 시나리오 재현)
|
|
201
|
+
|
|
202
|
+
```bash
|
|
203
|
+
# legacy 경로 시뮬레이션
|
|
204
|
+
mkdir -p /tmp/test-rlp-desk/.claude/ralph-desk
|
|
205
|
+
cd /tmp/test-rlp-desk
|
|
206
|
+
|
|
207
|
+
# init → 마이그레이션 또는 신규 .rlp-desk 생성 확인
|
|
208
|
+
node ~/.claude/ralph-desk/node/run.mjs init test-slug --autonomous
|
|
209
|
+
|
|
210
|
+
# 검증: .rlp-desk/ 존재 + .gitignore 갱신
|
|
211
|
+
test -d .rlp-desk && echo PASS || echo FAIL
|
|
212
|
+
grep -q '^.rlp-desk/$' .gitignore && echo PASS || echo FAIL
|
|
213
|
+
|
|
214
|
+
# 1-iter campaign with claude worker
|
|
215
|
+
node ~/.claude/ralph-desk/node/run.mjs run test-slug \
|
|
216
|
+
--mode tmux --worker-model sonnet --max-iter 1 --iter-timeout 600
|
|
217
|
+
# 기대: sentinel hang 없이 1 iteration 완료
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
## 4. Release Plan
|
|
223
|
+
|
|
224
|
+
- **버전**: `0.13.0` (npm minor bump — 자동 마이그레이션 + run 모드 명확한 안내로 사용자 영향 흡수).
|
|
225
|
+
- **Release notes** (user-facing only — CLAUDE.md mandate; 최상단에 BREAKING 라벨 강조):
|
|
226
|
+
- **BREAKING**: project-local runtime이 `.claude/ralph-desk/` → `.rlp-desk/`로 이동.
|
|
227
|
+
init 모드는 자동 마이그레이션, run 모드는 경고 + 수동 `mv .claude/ralph-desk .rlp-desk` 안내.
|
|
228
|
+
- **NEW**: `RLP_DESK_RUNTIME_DIR` 환경변수로 runtime 디렉토리 override 가능 (미래 platform 변화 회피용).
|
|
229
|
+
- **FIX**: Claude worker + tmux 조합 sentinel write hang 해결.
|
|
230
|
+
- **NEW**: claude worker + tmux 조합 경고 + permission prompt 조기 감지(BLOCKED `category=permission_prompt`).
|
|
231
|
+
- **Roadmap note**: 1.0.0에서 legacy 감지 로직 deprecation 예정 (deprecation cycle 약속).
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## 5. Critical files (이 plan 실행 시 수정 대상 요약)
|
|
236
|
+
|
|
237
|
+
```
|
|
238
|
+
src/node/init/campaign-initializer.mjs # deskRoot 상수 + GITIGNORE_RULE + 마이그레이션 감지
|
|
239
|
+
src/node/runner/campaign-main-loop.mjs # 경로 빌드 함수 + ensureScaffold() 전 legacy 검사
|
|
240
|
+
src/node/cli/command-builder.mjs # isClaudeEngine() helper export
|
|
241
|
+
src/node/run.mjs # parseRunOptions() 후 tmux+claude 경고
|
|
242
|
+
src/scripts/init_ralph_desk.zsh # DESK 변수 + permission marker (.rlp-desk/**)
|
|
243
|
+
src/scripts/run_ralph_desk.zsh # DESK 변수
|
|
244
|
+
src/scripts/lib_ralph_desk.zsh # 변경 없음 ($ROOT 이미 whitelist이므로 .rlp-desk 자동 포함; 주석만 명확화)
|
|
245
|
+
src/commands/rlp-desk.md # 24개 라인 경로 참조 갱신
|
|
246
|
+
src/governance.md # 6개 라인 경로 문서화
|
|
247
|
+
package.json # version 0.13.0
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
252
|
+
## 6. Resolved decisions (사용자 확정 — 추천안)
|
|
253
|
+
|
|
254
|
+
- **마이그레이션 정책**: init 모드 = 자동 mv, run 모드 = 경고 + 수동 mv 안내. 사용자 데이터(memos, plans)
|
|
255
|
+
존재 시 init은 새 scaffold가 만들어지는 시점이라 안전하게 이동 가능, run은 진행 중인 campaign일 수
|
|
256
|
+
있으므로 명시적 사용자 확인 필요.
|
|
257
|
+
- **버전 강도**: `0.13.0` (npm minor). Project-local 경로 변경은 breaking이지만 자동 마이그레이션
|
|
258
|
+
도우미 + run 모드 명확한 안내가 있으므로 minor로 충분. major(1.0.0)는 후속 안정화 단계에서.
|
|
259
|
+
- **`--mode agent` 처리**: Fix-1(경로 이동)으로 자동 해결. agent mode worker도 동일하게 `.rlp-desk/`에
|
|
260
|
+
쓰므로 Claude Code sensitive trigger 발생 안 함. 별도 작업 불필요.
|
|
261
|
+
|
|
262
|
+
---
|
|
263
|
+
|
|
264
|
+
## 7. RALPLAN-DR Deliberation Summary
|
|
265
|
+
|
|
266
|
+
### Principles
|
|
267
|
+
|
|
268
|
+
1. **Don't fight platform-reserved namespaces** — Claude Code hardcoded sensitive policy 우회 불가; 회피만이 답.
|
|
269
|
+
2. **Project-local runtime은 git 트리에 머물러야** — 캠페인 memory/plans는 iteration 간 영속 필요.
|
|
270
|
+
3. **Cross-engine fallback은 first-class** — codex worker는 영구적 회피책이 아닌 동등한 옵션.
|
|
271
|
+
4. **마이그레이션 안전성 우선, 자동화는 명확히 안전한 시점에만** — init 모드는 fresh scaffold 시점이라 자동 mv, run 모드는 진행 중 캠페인 보호를 위해 경고 + 수동 mv. "default 자동"이 아닌 "context-aware 자동/수동 선택".
|
|
272
|
+
|
|
273
|
+
### Decision Drivers (top 3)
|
|
274
|
+
|
|
275
|
+
1. **Unblock HIGH severity blocker** — claude-worker + tmux 모든 캠페인 차단 중.
|
|
276
|
+
2. **Minimize breaking surface** — 진행 중 campaign 손실 방지.
|
|
277
|
+
3. **Reference parity** — design-desk(영감 출처)도 sentinel을 `.claude/` 밖에 둠.
|
|
278
|
+
|
|
279
|
+
### Viable Options
|
|
280
|
+
|
|
281
|
+
| Option | Pros | Cons |
|
|
282
|
+
|---|---|---|
|
|
283
|
+
| **A. `.rlp-desk/` 이동 (권장)** | `.claude/` trigger 완전 회피, design-desk 패턴 일치, git 트리 유지 | 모든 사용자 마이그레이션 필요 (자동화로 완화) |
|
|
284
|
+
| **B. `.claude/ralph-desk/` 유지 + 권한 escape** | 경로 변경 없음 | 사용자 repro에서 permission allowlist도 우회 실패. Claude Code 내부 동작 의존 → brittle |
|
|
285
|
+
| **C. `$TMPDIR/rlp-desk-<slug>/`** | 프로젝트 트리 청결 | git 추적 끊김, campaign memory 영속성 깨짐, resume 취약 |
|
|
286
|
+
|
|
287
|
+
### Invalidation rationale
|
|
288
|
+
|
|
289
|
+
- **B**: 버그 리포트 §2 표에서 `Read/Edit/Write(.claude/ralph-desk/**)` allowlist 추가가 실패함이 입증. Claude Code의 sensitive 게이트는 일반 permission 시스템과 별도로 동작 → 의존 불가.
|
|
290
|
+
- **C**: campaign memory(`memos/<slug>-memory.md` 등)는 iteration 간 영속이 핵심 설계. tmpfs 기반은 OS 재부팅/clean 시 유실되어 resume 불가능.
|
|
291
|
+
|
|
292
|
+
→ **A가 유일한 viable option**.
|
|
293
|
+
|
|
294
|
+
### Pre-mortem (3 scenarios — verification §3a에 명시 binding)
|
|
295
|
+
|
|
296
|
+
1. **자동 mv가 사용자 데이터 덮어쓰기**: 양쪽 디렉토리 모두 존재 시 mv 충돌 → mitigation: §2 Phase 3의 atomic lock + 충돌 거부. **검증**: §3a MEDIUM-B.
|
|
297
|
+
2. **권한 marker 누락**: `.rlp-desk/**` permission이 `init_ralph_desk.zsh`에 추가 안 되면 worker가 새 경로에서도 prompt 발생 → mitigation: permission marker 패턴 갱신 + assertion. **검증**: §3a MEDIUM-A의 `grep -q '"Read(.rlp-desk/\*\*)"' .claude/settings.local.json` 단언 + §3a HIGH-A의 1-iter 캠페인 sentinel write 성공(end-to-end).
|
|
298
|
+
3. **Sandbox `--add-dir` 미커버**: `$ROOT`가 이미 whitelist이므로 자동 포함이지만, 만약 `--add-dir` 인자 변경으로 회귀하면 sandbox가 새 경로 거부 → mitigation: 통합 테스트에서 worker 명령 빌드 결과 단언. **검증**: §3a HIGH-A의 worker spawn 단계에서 `claude --add-dir "$ROOT" ...` 명시 확인 + §3d 수동 1-iter 재현.
|
|
299
|
+
|
|
300
|
+
### Acceptance Criteria (자동 검증 가능 — pass 신호 명시)
|
|
301
|
+
|
|
302
|
+
- [ ] **AC1** — `<project>/.rlp-desk/`만 사용. 검증: `find . -type d -path '*.claude/ralph-desk' -newer <campaign-start-marker> | wc -l` == 0.
|
|
303
|
+
- [ ] **AC2** — Legacy `.claude/ralph-desk/` 존재 시 init은 자동 mv 후 `.gitignore`에 `.rlp-desk/` 라인 존재. 검증: `test ! -d .claude/ralph-desk && test -d .rlp-desk && grep -q '^\.rlp-desk/$' .gitignore`.
|
|
304
|
+
- [ ] **AC3** — Run 모드에서 legacy 발견 시 비-zero exit + stderr에 `mv .claude/ralph-desk .rlp-desk` 안내 문자열 포함. 검증: `node run.mjs run ...; echo $? != 0; grep -q "mv .claude/ralph-desk" stderr.log`.
|
|
305
|
+
- [ ] **AC4** — `--mode tmux --worker-model sonnet` 1-iter 캠페인 600초 이내 종료 + `done-claim.json` 존재 + exit 0. 검증: `timeout 600 node run.mjs run ... --max-iter 1; test $? == 0 && test -f .rlp-desk/memos/<slug>-done-claim.json`.
|
|
306
|
+
- [ ] **AC5** — `--worker-model gpt-5.5:high` 1-iter 캠페인 동일 단언(AC4 패턴) — 회귀 없음.
|
|
307
|
+
- [ ] **AC6** — Permission prompt 조기 감지: worker에 mock prompt 주입 시 5초 이내 BLOCKED + sentinel `category=permission_prompt`. 검증: `jq -r .category .rlp-desk/memos/<slug>-blocked.json == "permission_prompt"`.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## 8. ADR (Architectural Decision Record)
|
|
312
|
+
|
|
313
|
+
- **Decision**: Project-local sentinel/runtime을 `.claude/ralph-desk/` → `.rlp-desk/`로 이동.
|
|
314
|
+
- **Drivers**: Claude Code hardcoded sensitive policy로 worker hang. 우회 불가 → 디렉토리 명 변경.
|
|
315
|
+
- **Alternatives considered**: B(`.claude/` 유지 + escape) — 사용자 repro에서 입증 실패. C(`$TMPDIR/`) — campaign memory 영속성 손상.
|
|
316
|
+
- **Why chosen**: A는 design-desk 참조 패턴과 일치하고, sensitive trigger의 root cause(`.claude/` 디렉토리명)를 직접 회피. 자동 마이그레이션으로 사용자 영향 최소화.
|
|
317
|
+
- **Consequences**: 0.13.0 minor breaking. 모든 사용자 `.gitignore` + 디렉토리 갱신 필요(자동화). 문서 업데이트 광범위(rlp-desk.md 24 lines, governance.md 6 lines).
|
|
318
|
+
- **Follow-ups**:
|
|
319
|
+
1. **1.0.0 deprecation cycle**: legacy `.claude/ralph-desk/` 감지 로직 제거 (사용자에게 1 minor 사이클 마이그레이션 시간 확보).
|
|
320
|
+
2. **`~/.claude/ralph-desk/` 이동 검토**: 현재 sensitive trigger 미발생이지만 platform 변화 대비 1.x에서 검토.
|
|
321
|
+
3. **Permission prompt 조기 감지**: `prompt-detector.mjs` 추가 후 다른 platform-shaped silent failure에도 재사용 (예: codex CLI의 미래 정책 변화).
|
|
322
|
+
4. **Steelman 대응**: Architect 지적("`.rlp-desk/`도 미래 sensitive화 가능") — `RLP_DESK_RUNTIME_DIR` env 외부화로 기술적 대응 완료. 정책 모니터링은 운영 영역.
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
# rlp-desk Artifact Schema (v5.7 §4.25)
|
|
2
|
+
|
|
3
|
+
> Worker/Verifier write JSON artifacts that the Leader reads. The schema validator at the READ boundary enforces these contracts. **Violation → BLOCKED `contract_violation/malformed_artifact`** (recoverable).
|
|
4
|
+
|
|
5
|
+
## Validated artifacts
|
|
6
|
+
|
|
7
|
+
| File | Written by | Read by | `signal_type` |
|
|
8
|
+
|------|-----------|---------|---------------|
|
|
9
|
+
| `<slug>-iter-signal.json` | Worker | Leader (worker poll) | `signal` |
|
|
10
|
+
| `<slug>-verify-verdict.json` (per-US) | Verifier | Leader (verifier poll) | `verdict` |
|
|
11
|
+
| `<slug>-verify-verdict.json` (final ALL) | Verifier | Leader (final-verifier poll) | `verdict` |
|
|
12
|
+
| `<slug>-flywheel-signal.json` | Flywheel | Leader (flywheel poll) | `flywheel_signal` |
|
|
13
|
+
| `<slug>-flywheel-guard-verdict.json` | Guard | Leader (guard poll) | `flywheel_guard_verdict` |
|
|
14
|
+
| `<slug>-done-claim.json` | Worker | Leader (analytics, A4 fallback) | `done_claim` |
|
|
15
|
+
|
|
16
|
+
## Required structural fields (validated by `validateArtifact`)
|
|
17
|
+
|
|
18
|
+
| Field | Type | Constraint | Notes |
|
|
19
|
+
|-------|------|------------|-------|
|
|
20
|
+
| `slug` | string | === campaign slug | OPTIONAL for backward compat. If present, must match. |
|
|
21
|
+
| `iteration` | integer | ≥ `iteration_floor` (current state.iteration) | OPTIONAL for backward compat. Worker may advance, never regress. |
|
|
22
|
+
| `signal_type` | string | === expected per read context | OPTIONAL for backward compat. Discriminates artifacts at read time. |
|
|
23
|
+
| `us_id` | string | ∈ `usList ∪ {'ALL'}` | OPTIONAL for backward compat. Closed-set check. |
|
|
24
|
+
|
|
25
|
+
The validator is structural-minimum + semantic-anchor. It does NOT validate downstream business fields (e.g. `verdict.verdict`, `signal.status`); those are checked by their respective consumers.
|
|
26
|
+
|
|
27
|
+
## Examples
|
|
28
|
+
|
|
29
|
+
### Valid worker signal
|
|
30
|
+
```json
|
|
31
|
+
{
|
|
32
|
+
"slug": "sum-fn",
|
|
33
|
+
"iteration": 1,
|
|
34
|
+
"signal_type": "signal",
|
|
35
|
+
"us_id": "US-001",
|
|
36
|
+
"status": "verify",
|
|
37
|
+
"summary": "implementation done; tests pass"
|
|
38
|
+
}
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### Valid verifier verdict
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"slug": "sum-fn",
|
|
45
|
+
"iteration": 1,
|
|
46
|
+
"signal_type": "verdict",
|
|
47
|
+
"us_id": "US-001",
|
|
48
|
+
"verdict": "pass",
|
|
49
|
+
"criteria_results": [...]
|
|
50
|
+
}
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### Violation: wrong slug
|
|
54
|
+
```json
|
|
55
|
+
{
|
|
56
|
+
"slug": "wrong-campaign", // ← BLOCKED contract_violation
|
|
57
|
+
"iteration": 1,
|
|
58
|
+
...
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
→ `Malformed artifact at slug: expected sum-fn, got wrong-campaign`
|
|
62
|
+
|
|
63
|
+
### Violation: us_id outside allowed set
|
|
64
|
+
```json
|
|
65
|
+
{
|
|
66
|
+
"us_id": "US-999" // ← BLOCKED contract_violation (US-999 ∉ [US-001, ALL])
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
→ `Malformed artifact at us_id: expected one of [US-001, ALL], got US-999`
|
|
70
|
+
|
|
71
|
+
### Violation: iteration regress
|
|
72
|
+
```json
|
|
73
|
+
{
|
|
74
|
+
"iteration": 0 // ← floor is 1; regress not allowed
|
|
75
|
+
}
|
|
76
|
+
```
|
|
77
|
+
→ `Malformed artifact at iteration: expected >= 1, got 0`
|
|
78
|
+
|
|
79
|
+
## Backward compatibility
|
|
80
|
+
|
|
81
|
+
Existing artifacts written before v5.7 §4.25 do not carry `slug`/`signal_type`/`iteration` fields. The validator skips any field not present (`undefined` is allowed). Workers/Verifiers SHOULD start emitting these fields for stronger contract enforcement, but legacy artifacts continue to work.
|
|
82
|
+
|
|
83
|
+
## Feedback loop closure
|
|
84
|
+
|
|
85
|
+
When `MalformedArtifactError` fires:
|
|
86
|
+
1. `_handlePollFailure` writes BLOCKED with `reason_category: contract_violation`, `failure_category: malformed_artifact`, `recoverable: true`.
|
|
87
|
+
2. `reason_detail` includes the structured error: `Malformed artifact at <field>: expected <expected>, got <got>`.
|
|
88
|
+
3. Operators reviewing `<slug>-blocked.json` see the precise contract violation and can update the Worker prompt template (`prompts/<slug>.worker.prompt.md`) to require the missing/correct field.
|
|
89
|
+
4. On re-run after fix, the Worker writes a compliant artifact and the campaign proceeds.
|
|
90
|
+
|
|
91
|
+
## Authoring guidance
|
|
92
|
+
|
|
93
|
+
- Worker prompt templates SHOULD instruct the LLM to include `slug`, `iteration`, `signal_type`, and `us_id` in every JSON artifact.
|
|
94
|
+
- The fix-contract (`buildFixContract` in `campaign-main-loop.mjs`) already feeds verifier failures back to the next Worker; future enhancement: feed `MalformedArtifactError` details directly into the next Worker prompt without requiring user re-run.
|
|
95
|
+
|
|
96
|
+
## Audit
|
|
97
|
+
|
|
98
|
+
- Schema unit tests: `tests/node/test-artifact-schema.mjs` (7 violation scenarios)
|
|
99
|
+
- E2E: Schema violations are exercised in `tests/sv-gate-full.sh` (REAL campaign E2E asserts `complete.md` or `blocked.md` exists — schema violations route to the latter)
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
# rlp-desk CI Setup (v5.7 §4.25)
|
|
2
|
+
|
|
3
|
+
> SV gate is a mechanical contract: every PR touching `src/node/**`, `src/scripts/**`, `src/commands/rlp-desk.md`, or `src/governance.md` MUST pass `tests/sv-gate-full.sh` before merge.
|
|
4
|
+
|
|
5
|
+
## Local development
|
|
6
|
+
|
|
7
|
+
### Fast gate (~30s)
|
|
8
|
+
|
|
9
|
+
Run before every commit:
|
|
10
|
+
|
|
11
|
+
```sh
|
|
12
|
+
zsh tests/sv-gate-fast.sh
|
|
13
|
+
# or
|
|
14
|
+
npm run sv-gate:fast
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Checks:
|
|
18
|
+
- 35+ code-pattern greps (each tracked v5.7 fix has the expected code)
|
|
19
|
+
- All Node unit tests (~50)
|
|
20
|
+
- 5 critical zsh unit tests
|
|
21
|
+
|
|
22
|
+
### Full gate (~5 min)
|
|
23
|
+
|
|
24
|
+
Run before merge / release:
|
|
25
|
+
|
|
26
|
+
```sh
|
|
27
|
+
zsh tests/sv-gate-full.sh
|
|
28
|
+
# or
|
|
29
|
+
npm run sv-gate:full
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Adds:
|
|
33
|
+
- REAL tmux E2E (mocked tmux capture, 9 scenarios)
|
|
34
|
+
- REAL campaign E2E (haiku worker/verifier, max-iter 3, iter-timeout 300s)
|
|
35
|
+
- Asserts `<slug>-complete.md` OR `<slug>-blocked.md` exists post-run (file-guarantee invariant)
|
|
36
|
+
|
|
37
|
+
**Pre-conditions for full gate**:
|
|
38
|
+
- Inside a tmux session (`echo $TMUX` not empty)
|
|
39
|
+
- `claude` CLI in PATH
|
|
40
|
+
- `node` >= 16 in PATH
|
|
41
|
+
- `~/.claude/ralph-desk/` synced from latest `src/` (run `bash install.sh` or manual sync)
|
|
42
|
+
|
|
43
|
+
## GitHub Actions
|
|
44
|
+
|
|
45
|
+
The fast gate runs on every PR via `.github/workflows/sv-gate.yml`:
|
|
46
|
+
|
|
47
|
+
```yaml
|
|
48
|
+
name: SV Gate
|
|
49
|
+
on: [push, pull_request]
|
|
50
|
+
jobs:
|
|
51
|
+
sv-gate-fast:
|
|
52
|
+
runs-on: macos-latest # zsh + tmux available
|
|
53
|
+
steps:
|
|
54
|
+
- uses: actions/checkout@v4
|
|
55
|
+
- uses: actions/setup-node@v4
|
|
56
|
+
with: { node-version: '22' }
|
|
57
|
+
- run: bash install.sh # syncs to ~/.claude/ralph-desk
|
|
58
|
+
env: { REPO_URL: file://${{ github.workspace }} }
|
|
59
|
+
- run: zsh tests/sv-gate-fast.sh
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
The full gate (with REAL campaign E2E) is NOT run in CI — it requires:
|
|
63
|
+
- Anthropic API key (haiku worker/verifier)
|
|
64
|
+
- Live tmux session (CI runners are non-interactive)
|
|
65
|
+
- ~3-5 min wallclock per run
|
|
66
|
+
|
|
67
|
+
Operators MUST run `tests/sv-gate-full.sh` locally before merging to `main`.
|
|
68
|
+
|
|
69
|
+
## Branch protection (manual)
|
|
70
|
+
|
|
71
|
+
Required for the SV gate to be enforceable:
|
|
72
|
+
|
|
73
|
+
1. Go to `https://github.com/<owner>/rlp-desk/settings/branches`
|
|
74
|
+
2. Add rule for `main`:
|
|
75
|
+
- ✅ Require a pull request before merging
|
|
76
|
+
- ✅ Require status checks to pass before merging
|
|
77
|
+
- ✅ Search and select: `sv-gate-fast`
|
|
78
|
+
- ✅ Require branches to be up to date before merging
|
|
79
|
+
3. Document the manual step here. Branch protection cannot be enforced via committed YAML alone — it is a repo-admin setting.
|
|
80
|
+
|
|
81
|
+
## Forks / non-GitHub repos
|
|
82
|
+
|
|
83
|
+
`tests/sv-gate-fast.sh` and `tests/sv-gate-full.sh` are pure zsh + Node — no GitHub-specific dependencies. Forks should:
|
|
84
|
+
|
|
85
|
+
1. Run `npm run sv-gate:fast` in their CI (Travis, GitLab CI, etc.) using the same OS-level prereqs (macOS or Linux + zsh + tmux + node + claude CLI).
|
|
86
|
+
2. Optionally run `npm run sv-gate:full` in a scheduled job (nightly) since it requires live API key.
|
|
87
|
+
|
|
88
|
+
## Gate failure interpretation
|
|
89
|
+
|
|
90
|
+
| Failure mode | Meaning | Action |
|
|
91
|
+
|--------------|---------|--------|
|
|
92
|
+
| Code-pattern grep failed | Tracked fix's expected code is missing | Restore the fix or update `tests/sv-gate-fast.sh` if the pattern legitimately changed |
|
|
93
|
+
| Node unit test failed | Behavioral regression | Fix the code; do NOT relax the test |
|
|
94
|
+
| zsh unit test failed | Behavioral regression in shell helpers | Fix the helper |
|
|
95
|
+
| REAL tmux E2E failed | Real tmux capture/send-keys broke | Investigate tmux version or pane state |
|
|
96
|
+
| REAL campaign E2E failed (no sentinel) | **FILE-GUARANTEE VIOLATED** — Worker/Verifier exited without artifact AND backstop did NOT catch | Critical bug; investigate `_ensureTerminalSentinel` and `_handlePollFailure` paths |
|
|
97
|
+
|
|
98
|
+
## Memo: SV gate is the contract
|
|
99
|
+
|
|
100
|
+
The SV gate exists because AI assistants (including the Leader itself) miss steps. Mechanical .sh verification is the only enforceable contract — code review, "I tested it locally", and unit-test-only verification are not sufficient. Plan v5.7 explicitly forbids commits that have not passed `tests/sv-gate-full.sh`.
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
# rlp-desk E2E Test Scenarios (v5.7 §4.25)
|
|
2
|
+
|
|
3
|
+
> Two-tier coverage: **Tier A** (deterministic injection, ~ms) runs in `sv-gate-fast`; **Tier B** (real-subprocess + real-tmux + real-claude, seconds–minutes) runs in `sv-gate-full`. Every fix path is covered by at least one tier.
|
|
4
|
+
|
|
5
|
+
## Tier A — Deterministic injection (sv-gate-fast)
|
|
6
|
+
|
|
7
|
+
Uses `pollForSignal` injection seam (no subprocess spawn) — deterministic, fast, CI-stable.
|
|
8
|
+
|
|
9
|
+
| Scenario | Test file | Asserts |
|
|
10
|
+
|----------|-----------|---------|
|
|
11
|
+
| writeSentinelExclusive O_EXCL race | `tests/node/test-sentinel-exclusive.mjs` | First-writer-wins, parent dir create, EEXIST returns no-op, parallel race |
|
|
12
|
+
| Backstop: missing scaffold | `tests/node/test-leader-exit-invariant.mjs` | `_ensureTerminalSentinel` writes `blocked.md` even on `ensureScaffold` throw |
|
|
13
|
+
| Backstop: pollForSignal throws | `tests/node/test-leader-exit-invariant.mjs` | `_handlePollFailure` writes BLOCKED + run() returns blocked status |
|
|
14
|
+
| Backstop: idempotent first-writer-wins | `tests/node/test-leader-exit-invariant.mjs` | Pre-existing BLOCKED is NOT overwritten by backstop |
|
|
15
|
+
| Lying worker (signal missing) | `tests/node/test-lying-worker.mjs` | BLOCKED `infra_failure/worker_exited_without_artifacts` |
|
|
16
|
+
| Lying verifier (per-US verdict missing) | `tests/node/test-lying-worker.mjs` + `tests/node/sv-e2e/test-lying-verifier.mjs` | BLOCKED `verifier_exited_without_artifacts` |
|
|
17
|
+
| Lying final verifier (US-ALL) | `tests/node/sv-e2e/test-lying-verifier.mjs` | BLOCKED `final_verifier_exited_without_artifacts` |
|
|
18
|
+
| Prompt-blocked (default-No worker) | `tests/node/sv-e2e/test-prompt-blocked.mjs` | BLOCKED `prompt_blocked` |
|
|
19
|
+
| Prompt-blocked (default-No verifier) | `tests/node/sv-e2e/test-prompt-blocked.mjs` | BLOCKED `prompt_blocked` (verifier role) |
|
|
20
|
+
| Schema: empty object | `tests/node/test-artifact-schema.mjs` | No crash |
|
|
21
|
+
| Schema: wrong slug | `tests/node/test-artifact-schema.mjs` | BLOCKED `contract_violation/malformed_artifact` |
|
|
22
|
+
| Schema: us_id outside set | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
|
|
23
|
+
| Schema: iteration regress | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
|
|
24
|
+
| Schema: iteration not integer | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
|
|
25
|
+
| Schema: signal_type mismatch | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
|
|
26
|
+
| Schema: valid signal (back-compat) | `tests/node/test-artifact-schema.mjs` | No false positive |
|
|
27
|
+
| Auto-dismiss prompt patterns (24+) | `tests/node/test-prompt-dismisser.mjs` | Each `(y/n)`/`[Y/n]`/`[y/N]` variant + scrollback + unknown-fast-fail + claude v2.x trust |
|
|
28
|
+
| Shell quote (Bug 1) | `tests/node/test-shell-quote.mjs` | POSIX single-quote escape for `[1m]` etc. |
|
|
29
|
+
| Opus 1M context | `tests/node/test-opus-1m-context.mjs` | `ANTHROPIC_BETA` prefix, isOpusModel detection |
|
|
30
|
+
|
|
31
|
+
**Tier A total**: 50+ tests across 11 files. Runtime: ~0.7s. Always runs in CI.
|
|
32
|
+
|
|
33
|
+
## Tier B — Real-subprocess (sv-gate-full)
|
|
34
|
+
|
|
35
|
+
Uses real tmux session + real `tmux send-keys` / `capture-pane` / real claude haiku CLI. Slow (~5min) but exercises actual production paths.
|
|
36
|
+
|
|
37
|
+
| Scenario | Test | Asserts |
|
|
38
|
+
|----------|------|---------|
|
|
39
|
+
| Real tmux: `[Y/n]` auto-dismiss | `tests/sv-gate-real-e2e.sh` | Real `tmux send-keys Enter` after `auto_dismiss_prompts` |
|
|
40
|
+
| Real tmux: `[y/N]` BLOCK | `tests/sv-gate-real-e2e.sh` | `infra_failure` sentinel written, NO Enter sent |
|
|
41
|
+
| Real tmux: 10s no-progress timeout | `tests/sv-gate-real-e2e.sh` | BLOCKED on freeze regardless of prompt |
|
|
42
|
+
| Real tmux: unknown text + no bracket | `tests/sv-gate-real-e2e.sh` | No false BLOCK, no false Enter |
|
|
43
|
+
| Real tmux: unknown phrasing + `[y/N]` | `tests/sv-gate-real-e2e.sh` | Fast-fail BLOCK (10min wait avoided) |
|
|
44
|
+
| Real tmux: unknown phrasing + `(y/n)` | `tests/sv-gate-real-e2e.sh` | Fast-fail BLOCK |
|
|
45
|
+
| Real tmux: codex `[Y/n]` | `tests/sv-gate-real-e2e.sh` | Auto-dismiss (codex CLI variant) |
|
|
46
|
+
| Real tmux: codex `[y/N]` | `tests/sv-gate-real-e2e.sh` | BLOCK |
|
|
47
|
+
| Real tmux: scrollback contamination | `tests/sv-gate-real-e2e.sh` | Old `[Y/n]` + active `[y/N]` → BLOCK (scan-all) |
|
|
48
|
+
| Real haiku campaign (happy path) | `tests/sv-gate-full.sh` (inline) | `complete.md` written; trust prompt auto-dismissed; tests pass; commit recorded |
|
|
49
|
+
|
|
50
|
+
**Tier B total**: 10+ scenarios. Runtime: ~5 min (1 min for tmux scenarios + ~4 min for haiku campaign). Run before merge / release.
|
|
51
|
+
|
|
52
|
+
## Coverage matrix (per fix)
|
|
53
|
+
|
|
54
|
+
| Fix | Tier A | Tier B | Bug ID |
|
|
55
|
+
|-----|--------|--------|--------|
|
|
56
|
+
| zsh `[1m]` glob | shell-quote | (haiku campaign launches Opus models when promoted) | Bug 1 |
|
|
57
|
+
| tmux silent SV/flywheel | us012 | (haiku campaign exercises tmux mode) | Bug 2/3 |
|
|
58
|
+
| auto_dismiss prompts | prompt-dismisser | real-e2e #1-9 | Bug 4 |
|
|
59
|
+
| A4 fallback prompt guard | a4_fallback | (haiku campaign) | Bug 5 |
|
|
60
|
+
| scrollback contamination | prompt-dismisser | real-e2e #9 | §4.17.b |
|
|
61
|
+
| unknown-prompt fast-fail | prompt-dismisser | real-e2e #5-6 | §4.18 |
|
|
62
|
+
| Node iterTimeout fwd | (verified by haiku campaign actually completing in ≤300s) | full | §4.19 |
|
|
63
|
+
| claude v2.x trust prompt | prompt-dismisser | full (haiku triggers it) | §4.20 |
|
|
64
|
+
| capture window -50 + whitespace norm | prompt-dismisser | full (haiku narrow-pane wrap) | §4.21 |
|
|
65
|
+
| WorkerExitedError | lying-worker | (full campaign covers happy path; injection covers exit) | §4.22 |
|
|
66
|
+
| tail-15 normalized matching | prompt-dismisser | real-e2e | §4.23 |
|
|
67
|
+
| writeSentinelExclusive O_EXCL | sentinel-exclusive | (full campaign uses it for complete.md) | §4.24 |
|
|
68
|
+
| run() try/finally backstop | leader-exit-invariant | (full campaign verifies success path) | §4.24 §1g |
|
|
69
|
+
| _handlePollFailure | lying-worker, lying-verifier, prompt-blocked | (full campaign success path) | §4.25 |
|
|
70
|
+
| validateArtifact schema | artifact-schema | full (haiku artifacts schema-compliant) | §4.25 P1 |
|
|
71
|
+
|
|
72
|
+
Every fix has at least one Tier A test. Tier B exercises the production-realistic paths (real tmux, real subprocess, real claude haiku).
|
|
73
|
+
|
|
74
|
+
## Running the gates
|
|
75
|
+
|
|
76
|
+
```sh
|
|
77
|
+
# Fast gate (~0.7s, every commit)
|
|
78
|
+
zsh tests/sv-gate-fast.sh
|
|
79
|
+
# or
|
|
80
|
+
npm run sv-gate:fast
|
|
81
|
+
|
|
82
|
+
# Full gate (~5 min, before merge/release)
|
|
83
|
+
zsh tests/sv-gate-full.sh
|
|
84
|
+
# or
|
|
85
|
+
npm run sv-gate:full
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
`sv-gate-full` requires:
|
|
89
|
+
- Inside a tmux session (`echo $TMUX` non-empty)
|
|
90
|
+
- `claude` CLI in PATH with valid auth
|
|
91
|
+
- `node >= 16` in PATH
|
|
92
|
+
- `~/.claude/ralph-desk/` synced from latest `src/` (run `bash install.sh`)
|
|
93
|
+
|
|
94
|
+
## Adding a new scenario
|
|
95
|
+
|
|
96
|
+
1. **Determine tier**:
|
|
97
|
+
- Deterministic, no subprocess → Tier A
|
|
98
|
+
- Requires real tmux/claude/network → Tier B
|
|
99
|
+
2. **Tier A**: add `tests/node/sv-e2e/test-<name>.mjs` (or extend existing file). Use `pollForSignal` injection seam. Update `NODE_TESTS` array in `tests/sv-gate-fast.sh`.
|
|
100
|
+
3. **Tier B**: add scenario to `tests/sv-gate-real-e2e.sh` with `reset_pane_state` between scenarios. The script auto-runs in `sv-gate-full.sh`.
|
|
101
|
+
4. **Document**: add row to the Coverage matrix in this file.
|
|
102
|
+
5. **Verify**: run `npm run sv-gate:fast` (Tier A) or `npm run sv-gate:full` (both tiers); both must exit 0.
|