@ai-dev-methodologies/rlp-desk 0.9.0 → 0.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/validated-snacking-crayon.md +156 -359
- package/docs/superpowers/plans/2026-04-12-flywheel-redesign.md +704 -0
- package/docs/superpowers/specs/2026-04-12-flywheel-redesign.md +161 -0
- package/package.json +1 -1
- package/src/commands/rlp-desk.md +15 -1
- package/src/node/reporting/campaign-reporting.mjs +364 -0
- package/src/node/run.mjs +12 -0
- package/src/node/runner/campaign-main-loop.mjs +104 -1
|
@@ -1,399 +1,188 @@
|
|
|
1
|
-
# Plan:
|
|
1
|
+
# Plan: Flywheel Phase 1 — SV Report Generation + Brainstorm Feedback Loop
|
|
2
2
|
|
|
3
3
|
## Context
|
|
4
4
|
|
|
5
|
-
rlp-desk의
|
|
6
|
-
|
|
5
|
+
rlp-desk의 flywheel 아키텍처(governance §8½ + brainstorm step 0)가 설계되어 있지만 구현이 끊겨 있다.
|
|
6
|
+
`--with-self-verification` 플래그가 파싱되지만 실제 SV 리포트 생성 코드가 없고, brainstorm step 0도 SV 리포트를 읽는 로직이 없다.
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
1. `print_run_presets()`가 rlp-desk.md 옵션 인터페이스와 desync (stale 플래그, 틀린 기본값)
|
|
10
|
-
2. Worker가 파일 읽자마자 바로 TDD로 넘어감 (계획 단계 없음)
|
|
11
|
-
3. Brainstorm이 코드 안 보고 US 제안
|
|
12
|
-
4. Brainstorm 결과가 campaign memory에 안 남음 (첫 Worker가 재발견)
|
|
13
|
-
5. Worker/Verifier가 코딩 원칙 가이드라인 없이 작동 (글로벌 CLAUDE.md 의존 불가)
|
|
8
|
+
**목표:** 캠페인 A → SV 리포트 생성 → 캠페인 B brainstorm이 A의 패턴 참조 — 최소한의 피드백 루프 완성.
|
|
14
9
|
|
|
15
|
-
**브랜치:** `
|
|
10
|
+
**브랜치:** `feature/flywheel-sv-report`
|
|
16
11
|
|
|
17
12
|
---
|
|
18
13
|
|
|
19
|
-
##
|
|
20
|
-
|
|
21
|
-
### Change 1: Fix Run Preset Desync
|
|
22
|
-
**File:** `src/scripts/init_ralph_desk.zsh` lines 197-238
|
|
23
|
-
|
|
24
|
-
Rewrite `print_run_presets()` to match `src/commands/rlp-desk.md` lines 142-200.
|
|
25
|
-
|
|
26
|
-
**Desync table:**
|
|
27
|
-
|
|
28
|
-
| current (init_ralph_desk.zsh) | canonical (rlp-desk.md) |
|
|
29
|
-
|---|---|
|
|
30
|
-
| `--final-consensus` (line 207) | `--consensus final-only` |
|
|
31
|
-
| `gpt-5.3-codex-spark:high` (line 210) | `spark:high` |
|
|
32
|
-
| `--verify-consensus` (line 232) | `--consensus off\|all\|final-only` |
|
|
33
|
-
| worker default `sonnet` (line 230) | `haiku` |
|
|
34
|
-
| verifier default `opus` (line 231) | per-US `sonnet`, final `opus` |
|
|
35
|
-
| Missing `--mode tmux` in recommended | Present |
|
|
36
|
-
| Missing 6 options | `--lock-worker-model`, `--consensus-model`, `--final-consensus-model`, `--cb-threshold`, `--iter-timeout`, `--final-verifier-model` |
|
|
14
|
+
## Current State (Gap Analysis)
|
|
37
15
|
|
|
38
|
-
|
|
16
|
+
| 구성요소 | 상태 | 위치 |
|
|
17
|
+
|----------|------|------|
|
|
18
|
+
| `--with-self-verification` 플래그 파싱 | ✅ | run.mjs:142-144 |
|
|
19
|
+
| 10섹션 SV 리포트 템플릿 정의 | ✅ | rlp-desk.md:522-573 |
|
|
20
|
+
| §8½ 피드백 루프 정의 | ✅ | governance.md:629-635 |
|
|
21
|
+
| Brainstorm step 0 정의 | ✅ | rlp-desk.md:115 |
|
|
22
|
+
| `generateSVReport()` 함수 | ❌ | 존재하지 않음 |
|
|
23
|
+
| campaign-main-loop.mjs에서 SV 호출 | ❌ | svSummary 파라미터 안 전달 (465, 568, 590) |
|
|
24
|
+
| analytics 디렉토리 생성 | ❌ | 코드 없음 |
|
|
25
|
+
| SV 리포트 테스트 | ❌ | us007에 없음 |
|
|
39
26
|
|
|
40
|
-
|
|
41
|
-
**Files:**
|
|
42
|
-
- `src/scripts/init_ralph_desk.zsh` Worker prompt — insert between line 316 and line 318
|
|
43
|
-
- `src/governance.md` line 217 — add `plan` to step types
|
|
44
|
-
- `src/scripts/init_ralph_desk.zsh` Verifier prompt — add audit after line 478
|
|
45
|
-
|
|
46
|
-
**Insert after line 316 ("Execute the plan for $SLUG."), before line 318 ("## Before you start"):**
|
|
47
|
-
|
|
48
|
-
```
|
|
49
|
-
## Planning (before writing any code)
|
|
50
|
-
After reading all files, BEFORE writing any test or code:
|
|
51
|
-
1. List the specific files you will create or modify
|
|
52
|
-
2. For each AC in the contract, state your approach in 1 sentence
|
|
53
|
-
3. Identify ordering constraints (which AC depends on which)
|
|
54
|
-
4. Record as first execution_step: {"step": "plan", "ac_id": "all", "command": null, "exit_code": null, "summary": "Plan: [files], [approach], [order]"}
|
|
55
|
-
Keep planning lightweight — 1-2 sentences per AC, not a detailed analysis.
|
|
56
|
-
If the plan reveals the contract is unclear or infeasible, signal "blocked" immediately.
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
**governance.md line 217:** Change from:
|
|
60
|
-
```
|
|
61
|
-
- Step types: `write_test`, `verify_red`, `implement`, `verify_green`, `refactor`, `commit`, `verify`, `verify_existing`
|
|
62
|
-
```
|
|
63
|
-
to:
|
|
64
|
-
```
|
|
65
|
-
- Step types: `plan`, `write_test`, `verify_red`, `implement`, `verify_green`, `refactor`, `commit`, `verify`, `verify_existing`
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
**Verifier prompt after line 478 (Worker Process Audit):** Add:
|
|
69
|
-
```
|
|
70
|
-
- Planning step presence: done-claim execution_steps should include a `plan` step as the first entry. If missing, record in reasoning as {"check": "Planning Step", "decision": "info", "basis": "plan step present/absent"} — informational only (does not affect pass/fail verdict)
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
### Change 3: Brainstorm Exploration Phase
|
|
74
|
-
**File:** `src/commands/rlp-desk.md` — insert between line 25 and line 26
|
|
75
|
-
|
|
76
|
-
**Insert after line 25 ("2. **Objective**") and before line 26 ("3. **User Stories**"):**
|
|
77
|
-
|
|
78
|
-
```
|
|
79
|
-
2.5. **Codebase Exploration** — Before proposing user stories, examine the project:
|
|
80
|
-
- Read the project's entry points, key modules, and test structure
|
|
81
|
-
- Identify architectural patterns in use (frameworks, conventions, test setup)
|
|
82
|
-
- Note constraints the Worker will encounter (dependencies, build system, existing code style)
|
|
83
|
-
- Present findings: "I explored the codebase and found: [patterns], [constraints], [existing tests]. This informs the US breakdown below."
|
|
84
|
-
- If the project is new/empty, skip this step and note "greenfield project."
|
|
85
|
-
```
|
|
86
|
-
|
|
87
|
-
### Change 4: Memory Bridge
|
|
88
|
-
**Files:**
|
|
89
|
-
- `src/commands/rlp-desk.md` line 131
|
|
90
|
-
- `src/scripts/init_ralph_desk.zsh` lines 578-580 (campaign memory template)
|
|
91
|
-
- `src/scripts/init_ralph_desk.zsh` line 355 area (Worker prompt iteration rules)
|
|
27
|
+
---
|
|
92
28
|
|
|
93
|
-
|
|
94
|
-
```
|
|
95
|
-
If brainstorm was done, auto-fill PRD and test-spec with the results.
|
|
96
|
-
```
|
|
97
|
-
to:
|
|
98
|
-
```
|
|
99
|
-
If brainstorm was done, auto-fill:
|
|
100
|
-
- PRD and test-spec with the brainstorm results
|
|
101
|
-
- Campaign memory "Key Decisions" with architectural decisions from brainstorm
|
|
102
|
-
- Campaign memory "Patterns Discovered" with codebase exploration findings (from step 2.5)
|
|
103
|
-
```
|
|
29
|
+
## Changes
|
|
104
30
|
|
|
105
|
-
|
|
31
|
+
### Change 1: `generateSVReport()` 함수 구현
|
|
32
|
+
**File:** `src/node/reporting/campaign-reporting.mjs` (확장)
|
|
33
|
+
|
|
34
|
+
기존 `generateCampaignReport()` (line 159) 옆에 `generateSVReport()` 추가.
|
|
35
|
+
|
|
36
|
+
**Input:**
|
|
37
|
+
- `slug` — campaign slug
|
|
38
|
+
- `logsDir` — `.claude/ralph-desk/logs/<slug>/` (done-claim, verify-verdict 파일 위치)
|
|
39
|
+
- `prdFile` — PRD 경로
|
|
40
|
+
- `testSpecFile` — test-spec 경로
|
|
41
|
+
- `analyticsFile` — campaign.jsonl 경로
|
|
42
|
+
- `outputDir` — `~/.claude/ralph-desk/analytics/<slug>/` (SV 리포트 출력)
|
|
43
|
+
|
|
44
|
+
**로직:**
|
|
45
|
+
1. `logsDir`에서 `iter-*-done-claim.json`, `iter-*-verify-verdict.json` 파일 수집
|
|
46
|
+
2. done-claim에서 execution_steps 파싱 → Worker Process Quality 집계
|
|
47
|
+
3. verify-verdict에서 reasoning 파싱 → Verifier Judgment Quality 집계
|
|
48
|
+
4. campaign.jsonl에서 per-iteration 요약 → Automated Validation Summary
|
|
49
|
+
5. AC lifecycle 추적 (first claimed, first verified, reopen count)
|
|
50
|
+
6. 10섹션 마크다운 생성
|
|
51
|
+
7. `outputDir/self-verification-report-NNN.md`에 버전드 파일 쓰기
|
|
52
|
+
8. `outputDir/self-verification-data.json`에 구조화 데이터 쓰기
|
|
53
|
+
|
|
54
|
+
**10섹션 구현 우선순위:**
|
|
55
|
+
- 필수 (핵심 피드백): §1 Automated Validation, §3 Worker Process Quality, §7 Patterns, §8 Recommendations
|
|
56
|
+
- 중요 (진단): §2 Failure Deep Dive, §4 Verifier Quality, §5 AC Lifecycle
|
|
57
|
+
- 보조 (참고): §6 Test-Spec Adherence, §9 Cost, §10 Blind Spots
|
|
58
|
+
|
|
59
|
+
**Return:** `{ reportPath, version, summary }` — summary는 generateCampaignReport()의 svSummary 파라미터로 전달
|
|
60
|
+
|
|
61
|
+
### Change 2: campaign-main-loop.mjs에 SV 생성 연결
|
|
62
|
+
**File:** `src/node/runner/campaign-main-loop.mjs` lines 465, 568, 590
|
|
63
|
+
|
|
64
|
+
현재 `generateCampaignReport()` 호출 3곳에서:
|
|
65
|
+
1. `options.withSelfVerification` 체크
|
|
66
|
+
2. true면 `generateSVReport()` 호출
|
|
67
|
+
3. 결과의 summary를 `svSummary` 파라미터로 전달
|
|
68
|
+
|
|
69
|
+
**Before (현재):**
|
|
70
|
+
```javascript
|
|
71
|
+
await generateCampaignReport({
|
|
72
|
+
slug, reportFile, prdFile, statusFile, analyticsFile, now
|
|
73
|
+
});
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
**After:**
|
|
77
|
+
```javascript
|
|
78
|
+
let svSummary = 'N/A — --with-self-verification not enabled';
|
|
79
|
+
if (options.withSelfVerification) {
|
|
80
|
+
const sv = await generateSVReport({
|
|
81
|
+
slug, logsDir: paths.logsDir, prdFile: paths.prdFile,
|
|
82
|
+
testSpecFile: paths.testSpecFile, analyticsFile: paths.analyticsFile,
|
|
83
|
+
outputDir: paths.analyticsDir,
|
|
84
|
+
});
|
|
85
|
+
svSummary = sv.summary;
|
|
86
|
+
}
|
|
87
|
+
await generateCampaignReport({
|
|
88
|
+
slug, reportFile, prdFile, statusFile, analyticsFile, now, svSummary
|
|
89
|
+
});
|
|
106
90
|
```
|
|
107
|
-
## Key Decisions
|
|
108
91
|
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
to:
|
|
112
|
-
```
|
|
113
|
-
## Key Decisions
|
|
114
|
-
(seeded from brainstorm — do not erase, only append)
|
|
92
|
+
### Change 3: analytics 디렉토리 생성
|
|
93
|
+
**File:** `src/node/runner/campaign-main-loop.mjs` (초기화 단계)
|
|
115
94
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
95
|
+
캠페인 시작 시 `~/.claude/ralph-desk/analytics/<slug>/` 디렉토리 생성.
|
|
96
|
+
- slug에 `--<root_hash>` 접미사 추가 (cross-project 충돌 방지, rlp-desk.md:248 스펙)
|
|
97
|
+
- metadata.json 초기 작성
|
|
119
98
|
|
|
120
|
-
**
|
|
121
|
-
```
|
|
122
|
-
|
|
99
|
+
**paths 객체에 추가:**
|
|
100
|
+
```javascript
|
|
101
|
+
analyticsDir: join(homeDir, '.claude/ralph-desk/analytics', `${slug}--${rootHash}`),
|
|
123
102
|
```
|
|
124
103
|
|
|
125
|
-
### Change
|
|
126
|
-
**
|
|
127
|
-
- `src/scripts/init_ralph_desk.zsh` Worker prompt — insert after line 316, before Change 2's Planning section
|
|
128
|
-
- `src/scripts/init_ralph_desk.zsh` Verifier prompt — insert after line 429
|
|
104
|
+
### Change 4: Brainstorm Step 0 SV Report Feedback 구현
|
|
105
|
+
**File:** `src/commands/rlp-desk.md` brainstorm section (line 115 area)
|
|
129
106
|
|
|
130
|
-
|
|
107
|
+
현재 step 0은 한 줄 설명만 있음. 구체적 실행 절차 추가:
|
|
131
108
|
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
2. Simplicity First
|
|
145
|
-
Minimum code that solves the problem. Nothing speculative.
|
|
146
|
-
- No features beyond what was asked.
|
|
147
|
-
- No abstractions for single-use code.
|
|
148
|
-
- No configurability that was not specified.
|
|
149
|
-
- No defensive handling for implausible scenarios unless the context requires it.
|
|
150
|
-
- If 200 lines could be 50, rewrite it.
|
|
151
|
-
Ask: "Would a strong senior engineer call this overcomplicated?" If yes, simplify.
|
|
152
|
-
|
|
153
|
-
3. Surgical Changes
|
|
154
|
-
Touch only what you must. Clean up only your own mess.
|
|
155
|
-
- Do not improve adjacent code, comments, or formatting unless required by the task.
|
|
156
|
-
- Do not refactor unrelated code.
|
|
157
|
-
- Match the local style unless there is a compelling reason not to.
|
|
158
|
-
- If unrelated dead code is noticed, mention it in done-claim — do not delete it.
|
|
159
|
-
- Remove imports, variables, or functions that YOUR changes made unused.
|
|
160
|
-
- Do not remove pre-existing dead code.
|
|
161
|
-
Test: every changed line should trace directly to the contract.
|
|
162
|
-
|
|
163
|
-
4. Goal-Driven Execution
|
|
164
|
-
Define success criteria. Loop until verified.
|
|
165
|
-
These principles are enforced by the TDD Mandate and Planning step below.
|
|
166
|
-
If success criteria for any AC are unclear, signal blocked.
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
**Verifier prompt — insert after line 429 ("Independent verifier for Ralph Desk: $SLUG"), before line 431 ("## Iron Law"):**
|
|
170
|
-
|
|
171
|
-
```
|
|
172
|
-
## Verification Principles
|
|
173
|
-
|
|
174
|
-
1. Think Before Judging
|
|
175
|
-
Don't assume. Don't default to PASS or FAIL without evidence.
|
|
176
|
-
- State your assumptions about what PASS looks like for each AC before
|
|
177
|
-
checking evidence.
|
|
178
|
-
- If evidence is ambiguous or incomplete, say what is unclear and why —
|
|
179
|
-
do not default to either verdict.
|
|
180
|
-
- If multiple interpretations of an AC exist, flag it as a spec issue.
|
|
181
|
-
|
|
182
|
-
2. Goal-Driven Verification
|
|
183
|
-
Define the specific evidence required for PASS before you start checking.
|
|
184
|
-
- For each AC, state: "PASS requires [specific evidence]."
|
|
185
|
-
- Verify against that criteria, not against a general impression of code quality.
|
|
186
|
-
- If success criteria are unclear, note it in reasoning — do not invent criteria.
|
|
109
|
+
```markdown
|
|
110
|
+
0. **SV Report Feedback** — If a prior campaign's self-verification report exists:
|
|
111
|
+
a. Scan `~/.claude/ralph-desk/analytics/` for directories matching this project root
|
|
112
|
+
b. Read the latest `self-verification-report-*.md` from each matching directory
|
|
113
|
+
c. Extract from §7 (Patterns) and §8 (Recommendations):
|
|
114
|
+
- Which US types/sizes failed most frequently
|
|
115
|
+
- Which AC quality dimensions scored lowest
|
|
116
|
+
- Which model tiers underperformed for this project's complexity
|
|
117
|
+
- Specific brainstorm/PRD/test-spec recommendations from prior campaigns
|
|
118
|
+
d. Present findings to user: "Prior campaign analysis found: [patterns]. Recommendations: [suggestions]."
|
|
119
|
+
e. If no prior reports exist, skip and note "No prior campaign data available."
|
|
187
120
|
```
|
|
188
121
|
|
|
189
122
|
---
|
|
190
123
|
|
|
191
124
|
## Implementation Sequence
|
|
192
125
|
|
|
193
|
-
| Wave | Changes | Files |
|
|
194
|
-
|
|
195
|
-
| 1 | Change 1 (
|
|
196
|
-
|
|
|
197
|
-
| 2 | Change 2 (
|
|
198
|
-
| 3 | Change
|
|
199
|
-
| 3 | Change 4 (memory bridge) | rlp-desk.md + init_ralph_desk.zsh | MEDIUM |
|
|
126
|
+
| Wave | Changes | Files | Dependency |
|
|
127
|
+
|------|---------|-------|------------|
|
|
128
|
+
| 1 | Change 1 (generateSVReport) | campaign-reporting.mjs | None |
|
|
129
|
+
| 1 | Change 3 (analytics dir) | campaign-main-loop.mjs + paths.mjs | None |
|
|
130
|
+
| 2 | Change 2 (SV 호출 연결) | campaign-main-loop.mjs | Change 1, 3 |
|
|
131
|
+
| 3 | Change 4 (brainstorm step 0) | rlp-desk.md | Change 1 (reports exist) |
|
|
200
132
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
- Wave 3: rlp-desk.md changes. Change 4 depends on Change 3 (exploration produces findings that get seeded).
|
|
133
|
+
Wave 1은 병렬 가능 (서로 독립).
|
|
134
|
+
Wave 2는 Wave 1 완료 후.
|
|
135
|
+
Wave 3는 별도 — rlp-desk.md만 수정.
|
|
205
136
|
|
|
206
137
|
---
|
|
207
138
|
|
|
208
|
-
## TDD
|
|
209
|
-
|
|
210
|
-
Each change has tests written FIRST, verified to fail, then implementation, then re-verify.
|
|
211
|
-
|
|
212
|
-
### Test Script: `tests/test_template_generation.sh`
|
|
213
|
-
|
|
214
|
-
```bash
|
|
215
|
-
#!/bin/bash
|
|
216
|
-
# TDD tests for template generation changes
|
|
217
|
-
# Run: bash tests/test_template_generation.sh
|
|
218
|
-
set -euo pipefail
|
|
219
|
-
|
|
220
|
-
SCRIPT="src/scripts/init_ralph_desk.zsh"
|
|
221
|
-
CMD="src/commands/rlp-desk.md"
|
|
222
|
-
GOV="src/governance.md"
|
|
223
|
-
PASS=0; FAIL=0; TOTAL=0
|
|
224
|
-
|
|
225
|
-
assert_contains() {
|
|
226
|
-
local file="$1" pattern="$2" label="$3"
|
|
227
|
-
TOTAL=$((TOTAL+1))
|
|
228
|
-
if grep -q "$pattern" "$file" 2>/dev/null; then
|
|
229
|
-
echo " PASS: $label"; PASS=$((PASS+1))
|
|
230
|
-
else
|
|
231
|
-
echo " FAIL: $label (pattern not found: $pattern)"; FAIL=$((FAIL+1))
|
|
232
|
-
fi
|
|
233
|
-
}
|
|
234
|
-
|
|
235
|
-
assert_not_contains() {
|
|
236
|
-
local file="$1" pattern="$2" label="$3"
|
|
237
|
-
TOTAL=$((TOTAL+1))
|
|
238
|
-
if grep -q "$pattern" "$file" 2>/dev/null; then
|
|
239
|
-
echo " FAIL: $label (stale pattern still present: $pattern)"; FAIL=$((FAIL+1))
|
|
240
|
-
else
|
|
241
|
-
echo " PASS: $label"; PASS=$((PASS+1))
|
|
242
|
-
fi
|
|
243
|
-
}
|
|
244
|
-
|
|
245
|
-
echo "=== Change 1: Run Preset Desync ==="
|
|
246
|
-
assert_not_contains "$SCRIPT" "\-\-final-consensus" "C1: no --final-consensus"
|
|
247
|
-
assert_not_contains "$SCRIPT" "gpt-5.3-codex-spark" "C1: no gpt-5.3-codex-spark"
|
|
248
|
-
assert_not_contains "$SCRIPT" "\-\-verify-consensus" "C1: no --verify-consensus"
|
|
249
|
-
assert_contains "$SCRIPT" "\-\-consensus final-only" "C1: --consensus final-only present"
|
|
250
|
-
assert_contains "$SCRIPT" "spark:high" "C1: spark:high present"
|
|
251
|
-
assert_contains "$SCRIPT" "default: haiku" "C1: worker default haiku"
|
|
252
|
-
assert_contains "$SCRIPT" "\-\-lock-worker-model" "C1: --lock-worker-model in options"
|
|
253
|
-
assert_contains "$SCRIPT" "\-\-cb-threshold" "C1: --cb-threshold in options"
|
|
254
|
-
assert_contains "$SCRIPT" "\-\-iter-timeout" "C1: --iter-timeout in options"
|
|
255
|
-
assert_contains "$SCRIPT" "\-\-consensus-model" "C1: --consensus-model in options"
|
|
256
|
-
assert_contains "$SCRIPT" "\-\-mode tmux" "C1: --mode tmux in recommended"
|
|
257
|
-
|
|
258
|
-
echo ""
|
|
259
|
-
echo "=== Change 2: Worker Planning Step ==="
|
|
260
|
-
assert_contains "$SCRIPT" "## Planning" "C2: Planning section in Worker prompt"
|
|
261
|
-
assert_contains "$SCRIPT" "step.*plan.*ac_id.*all" "C2: plan execution_step format"
|
|
262
|
-
assert_contains "$SCRIPT" "Keep planning lightweight" "C2: lightweight constraint"
|
|
263
|
-
assert_contains "$GOV" "plan.*write_test.*verify_red" "C2: plan in §1f step types"
|
|
264
|
-
assert_contains "$SCRIPT" "Planning Step.*decision.*info" "C2: Verifier plan audit"
|
|
265
|
-
|
|
266
|
-
echo ""
|
|
267
|
-
echo "=== Change 3: Brainstorm Exploration ==="
|
|
268
|
-
assert_contains "$CMD" "Codebase Exploration" "C3: exploration step present"
|
|
269
|
-
assert_contains "$CMD" "greenfield project" "C3: greenfield skip path"
|
|
270
|
-
assert_contains "$CMD" "entry points.*key modules" "C3: exploration instructions"
|
|
271
|
-
|
|
272
|
-
echo ""
|
|
273
|
-
echo "=== Change 4: Memory Bridge ==="
|
|
274
|
-
assert_contains "$CMD" "Campaign memory.*Key Decisions" "C4: init seeds memory instruction"
|
|
275
|
-
assert_contains "$SCRIPT" "seeded from brainstorm" "C4: seed markers in template"
|
|
276
|
-
assert_contains "$SCRIPT" "PRESERVE the Key Decisions" "C4: Worker preservation instruction"
|
|
277
|
-
|
|
278
|
-
echo ""
|
|
279
|
-
echo "=== Change 5: Coding Principles ==="
|
|
280
|
-
assert_contains "$SCRIPT" "## Coding Principles" "C5: Worker coding principles section"
|
|
281
|
-
assert_contains "$SCRIPT" "Think Before Coding" "C5: principle 1 in Worker"
|
|
282
|
-
assert_contains "$SCRIPT" "Simplicity First" "C5: principle 2 in Worker"
|
|
283
|
-
assert_contains "$SCRIPT" "Surgical Changes" "C5: principle 3 in Worker"
|
|
284
|
-
assert_contains "$SCRIPT" "Goal-Driven Execution" "C5: principle 4 in Worker"
|
|
285
|
-
assert_contains "$SCRIPT" "## Verification Principles" "C5: Verifier principles section"
|
|
286
|
-
assert_contains "$SCRIPT" "Think Before Judging" "C5: Verifier principle 1"
|
|
287
|
-
assert_contains "$SCRIPT" "Goal-Driven Verification" "C5: Verifier principle 2"
|
|
288
|
-
|
|
289
|
-
echo ""
|
|
290
|
-
echo "=== RESULTS ==="
|
|
291
|
-
echo "PASS: $PASS / $TOTAL"
|
|
292
|
-
echo "FAIL: $FAIL / $TOTAL"
|
|
293
|
-
[ $FAIL -eq 0 ] && echo "ALL TESTS PASSED" || echo "SOME TESTS FAILED"
|
|
294
|
-
exit $FAIL
|
|
295
|
-
```
|
|
296
|
-
|
|
297
|
-
### TDD Flow Per Wave
|
|
139
|
+
## TDD Plan
|
|
298
140
|
|
|
299
|
-
|
|
300
|
-
1. Write test → run → expect 11 FAIL (stale patterns present, new patterns absent)
|
|
301
|
-
2. Implement Change 1
|
|
302
|
-
3. Run test → expect 11 PASS
|
|
303
|
-
4. `bash -n src/scripts/init_ralph_desk.zsh` (syntax check)
|
|
141
|
+
### 테스트 파일: `tests/node/test-sv-report.mjs` (새로 생성)
|
|
304
142
|
|
|
305
|
-
**
|
|
306
|
-
1
|
|
307
|
-
2
|
|
308
|
-
3
|
|
309
|
-
4
|
|
310
|
-
5
|
|
311
|
-
6
|
|
143
|
+
**Change 1 테스트:**
|
|
144
|
+
- T1.1: done-claim + verify-verdict 파일에서 10섹션 리포트 생성
|
|
145
|
+
- T1.2: 빈 logs 디렉토리 → graceful 처리 (빈 리포트)
|
|
146
|
+
- T1.3: Worker Process Quality §3 — TDD compliance % 정확성
|
|
147
|
+
- T1.4: Verifier Judgment Quality §4 — reasoning completeness % 정확성
|
|
148
|
+
- T1.5: AC Lifecycle §5 — reopen count 추적
|
|
149
|
+
- T1.6: Patterns §7 + Recommendations §8 — 패턴 추출
|
|
150
|
+
- T1.7: 버전드 파일 쓰기 (NNN 증가)
|
|
151
|
+
- T1.8: self-verification-data.json 구조 검증
|
|
312
152
|
|
|
313
|
-
**
|
|
314
|
-
1
|
|
315
|
-
2
|
|
316
|
-
3. Run test → expect Change 3 PASS, Change 4 still FAIL
|
|
317
|
-
4. Implement Change 4 (memory bridge — rlp-desk.md + init)
|
|
318
|
-
5. Run test → expect all PASS
|
|
153
|
+
**Change 2 테스트:**
|
|
154
|
+
- T2.1: withSelfVerification=false → svSummary 기본값
|
|
155
|
+
- T2.2: withSelfVerification=true → generateSVReport 호출됨
|
|
319
156
|
|
|
320
|
-
|
|
157
|
+
**Change 3 테스트:**
|
|
158
|
+
- T3.1: analytics 디렉토리 생성 확인
|
|
159
|
+
- T3.2: metadata.json 구조 검증
|
|
321
160
|
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
```bash
|
|
325
|
-
# E2E: generate artifacts and verify
|
|
326
|
-
TEST_SLUG="test-karpathy-e2e"
|
|
327
|
-
TEST_DIR=$(mktemp -d)
|
|
328
|
-
cd "$TEST_DIR" && git init && mkdir -p .claude/ralph-desk
|
|
329
|
-
|
|
330
|
-
bash /path/to/src/scripts/init_ralph_desk.zsh "$TEST_SLUG" "test objective"
|
|
331
|
-
|
|
332
|
-
# Check Worker prompt
|
|
333
|
-
grep -q "## Coding Principles" .claude/ralph-desk/prompts/$TEST_SLUG.worker.prompt.md
|
|
334
|
-
grep -q "## Planning" .claude/ralph-desk/prompts/$TEST_SLUG.worker.prompt.md
|
|
335
|
-
grep -q "Think Before Coding" .claude/ralph-desk/prompts/$TEST_SLUG.worker.prompt.md
|
|
336
|
-
grep -q "PRESERVE the Key Decisions" .claude/ralph-desk/prompts/$TEST_SLUG.worker.prompt.md
|
|
337
|
-
|
|
338
|
-
# Check Verifier prompt
|
|
339
|
-
grep -q "## Verification Principles" .claude/ralph-desk/prompts/$TEST_SLUG.verifier.prompt.md
|
|
340
|
-
grep -q "Think Before Judging" .claude/ralph-desk/prompts/$TEST_SLUG.verifier.prompt.md
|
|
341
|
-
|
|
342
|
-
# Check campaign memory
|
|
343
|
-
grep -q "seeded from brainstorm" .claude/ralph-desk/memos/$TEST_SLUG-memory.md
|
|
344
|
-
|
|
345
|
-
# Check run presets (capture init output)
|
|
346
|
-
# ... verify --consensus, spark:high, haiku defaults appear
|
|
347
|
-
|
|
348
|
-
rm -rf "$TEST_DIR"
|
|
349
|
-
```
|
|
161
|
+
**Change 4 테스트:**
|
|
162
|
+
- T4.1: rlp-desk.md에 step 0 실행 절차 존재 (grep)
|
|
350
163
|
|
|
351
164
|
---
|
|
352
165
|
|
|
353
|
-
##
|
|
166
|
+
## Verification
|
|
354
167
|
|
|
355
|
-
|
|
168
|
+
### TDD Flow
|
|
169
|
+
1. 테스트 작성 → RED (generateSVReport 없으므로)
|
|
170
|
+
2. Change 1 구현 → 테스트 GREEN
|
|
171
|
+
3. Change 3 구현 → analytics dir 테스트 GREEN
|
|
172
|
+
4. Change 2 구현 → 연결 테스트 GREEN
|
|
173
|
+
5. Change 4 구현 → grep 테스트 GREEN
|
|
356
174
|
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
-
|
|
360
|
-
|
|
175
|
+
### E2E Verification
|
|
176
|
+
1. 테스트 프로젝트에서 campaign 실행 (with-self-verification 플래그)
|
|
177
|
+
2. `~/.claude/ralph-desk/analytics/<slug>/self-verification-report-001.md` 생성 확인
|
|
178
|
+
3. 리포트에 10섹션 존재 확인
|
|
179
|
+
4. 두 번째 campaign brainstorm에서 첫 캠페인 패턴 참조 확인
|
|
361
180
|
|
|
362
|
-
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
-
|
|
366
|
-
|
|
367
|
-
**Scenario 3: CRITICAL risk — governance change verification**
|
|
368
|
-
- Verify governance §1f has `plan` in step types
|
|
369
|
-
- Simulate: Worker without plan step → Verifier records `info` (not fail)
|
|
370
|
-
- Simulate: Worker erases Key Decisions → next Worker loses context
|
|
371
|
-
- Layers: L1 + L2 + L3 + governance compliance
|
|
372
|
-
|
|
373
|
-
---
|
|
374
|
-
|
|
375
|
-
## Post-Commit Checklist
|
|
376
|
-
|
|
377
|
-
1. Local file sync (ALL distributable files):
|
|
378
|
-
```bash
|
|
379
|
-
cp src/commands/rlp-desk.md ~/.claude/commands/rlp-desk.md
|
|
380
|
-
cp src/governance.md ~/.claude/ralph-desk/governance.md
|
|
381
|
-
cp src/scripts/init_ralph_desk.zsh ~/.claude/ralph-desk/init_ralph_desk.zsh
|
|
382
|
-
cp src/scripts/run_ralph_desk.zsh ~/.claude/ralph-desk/run_ralph_desk.zsh
|
|
383
|
-
cp src/scripts/lib_ralph_desk.zsh ~/.claude/ralph-desk/lib_ralph_desk.zsh
|
|
384
|
-
cp README.md ~/.claude/ralph-desk/README.md
|
|
385
|
-
```
|
|
386
|
-
|
|
387
|
-
2. Verify sync:
|
|
388
|
-
```bash
|
|
389
|
-
diff -q src/commands/rlp-desk.md ~/.claude/commands/rlp-desk.md
|
|
390
|
-
diff -q src/governance.md ~/.claude/ralph-desk/governance.md
|
|
391
|
-
diff -q src/scripts/init_ralph_desk.zsh ~/.claude/ralph-desk/init_ralph_desk.zsh
|
|
392
|
-
diff -q src/scripts/run_ralph_desk.zsh ~/.claude/ralph-desk/run_ralph_desk.zsh
|
|
393
|
-
diff -q src/scripts/lib_ralph_desk.zsh ~/.claude/ralph-desk/lib_ralph_desk.zsh
|
|
394
|
-
diff -q README.md ~/.claude/ralph-desk/README.md
|
|
395
|
-
```
|
|
396
|
-
All must produce no output.
|
|
181
|
+
### Self-Verification Gate
|
|
182
|
+
governance.md 변경 없음 (§8½는 이미 정의됨). rlp-desk.md만 변경.
|
|
183
|
+
init_ralph_desk.zsh 변경 없으면 2시나리오만 필요:
|
|
184
|
+
- LOW: SV 리포트 없는 상태에서 brainstorm → "No prior data" 스킵
|
|
185
|
+
- MEDIUM: SV 리포트 있는 상태에서 brainstorm → 패턴 참조
|
|
397
186
|
|
|
398
187
|
---
|
|
399
188
|
|
|
@@ -401,7 +190,15 @@ All must produce no output.
|
|
|
401
190
|
|
|
402
191
|
| File | Changes |
|
|
403
192
|
|------|---------|
|
|
404
|
-
| `src/
|
|
405
|
-
| `src/
|
|
406
|
-
| `src/
|
|
407
|
-
| `
|
|
193
|
+
| `src/node/reporting/campaign-reporting.mjs` | Change 1: generateSVReport() 추가 |
|
|
194
|
+
| `src/node/runner/campaign-main-loop.mjs` | Change 2: SV 호출 연결, Change 3: analytics dir |
|
|
195
|
+
| `src/node/shared/paths.mjs` | Change 3: analyticsDir path 추가 |
|
|
196
|
+
| `src/commands/rlp-desk.md` | Change 4: brainstorm step 0 절차 확장 |
|
|
197
|
+
| `tests/node/test-sv-report.mjs` | 새로 생성 — SV 리포트 테스트 |
|
|
198
|
+
|
|
199
|
+
### Reuse 가능한 기존 코드
|
|
200
|
+
- `versionFile()` (campaign-reporting.mjs:47-60) — 버전드 파일 쓰기
|
|
201
|
+
- `readAnalytics()` (campaign-reporting.mjs:70-80) — campaign.jsonl 파싱
|
|
202
|
+
- `readJsonIfExists()` (campaign-reporting.mjs:62-68) — JSON 안전 읽기
|
|
203
|
+
- `summarizeUsStatus()` (campaign-reporting.mjs:91-96) — US 상태 집계
|
|
204
|
+
- `summarizeVerificationResults()` (campaign-reporting.mjs:98-102) — 검증 결과 집계
|