@ai-dev-methodologies/rlp-desk 0.15.4 → 0.15.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/CHANGELOG.md +17 -1
  2. package/package.json +9 -3
  3. package/src/node/MANIFEST.txt +3 -0
  4. package/src/node/prompts/prompt-assembler.mjs +2 -2
  5. package/src/node/run.mjs +70 -3
  6. package/src/node/runner/campaign-main-loop.mjs +13 -2
  7. package/src/scripts/run_ralph_desk.zsh +5 -3
  8. package/docs/rlp-desk/internal/verification-policy-gap-analysis.md +0 -523
  9. package/docs/rlp-desk/internal/verification-strategy-research.md +0 -2097
  10. package/docs/rlp-desk/plans/cozy-gliding-trinket.md +0 -53
  11. package/docs/rlp-desk/plans/frolicking-churning-honey.md +0 -253
  12. package/docs/rlp-desk/plans/keen-sauteeing-snowflake.md +0 -245
  13. package/docs/rlp-desk/plans/mutable-booping-corbato.md +0 -163
  14. package/docs/rlp-desk/plans/rlp-desk-0.11-handoff-7fixes.md +0 -352
  15. package/docs/rlp-desk/plans/rlp-desk-0.11.1-tmux-pane-disappearance.md +0 -260
  16. package/docs/rlp-desk/plans/rlp-desk-elegant-papert-agent-a8cd695ffca2a3ad8.md +0 -84
  17. package/docs/rlp-desk/plans/rlp-desk-elegant-papert.md +0 -270
  18. package/docs/rlp-desk/plans/rlp-desk-tmux-flywheel-routing.md +0 -730
  19. package/docs/rlp-desk/plans/toasty-whistling-diffie-agent-a6814625642e956da.md +0 -201
  20. package/docs/rlp-desk/plans/toasty-whistling-diffie.md +0 -117
  21. package/docs/rlp-desk/plans/validated-snacking-crayon.md +0 -204
  22. package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-output.log +0 -0
  23. package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-prompt.md +0 -38
  24. package/examples/calculator/.claude/ralph-desk/logs/loop-test/iter-001.worker-trigger.sh +0 -28
  25. package/examples/calculator/.claude/ralph-desk/logs/loop-test/session-config.json +0 -25
  26. package/examples/calculator/.claude/ralph-desk/logs/loop-test/status.json +0 -10
  27. package/examples/calculator/.claude/ralph-desk/logs/loop-test/worker-heartbeat.json +0 -1
@@ -1,163 +0,0 @@
1
- # Plan: rlp-desk Batch Mode + Operational Context 개선
2
-
3
- ## Context
4
-
5
- 실제 캠페인(`prod-local-parity`, spark:high)에서 두 가지 구조적 문제가 발견됨:
6
-
7
- 1. **Batch 모드 무한 FAIL**: US 5개 이상이면 Worker가 일부만 완료 → Verifier가 전체 검증 → FAIL → 진전 무시 → CB BLOCKED. `VERIFIED_US` 추적이 per-us 모드에만 있고 batch에는 없음.
8
-
9
- 2. **서버 프로젝트 지원 부재**: Worker가 코드 수정 후 서버 restart를 안 하고, 서버 포트를 모르고, health check도 없음. spark 모델 탓이 아니라 **rlp-desk가 operational context를 brainstorm/prompt에 반영하지 않는 설계 결함**.
10
-
11
- ---
12
-
13
- ## P0: Batch 모드 Partial Progress Tracking
14
-
15
- ### 수정 대상
16
- - `src/scripts/run_ralph_desk.zsh`
17
- - `src/commands/rlp-desk.md` (agent mode ⑦c)
18
-
19
- ### 변경 내용
20
-
21
- #### 1. Batch 모드에도 VERIFIED_US 추적 (run_ralph_desk.zsh)
22
- - PASS verdict 처리(L2423): `per-us` 조건 제거 → batch에서도 `signal_us_id`가 개별 US면 `VERIFIED_US`에 추가
23
- - FAIL verdict 처리(L2445): verdict JSON에서 `per_us_results` 파싱 → `met=true`인 US를 `VERIFIED_US`에 추가
24
- - status.json 갱신: batch 모드에서도 `verified_us` 배열 기록
25
-
26
- #### 2. Verifier Prompt에 VERIFIED_US 전달 (run_ralph_desk.zsh L1225-1232)
27
- - `if [[ "$VERIFY_MODE" = "per-us"` 조건 → `if [[ -n "$VERIFIED_US"` 로 변경
28
- - batch 모드 verifier에게도 "이미 verified된 US skip" 지시
29
-
30
- #### 3. Fix Contract Scope Narrowing (run_ralph_desk.zsh L2461-2473)
31
- - FAIL 시: verdict에서 pass한 US 추출 → fix contract에 "US-001~004 verified. Continue from US-005."
32
- - Worker prompt 조합 시 `VERIFIED_US` 참조하여 축소된 scope 전달
33
-
34
- #### 4. consecutive_failures 부분 리셋 (run_ralph_desk.zsh L2447)
35
- - 새로 pass된 US가 있으면 (`VERIFIED_US` 길어짐) → `CONSECUTIVE_FAILURES=0` 리셋
36
- - 진전 없이 같은 상태면 → 기존대로 증가
37
-
38
- #### 5. Verifier Verdict에 per_us_results 필수화
39
- - Verifier prompt template(init_ralph_desk.zsh L384-474)에 output format 추가:
40
- ```json
41
- {
42
- "verdict": "fail",
43
- "per_us_results": { "US-001": "pass", "US-005": "fail" },
44
- "issues": [...]
45
- }
46
- ```
47
- - batch/per-us 공통으로 per_us_results 포함하도록 지시
48
-
49
- ---
50
-
51
- ## P1: Brainstorm Operational Context + Worker System Prompt
52
-
53
- ### 수정 대상
54
- - `src/commands/rlp-desk.md` (brainstorm section)
55
- - `src/scripts/init_ralph_desk.zsh` (Worker/Verifier prompt template)
56
-
57
- ### 변경 내용
58
-
59
- #### 1. Brainstorm: Operational Context 수집 (rlp-desk.md L24-93)
60
- 현재 11개 항목 수집 중, **12번째 항목 추가**:
61
-
62
- ```
63
- 12. **Operational Context** (if applicable):
64
- - Does this project require a running server/service? (y/n)
65
- - Server start command (e.g., `npm run dev`, `python manage.py runserver`)
66
- - Server port (e.g., 7001)
67
- - Health check URL (e.g., `http://localhost:7001/health`)
68
- - Other runtime dependencies (e.g., database, Redis)
69
- ```
70
-
71
- brainstorm이 프로젝트 디렉토리에서 `package.json`의 `scripts.dev`/`scripts.start`, `Makefile`, `docker-compose.yml` 등을 자동 감지하여 추천.
72
-
73
- #### 2. Brainstorm: US 생성 시 Operational Step 포함 가이드
74
- US/AC 작성 가이드(rlp-desk.md L26-38)에 추가:
75
-
76
- ```
77
- - If the project has operational context (server, DB, etc.):
78
- - Each US that modifies server code MUST include AC:
79
- "Given server is running, When code is modified, Then server is restarted and responds on health check URL"
80
- - Do NOT assume Worker will restart server on its own — spell it out in AC
81
- ```
82
-
83
- #### 3. Init: Worker Prompt에 Operational Rules 주입 (init_ralph_desk.zsh L285-380)
84
- brainstorm에서 수집한 operational context를 Worker prompt template에 주입:
85
-
86
- ```markdown
87
- ## Operational Context
88
- - **Server Command**: `npm run dev`
89
- - **Server Port**: 7001
90
- - **Health Check**: `http://localhost:7001/health`
91
-
92
- ### Operational Rules (always apply)
93
- - After modifying server/application code, restart the server: `[server_cmd]`
94
- - Before signaling done, verify server responds: `curl -s [health_url] || fail`
95
- - Do NOT modify dependency files (package.json, requirements.txt, etc.) unless the AC explicitly requires it
96
- - Do NOT run package install commands (npm install, pip install, etc.) unless the AC explicitly requires it
97
- ```
98
-
99
- operational context가 없는 프로젝트(코드만 수정)면 이 섹션 생략.
100
-
101
- #### 4. Init: Verifier Prompt에도 Operational Check 추가
102
- Verifier prompt template(init_ralph_desk.zsh L384-474)에:
103
-
104
- ```markdown
105
- ## Operational Verification (if server context provided)
106
- - Verify server is running on expected port before checking ACs
107
- - If server is down, verdict=FAIL with issue: "server not running"
108
- ```
109
-
110
- #### 5. --server-cmd / --server-port CLI 옵션 (run_ralph_desk.zsh)
111
- brainstorm에서 수집한 값을 init이 prompt에 넣지만, run 시 override도 가능:
112
- - `--server-cmd "npm run dev"` → Worker prompt의 서버 명령어 override
113
- - `--server-port 7001` → Worker prompt의 포트 override
114
- - 런타임에 iteration 시작 시 health check (optional, `--server-health-check` flag)
115
-
116
- ---
117
-
118
- ## Verification Plan
119
-
120
- ### P0 Tests
121
- ```bash
122
- # Batch partial progress 단위 테스트
123
- zsh tests/test_batch_partial_progress.sh
124
- # 시나리오: batch FAIL verdict에 per_us_results 포함 → VERIFIED_US 추적 확인
125
- # 시나리오: 새 US pass 시 consecutive_failures 리셋 확인
126
- # 시나리오: verifier prompt에 VERIFIED_US 포함 확인 (batch 모드)
127
- ```
128
-
129
- ### P1 Tests
130
- ```bash
131
- # Operational context 단위 테스트
132
- zsh tests/test_operational_context.sh
133
- # 시나리오: --server-cmd 옵션 파싱 확인
134
- # 시나리오: Worker prompt에 operational rules 주입 확인
135
- # 시나리오: operational context 없는 프로젝트에서는 섹션 생략 확인
136
- ```
137
-
138
- ### Self-Verification (CLAUDE.md 필수)
139
- 변경된 src 파일에 대해 3개 시나리오 (LOW/MEDIUM/CRITICAL) 자체 검증 실행.
140
-
141
- ### E2E
142
- 실제 캠페인으로 테스트:
143
- 1. batch 모드 + 10 US → partial progress 추적 확인
144
- 2. server 프로젝트 + spark:high → 서버 restart 수행 확인
145
-
146
- ---
147
-
148
- ## File Map
149
-
150
- | 파일 | P0 | P1 |
151
- |------|----|----|
152
- | `src/scripts/run_ralph_desk.zsh` | VERIFIED_US batch 추적, fix contract narrowing, CF 리셋 | --server-cmd/port 옵션 |
153
- | `src/scripts/lib_ralph_desk.zsh` | - | - |
154
- | `src/scripts/init_ralph_desk.zsh` | - | Worker/Verifier prompt에 operational context 주입 |
155
- | `src/commands/rlp-desk.md` | agent mode ⑦c batch 로직 | brainstorm 12번 항목, US 가이드 |
156
- | `src/governance.md` | - | - |
157
-
158
- ---
159
-
160
- ## Scope / Non-Goals
161
- - 모델별 가드레일 (spark 전용 금지 목록) → **하지 않음**. brainstorm/prompt 구조로 해결
162
- - batch 모드 완전 제거 → **하지 않음**. 수정하여 사용 가능하게 함
163
- - auto-detect project type → brainstorm에서 사용자 확인 + 파일 기반 추천만. 완전 자동화 아님
@@ -1,352 +0,0 @@
1
- # rlp-desk 0.11 — Handoff Final 7-fix bundle (ralplan v3)
2
-
3
- > v3 changes: NEW-1 (bash→zsh fixture invocation) + NEW-2 (early-exit grep broadened) Architect executor follow-ups 흡수.
4
- > v2 changes (Architect + Critic codex iteration): PR split A/B 결정, R7 schema fallback, R8 helper-side guard, R9 reason canonicalization + edge cases, R10 normalized US extractor + quarantine (not rm), R11 early-exit grep inventory + trap, self-verification mechanical assertion 패치.
5
-
6
- ## Context
7
-
8
- 소비자 Final Handoff (`coordination/handoffs/2026-04-25-rlp-desk-final-status-and-handoff.md`) timestamp evidence 기반 7건 결함:
9
-
10
- | ID | Severity | 결함 | Root file |
11
- |---|---|---|---|
12
- | P0-D | HIGH | A4 fallback 83% 빈발 (worker iter-signal 누락) | `run_ralph_desk.zsh:1587-1595`, `:526-546` |
13
- | P1-F | MEDIUM | test-spec ≥3 tests/AC IL-4 자가모순 | `init_ralph_desk.zsh` test-spec gen + ingest |
14
- | P1-G | MEDIUM | partial_verify signal vocabulary 부재 | `init_ralph_desk.zsh:448-454` Signal rules + verifier |
15
- | P1-H | MEDIUM | blocked 시 memory.md/latest.md 미갱신 | worker prompt blocked exit hygiene |
16
- | P2-I | MEDIUM | block ≠ failure → contract defect silent 12-iter | `run_ralph_desk.zsh:2659` consecutive_blocks 신규 |
17
- | P2-J | MEDIUM | final ALL verify cross-mission us_id leak | `run_ralph_desk.zsh:2198/2425-2429` US_LIST scope |
18
- | P2-K | LOW | cost-log 비어있음 (tmux mode) | `lib_ralph_desk.zsh:367` write_cost_log call coverage |
19
-
20
- ## PR 분할 결정 (v2)
21
-
22
- Architect 권고에 따라 **PR-A(protocol) + PR-B(runtime) 2-PR 분할** 채택. 사용자가 "단일 PR" 명시한 경우에도 R7 schema collision (R3 와 silent fallback 위험) 때문에 분리 필요.
23
-
24
- - **PR-A (protocol/contract)**: R5 + R6 + R7 + governance §1f/§7f/§7g + us017/us018/us019
25
- - **PR-B (runtime/state)**: R8 + R9 + R10 + R11 + governance §8/§7a + us020/us021/us022/us023
26
- - 자가검증 mapping 시나리오는 양 PR 모두 포함 (각 PR 의 fix 만 evaluate). 최종 self-verification (7/7) 은 PR-B merge 후 별도.
27
-
28
- 단, 사용자 직접 "단일 PR" 재요청 시 single PR 로 진행하되 self-verification 시나리오를 더 강화 (per-row mechanical assertion 필수).
29
-
30
- ## RALPLAN-DR
31
-
32
- **Principles** (4):
33
- 1. **Fail loud, not silent** — A4 fallback / block-as-success / cross-mission leak / cost-log silence 모두 silent failure 패턴.
34
- 2. **Backward-compat first** — verify_partial 신규 status 의 기존 wrapper malformed 처리 명시. test-spec lint warn-then-strict 단계 진화.
35
- 3. **Minimal blast radius** — PR split + per-fix helper 분리. 각 fix 의 회귀는 독립 us_test.
36
- 4. **Self-verification mechanical** — 변경 사항 X가 자가검증 시나리오 Y에서 실제 트리거되었음을 grep+exit-code 로 증명.
37
-
38
- **Decision Drivers**:
39
- 1. consumer wrapper 가 동일 패턴(83% A4 fallback, cross-mission leak, contract defect silent loop) 재발 차단.
40
- 2. 7-mission autonomous run 후 debug.log [FLOW] events 가 의미 있는 summary 보유 + audit log auditable.
41
- 3. cost-log 빈 파일 = "broken logging" 분류 가능, audit pipeline 신뢰성.
42
-
43
- **Viable Options 비교**:
44
-
45
- (아래 옵션 비교 v1 과 동일하나 Critic ITERATE 흡수 패치 추가)
46
-
47
- - **R7 verify_partial schema malformed 처리 (Architect issue #2)**: `verify_partial` 인데 `verified_acs` 미존재/빈 배열 → `status='blocked'`, `reason='verify_partial_malformed'` 으로 다운그레이드. Worker autonomy 위배 차단.
48
- - **R8 helper-side guard (Critic R8 + Architect issue #3)**: Verifier 의 mtime check 만으로 부족. `write_blocked_sentinel` 자체에 hygiene check 추가 — memory.md/latest.md mtime 이 sentinel 작성 시각보다 오래됐으면 (즉 worker 가 hygiene update 안 했으면) sentinel JSON 에 `meta.blocked_hygiene_violated=true` 자동 첨부. Worker 가 잊어도 verifier 가 즉시 인지.
49
- - **R9 reason canonicalization (Architect issue #3)**: `_canonical_block_reason()` helper — hygiene wrapper prefix("hygiene_violated:", "wrapped:") strip 후 비교. R8 hygiene_violated 가 R9 counter 우회 차단.
50
- - **R9 edge cases (Critic R9)**: 첫-iter block / mission setup block 은 `infra_failure` reason 으로 분류된 경우 counter 증가 안 함 (mission abort 부적절). 명시 exempt.
51
- - **R10 normalized extractor + quarantine (Architect issue #4 + Critic R10)**: `grep -qE "^## $stale_us[: ]"` 대신 `awk '/^##[[:space:]]+(US-[0-9]+)([[:space:]:-]|$)/'` 로 정규화 추출 (PRD heading variation 대응). `rm -f` 대신 `mv` to `.sisyphus/quarantine/` (silent destructive 차단).
52
- - **R11 trap-based final write (Architect issue #6 + Critic R11)**: init placeholder 폐기. zsh `trap 'write_cost_log "$ITERATION" || true' EXIT` 추가 + early-exit path grep inventory 회귀로 보장.
53
- - **Self-verification per-row functions (Architect issue #5 + Critic Self-V)**: 단일 monolithic script 대신 7 함수 (`test_r5_a4_audit_triggered`, …) + 각 함수 내 pre/post 카운터 + grep 로 변경 함수 호출 증명.
54
-
55
- ---
56
-
57
- ## 해결 계획 (v2 patches highlighted)
58
-
59
- ### Fix R5: P0-D — A4 fallback 추적 + worker prompt 강화
60
-
61
- **대상**:
62
- 1. `src/scripts/run_ralph_desk.zsh:1587-1595` + `:526-546` — A4 fallback 발동 시 audit log entry 작성 (`a4-fallback-audit.jsonl`, append).
63
- 2. `src/scripts/init_ralph_desk.zsh` worker prompt — "Step N+1 (mandatory)" 추가 + auto-generated summary penalty 명시.
64
- 3. Verifier prompt — A4 fallback summary detection 시 verdict.meta.iter_signal_quality='auto_generated'.
65
- 4. governance §1f — A4 ratio 권고 (per-mission < 10%).
66
-
67
- **검증 (us017) — Critic R5 patch 흡수**:
68
- - AC1: a4-fallback-audit.jsonl entry 작성 (zsh fixture)
69
- - **AC1+ (Critic R5)**: pre_count=$(wc -l a4-fallback-audit.jsonl), trigger fixture, assert post_count > pre_count + ratio 계산 정확.
70
- - AC2: worker prompt grep "Step N+1" + "iter-signal.json with SPECIFIC summary" 존재
71
- - AC3: governance §1f 에 "A4 ratio < 10%" 권고 텍스트 + 측정 방법 명시
72
- - AC4 (신규): Verifier prompt 에 "auto_generated" detection 문장 + meta field 명시
73
-
74
- ### Fix R6: P1-F — test-spec ≥3/AC enforcement (warn default + strict opt-in)
75
-
76
- **대상**:
77
- 1. `src/scripts/init_ralph_desk.zsh` — `_lint_test_density()` helper:
78
- - PRD AC count 추출 (per-US, `^- AC[0-9]+:` regex)
79
- - test-spec test count 추출 (per-US, `^### Test ` 또는 `^\*\*T-` 헤더 카운트)
80
- - ratio < 3 시: WARN(default) → log_warn + audit + **init exit message 마지막에 summary 표시 (Critic R6 patch)**; STRICT(`--test-density-strict`) → exit 1.
81
- 2. `src/scripts/run_ralph_desk.zsh` + `src/node/run.mjs` — `--test-density-strict` flag stub.
82
- 3. governance §7f — Test Density Enforcement (WARN+STRICT decision tree).
83
- 4. Worker prompt — "≥3 tests/AC (happy + negative + boundary) 강제" 강화.
84
-
85
- **검증 (us018) — Critic R6 patch 흡수**:
86
- - AC1: `--test-density-strict` 플래그 파싱 (zsh + Node)
87
- - AC2: WARN default — ratio<3 fixture 에서 init exit=0 + audit log entry **+ stderr/stdout 마지막 라인에 "Test density warning: US-XXX has N tests for M ACs (ratio=N/M < 3)" 메시지 포함**
88
- - AC3: STRICT — ratio<3 fixture 에서 init exit=1 + 동일 메시지
89
- - AC4: governance §7f 텍스트 정합 (Decision tree, downgrade 없음)
90
-
91
- ### Fix R7: P1-G — verify_partial signal vocabulary
92
-
93
- **대상 (Critic R7 + Architect issue #2 patches)**:
94
- 1. `src/scripts/init_ralph_desk.zsh:448` Signal rules — verify_partial + 필수 필드 명시.
95
- 2. `src/scripts/init_ralph_desk.zsh build_verifier_prompt` 함수 (or equivalent prompt heredoc) — 정확 문장 추가:
96
- ```
97
- If signal status=verify_partial, evaluate ONLY verified_acs. Treat deferred_acs as out-of-scope (not fail).
98
- ```
99
- 3. `src/node/runner/campaign-main-loop.mjs` 신호 파싱 — verify_partial + verified_acs 미존재/빈 배열 시:
100
- ```js
101
- if (signalStatus === 'verify_partial' && (!Array.isArray(signal.verified_acs) || signal.verified_acs.length === 0)) {
102
- // Downgrade to blocked
103
- await writeSentinel(blockedSentinel, 'blocked', usId, 'verify_partial_malformed', { reason_category: 'mission_abort', recoverable: true, suggested_action: 'retry_after_fix' });
104
- continue;
105
- }
106
- ```
107
- 4. `src/scripts/run_ralph_desk.zsh:1313+` — verify_partial 동등 처리 (zsh 측 fallback).
108
- 5. governance §7g 신규 — Signal Vocabulary Extension + malformed downgrade 명시.
109
-
110
- **검증 (us019)**:
111
- - AC1: Signal rules grep verify_partial + verified_acs/deferred_acs/defer_reason
112
- - AC2: governance §7g 정합 + malformed downgrade 명시
113
- - AC3: Node 파서 verify_partial→verified_acs 만 verifier prompt 전달 (behavioural fixture)
114
- - AC4: zsh 파서 verify_partial 인지
115
- - **AC5 (Architect issue #2)**: malformed fixture (verify_partial + verified_acs=[]) → blocked sentinel 작성 + reason='verify_partial_malformed' + reason_category='mission_abort'
116
- - **AC6 (Critic R7)**: Verifier prompt 에 정확 sentence 존재 (grep)
117
-
118
- ### Fix R8: P1-H — Blocked exit hygiene + helper-side guard
119
-
120
- **대상 (Critic R8 + Architect issue #3 patches)**:
121
- 1. `src/scripts/init_ralph_desk.zsh` worker prompt — Blocked exit hygiene 섹션:
122
- > "On blocked exit (status=blocked): BEFORE writing iter-signal.json, ALWAYS append to memory.md § Blocking History `{iter, us, reason, suggested_repair}` AND update latest.md § Known Issues."
123
- 2. **`src/scripts/lib_ralph_desk.zsh:write_blocked_sentinel` (Critic R8 patch)** — sentinel write 직전 hygiene check:
124
- ```zsh
125
- local hygiene_violated=false
126
- local mem_file="$DESK/memos/$SLUG-memory.md"
127
- local lat_file="$DESK/context/$SLUG-latest.md"
128
- local now_ts=$(date +%s)
129
- for f in "$mem_file" "$lat_file"; do
130
- if [[ -f "$f" ]]; then
131
- local f_mtime=$(stat -f %m "$f" 2>/dev/null || stat -c %Y "$f" 2>/dev/null || echo 0)
132
- if (( now_ts - f_mtime > 300 )); then
133
- hygiene_violated=true
134
- break
135
- fi
136
- fi
137
- done
138
- ```
139
- JSON sidecar 에 `meta.blocked_hygiene_violated=$hygiene_violated` 자동 첨부.
140
- 3. `src/node/runner/campaign-main-loop.mjs` `_checkBlockedHygiene()` helper — blocked write 시 동등 검사 + analytics event.
141
- 4. governance §1f — "5th channel: memory.md/latest.md hygiene update" 추가 (4 channels → 5 channels).
142
-
143
- **검증 (us020)**:
144
- - AC1: Worker prompt grep "Blocked exit hygiene" + "memory.md" + "latest.md"
145
- - AC2: governance §1f grep "5th channel" + "memory.md/latest.md hygiene"
146
- - AC3: Node helper `_checkBlockedHygiene` 정의 (grep)
147
- - AC4: behavioural — fixture: stale memory.md (mtime > 5min ago) → blocked sentinel JSON sidecar 의 meta.blocked_hygiene_violated=true
148
- - **AC5 (Critic R8)**: lib_ralph_desk.zsh write_blocked_sentinel 에 hygiene_violated 자동 첨부 grep + behavioural fixture
149
-
150
- ### Fix R9: P2-I — consecutive_blocks counter + canonicalization + edge cases
151
-
152
- **대상 (Critic R9 + Architect issue #3 patches)**:
153
- 1. `src/scripts/run_ralph_desk.zsh` 변수:
154
- ```zsh
155
- CONSECUTIVE_BLOCKS=0
156
- LAST_BLOCK_REASON=""
157
- BLOCK_CB_THRESHOLD="${BLOCK_CB_THRESHOLD:-3}"
158
- ```
159
- 2. **`_canonical_block_reason()` helper (Architect issue #3)**:
160
- ```zsh
161
- _canonical_block_reason() {
162
- local raw="$1"
163
- # Strip wrapper prefixes
164
- echo "$raw" | sed -E 's/^(hygiene_violated:|wrapped:)//' | head -c 80
165
- }
166
- ```
167
- 3. **Edge case exemption (Critic R9)** — `infra_failure` category 또는 첫 iter block 은 counter 증가 안 함:
168
- ```zsh
169
- if [[ "$reason_category" == "infra_failure" ]] || (( ITERATION <= 1 )); then
170
- # Exempt from consecutive_blocks
171
- LAST_BLOCK_REASON=""
172
- CONSECUTIVE_BLOCKS=0
173
- else
174
- local canonical=$(_canonical_block_reason "$reason")
175
- if [[ "$canonical" == "$LAST_BLOCK_REASON" ]]; then
176
- CONSECUTIVE_BLOCKS=$((CONSECUTIVE_BLOCKS + 1))
177
- else
178
- CONSECUTIVE_BLOCKS=1
179
- LAST_BLOCK_REASON="$canonical"
180
- fi
181
- if (( CONSECUTIVE_BLOCKS >= BLOCK_CB_THRESHOLD )); then
182
- echo '{"reason":"consecutive_blocks","count":'"$CONSECUTIVE_BLOCKS"',"last_reason":"'"$LAST_BLOCK_REASON"'"}' | atomic_write "$DESK/.sisyphus/mission-abort.json"
183
- exit 1
184
- fi
185
- fi
186
- ```
187
- 4. `src/node/runner/campaign-main-loop.mjs` 동등 (state.consecutive_blocks + last_block_reason + canonicalReason).
188
- 5. governance §8 — consecutive_blocks + canonicalization + exemption 명시.
189
-
190
- **검증 (us021)**:
191
- - AC1: BLOCK_CB_THRESHOLD 변수 정의 (default 3)
192
- - AC2: zsh same-reason counter logic
193
- - AC3: governance §8 텍스트 정합
194
- - AC4: behavioural — 3회 동일 reason BLOCK 후 mission-abort.json 생성
195
- - **AC5 (Architect issue #3)**: `_canonical_block_reason` helper 정의 + hygiene_violated prefix strip 검증
196
- - **AC6 (Critic R9)**: 첫-iter block exempt fixture (ITERATION=1, reason="setup_fail") → CONSECUTIVE_BLOCKS=0 유지
197
- - **AC7 (Critic R9)**: infra_failure category exempt fixture → CONSECUTIVE_BLOCKS=0 유지
198
-
199
- ### Fix R10: P2-J — Cross-mission us_id leak + normalized extractor + quarantine
200
-
201
- **대상 (Critic R10 + Architect issue #4 patches)**:
202
- 1. `src/scripts/init_ralph_desk.zsh` mission init — stale us_id detect + scrub:
203
- ```zsh
204
- if [[ -f "$SIGNAL_FILE" ]]; then
205
- stale_us=$(jq -r '.us_id // empty' "$SIGNAL_FILE" 2>/dev/null)
206
- if [[ -n "$stale_us" && "$stale_us" != "ALL" ]]; then
207
- # Critic R10: normalized US extractor
208
- prd_us_list=$(awk 'match($0, /^##[[:space:]]+(US-[0-9]+)([[:space:]:-]|$)/, m) { print m[1] }' "$PRD_FILE" 2>/dev/null | sort -u)
209
- if ! echo "$prd_us_list" | grep -qx "$stale_us"; then
210
- # Architect issue #4: quarantine, not rm
211
- mkdir -p "$DESK/.sisyphus/quarantine"
212
- mv "$SIGNAL_FILE" "$DESK/.sisyphus/quarantine/iter-signal.$(date +%s).json"
213
- log " Cross-mission stale us_id ($stale_us) — quarantined to .sisyphus/quarantine/"
214
- fi
215
- fi
216
- fi
217
- ```
218
- 단, BSD awk match() 3-arg 미지원 → `match() + RSTART/RLENGTH + substr()` pattern 또는 `grep -oE` + 후처리 사용:
219
- ```zsh
220
- prd_us_list=$(grep -oE '^##[[:space:]]+US-[0-9]+([[:space:]:-]|$)' "$PRD_FILE" 2>/dev/null | grep -oE 'US-[0-9]+' | sort -u)
221
- ```
222
- 2. `src/scripts/run_ralph_desk.zsh:2425-2429` final ALL verify scope — US_LIST 만 신뢰 (signal_us_id US_LIST 에 없으면 무시 + warn).
223
- 3. `src/node/runner/campaign-main-loop.mjs` — 동등 처리.
224
- 4. governance §7a — cross-mission us_id leak 방어 + quarantine path 명시.
225
-
226
- **검증 (us022)**:
227
- - AC1: init 단계 stale us_id detect + quarantine helper (grep + behavioural)
228
- - AC2: zsh runner final ALL verify US_LIST 신뢰
229
- - AC3: governance §7a 텍스트 정합 + quarantine path
230
- - AC4: behavioural — fixture mission PRD (US-001~003) + stale signal us_id=US-005 → SIGNAL_FILE quarantine 이동, .sisyphus/quarantine/ 에 파일 존재
231
- - **AC5 (Architect issue #4)**: rm -f 사용 안 함 (`grep -n "rm -f.*SIGNAL_FILE" src/scripts/init_ralph_desk.zsh` = 0)
232
- - **AC6 (Critic R10)**: PRD heading variation fixture (`## US-005 -`, `## US-005:`, `## US-005`) → 모두 정상 인식 (false positive 0)
233
-
234
- ### Fix R11: P2-K — Cost log non-empty + trap-based final write + early-exit inventory
235
-
236
- **대상 (Critic R11 + Architect issue #6 patches)**:
237
- 1. `src/scripts/lib_ralph_desk.zsh:367` write_cost_log — note 필드 (bytes=0 시 'no_actual_usage_recorded').
238
- 2. **`src/scripts/run_ralph_desk.zsh` (Architect issue #6)** — main loop 진입 직후 trap 등록:
239
- ```zsh
240
- trap '_emit_final_cost_log' EXIT
241
- _emit_final_cost_log() {
242
- [[ -n "${ITERATION:-}" ]] && [[ "${COST_LOG_FINAL_WRITTEN:-0}" -eq 0 ]] && {
243
- write_cost_log "$ITERATION" 2>/dev/null || true
244
- COST_LOG_FINAL_WRITTEN=1
245
- }
246
- }
247
- ```
248
- 3. **Early-exit path inventory (Critic R11 + Architect NEW-2)** — us023 회귀가 다음 broadened grep 결과의 모든 path 가 trap coverage 내인지 검증:
249
- ```bash
250
- grep -nE '^[[:space:]]*(exit\b|return\b|die\b)' src/scripts/run_ralph_desk.zsh src/scripts/lib_ralph_desk.zsh | grep -v '^[^:]*:[^:]*:.*\${' > early_exits.txt
251
- ```
252
- `die` wrapper 함수가 `lib_ralph_desk.zsh` 에 정의된 경우 명시적으로 trap 우회 분석 + 회귀에 포함.
253
- 4. (init placeholder 삭제 — Architect issue #6) — 빈 cost-log 가 "broken logging" 으로 감지되도록 normal path 만 보강.
254
- 5. governance §7 Cost Tracking — tmux estimated path + trap 명시.
255
-
256
- **검증 (us023)**:
257
- - AC1: write_cost_log 에 note 필드 (bytes=0 시 'no_actual_usage_recorded')
258
- - AC2: zsh runner 에 `trap '_emit_final_cost_log' EXIT` 존재 (grep)
259
- - AC3: behavioural — write_cost_log 호출 후 cost-log.jsonl 비어있지 않음
260
- - **AC4 (Critic R11)**: early-exit grep inventory + 모든 path 가 trap coverage 검증 (스크립트 내 모든 `exit N` 또는 `return N` 위치 grep + trap fire 시점 비교)
261
- - **AC5 (Architect issue #6)**: init placeholder 코드 부재 (grep `placeholder.*cost-log` = 0)
262
-
263
- ---
264
-
265
- ## 자가검증 시나리오 — Mechanical per-row (v2)
266
-
267
- `tests/test_self_verification_0_11_handoff.sh` — 7 함수 + 각 함수 내 pre/post + grep 증명:
268
-
269
- ```bash
270
- test_r5_a4_audit_triggered() {
271
- local audit="$LOGS_DIR/a4-fallback-audit.jsonl"
272
- local pre=$(wc -l < "$audit" 2>/dev/null || echo 0)
273
- # Trigger: simulate done-claim without iter-signal
274
- echo '{"us_id":"US-001","status":"complete"}' > "$DESK/memos/${SLUG}-done-claim.json"
275
- rm -f "$DESK/memos/${SLUG}-iter-signal.json"
276
- # NEW-1 (Architect): zsh fixture invocation (run_ralph_desk.zsh is zsh, NOT bash)
277
- # us017 implementation MUST extract A4 fallback into a callable helper in lib_ralph_desk.zsh
278
- # so it can be sourced cleanly. Until then, use zsh -c with explicit DESK/SLUG/ITERATION exports.
279
- zsh -c "DESK='$DESK' SLUG='$SLUG' ITERATION=1 LOGS_DIR='$LOGS_DIR' source src/scripts/lib_ralph_desk.zsh; _emit_a4_fallback_audit US-001 1" 2>/dev/null
280
- local post=$(wc -l < "$audit" 2>/dev/null || echo 0)
281
- [[ "$post" -gt "$pre" ]] || { fail "R5 A4 audit not triggered (pre=$pre post=$post)"; return 1; }
282
- # Mechanical: grep that the patched code path was exercised
283
- grep -q "a4_fallback" "$audit" || { fail "R5 audit entry missing"; return 1; }
284
- pass "R5 A4 fallback audit triggered ($pre→$post)"
285
- }
286
-
287
- test_r6_test_density_warn() {
288
- # Fixture: PRD with 3 ACs, test-spec with 1 test
289
- local stderr_capture=$(./init_ralph_desk.zsh --slug test-r6 --prd fixtures/r6-bad-prd.md 2>&1)
290
- echo "$stderr_capture" | grep -q "Test density warning" || { fail "R6 init exit message missing warning"; return 1; }
291
- pass "R6 test density warning emitted to stderr"
292
- }
293
-
294
- # ... R7~R11 동일 패턴: 각 함수가 (1) pre-state 캡처, (2) 변경 코드 직접 invoke, (3) post-state grep 검증
295
- ```
296
-
297
- | Fix | 시나리오 | Mechanical 증명 |
298
- |---|---|---|
299
- | R5 P0-D | done-claim 작성 + iter-signal 누락 → A4 fallback 발동 | `wc -l a4-fallback-audit.jsonl` pre/post 비교 + entry grep |
300
- | R6 P1-F | test-spec AC 3개 + test 1개 fixture | stderr 의 "Test density warning" 라인 grep |
301
- | R7 P1-G | iter-signal status=verify_partial fixture (정상 + malformed) | verifier prompt grep `verified_acs only` + malformed → blocked sentinel meta.reason='verify_partial_malformed' |
302
- | R8 P1-H | blocked sentinel + memory.md unchanged 5min+ | sentinel JSON sidecar `meta.blocked_hygiene_violated=true` jq 추출 |
303
- | R9 P2-I | 동일 reason 3회 BLOCK + canonicalization + edge cases | mission-abort.json 존재 + jq `.count==3` + first-iter exempt fixture CONSECUTIVE_BLOCKS=0 검증 |
304
- | R10 P2-J | PRD US-001~003 + stale signal us_id=US-005 + heading variation | `.sisyphus/quarantine/iter-signal.*.json` 존재 + 원본 SIGNAL_FILE 부재 + 3 variation fixture 정상 인식 |
305
- | R11 P2-K | tmux mode 5 iter run + early-exit fixture | `cost-log.jsonl` 행 수 ≥ 5 + 모두 note 필드 보유 + trap fire 검증 |
306
-
307
- **Pass criterion**: 7/7 mechanical 증명 + 각 fix 가 변경된 함수/파일을 실제 호출했음을 grep 으로 확인 (tautology 방지).
308
-
309
- ---
310
-
311
- ## 변경 대상 파일 표
312
-
313
- ```
314
- src/scripts/init_ralph_desk.zsh # R5(worker prompt), R6(test density lint + flag), R7(Signal rules + verifier prompt), R8(blocked exit hygiene), R10(stale us_id quarantine)
315
- src/scripts/run_ralph_desk.zsh # R5(A4 audit), R6(--test-density-strict), R7(verify_partial parsing), R9(consecutive_blocks + canonical + exempt), R10(US_LIST scope), R11(trap)
316
- src/scripts/lib_ralph_desk.zsh # R8(write_blocked_sentinel hygiene_violated), R11(write_cost_log note + bytes=0 path)
317
- src/node/run.mjs # R6(--test-density-strict stub)
318
- src/node/runner/campaign-main-loop.mjs # R7(verify_partial parser + malformed downgrade), R8(_checkBlockedHygiene), R9(consecutive_blocks state), R10(stale us_id scrub)
319
- src/governance.md # R5(§1f A4 metric), R6(§7f Test Density), R7(§7g Signal Vocabulary + malformed), R8(§1f 5th channel), R9(§8 cb + canonicalization + exempt), R10(§7a quarantine)
320
-
321
- [테스트]
322
- tests/test_us017_a4_fallback_audit.sh
323
- tests/test_us018_test_density.sh
324
- tests/test_us019_verify_partial.sh
325
- tests/test_us020_blocked_hygiene.sh
326
- tests/test_us021_consecutive_blocks.sh
327
- tests/test_us022_cross_mission_us_leak.sh
328
- tests/test_us023_cost_log_nonempty.sh
329
- tests/test_self_verification_0_11_handoff.sh # mechanical per-row
330
- ```
331
-
332
- ## 검증 (Self-Verification Gate)
333
-
334
- 1. **LOW** — `zsh -n` + `node --check` (~10s)
335
- 2. **MEDIUM** — us017~us023 7 신규 회귀 (~3min)
336
- 3. **CRITICAL** — us001/us007/us012/us013/us014/us015/us016 회귀 무손실 (~3min)
337
- 4. **자가검증 매핑 시나리오** — `test_self_verification_0_11_handoff.sh` 7/7 mechanical 증명
338
-
339
- ## 단일 PR 진행 결정 (사용자 명시 시)
340
-
341
- 사용자가 PR split 거부 + 단일 PR 명시한 경우:
342
- - R5+R6+R7 (protocol) + R8+R9+R10+R11 (runtime) 단일 PR
343
- - self-verification 시나리오는 양 영역 모두 포함하므로 보장 유지
344
- - 단, codex review iteration 5+ 도달 시 split fallback 자동 트리거
345
-
346
- ## ADR (간결)
347
-
348
- - **Decision**: 7건 fix. v2 patches: PR split 권고 (사용자 명시 시 단일), R7 schema fallback (verify_partial_malformed downgrade), R8 helper-side hygiene check, R9 canonical reason + edge exempt, R10 normalized extractor + quarantine, R11 trap-based final write + early-exit inventory, self-verification mechanical per-row.
349
- - **Drivers**: silent failure 가시화 + backward-compat + minimal blast radius + mechanical self-verification.
350
- - **Alternatives considered (각 R 별 v1 표 + v2 새 patches)**.
351
- - **Consequences**: PR-A 먼저 머지 + soak → PR-B (권고). 단일 PR 도 가능. Worker prompt 길이 약간 증가. test-spec WARN 다수 발생 가능 (점진 strict 화).
352
- - **Follow-ups**: test-density STRICT 의 default 화 (v0.12+), verify_partial deferred_acs 자동 우선 재시도, A4 fallback 0% 시 hard fail.