okstra 0.52.0 → 0.53.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/README.kr.md +1 -1
  2. package/README.md +1 -1
  3. package/docs/kr/architecture.md +1 -0
  4. package/docs/kr/cli.md +2 -1
  5. package/docs/superpowers/plans/2026-06-06-final-verification-whole-task-gate.md +993 -0
  6. package/docs/superpowers/plans/2026-06-06-stage-parallel-and-pending-fixes.md +93 -0
  7. package/docs/superpowers/plans/2026-06-06-stage-worktree-isolation-p1.md +447 -0
  8. package/docs/superpowers/plans/2026-06-06-stage-worktree-isolation-p2.md +289 -0
  9. package/docs/superpowers/plans/2026-06-06-stage-worktree-isolation-p3.md +774 -0
  10. package/docs/superpowers/plans/2026-06-06-stage-worktree-isolation-p4.md +303 -0
  11. package/docs/superpowers/plans/2026-06-06-stage-worktree-isolation-p5-multidep-base.md +387 -0
  12. package/docs/superpowers/specs/2026-06-06-final-verification-whole-task-gate-design.md +126 -0
  13. package/docs/superpowers/specs/2026-06-06-stage-worktree-isolation-design.md +180 -0
  14. package/package.json +1 -1
  15. package/runtime/BUILD.json +2 -2
  16. package/runtime/agents/workers/report-writer-worker.md +1 -0
  17. package/runtime/bin/lib/okstra/cli.sh +5 -1
  18. package/runtime/bin/okstra.sh +1 -0
  19. package/runtime/prompts/launch.template.md +1 -0
  20. package/runtime/prompts/profiles/_implementation-deliverable.md +1 -1
  21. package/runtime/prompts/profiles/_implementation-executor.md +16 -9
  22. package/runtime/prompts/profiles/_implementation-verifier.md +4 -1
  23. package/runtime/prompts/profiles/final-verification.md +7 -7
  24. package/runtime/prompts/profiles/implementation-planning.md +8 -4
  25. package/runtime/prompts/wizard/prompts.ko.json +3 -2
  26. package/runtime/python/okstra_ctl/analysis_packet.py +14 -2
  27. package/runtime/python/okstra_ctl/render.py +3 -0
  28. package/runtime/python/okstra_ctl/run.py +541 -41
  29. package/runtime/python/okstra_ctl/wizard.py +25 -7
  30. package/runtime/python/okstra_ctl/worktree.py +126 -9
  31. package/runtime/python/okstra_ctl/worktree_registry.py +88 -17
  32. package/runtime/schemas/final-report-v1.0.schema.json +36 -0
  33. package/runtime/skills/okstra-convergence/SKILL.md +14 -3
  34. package/runtime/skills/okstra-run/SKILL.md +1 -1
  35. package/runtime/templates/reports/final-report.template.md +12 -0
  36. package/runtime/templates/reports/final-verification-input.template.md +8 -5
  37. package/runtime/templates/reports/i18n/en.json +3 -1
  38. package/runtime/templates/reports/i18n/ko.json +3 -1
  39. package/runtime/validators/validate-run.py +143 -1
  40. package/runtime/validators/validate-workflow.sh +6 -1
@@ -0,0 +1,180 @@
1
+ # stage별 worktree 격리 + 동시 병렬 실행 — 설계
2
+
3
+ - 작성일: 2026-06-06
4
+ - 범위: `implementation` phase에서 stage를 **별도 git worktree로 격리**해, 사용자가 수동으로 동시에 띄운 여러 `implementation` run이 안전하게 서로 다른 stage를 진행하도록 한다. `started-exclusion`(A2)을 같은 설계에 통합한다.
5
+ - 비범위
6
+ - **자동 fan-out 없음** — okstra가 ready stage들을 여러 프로세스로 자동 분기하지 않는다. 병렬 트리거는 사용자가 stage별 run을 각각 기동하는 **수동 동시**만 지원한다.
7
+ - **okstra 자동 머지 없음** — stage 브랜치 합류는 사용자 수동 머지(또는 release-handoff 수집)다.
8
+ - `implementation` 외 phase(`requirements-discovery` / `error-analysis` / `implementation-planning` / `final-verification` / `release-handoff`)의 worktree 모델은 불변 — 기존 task-key worktree 1개 유지.
9
+ - ADR↔gitignore(C1)는 별도 plan. 다국어/i18n.
10
+ - 관계: [`2026-05-20-implementation-planning-multi-stage-design.md`](2026-05-20-implementation-planning-multi-stage-design.md)의 stage 개념·carry-in 모델 위에 선다. [`2026-06-04-stage-run-batching.md`](../plans/2026-06-04-stage-run-batching.md)가 known gap으로 남긴 **started-exclusion 미구현**을 본 설계가 해소한다. [`2026-06-04-stage-splitting-cost-aware-design.md`](2026-06-04-stage-splitting-cost-aware-design.md)의 "병렬=부수효과, run batch=비용 단위" 원칙과 양립한다(본 설계는 부수효과인 병렬을 **안전하게** 만들 뿐, 분할 기준을 바꾸지 않는다).
11
+
12
+ ## 1. 동기
13
+
14
+ `fontradar-v2-api:dev-9184`에서 사용자가 stage2/stage3를 두 `implementation` run으로 동시에 띄웠고, 두 run이 **같은 task-key worktree·브랜치를 공유**해 커밋이 한 브랜치에 인터리브됐다. 근본 원인 두 가지:
15
+
16
+ 1. **worktree가 task-key 단위 1개**다. [`compute_worktree_path`](../../../scripts/okstra_ctl/worktree.py:489)·[`compute_branch_name`](../../../scripts/okstra_ctl/worktree.py:511)에 run-seq도 stage도 없고, 모든 phase가 같은 worktree를 재사용한다([worktree.py:5-9](../../../scripts/okstra_ctl/worktree.py:5)). 같은 task의 두 stage run은 같은 디렉토리에서 파일을 편집하고 같은 브랜치에 커밋한다.
17
+ 2. **started-exclusion 미구현**이다. [`_resolve_effective_stages`](../../../scripts/okstra_ctl/run.py:264)가 `consumers.jsonl`의 `done`만 보고 `started`를 무시해, stage가 `started`(미완)인데 다른 run이 또 `auto`로 같은 ready-set을 잡는다(known gap: [stage-run-batching.md:13](../plans/2026-06-04-stage-run-batching.md), `tests/test_e2e_multi_stage_q1_q9.py::test_q7`가 현 동작을 박제).
18
+
19
+ 사용자 욕구는 "진짜 동시 병렬"이다. 본 설계는 stage를 worktree로 격리하고 점유를 원자화해 이를 안전하게 만든다.
20
+
21
+ ## 2. 핵심 원칙
22
+
23
+ ### 2.1 격리 모델 — `implementation`만 stage worktree
24
+
25
+ stage 개념은 `implementation`에만 존재한다. 따라서 stage worktree 격리도 `implementation` phase에서만 일어난다.
26
+
27
+ - `requirements-discovery`~`implementation-planning`: 기존 task-key worktree 1개 그대로(분기 없는 단일 흐름).
28
+ - `implementation`: 각 run이 자신이 맡은 stage(또는 ready batch)를 **전용 stage worktree**에서 실행한다.
29
+
30
+ **비-git / nested-worktree degradation:** 주변 흐름의 [`provision_task_worktree`](../../../scripts/okstra_ctl/worktree.py)가 `project_root`가 git repo가 아니거나(`skipped-not-git`) 이미 다른 worktree 안인 경우(`skipped-in-worktree`) 일반 phase는 graceful degrade한다(task worktree 미발급, `EXECUTOR_WORKTREE_PATH=project_root`). `implementation` stage 격리도 같은 신호를 따라 degrade한다 — stage worktree 발급을 건너뛰고 ctx의 `EXECUTOR_WORKTREE_*`를 그대로 두며, `consumers.jsonl`에 선택된 stage의 `started` 행만 기록한다. (이 경로는 render-only 테스트·non-git 프로젝트 fallback용. 실제 stage 격리·동시 병렬은 git project에서만 작동.)
31
+
32
+ ### 2.2 base 결정 — carry-in의 git-native 구현
33
+
34
+ | stage 종류 | worktree base | 동시성 |
35
+ |---|---|---|
36
+ | **독립** (`depends-on (none)`) | **공통 base** = `implementation` 첫 진입 시점의 task-key worktree HEAD(=planning 종료 상태) | 서로 **동시 가능** |
37
+ | **단일 의존** (`depends-on X`) | 선행 stage X의 **done commit**(그 stage 브랜치 위에 적층) | X done 이후 시작 |
38
+ | **다중 의존** (`depends-on X,Y…`) | task-key worktree HEAD(candidate). 선행들의 `done.head_commit`이 **모두 candidate의 ancestor**일 때만(=사용자가 선행을 머지함) — §9 #1 | 선행 모두 done + 머지 이후 |
39
+
40
+ - 공통 base는 **첫 stage 진입 시 한 번 결정·고정**한다(아래 §3.2). dev-9184에서 stage들이 `8a18f99`(다른 task 머지가 반영된 main)에서 출발한 그 지점이다.
41
+ - **단일 의존** stage Y의 base는 선행 stage X의 done commit이다. X의 done은 이미 `consumers.jsonl`에 `head_commit`으로 기록된다(dev-9184 consumers의 `stage:1 done head_commit:b3971782…`가 그 증거). Y는 X 브랜치 위에 적층되므로 Y 머지 시 X도 따라온다.
42
+ - **다중 의존** stage는 선행들이 서로 다른 stage 브랜치에 흩어져 있어 단일 base commit이 자명하지 않다. 옵션 A 자동 감지(§9 #1): task-key worktree HEAD를 candidate로 두고, 선행 stage들의 `done.head_commit`이 모두 candidate의 ancestor면(`git merge-base --is-ancestor`) 사용자가 선행을 머지한 것으로 보고 candidate를 base로 발급한다. 하나라도 ancestor가 아니면 `PrepareError`로 "선행 stage 브랜치를 머지 후 재시도" 안내.
43
+ - 결론: **진짜 동시 실행은 독립 stage끼리만** 성립하고, dev-9184의 "의존 stage가 선행 변경 없는 base에서 시작" 문제가 구조적으로 사라진다.
44
+
45
+ ### 2.3 점유 / 동시성 — registry flock 예약이 SSOT (A2 통합)
46
+
47
+ - worktree **registry를 stage-key 단위로 확장**하고 flock으로 **원자적 예약**한다([worktree_registry](../../../scripts/okstra_ctl/worktree.py:47) 경유).
48
+ - **ready 집합** = `depends-on`이 모두 `done` **AND** 자신이 `consumers.jsonl`에 `started`/`done` 행이 없음 **AND** registry에 active 예약 없음.
49
+ - 두 동시 run이 같은 stage를 잡으려 하면 **flock 예약이 직렬화**한다 — 한쪽만 예약에 성공하고, 다른쪽은 다음 ready stage를 잡거나 "잡을 stage 없음"으로 정상 종료.
50
+ - **이것이 점유의 SSOT**다. `consumers.jsonl`의 `started`는 기록·관찰용이고, 실제 동시성 backstop은 registry flock 예약이다. A2(started-exclusion)는 ready 집합 계산에 `started`/예약 제외를 더하는 것으로 충족된다.
51
+ - **한 run = 한 stage** (단일 stage 실행). [`cost-aware-design.md §2.3`](2026-06-04-stage-splitting-cost-aware-design.md)의 "ready-set batch"는 stage가 한 worktree·한 branch에서 순차 실행된다는 가정 위에 섰지만, 본 설계가 stage마다 격리 worktree·격리 branch를 요구하므로 batch는 의미를 잃는다(같은 branch에 두 stage-key를 reserve하려 하면 P1 registry의 branch-uniqueness 불변식 충돌). `_resolve_effective_stages`는 backward compat로 batch 리스트를 반환하지만, implementation 통합 경로는 **첫 ready stage 하나만** 실행한다. 사용자가 여러 stage를 동시에 진행하려면 별도 run을 띄우면 되고(이게 본 설계의 동시 병렬), 순차로 진행하려면 stage가 done 되는 대로 다음 run을 띄우면 된다 — 어느 쪽이든 cost 측면 등가다.
52
+
53
+ > **declaration ↔ enforcement** (okstra `CLAUDE.md` rule 3): 점유 규칙의 강제 지점은 **런타임 registry 예약**이다. `prepare_task_bundle`이 이미 active 예약된 stage-key를 또 잡으려 하면 `PrepareError`로 중단한다. 프로파일 문구(MUST)만으로 끝내지 않는다.
54
+
55
+ ## 3. 데이터 모델
56
+
57
+ ### 3.1 worktree 키 / 브랜치
58
+
59
+ ```
60
+ 경로: ~/.okstra/worktrees/<project>/<group>/<task-id>/stage-<N>/
61
+ 브랜치: <work-category-prefix>-<task-id>-s<N>
62
+ ```
63
+
64
+ - [`compute_worktree_path`](../../../scripts/okstra_ctl/worktree.py:489)에 optional `stage_number` 추가. None이면 기존 task-key 경로(다른 phase 호환).
65
+ - [`compute_branch_name`](../../../scripts/okstra_ctl/worktree.py:511)에 optional `stage_number` 추가. None이면 기존 `<prefix>-<task-id>`.
66
+
67
+ ### 3.2 registry 엔트리 — task-key + stage-key 공존
68
+
69
+ 기존 task-key 엔트리(다른 phase가 사용)는 유지하고, `implementation` stage마다 stage-key 엔트리를 추가한다. task-key 엔트리에 **공통 base 고정값**을 1회 기록한다.
70
+
71
+ ```jsonc
72
+ "tasks": {
73
+ "<proj>/<group>/<task-id>": { // 기존 (다른 phase)
74
+ "branch": "<prefix>-<task-id>",
75
+ "implementationBaseCommit": "8a18f99…", // 신규: 첫 stage 진입 시 1회 고정
76
+ ...
77
+ },
78
+ "<proj>/<group>/<task-id>#stage-2": { // 신규 (implementation stage)
79
+ "branch": "<prefix>-<task-id>-s2",
80
+ "worktree_path": ".../stage-2",
81
+ "base_ref": "8a18f99…", // 독립: 공통 base
82
+ "stage": 2, "status": "active"
83
+ },
84
+ "<proj>/<group>/<task-id>#stage-3": {
85
+ "branch": "<prefix>-<task-id>-s3",
86
+ "base_ref": "b3971782…", // 의존: 선행 stage done commit
87
+ "stage": 3, "status": "active"
88
+ }
89
+ }
90
+ ```
91
+
92
+ - `implementationBaseCommit`는 첫 stage 예약 시 flock 안에서 한 번 써진다. 동시 첫 진입이라도 직렬화되어 둘째 run은 기록된 값을 읽는다(race 없음).
93
+
94
+ ### 3.3 ready 집합 계산 (A2)
95
+
96
+ `_resolve_effective_stages`의 ready 판정을 확장한다(`requested == "auto"` 경로):
97
+
98
+ ```
99
+ ready = [s for s in stages
100
+ if s.number not in done_stages
101
+ and s.number not in started_stages # ← A2 신규
102
+ and s.number not in reserved_stages # ← registry 예약 신규
103
+ and all(d in done_stages for d in s.depends_on)]
104
+ ```
105
+
106
+ `--stage N` 명시 경로는 단일 stage를 직접 예약 시도하고, 이미 예약/done이면 `PrepareError`.
107
+
108
+ ## 4. 흐름
109
+
110
+ ```
111
+ implementation-planning done (task-key worktree HEAD = 공통 base 후보)
112
+
113
+ 사용자가 동시에 두 run 기동: okstra-run --stage 2 │ okstra-run --stage 3
114
+ ▼ ▼
115
+ prepare(stage 2) prepare(stage 3)
116
+ flock: flock:
117
+ - implementationBaseCommit 고정(1회) - (이미 고정됨, 읽기)
118
+ - ready/예약 확인 → stage-2 예약 - ready/예약 확인 → stage-3 예약
119
+ - base=공통(독립) - base=공통(독립; depends-on 동일 Stage1만)
120
+ - worktree add stage-2 @ base - worktree add stage-3 @ base
121
+ ▼ ▼
122
+ executor(stage-2) → commit → consumers executor(stage-3) → commit → consumers
123
+ stage-2 done + carry/stage-2.json stage-3 done + carry/stage-3.json
124
+ ▼ ▼
125
+ └──────── 사용자 수동 머지(의존성 순) ────────┘
126
+ release-handoff가 stage PR 목록 수집
127
+ ```
128
+
129
+ 의존 stage(예: Stage 4 `depends-on 2,3`)는 2·3가 done될 때까지 ready 아님 → 자동으로 직렬.
130
+
131
+ ## 5. 변경 대상 파일 (seed)
132
+
133
+ | 파일 | 변경 |
134
+ |---|---|
135
+ | [`worktree.py`](../../../scripts/okstra_ctl/worktree.py:489) | `compute_worktree_path`/`compute_branch_name`에 optional `stage_number`; `provision_task_worktree`에 stage 인지 + base 계산(독립=공통 `implementationBaseCommit`, 의존=선행 done commit) |
136
+ | [`worktree_registry.py`](../../../scripts/okstra_ctl/worktree.py:47) | stage-key(`<task-key>#stage-<N>`) 예약/lookup; `implementationBaseCommit` 1회 고정(flock 내) |
137
+ | [`run.py`](../../../scripts/okstra_ctl/run.py:264) | `_resolve_effective_stages`에 `started`/예약 제외(A2); `_reserve_implementation_stages`([run.py:890](../../../scripts/okstra_ctl/run.py:890))를 stage worktree 발급에 연결; `--stage N` 중복 예약 시 `PrepareError` |
138
+ | [`_implementation-executor.md`](../../../prompts/profiles/_implementation-executor.md:30) | "owns a ready-set batch"를 stage worktree 컨텍스트로 — executor가 자신의 stage worktree에서만 작업, 다른 stage worktree 접근 금지 |
139
+ | [`okstra.sh`](../../../scripts/okstra.sh) · [`cli.sh`](../../../scripts/lib/okstra/cli.sh) | `--stage` 패스스루(B2) — 수동 동시 병렬의 CLI 진입점 |
140
+ | [`release-handoff.md`](../../../prompts/profiles/release-handoff.md) | stage 브랜치(`-s<N>`) PR 목록을 `consumers.jsonl`에서 수집 |
141
+ | `validators/` (`validate-run.py` 인접) | prepare 시 stage-key 중복 예약 거부(런타임 강제 지점 명시) |
142
+
143
+ seed 규칙(`feedback_okstra_fixes_target_end_users`): `runtime/`·개인 `.claude/`가 아니라 위 source 파일에만 가한다.
144
+
145
+ ## 6. teardown / 정리
146
+
147
+ 모든 stage done + 머지 후 stage worktree들을 정리한다. 기존 수동 절차(`git worktree remove <path>` → `git branch -D <branch>` → registry 키 삭제)를 **stage-key N개로 확장**한다. 자동 teardown은 본 설계 비범위(사용자 수동 또는 후속 spec).
148
+
149
+ ## 7. 호환성
150
+
151
+ okstra는 pre-1.0(`feedback_pre_v1_no_compat`). 기존 단일-worktree `implementation` 흐름은 **stage worktree 흐름으로 대체**된다 — N=1(단일 stage) plan도 `stage-1` worktree를 발급받는다. compat shim 없음. 다른 phase는 영향 없음.
152
+
153
+ ## 8. 검증 시나리오 (수동 QA)
154
+
155
+ | # | 시나리오 | 기대 |
156
+ |---|----------|------|
157
+ | W1 | 1-stage plan `implementation` | `stage-1` worktree 발급, PR 1개 |
158
+ | W2 | 독립 stage 2·3을 두 run 동시 `--stage 2` / `--stage 3` | 각자 stage worktree·브랜치, 공통 base, 충돌 없음, consumers 2줄 |
159
+ | W3 | W2를 둘 다 `auto`로 동시 | flock 직렬화 → 한 run이 2, 다른 run이 3 (중복 점유 없음) |
160
+ | W4 | 다중 의존 stage 4(`depends-on 2,3`)를 2·3 done 전에 `--stage 4` | ready 아님 → `PrepareError`(선행 done 대기) |
161
+ | W5 | 단일 의존 stage(`depends-on 2`)를 2 done 후 실행 | base = stage2 done commit, 2 브랜치 위 적층, carry-in 주입 |
162
+ | W6 | 다중 의존 stage 4를 2·3 done + 통합 후 실행 | base = §9 규칙(통합 commit), carry-in 주입 |
163
+ | W7 | 이미 active 예약된 stage를 또 `--stage N` | `PrepareError`(중복 예약 거부) — 런타임 강제 확인 |
164
+ | W8 | `implementation` 외 phase | 기존 task-key worktree 그대로(stage 분기 없음) |
165
+
166
+ ## 9. 미해결 / writing-plans 전 확정
167
+
168
+ 1. **다중 의존 stage base 전략** (§2.2):
169
+ - **(옵션 A, 잠정)** 선행 stage들이 사용자 수동 머지로 한 라인에 통합된 뒤, 그 통합 commit을 base로. → 다중 의존 stage는 선행 머지 대기(자연 직렬화). 단순하고 §5의 "수동 머지" 결정과 일관. 단점: 사용자가 머지를 늦추면 다중 의존 stage가 블록됨.
170
+ - **(옵션 B)** okstra가 선행 done commit들을 임시 octopus 머지한 base를 생성. → 사용자 머지 전에도 시작 가능하나, 머지 충돌 해소 책임이 okstra로 들어옴(자동 머지 비범위와 충돌).
171
+ - **확정: 옵션 A** (2026-06-06 사용자 승인). 다중 의존 stage는 선행 stage들이 사용자 머지로 통합된 뒤 ready가 된다. 옵션 B는 후속 spec 후보로 보류.
172
+ - **자동 감지 구현** (2026-06-06 P5): 다중 의존 stage(`len(depends_on) >= 2`)의 base 결정은:
173
+ 1. candidate = task-key worktree HEAD(`git rev-parse HEAD` in `project_root` — 모든 phase 공유 worktree이며 사용자가 선행 stage 머지를 반영하는 곳).
174
+ 2. 각 선행 stage의 `done.head_commit`(consumers.jsonl)을 수집. 하나라도 done 행/head_commit이 없으면 `PrepareError`(선행 미완).
175
+ 3. 각 선행 done commit이 candidate의 **ancestor**인지 `git merge-base --is-ancestor <done> <candidate>`(returncode 0)로 검증.
176
+ - 모두 ancestor → 사용자가 선행을 머지함 → candidate를 base로 반환.
177
+ - 하나라도 아니면 → `PrepareError`로 "선행 stage 브랜치(`-s<X>`/`-s<Y>`)를 task worktree에 머지(또는 main 머지 후 worktree 갱신) 후 재시도" 안내.
178
+ - 사용자 워크플로우: stage X·Y가 done → 각 stage 브랜치를 task-key worktree(또는 main)에 머지 → task worktree HEAD가 X·Y done을 ancestor로 가짐 → 다중 의존 stage가 그 위에서 자동으로 base를 잡고 시작.
179
+ - 옵션 B(okstra octopus 임시 머지)는 여전히 보류.
180
+ 2. **stage worktree 자동 teardown** — 현재 수동(§6). 후속 spec 후보.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.52.0",
3
+ "version": "0.53.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.52.0",
3
- "builtAt": "2026-06-05T15:37:23.478Z",
2
+ "package": "0.53.0",
3
+ "builtAt": "2026-06-06T14:12:31.583Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -100,6 +100,7 @@ Rules (the schema enforces most of these — they are listed here so you know *w
100
100
  - If evidence is missing, write `"I don't know"` in the relevant statement field rather than fabricating confidence.
101
101
  - Cite file paths and line numbers in every `evidence.primary[].source` / `consensus[].evidence` cell.
102
102
  - Preserve every analysis worker's ticket tagging — every row's `ticketId` field carries the ticket key or the task-fallback. For single-ticket runs, set `ticketCoverage` to `{"singleTicket": "<ticket>"}`. For runs that do not require ticket tagging (`release-handoff`, `final-verification`), set `ticketCoverage` to `{"omit": true}`.
103
+ - For `implementation-planning`, populate `implementationPlanning.requirementCoverage` with one row per concrete requirement from the brief / packet, using IDs `R-001`, `R-002`, ... in source order. `coveredBy` MUST name the specific Option Candidate plus Stage/Step that satisfies the requirement. Use `status: "covered"` only when the report's plan actually covers it; otherwise use `gap` or `blocked C-NNN` and ensure the corresponding `Clarification Items` row blocks approval. Do not collapse this into `ticketCoverage`; ticket coverage is not requirement coverage.
103
104
  - When the `Task Type` is `improvement-discovery`, populate `## 5.9 Improvement Candidates` with the 10-column schema enforced by `validators/validate-improvement-report.py`. Source the row IDs (`I-NNN`), lens whitelist, and Source workers patterns from `scripts/okstra_ctl/improvement_lenses.py` — do NOT introduce new lens names or worker prefixes.
104
105
 
105
106
  Write the data.json with your `Write` tool against the absolute `Result Path`. Then invoke the renderer (`Bash`): `python3 scripts/okstra-render-final-report.py <data.json path>`. Confirm both files exist and respond with a short status line: `data.json written to <abs path>; markdown rendered to <abs path>. Sections populated: <count>.`
@@ -95,6 +95,10 @@ while [[ $# -gt 0 ]]; do
95
95
  BASE_REF="$(require_option_value --base-ref "${2-}")"
96
96
  shift 2
97
97
  ;;
98
+ --stage)
99
+ STAGE="$(require_option_value --stage "${2-}")"
100
+ shift 2
101
+ ;;
98
102
  --task-type)
99
103
  TASK_TYPE="$(require_option_value --task-type "${2-}")"
100
104
  shift 2
@@ -185,7 +189,7 @@ while [[ $# -gt 0 ]]; do
185
189
  printf ' hint: did you mean --task-id?\n' >&2
186
190
  ;;
187
191
  esac
188
- printf ' valid options: --render-only --resume-clarification --yes --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan --approve --implementation-option --no-plan-verification -h|--help\n' >&2
192
+ printf ' valid options: --render-only --resume-clarification --yes --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan --approve --implementation-option --stage --no-plan-verification -h|--help\n' >&2
189
193
  usage
190
194
  exit 1
191
195
  ;;
@@ -122,6 +122,7 @@ PY_ARGS=(
122
122
  [[ -n "${CLARIFICATION_RESPONSE_PATH-}" ]] && PY_ARGS+=(--clarification-response "$CLARIFICATION_RESPONSE_PATH")
123
123
  [[ -n "${WORK_CATEGORY-}" ]] && PY_ARGS+=(--work-category "$WORK_CATEGORY")
124
124
  [[ -n "${BASE_REF-}" ]] && PY_ARGS+=(--base-ref "$BASE_REF")
125
+ [[ -n "${STAGE-}" ]] && PY_ARGS+=(--stage "$STAGE")
125
126
  [[ "$RENDER_ONLY" == "true" ]] && PY_ARGS+=(--render-only)
126
127
  [[ "$PLAN_VERIFICATION_ENABLED" == "false" ]] && PY_ARGS+=(--no-plan-verification)
127
128
 
@@ -16,6 +16,7 @@ Emit one `PROGRESS: <phase-id> <verb-phrase>` line as plain user-facing text at
16
16
  {{PHASE_FORBIDDEN_ACTIONS}}
17
17
  - This run executes `{{WORKFLOW_CURRENT_PHASE}}` only. Do not start `{{WORKFLOW_NEXT_RECOMMENDED_PHASE}}` or any later phase inside this run, even if the user says "다음 단계 진행해" or similar.
18
18
  {{STAGE_BATCH_DIRECTIVE}}
19
+ {{VERIFICATION_TARGET}}
19
20
  - Phase advancement requires a new okstra invocation launched with `--task-type {{WORKFLOW_NEXT_RECOMMENDED_PHASE}}` after this run's final report is written and approved. The lead must not write source code, run builds/migrations/deployments, or otherwise produce artifacts of a different phase from inside this run.
20
21
  - See `Lifecycle Phase Boundaries` in the okstra skill (`agents/SKILL.md`) for the canonical rules and the phase-transition checklist.
21
22
 
@@ -49,6 +49,6 @@ are collected and convergence finished. Phase 1-5 do not need it.
49
49
 
50
50
  - Parse the executor's `### Stage Carry Evidence` JSON block. If absent or unparsable, end with status `contract-violated` and route to a follow-up `error-analysis`.
51
51
  - For EACH stage in this run's batch: write its JSON verbatim to `runs/<impl-task-key>/carry/stage-<N>.json`. Refuse to overwrite an existing file (one stage = one sidecar; re-runs are out of scope for this version).
52
- - For EACH stage in this run's batch: append a `status:"done"` row to `runs/<plan-task-key>/consumers.jsonl` with `completed_at`, `carry_path`, and the SHA of HEAD. Use the okstra runtime's `consumers_mutex` helper (NOT a raw filesystem write) to honour the lock.
52
+ - For EACH stage in this run's batch: append a `status:"done"` row to `runs/<plan-task-key>/consumers.jsonl` with `completed_at`, `carry_path`, `report_path` (this run's final-report path relative to the run root), and the SHA of HEAD. Use the okstra runtime's `consumers_mutex` helper (NOT a raw filesystem write) to honour the lock. `report_path` lets `final-verification` cite each stage's originating report when assembling its Source Implementation Report list.
53
53
  - The verifier round, Phase 5.5 convergence, and this Phase 6 report run **once per run** over the batch's combined diff — NOT per stage. The single final report covers every batched stage, with a per-stage subsection.
54
54
  - Quote every batched stage's new contents (each sidecar JSON in full, each new consumers row by itself) in the final report's `Stage sidecar evidence` deliverable section.
@@ -20,6 +20,7 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
20
20
 
21
21
  - **Coding-conventions preflight (BLOCKING — runs before the first `Edit` / `Write`, and binds the TDD loop below):** load the applicable coding conventions for every language the diff will touch, then state in ONE line which conventions apply (e.g. `Applying TS + hexagonal overlay; domain at src/domains/*/domain/`). Lint/test green is necessary but NOT sufficient — self-mocked tests, interaction-only assertions, and untruthful names all pass a green pipeline; this gate is what keeps them out of the diff.
22
22
  - **Language-specific rules load per situation — never inline them here.** Detect each touched file's language (extension / project manifest) and load the matching reference by reading okstra's installed coding-conventions files directly at `~/.claude/skills/okstra-coding-preflight/` (placed there by `okstra install`): read `languages/<lang>.md` (mock/spy API, idioms, test framework) + `clean-code.md` + any `architecture/*` overlay via the Read tool by absolute path. The skill is `user-invocable: false`, so do NOT rely on Skill-tool auto-invocation — read the files directly. For a ports-and-adapters / NestJS-hex layout (`domain/` + `ports/` + `adapters/`, `*.port.*`), load the hexagonal overlay too. This per-language split is the skill's job — the executor does not carry a multi-language block in context.
23
+ - **Project review rule packs:** also look for project-local review skills in `<PROJECT_ROOT>/skills/*review*`, `<PROJECT_ROOT>/.claude/skills/*review*`, and up to two parent directories' `skills/*review*/SKILL.md`. Read the relevant `SKILL.md` plus referenced `references/*.md` files and apply their rules during implementation. This is a prevention pass, not a PR-comment generation workflow: do not dispatch reviewer subagents from the executor. For Fonts Ninja-style PR review packs, the executor must avoid newly introduced duplicate helper stacks, tautological tests that merely re-call the delegated helper, self-mocking, domain rules in adapters/ports, domain objects outside `domain/`, dead APIs, weak public names, and functions that fail the plain-English read.
23
24
  - **Language-agnostic principles that ALWAYS bind (the TDD loop below MUST satisfy them):** (1) no self-mocking of the SUT — stub/spy only injected collaborators, never the subject's own methods; (2) behavioral assertions on outcomes (return value, state, persisted rows, events, boundary calls) — never `toHaveBeenCalled*` on an internal helper as the only/primary assertion; (3) truthful names — a `get*` / `find*` that writes/inserts, or a name encoding the caller's use-case (`*ForInit`) or hiding a domain rule (`findValid*`), is a defect; (4) single-purpose functions ≤50 effective lines, plain-English readability.
24
25
  - **Graceful degradation (codex / gemini executor runtimes, or any runtime where the `~/.claude/skills/okstra-coding-preflight/` files are absent or unreadable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
25
26
  - **Mandatory TDD loop**: BEFORE the first `Edit` or `Write` call, the executor MUST apply a red-green-refactor loop for every code change in this run. This is required; skipping it is a `contract-violated` outcome. This governs HOW each step is executed (failing test first → minimal implementation → refactor); it does not override the approved plan's WHAT/file scope.
@@ -27,23 +28,29 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
27
28
  - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
28
29
  - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
29
30
  - **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
30
- - re-read the approved plan end-to-end and parse the `## 5.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
31
- - for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
32
- - the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
31
+ - re-read the approved plan end-to-end and parse the `## 5.5 Stage Map`. Read the **Stage** injected in the launch prompt (`Stage for this implementation run`): the single stage number this run owns. The runtime already selected and reserved this stage (one run = one stage) — do NOT recompute the start stage from `consumers.jsonl`.
32
+ - load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(this stage)` and inject them into the executor's working context as "runtime carry-in". For a `depends-on (none)` stage, no sidecar load — task-brief only.
33
+ - this stage's `depends-on` are all already `status:done`. Its file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope.
33
34
  - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
34
35
  - "materially changed" means: the function, class, section, or behaviour the plan targets has been edited, renamed, moved, removed, or otherwise altered in a way that invalidates the plan's reasoning. Cosmetic edits (whitespace, comment-only changes, unrelated function modifications elsewhere in the same file) do NOT trigger a re-plan; cite the diff (`git log --oneline <plan-created-at>..HEAD -- <file>`) in the final report and proceed.
35
36
  - distinguish the two file-scope rules (they are not in conflict):
36
37
  - **drift rule** (this section): if a file *named in the plan* has materially drifted, refuse to edit and route back to planning. This protects trust in the approved scope.
37
38
  - **out-of-plan rule** (Allowed actions section below): if a step *requires touching a file NOT in the plan list*, that is permitted with `Out-of-plan edits` justification. This handles honest scope discovery during execution.
38
39
  - confirm the test/build commands referenced in the plan still exist and run from a clean state
40
+ - **Pre-commit review-rule sweep (BLOCKING before the executor's final commit):** inspect the run diff (`git diff <stage-base>..HEAD`) against the loaded coding conventions and project review rule packs. Fix in-place before handing to verifiers when the issue is inside this stage's scope. Minimum sweep:
41
+ - no new byte-identical or semantically equivalent helper stack appears in two touched services; extract a shared helper/domain module unless the approved plan explicitly justified duplication,
42
+ - no test asserts equality to a direct re-invocation of the collaborator/helper being delegated to; keep literal-value or observable-state assertions,
43
+ - no public method/repository function introduced by this run is left with zero in-scope callers unless it is part of a declared interface contract,
44
+ - no exported/public name hides side effects or omits the entity it acts on,
45
+ - no newly introduced function requires a reader to mentally name several phases that should be helper calls.
39
46
 
40
- ## Stage execution contract (this run owns the injected stage batch)
47
+ ## Stage execution contract (this run owns one stage)
41
48
 
42
- - **Sidecar evidence writer (BLOCKING, per stage).** For each stage in the batch, when that stage's Stage Validation `post` commands all succeed, the Executor MUST emit a JSON object matching the schema in `docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md` §3.2 and the lead MUST persist it to `runs/<impl-task-key>/carry/stage-<N>.json`. Each file MUST NOT exist before the run starts (overwrite is refused — see `--force-stage` non-goal).
43
- - **Reverse link (BLOCKING, per stage).** The runtime already appended a `status:"started"` row per batch stage before this run began. On each stage's completion, append a `status:"done"` row with `carry_path` populated for that stage number.
44
- - **One-PR-per-run.** This run creates exactly one PR titled `Stages <first>–<last>: <run summary>` (or `Stage <N>: <title>` when the batch is a single stage). The PR body MUST include:
45
- - `## Stage <N>` — one section per batched stage: number, title (from Stage Map row), touched files, and validation result.
46
- - `## Carry-In summary` — per stage, depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
49
+ - **Sidecar evidence writer (BLOCKING).** When this stage's Stage Validation `post` commands all succeed, the Executor MUST emit a JSON object matching the schema in `docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md` §3.2 and the lead MUST persist it to `runs/<impl-task-key>/carry/stage-<N>.json`. The file MUST NOT exist before the run starts (overwrite is refused — see `--force-stage` non-goal).
50
+ - **Reverse link (BLOCKING).** The runtime already appended a `status:"started"` row for this stage before the run began. On completion, append a `status:"done"` row with `carry_path` populated for this stage number.
51
+ - **One-PR-per-run.** This run creates exactly one PR titled `Stage <N>: <title>`. The PR body MUST include:
52
+ - `## Stage <N>` — number, title (from Stage Map row), touched files, and validation result.
53
+ - `## Carry-In summary` — depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
47
54
  - `## Previous run` / `## Next run` — links so a reviewer can navigate the run chain.
48
55
 
49
56
  ## Allowed actions during the run
@@ -67,12 +67,15 @@ Re-running commands proves the diff *builds and passes*; it does NOT prove the d
67
67
 
68
68
  - **Scope (no silent sampling).** Enumerate every changed source/test file via `git diff --name-only <base>...HEAD` and review each one. Skipping a changed file silently is a `contract-violated` outcome. If a file's language has no reference and is not covered by the agnostic checks below, record `design-review skipped: <file> (language=<x> no reference)` — never pass it silently.
69
69
  - **Load the same conventions the executor used, per language.** For each touched language load the coding-conventions reference by reading `~/.claude/skills/okstra-coding-preflight/languages/<lang>.md` + `clean-code.md` + the `architecture/hexagonal.md` overlay when the layout matches; degrade to the agnostic checks below when those files are not readable. The verifier does NOT inline language rules — it loads them per situation, identical to the executor preflight.
70
+ - **Load project review rule packs when present.** Search the project root, `.claude/skills`, and up to two parent `skills/` directories for `*review*/SKILL.md` rule packs. Read their referenced `references/*.md` files and apply them as an overlay on this static review. If a premium review skill exists, use its coverage philosophy (recall-first enumeration followed by verify-only confirmation) as the verifier's mental model, but do NOT dispatch extra reviewer agents unless the task explicitly configured them. Record `project-review-rules: <paths read>` or `project-review-rules: none found` in the worker result.
70
71
  - **Blocking checks (any hit → verdict `FAIL`, cited `path:line` + rule name, recommended fix recorded — the verifier does NOT apply it):**
72
+ - **New duplication / DRY:** two or more newly added or meaningfully modified blocks implement the same helper stack, transform, or domain rule. Literal copy-paste is always blocking; semantically equivalent transforms across services are blocking unless the approved plan explicitly justified keeping them separate. Recommend the shared module location.
71
73
  - **Self-mocking:** a test for `Foo` stubs/spies a method on the `Foo` instance under test (`jest.spyOn(sut, ...)`, `spyOn(FooService.prototype, ...)` in `foo.*.spec.*`, `vi.mocked(sut)` + stub). Mocking injected collaborators is fine.
72
74
  - **Interaction-only assertion:** a test whose only/primary assertion is `toHaveBeenCalled*` / `toHaveBeenCalledTimes` on an internal helper or a non-side-effecting collaborator, with no assertion on the returned value / resulting state / persisted row / emitted event.
75
+ - **Tautological delegation assertion:** a test asserts the SUT result equals a direct call to the same pure helper/collaborator that the SUT delegates to, instead of asserting an independent literal value or observable state.
73
76
  - **Untruthful name:** a read-named function (`get*` / `find*` / `load*`) that writes/inserts/mutates; an adapter or repository name encoding the caller's use-case (`*ForInit`) or hiding a domain rule (`findValid*` / `findActive*`).
74
77
  - **Hexagonal (only when the overlay is loaded):** business logic inside a port body; an adapter method that is not pure I/O (post-fetch JS filtering on domain state, domain-rule evaluation); a domain object declared outside the `domain/` boundary.
75
- - **Advisory findings (recorded as recommendations; verdict MAY still PASS):** function >50 effective lines, a single body mixing read+write stages, weak readability, a missing-but-non-critical outcome assertion. These land in the verifier result as `should-fix` / `nit` recommendations, not as a `FAIL`.
78
+ - **Advisory findings (recorded as recommendations; verdict MAY still PASS):** function >50 effective lines, a single body mixing read+write stages, weak readability, a missing-but-non-critical outcome assertion, newly orphaned private/public code that is safe to remove but not on a critical path, or weak-but-not-misleading names. These land in the verifier result as `should-fix` / `nit` recommendations, not as a `FAIL`.
76
79
  - **Output.** Every finding — blocking or advisory — is a structured item in the verifier's worker result (`path:line`, rule, severity, suggested fix) so it carries into Phase 5.5 convergence and the final report. A blocking hit sets the verifier verdict to `FAIL` with the rule cited, using the same verdict machinery as the Discrepancy rule above. `Claude lead` MUST NOT silently downgrade a cited blocking finding to advisory during synthesis; an override requires a concrete cited reason, exactly as for the Discrepancy rule.
77
80
 
78
81
  ### DB / IO / SQL change — real-execution gate (mock-only acceptance forbidden)
@@ -22,20 +22,20 @@
22
22
  - regression risk in adjacent code paths not directly changed
23
23
  - documentation or rollout gaps
24
24
  - production-specific failure modes not caught by tests (env/config drift across stages, secrets & permission/auth changes, migration ordering & rollback executability, observability gaps)
25
- - Pre-verification entry gate (mandatoryrefuse to start if any item fails):
26
- - the task brief MUST cite the originating `implementation` final-report path under `## Source Implementation Report`. The lead opens that file and confirms it includes the approved-plan reference (heading `Approved Plan Reference` or `Plan link & approval evidence`), `Commit list`, `Diff summary`, `Validation evidence`, and `Routing recommendation for final-verification`.
27
- - the task brief MUST identify the worktree / checkout under verification and the implementation base ref. If the implementation report names a task worktree, final-verification MUST inspect that same worktree rather than the caller's original checkout.
28
- - the lead MUST capture `git rev-parse HEAD`, `git status --short`, and `git diff --stat <implementation-base>..HEAD` from the verification worktree before dispatching workers. These values are the verification target and must be cited in the final report.
29
- - if the cited implementation report is missing, lacks commits for delivered code changes, or the current checkout does not match the implementation report's commit list / diff summary, the run MUST end with status `blocked` and route back to `implementation` or `implementation-planning` rather than verifying an ambiguous target.
25
+ - Pre-verification entry gate (resolved & enforced by `okstra render-bundle` prep the lead does NOT recompute it):
26
+ - the verification target (scope / worktree / base / head / stages / source reports / diff stat) is injected as the `VERIFICATION_TARGET` block. The lead MUST treat it as authoritative and MUST NOT re-pick a target from the brief.
27
+ - **whole-task scope** (`--stage auto`, default): prep has already verified every Stage Map stage is `status:done` in `consumers.jsonl`, every done stage's `head_commit` is an ancestor of the task worktree HEAD (all stage branches merged), and the worktree is clean outside `.okstra/`. If any check failed the run never started (PrepareError); a started whole-task run is therefore a fully-merged, clean target.
28
+ - **single-stage scope** (`--stage N`): prep verified stage N is `status:done` and its isolated stage worktree exists and is clean. Other stages' state is irrelevant. A single-stage run is a partial verification and MUST NOT recommend `release-handoff`.
29
+ - the lead still captures `git rev-parse HEAD` / `git status --short` from the injected worktree to confirm the analysis ran against the injected head; a mismatch is a `tool-failure`, not a silent proceed.
30
30
  - Required deliverable shape (final report, in addition to the standard sections):
31
- - **Source Implementation Report**: relative path of the originating `implementation` final-report file, the quoted commit list / diff summary used as the verification target, the worktree path inspected, and the base/head SHAs captured at run start. The lead injects this same target snapshot into every analyser prompt (`**Worktree:** / **Verification base ref:** / **Verification head SHA:** / **Verification diff stat:**`); a worker that cannot confirm its analysis ran against that exact head MUST record a `tool-failure` rather than verify an ambiguous target.
31
+ - **Source Implementation Report(s)**: the `VERIFICATION_TARGET` snapshot verbatim verification scope, worktree path, base/head SHAs, the list of stages under verification, and one row per stage citing its originating implementation final-report (`report_path` from `consumers.jsonl`; render `(report_path unrecorded)` when absent). The lead injects this same snapshot into every analyser prompt (`**Verification scope:** / **Worktree:** / **Verification base ref:** / **Verification head SHA:** / **Verification diff stat:**`); a worker that cannot confirm its analysis ran against that exact head MUST record a `tool-failure`.
32
32
  - **Verdict vocabulary**: Section 7 (`Final Verdict`) MUST include a `Verdict Token` field whose value is exactly one of `accepted`, `conditional-accept`, or `blocked`. `conditional-accept` requires an explicit, exhaustive list of conditions; ambiguous verdicts ("looks good", "mostly ready") are not allowed. Each condition MUST be recorded as a row in the **Conditional Acceptance Conditions** deliverable (`id` `CA-NNN`, `condition`, `evidenceRequired`, `blocksReleaseHandoff`). The validator enforces verdict↔deliverable consistency: `accepted` ⇒ zero acceptance blockers, `blocked` ⇒ at least one, `conditional-accept` ⇒ at least one condition, and a `release-handoff` routing recommendation is allowed only when the verdict is `accepted`.
33
33
  - **Acceptance Blockers block** (under section 4): one row per blocker with `id`, `severity` (`critical` / `major` / `minor`), evidence (file path, log excerpt, or test output), and the recommended follow-up phase (`error-analysis` or `implementation-planning`). Empty block is acceptable and preferred — render the single line `- No acceptance blockers found.`
34
34
  - **Residual Risk block** (under section 4): risks that are not blockers but should be tracked, each with mitigation owner and a trigger that would escalate them to a blocker.
35
35
  - **Validation Evidence**: for every requirement in the originating plan or task brief, cite the artifact (commit SHA, test output, log line, MCP SELECT result) that demonstrates coverage. Paraphrased "verified" claims without an artifact are rejected.
36
36
  - **Read-only command log**: any pre-existing test/validation command executed during this run MUST be listed with its exact command line and exit code. No mutating commands may appear here.
37
37
  - **Two-tier command lookup (shared with `implementation`):** when this phase performs its own independent re-validation, the command source is exactly the same two tiers `implementation` verifiers use — Tier 1 is the originating task brief / approved plan's `validation` set, Tier 2 is `<PROJECT_ROOT>/.okstra/project.json` under `qaCommands`. Auto-detecting tools from manifest files is forbidden; missing tiers are recorded as `qa-command not configured: <category>` and do NOT trigger a guess. The `cmd` deny-list (`--fix`, `--write`, ` -w`, ` -u`, `--snapshot-update`, `INSTA_UPDATE=<not-no>`, `cargo update`, `npm install` without `ci`, etc.) is enforced identically. NOTE: runtime fail-fast validation (`okstra_ctl.qa_commands.validate_qa_commands`) only fires at `--task-type implementation` run-prep, so this phase MUST self-check each `qaCommands` entry against the deny-list before executing it — if a denied token is present, skip the command and record it as a `Read-only command log` line `qa-command rejected (denied token: <token>): <label>`.
38
- - **Routing recommendation**: the next safe phase — one of `release-handoff`, `done`, `error-analysis`, `implementation-planning` — tied to the verdict and blocker list. `release-handoff` is allowed ONLY when the Verdict Token is `accepted`.
38
+ - **Routing recommendation**: the next safe phase — one of `release-handoff`, `done`, `error-analysis`, `implementation-planning` — tied to the verdict and blocker list. `release-handoff` is allowed ONLY when the Verdict Token is `accepted`. `release-handoff` is additionally allowed ONLY when the verification scope (the `Verification scope:` line of the injected `VERIFICATION_TARGET` block, recorded as the report's `verificationScope` field) is `whole-task`; a `single-stage` run is partial and routes to `implementation` / `done` even on an `accepted` verdict.
39
39
  - Clarification request policy (phase-specific addendum — shared policy is in `_common-contract.md`):
40
40
  - populate `## 1. Clarification Items` only when a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation); use `Blocks=next-phase` for items that gate continuing to release-handoff
41
41
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
@@ -31,6 +31,7 @@
31
31
  - **Files that change together live together**: split by responsibility, not by technical layer. Penalize options that scatter one logical change across unrelated layers.
32
32
  - **Follow established patterns**: in existing codebases, conform to current conventions. Targeted cleanup of a file you are already modifying is acceptable; unrelated refactors are not.
33
33
  - **YAGNI ruthlessly**: drop features, abstractions, and configuration knobs that do not serve the stated requirement.
34
+ - **Project review-rule preflight**: before choosing the recommended option, look for project-local review rule packs such as `<PROJECT_ROOT>/skills/*review*`, `<PROJECT_ROOT>/.claude/skills/*review*`, and up to two parent directories' `skills/*review*/SKILL.md`. If present, read the relevant `SKILL.md` plus referenced `references/*.md` files and treat their rules as planning constraints. Do not run the PR-review workflow here; extract only the rules. For Fonts Ninja-style TS/NestJS review packs, this means planning away known review findings before code exists: shared transforms instead of duplicate helper stacks, behavioral tests instead of collaborator-tautology assertions, domain rules in domain modules rather than repositories/adapters, domain objects under `domain/`, plain-English functions, truthful/specific names, and no dead APIs introduced by the plan.
34
35
  - Expected output emphasis:
35
36
  - feasible plan options
36
37
  - dependency and risk visibility
@@ -52,7 +53,7 @@
52
53
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
53
54
  - **Evidence note required inside `Statement`**: every clarification row includes `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>` in the `Statement` cell. `none` is allowed ONLY when the row's nature is "only a human can answer this" (reporter intent, business priority, organisational decision). A row with `none` that *could* have been answered by the codebase is a defect of this phase, restated from the pre-planning rule above.
54
55
  - Section heading contract (BLOCKING — validator scans for these literal English substrings):
55
- - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
56
+ - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`, `Requirement Coverage`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
56
57
  - Korean translations are allowed in parentheses (e.g. `### Recommended Option (권장 옵션)`), but the English keyword must be present verbatim in the heading line.
57
58
  - The shape and ordering follow `final-report-template.md` section 4.5 (`Implementation Plan Deliverables`). Do NOT translate the heading keywords — `validators/validate-run.py` does substring matching on the raw report text and 7-of-8 missing strings is a real, repeatedly observed failure mode (root cause: writer translated the headings to Korean).
58
59
  - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), the `depends-on` DAG, and the per-stage vertical-slice contract (S10) are all enforced here, not deferred to the `implementation` entry gate. S10 scans for the literal in-section strings `Slice value:`, `Acceptance:`, and the Stepwise `action`-cell prefixes `RED:` / `GREEN:` (or a `TDD exemption:` line) — keep these tokens verbatim for the same reason as the heading keywords above.
@@ -78,6 +79,8 @@
78
79
  - dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
79
80
  - validation checklist (pre / mid / post) — each item is an exact command or observable outcome
80
81
  - rollback strategy — exact revert path (commits, flags, migrations) and the signal that triggers rollback
82
+ - **Requirement Coverage (mandatory, §5.5.8):** one row per concrete requirement from the task brief / packet. Assign stable IDs `R-001`, `R-002`, ... in source order. Columns: `ID | Source | Requirement | Covered by option / stage / step | Status`. `Source` cites the brief heading or file/line where the requirement came from. `Covered by` must name the specific Option Candidate and Stage/Step that satisfies it, not just "recommended option". `Status` is one of `covered`, `gap`, or `blocked C-NNN`. If any row is `gap` or `blocked C-NNN`, the Plan Body Verification gate MUST NOT be `passed` / `passed-with-dissent`; add a matching `Blocks=approval` row for the blocker and keep `approved: false`.
83
+ - **Review-rule compliance plan:** when a project-local review rule pack is found, each Option Candidate MUST include the design implication of those rules in its File Structure / interfaces / blast-radius notes. For any helper or data transform used by more than one changed service, the plan must either place it in a shared module or explicitly justify why duplication is intentional. For any test step, the plan must state the observable behavior being asserted, not the internal collaborator call being pinned. For any exported/public method added or renamed, the step must carry the intended noun/side-effect semantics so implementation names can be reviewed before code is written.
81
84
  - the YAML frontmatter MUST include the line `approved: false` (report-writer always emits the unflipped value). The user authorises the next `implementation` run by flipping it to `approved: true` (manual edit or `--approve` CLI). Do NOT recreate any `User Approval Request` body block — the validator fails reports that contain one (see `validators/validate-run.py` deprecated patterns).
82
85
  - the YAML frontmatter MUST include the line `implementation-option:` directly under `approved:` (report-writer always emits it with an **empty value**). The user selects which Option Candidate the next `implementation` run executes by filling this line with that option's name (manual edit or `--implementation-option <name>` CLI). When left empty, the `implementation` run falls back to the `Recommended Option`.
83
86
  - **the frontmatter `approved: false` line is rendered unconditionally; if the plan-body verification gate (§5.5.9) returns `blocked-by-disagreement` or `aborted-non-result`, the writer MUST keep `approved: false` and the validator refuses any report that ships with `approved: true` under such a gate result.**
@@ -96,10 +99,11 @@
96
99
  - references to types, functions, flags, or files that no other step or option defines
97
100
  - steps that describe *what* to do without showing *how* (commands, code, or exact diffs are required for any code-touching step)
98
101
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
99
- 1. **Spec coverage** — for every requirement in the task brief, point to the option(s) and step(s) that satisfy it. List gaps explicitly.
102
+ 1. **Spec coverage** — for every requirement in the task brief, point to the option(s) and step(s) that satisfy it in `### 5.5.8 Requirement Coverage`. Every row must name the Option Candidate and Stage/Step. List gaps explicitly as `gap` or `blocked C-NNN`; a publishable gate with a non-`covered` row is a validator failure.
100
103
  2. **Placeholder scan** — search the report for the patterns in the No-placeholder rule above and fix inline.
101
104
  3. **Internal consistency** — option file lists, trade-off matrix, and recommended step list must agree on file paths, names, and signatures. A symbol called `clearLayers()` in the matrix and `clearFullLayers()` in the steps is a bug.
102
105
  4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 1. Clarification Items` table as a `Blocks=approval` row.
103
106
  5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
104
- 6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
105
- 7. **Stage Map self-check** for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Confirm each stage declares a non-empty `Slice value:` and `Acceptance:` line, and that its first step `action` starts with `RED:` with a later `GREEN:` (or carries a `TDD exemption:` line) this is what validator S10 enforces. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
107
+ 6. **Review-rule preflight check** if a project review rule pack exists, map each relevant rule to the recommended option. Reject the draft if it knowingly creates a violation that the later PR reviewer would flag, unless the plan records a specific rationale and follow-up. In particular, scan for repeated helper stacks across planned files, tests that assert delegation to the same calculator/helper they exercise, public names that hide side effects, domain rules placed in repositories/adapters, and APIs made dead by this change.
108
+ 7. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
109
+ 8. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Confirm each stage declares a non-empty `Slice value:` and `Acceptance:` line, and that its first step `action` starts with `RED:` with a later `GREEN:` (or carries a `TDD exemption:` line) — this is what validator S10 enforces. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
@@ -155,10 +155,11 @@
155
155
  "echo_template": "approved-plan: {value}"
156
156
  },
157
157
  "stage_pick": {
158
- "label": "실행할 stage 선택하세요. auto 는 의존성이 만족된 가장 빠른 미완료 stage 자동으로 잡습니다.",
158
+ "label": "stage 범위를 선택하세요. auto 는 전체 task(모든 stage)를, 특정 번호는 해당 stage 대상으로 합니다.",
159
159
  "echo_template": "stage: {value}",
160
160
  "options": {
161
- "auto": "auto (다음 미완료 stage)"
161
+ "auto": "auto (다음 미완료 stage)",
162
+ "auto_final_verification": "auto (전체 task — 모든 stage 머지 후 한 번)"
162
163
  }
163
164
  },
164
165
  "directive_pick": {
@@ -26,6 +26,14 @@ PROFILE_SECTIONS = (
26
26
  "Non-goals",
27
27
  "Clarification request policy",
28
28
  )
29
+ PROFILE_SECTIONS_BY_TASK_TYPE = {
30
+ "implementation-planning": (
31
+ "Section heading contract",
32
+ "Required deliverable shape",
33
+ "No-placeholder rule",
34
+ "Self-review pass before finalising the report",
35
+ ),
36
+ }
29
37
  CLARIFICATION_SECTIONS = (
30
38
  "Clarification Items",
31
39
  "Clarification Response Carried In From Previous Run",
@@ -123,8 +131,12 @@ def _brief_block(brief_text: str) -> list[str]:
123
131
 
124
132
 
125
133
  def _profile_block(task_type: str, profile_text: str) -> list[str]:
126
- profile_sections = _extract_sections(profile_text, PROFILE_SECTIONS)
127
- profile_bullets = _extract_bullet_sections(profile_text, PROFILE_SECTIONS)
134
+ section_names = (
135
+ PROFILE_SECTIONS
136
+ + PROFILE_SECTIONS_BY_TASK_TYPE.get(task_type, ())
137
+ )
138
+ profile_sections = _extract_sections(profile_text, section_names)
139
+ profile_bullets = _extract_bullet_sections(profile_text, section_names)
128
140
  profile_focus = "\n\n".join(
129
141
  part for part in (profile_sections, profile_bullets)
130
142
  if part and not part.startswith("- No matching")
@@ -1716,6 +1716,9 @@ def apply_lead_prompt_defaults(ctx: dict) -> None:
1716
1716
  # Empty for non-implementation runs; the implementation prepare path
1717
1717
  # overwrites it with the resolved stage-batch directive.
1718
1718
  ctx.setdefault("STAGE_BATCH_DIRECTIVE", "")
1719
+ # Empty for non-final-verification runs; the final-verification prepare
1720
+ # path overwrites it with the resolved verification target block.
1721
+ ctx.setdefault("VERIFICATION_TARGET", "")
1719
1722
  ctx.setdefault(
1720
1723
  "WORKER_PROMPT_PREAMBLE_PATH",
1721
1724
  str(Path.home() / ".okstra" / "templates" / "worker-prompt-preamble.md"),