npm - okstra - Versions diffs - 0.48.0 → 0.50.0 - Mend

okstra 0.48.0 → 0.50.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (60) hide show

package/docs/superpowers/specs/2026-06-05-wizard-batch-prompts-design.md ADDED Viewed

@@ -0,0 +1,121 @@
+# okstra wizard 멀티탭 배치 프롬프트 설계
+- 작성일: 2026-06-05
+- 대상: `scripts/okstra_ctl/wizard.py`, `skills/okstra-run/SKILL.md`
+- 상태: 설계 승인 대기
+## 1. 배경 / 문제
+`okstra-run` 스킬은 wizard 상태 머신이 내보내는 프롬프트를 한 개씩
+`AskUserQuestion` 으로 렌더하고, 사용자가 답하면 다음 프롬프트를 받는 **답-대기 왕복**
+구조다 ([wizard.py:2238](../../../scripts/okstra_ctl/wizard.py), [SKILL.md:96](../../../skills/okstra-run/SKILL.md)).
+`use_defaults=False`(customize) 분기에서는 서로 의존이 없는 픽이 줄줄이 개별 프롬프트로
+나온다:
+- 모델: `lead_model` → (impl) `executor_model` / (그 외) 로스터별 `claude_model`·`codex_model`·`gemini_model` → `report_writer_model` — 최대 5회
+- 옵션: `directive_pick` → `related_tasks_pick` → `clarification_pick` → (release-handoff) `pr_template_pick` — 최대 4회
+각 픽이 별도 왕복이라 전체 입력에 시간이 오래 걸린다. **비슷한 유형의 독립 질문을 탭으로
+묶어 한 번의 `AskUserQuestion` 으로 받아** 왕복 수를 줄이는 것이 목표다.
+## 2. 핵심 원칙
+**기존 `Step` 은 그대로 두고 "방출(presentation) 계층" 에만 배치 개념을 추가한다.**
+- `answered` / `owns` / edit-rewind(`_reset_from`) / `_ready_for_confirm` 의 `custom_ids`
+  검사는 전부 **개별 step id 단위**로 유지된다 ([wizard.py:2149](../../../scripts/okstra_ctl/wizard.py)).
+- 그룹은 "서로 의존이 없는 픽 step 들을 한 화면에 모아 내보내는" 래퍼일 뿐, step 자체를
+  대체하지 않는다. 따라서 검증·되감기 로직은 무손상이다.
+## 3. 변경 사항
+### 3.1 새 prompt kind `pick_group`
+`Prompt` 직렬화에 멀티-질문 모양을 추가한다 ([wizard.py:299](../../../scripts/okstra_ctl/wizard.py)).
+```json
+{
+  "step": "<group-id>",
+  "kind": "pick_group",
+  "questions": [
+    { "step": "lead_model", "label": "...", "options": [ {"value":"..","label":".."} ], "multi": false },
+    { "step": "claude_model", "label": "...", "options": [...], "multi": false }
+  ]
+}
+```
+- `questions[]` 한 항목 = `AskUserQuestion` 탭 하나.
+- 각 항목은 기존 step 의 `build()` 가 만든 `label` / `options` 를 그대로 재사용한다(중복 정의 금지).
+- 기존 `kind: "pick"` / `"text"` / `"done"` 은 변경 없음.
+### 3.2 그룹 정의
+customize 분기의 **픽 step 만** 그룹화한다. 순서가 있는 명시적 정의를 wizard.py 에 둔다.
+| 그룹 id | 멤버 step (적용 가능할 때만) |
+|---|---|
+| `models` | `lead_model`, (impl) `executor_model` / (그 외) `claude_model`·`codex_model`·`gemini_model`, `report_writer_model` |
+| `options` | `directive_pick`, `related_tasks_pick`, `clarification_pick`, (release-handoff) `pr_template_pick` |
+**개별 유지(그룹화 금지) 대상**:
+- `*_TEXT` 후속(`directive`, `related_tasks`, `clarification`, `pr_template`): "직접 입력"
+  선택 시에만 나타나는 조건부 텍스트 입력. `AskUserQuestion` 은 텍스트 입력을 지원하지
+  않으므로 ([SKILL.md:50](../../../skills/okstra-run/SKILL.md)) 개별 텍스트 프롬프트로 유지.
+- `workers_override`: 어떤 모델 탭(claude/codex/gemini)이 적용되는지를 결정하므로 `models`
+  그룹보다 **반드시 선행**해야 한다. 개별 유지.
+- `pr_template_scope`: `pr_template_path` 가 정해진 뒤에야 적용되므로 개별 유지.
+### 3.3 엔진 `next_prompt` 수정
+[wizard.py:2238](../../../scripts/okstra_ctl/wizard.py) 의 다음-프롬프트 결정 로직:
+1. 기존대로 첫 번째 "적용 가능 + 미답변" step 을 찾는다.
+2. 그 step 이 그룹 멤버가 아니면 기존 `Prompt` 를 그대로 반환(동작 불변).
+3. 그룹 멤버이면 같은 그룹의 **적용 가능 + 미답변 픽 멤버**를 정의 순서대로 **최대 4개**까지
+   모아 `pick_group` 으로 반환한다.
+   - `AskUserQuestion` 의 질문 수 한도가 4개이기 때문.
+   - 비-implementation 풀 로스터(lead+claude+codex+gemini+report = 5)는 첫 4개만 한 배치로
+     나가고, 답변 후 5번째가 다음 배치로 자동 분리된다. **스킬에 청크 로직 불필요**, 최악 2화면.
+### 3.4 제출 경로 수정
+[wizard.py:2233](../../../scripts/okstra_ctl/wizard.py) 의 advance, CLI `--answer` 파싱([wizard.py:2363](../../../scripts/okstra_ctl/wizard.py)):
+- 현재 활성 프롬프트가 `pick_group` 이면 `--answer` 를 **JSON 객체 문자열**로 받는다.
+  ```
+  okstra wizard step --state-file <path> --answer '{"lead_model":"opus","claude_model":"default","report_writer_model":""}'
+  ```
+- 엔진은 각 키를 해당 멤버 step 의 `submit()` 으로 라우팅하고, 각 멤버를 개별적으로
+  `answered` 에 추가한다.
+- 키 누락 / 빈 값("" 또는 "default")은 기존 `_validate_model` 규칙대로 "phase 기본값"으로
+  처리된다 ([wizard.py:418](../../../scripts/okstra_ctl/wizard.py)). 멤버 중 하나라도
+  검증 실패하면 `ok:false` 로 같은 그룹을 재-프롬프트한다(부분 적용 금지: 전부 검증 통과해야
+  `answered` 마킹).
+### 3.5 SKILL.md 렌더 규칙 추가
+[SKILL.md:41-50](../../../skills/okstra-run/SKILL.md), [SKILL.md:96](../../../skills/okstra-run/SKILL.md):
+- `kind: "pick_group"` → `questions[]` 를 탭(질문)으로 갖는 `AskUserQuestion` **1회** 호출.
+  탭마다 `label` + `options`, `multiSelect` 는 `questions[].multi`.
+- 사용자의 탭별 선택값(`options[].value`)을 `questions[].step` 키로 묶어 JSON 문자열을 만들고,
+  단일 `--answer '<json>'` 으로 제출.
+- 리터럴-토큰 권한 규칙 유지: `--answer` 값은 셸 변수/`$(...)` 없이 리터럴 JSON 문자열로 전달.
+- 검증 실패(`ok:false`) 시 동일 그룹 재-프롬프트.
+## 4. 영향 / 비영향
+- **동작 변경**: customize 분기의 모델·옵션 픽이 멀티탭으로 묶여 왕복 횟수 감소
+  (모델 3~5픽 → 1~2회, 옵션 3~4픽 → 1회).
+- **불변**: `use_defaults=True` 경로, identity 단계(task pick/type/base-ref/plan/executor),
+  confirm/branch_confirm/edit-target, render-bundle 인자 매핑([wizard.py:2249](../../../scripts/okstra_ctl/wizard.py)).
+- **"직접 입력"** 을 고른 항목만 텍스트 후속이 개별 프롬프트로 남는다(불가피).
+## 5. 테스트
+- 기존 wizard 단위 테스트(`tests/`) 가 개별 step 계약을 그대로 통과해야 한다(그룹은 방출 계층만 변경).
+- 신규: `pick_group` 방출(멤버 4개 초과 → 2배치 분리), JSON `--answer` 라우팅(부분 검증 실패 시
+  재-프롬프트), workers_override 선행 → models 그룹 로스터 반영을 커버하는 테스트 추가.
+- `node bin/okstra --version` / `bash validators/validate-workflow.sh` 회귀 확인.

package/docs/task-process/error-analysis.md CHANGED Viewed

@@ -87,7 +87,7 @@ final report는 다음을 담아야 한다.
 - evidence-backed cause analysis
 - uncertainty boundary
 - practical next diagnostic steps
-- blocking uncertainty가 있으면 `## 5. Clarification Items`, 보통 `Blocks=next-phase`
+- blocking uncertainty가 있으면 `## 1. Clarification Items`, 보통 `Blocks=next-phase`
 금지되는 것은 source edit, refactor, fix attempt, implementation design artifact, build/migration/deploy 실행이다. code나 log로 답할 수 있는 ambiguity를 사용자 질문으로 넘기는 것도 profile상 defect다.

package/docs/task-process/final-verification.md CHANGED Viewed

@@ -108,7 +108,7 @@ flowchart TD
     Blockers --> Followup
 ```
-`## 2. Final Verdict`에는 `Verdict Token` field가 정확히 하나 들어가야 하며 값은 다음 셋 중 하나다.
+`## 7. Final Verdict`에는 `Verdict Token` field가 정확히 하나 들어가야 하며 값은 다음 셋 중 하나다.
 - `accepted`: release-handoff로 넘겨도 되는 상태
 - `conditional-accept`: 조건을 모두 명시해야 하며, 조건이 gate면 다음 phase를 막는다

package/docs/task-process/release-handoff.md CHANGED Viewed

@@ -83,7 +83,7 @@ flowchart TD
 lead는 사용자에게 push/PR 여부를 묻기 전에 다음을 확인한다.
 - task brief가 `## Source Verification Report`를 가리킨다.
-- 그 report의 `## 2. Final Verdict`에 `Verdict Token = accepted`가 정확히 있다.
+- 그 report의 `## 7. Final Verdict`에 `Verdict Token = accepted`가 정확히 있다.
 - working tree가 clean이다.
 - 현재 branch가 `main`, `master`, `prod`, `preprod`, `staging`, `dev` 같은 base branch가 아니다.
 - `<base>..HEAD` commit range가 비어 있지 않다.

package/docs/task-process/requirements-discovery.md CHANGED Viewed

@@ -100,7 +100,7 @@ final report는 다음을 특히 강조한다.
 - missing input과 uncertainty boundary
 - 다음 phase 및 safe resume guidance
 - `terminology:*` brief item에 대한 canonical term resolution
-- blocking input이 있으면 `## 5. Clarification Items` unified table에 `Blocks=next-phase`
+- blocking input이 있으면 `## 1. Clarification Items` unified table에 `Blocks=next-phase`
 Non-goal은 source edit, plan authoring, build, deployment다.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.48.0",
+  "version": "0.50.0",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.48.0",
-  "builtAt": "2026-06-04T17:18:36.379Z",
+  "package": "0.50.0",
+  "builtAt": "2026-06-05T10:53:30.670Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/agents/SKILL.md CHANGED Viewed

@@ -310,12 +310,12 @@ Distinct from Phase 5.5 finding convergence:
 Lead's responsibilities in this sub-step (in order):
-1. Extract `P-*` plan items from the draft report's `## 4.5 Implementation Plan Deliverables` per the prefix → source-section mapping in the convergence skill.
+1. Extract `P-*` plan items from the draft report's `## 5.5 Implementation Plan Deliverables` per the prefix → source-section mapping in the convergence skill.
 2. Dispatch a single plan-body reverify round to every analyser worker in the roster (`claude`, `codex`, and `gemini` when opted in). `Report writer worker` is NOT a participant in this round.
 3. Aggregate verdicts and resolve the gate result to one of `passed` / `passed-with-dissent` / `blocked-by-disagreement` / `aborted-non-result`.
 4. Write `runs/<task-type>/state/plan-body-verification.json` (schema in the convergence skill).
-5. Populate `### 4.5.9 Plan Body Verification` in the final-report file (template at `templates/reports/final-report.template.md` §4.5.9 — Round count, Gate result, Verdict table, Dissent log).
-6. For every `majority-disagree` plan item, append a row to `## 5. Clarification Items` with `Blocks=approval` and the 1:1 ID match in the verdict table's `Classification` column (`majority-disagree → C-<N>`). Do NOT create a parallel `Open Questions` block — see `prompts/profiles/implementation-planning.md` self-review step 6 for the orphan-on-either-side contract.
+5. Populate `### 5.5.9 Plan Body Verification` in the final-report file (template at `templates/reports/final-report.template.md` §5.5.9 — Round count, Gate result, Verdict table, Dissent log).
+6. For every `majority-disagree` plan item, append a row to `## 1. Clarification Items` with `Blocks=approval` and the 1:1 ID match in the verdict table's `Classification` column (`majority-disagree → C-<N>`). Do NOT create a parallel `Open Questions` block — see `prompts/profiles/implementation-planning.md` self-review step 6 for the orphan-on-either-side contract.
 7. Conditionally render the top-of-report `- [ ] Approved` marker line: present iff gate ∈ {passed, passed-with-dissent}, absent iff gate ∈ {blocked-by-disagreement, aborted-non-result}. `validators/validate-run.py` `validate_phase_boundary` enforces this correspondence. Manually flipping a blocked gate to passing in order to render the marker is a contract violation.
 If `convergence.planBodyVerification.enabled == false` (set by `--no-plan-verification` or by `okstra config set plan-verification off`), the entire sub-step is skipped and the top-of-report Approval marker is rendered unconditionally (legacy behaviour). This opt-out is intended for fast iteration only and is not recommended for handoff-ready plans.

package/runtime/agents/workers/claude-worker.md CHANGED Viewed

@@ -78,7 +78,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
 Include file paths and line numbers when discussing code evidence.
-**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Use the leading column for table-form items (`F-001`, `M-001`, `S-001`, `U-001`, `R-001` per section) or a `[<ID>]` prefix for bullet/numbered items. The ID shape is your choice but it MUST appear — the lead's §1.1 / §1.2 / §3.1 synthesis preserves these IDs in its `Source items (worker:item)` column to keep cross-worker traceability intact. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
+**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Use the leading column for table-form items (`F-001`, `M-001`, `S-001`, `U-001`, `R-001` per section) or a `[<ID>]` prefix for bullet/numbered items. The ID shape is your choice but it MUST appear — the lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs in its `Source items (worker:item)` column to keep cross-worker traceability intact. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
 **Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.

package/runtime/agents/workers/codex-worker.md CHANGED Viewed

@@ -161,7 +161,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
 Include file paths and line numbers when discussing code evidence.
-**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Codex tends to use hierarchical numbering (`1.1`, `1.2`, `1.3`, ...); that shape is fine — keep what's natural. What matters is that each item is addressable. The lead's §1.1 / §1.2 / §3.1 synthesis preserves these IDs as `codex:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
+**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Codex tends to use hierarchical numbering (`1.1`, `1.2`, `1.3`, ...); that shape is fine — keep what's natural. What matters is that each item is addressable. The lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs as `codex:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
 **Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.

package/runtime/agents/workers/gemini-worker.md CHANGED Viewed

@@ -161,7 +161,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
 Include file paths and line numbers when discussing code evidence.
-**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Gemini may use `F-1`, `F-2`, ... or numbered hierarchical IDs — either is fine. What matters is that each item is addressable. The lead's §1.1 / §1.2 / §3.1 synthesis preserves these IDs as `gemini:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
+**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Gemini may use `F-1`, `F-2`, ... or numbered hierarchical IDs — either is fine. What matters is that each item is addressable. The lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs as `gemini:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
 **Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.

package/runtime/agents/workers/report-writer-worker.md CHANGED Viewed

@@ -71,9 +71,9 @@ For the report writer specifically, the `## Inputs` list always includes:
 - `<instruction-set>/final-report-template.md` — the **phase-stripped** Jinja2 template the renderer uses (only this run's §4.x deliverable block remains). Read it to understand which data.json fields appear where in the rendered markdown; do NOT edit it, and do NOT pull the full `templates/reports/final-report.template.md` source.
 - `templates/reports/i18n/en.json` and `templates/reports/i18n/ko.json`.
 - Every analysis worker's result file under `worker-results/`.
-- `state/convergence-<task-type>-<seq>.json` (if present). When present, reproduce its `roundHistory[]`, `round2SkippedReason`, and `finalClassificationCounts` verbatim into the final report's Section 1 Round History sub-table — do not recompute from worker results.
+- `state/convergence-<task-type>-<seq>.json` (if present). When present, reproduce its `roundHistory[]`, `round2SkippedReason`, and `finalClassificationCounts` verbatim into the final report's Section 6 Round History sub-table — do not recompute from worker results.
-For the carry-in `clarification-response.md` (if present), walk every row of `## 5. Clarification Items` including rows whose `User input` cell is blank — a blank cell with `Status=open` is a signal you must surface in the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it when the carry-in path is non-empty). When no carry-in path was provided, OMIT the `## 0.` heading entirely — do NOT write an empty-state stub.
+For the carry-in `clarification-response.md` (if present), walk every row of `## 1. Clarification Items` including rows whose `User input` cell is blank — a blank cell with `Status=open` is a signal you must surface in the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it when the carry-in path is non-empty). When no carry-in path was provided, OMIT the `## 0.` heading entirely — do NOT write an empty-state stub.
 Write a Reading Confirmation block to your **audit sidecar** at `runs/<task-type>/worker-results/report-writer-worker-audit-<task-type>-<seq>.md`. The main final-report and the main worker-results file MUST NOT contain a `## 0. Reading Confirmation` heading. If you cannot truthfully confirm a file end-to-end, record a `tool-failure` in the errors sidecar instead of fabricating the report.
@@ -100,7 +100,7 @@ Rules (the schema enforces most of these — they are listed here so you know *w
 - If evidence is missing, write `"I don't know"` in the relevant statement field rather than fabricating confidence.
 - Cite file paths and line numbers in every `evidence.primary[].source` / `consensus[].evidence` cell.
 - Preserve every analysis worker's ticket tagging — every row's `ticketId` field carries the ticket key or the task-fallback. For single-ticket runs, set `ticketCoverage` to `{"singleTicket": "<ticket>"}`. For runs that do not require ticket tagging (`release-handoff`, `final-verification`), set `ticketCoverage` to `{"omit": true}`.
-- When the `Task Type` is `improvement-discovery`, populate `## 4.9 Improvement Candidates` with the 10-column schema enforced by `validators/validate-improvement-report.py`. Source the row IDs (`I-NNN`), lens whitelist, and Source workers patterns from `scripts/okstra_ctl/improvement_lenses.py` — do NOT introduce new lens names or worker prefixes.
+- When the `Task Type` is `improvement-discovery`, populate `## 5.9 Improvement Candidates` with the 10-column schema enforced by `validators/validate-improvement-report.py`. Source the row IDs (`I-NNN`), lens whitelist, and Source workers patterns from `scripts/okstra_ctl/improvement_lenses.py` — do NOT introduce new lens names or worker prefixes.
 Write the data.json with your `Write` tool against the absolute `Result Path`. Then invoke the renderer (`Bash`): `python3 scripts/okstra-render-final-report.py <data.json path>`. Confirm both files exist and respond with a short status line: `data.json written to <abs path>; markdown rendered to <abs path>. Sections populated: <count>.`

package/runtime/bin/lib/okstra/tmux-pane.sh ADDED Viewed

@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+# Resolve the tmux pane that the CURRENT process actually runs in.
+#
+# Why this exists: Claude Code's Bash tool strips $TMUX and $TMUX_PANE, so a
+# bare `tmux display-message -p '#{pane_id}'` does NOT return the caller's pane
+# — it returns the pane of the most-recently-active tmux *client*, which (when
+# the user has several attached sessions) is frequently a DIFFERENT session
+# than the one the okstra run lives in. Earlier trace-pane fixes all trusted
+# `display-message` and therefore mis-placed (or dropped) the tail pane.
+#
+# This resolver instead walks the process's own ancestor PIDs and matches them
+# against the tmux server's pane_pids. That is deterministic and correct
+# regardless of $TMUX/$TMUX_PANE or which client is active: when the process is
+# a descendant of a tmux pane's shell it finds exactly that pane; when it is not
+# inside any tmux pane (e.g. Claude launched from the macOS GUI app) no ancestor
+# matches and the function prints nothing.
+#
+# Usage: pane="$(okstra_resolve_caller_pane)"   # empty => not in a tmux pane
+# Optional arg: a starting PID (defaults to $$) — used by the regression test.
+# bash 3.2 safe (no associative arrays).
+okstra_resolve_caller_pane() {
+  command -v tmux >/dev/null 2>&1 || return 0
+  local panes
+  panes="$(tmux list-panes -a -F '#{pane_pid} #{pane_id}' 2>/dev/null)" || return 0
+  [ -n "$panes" ] || return 0
+  local pid="${1:-$$}"
+  local depth=0
+  local hit
+  while [ -n "$pid" ] && [ "$pid" != "0" ] && [ "$depth" -lt 16 ]; do
+    hit="$(printf '%s\n' "$panes" | awk -v p="$pid" '$1==p {print $2; exit}')"
+    if [ -n "$hit" ]; then
+      printf '%s\n' "$hit"
+      return 0
+    fi
+    pid="$(ps -o ppid= -p "$pid" 2>/dev/null | tr -d ' ')"
+    depth=$((depth + 1))
+  done
+  return 0
+}

package/runtime/bin/okstra-codex-exec.sh CHANGED Viewed

@@ -183,36 +183,32 @@ status_path="${prompt_path%.md}.status.json"
 [[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
 started_ts=$(date +%s)
 script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
+# is a bin-sibling in both repo (scripts/lib/...) and installed
+# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
+[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
 python3 "$script_dir/okstra-wrapper-status.py" \
   init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
   >>"$log_path" 2>&1 || true
 # Derive the okstra run dir from the prompt path. paths.py is the SSOT:
 # dispatched prompts live at `<RUN_DIR>/prompts/<cli>-worker-prompt<NNN>.md`,
-# so the run dir is two levels up. Used to (a) read the lead pane the lead
-# recorded in its own foreground pane and (b) tag the trace pane so cleanup
-# can find exactly this run's panes without any tmux env var. Empty if the
+# so the run dir is two levels up. Used to tag the trace pane so cleanup can
+# find exactly this run's panes without any tmux env var. Empty if the
 # derivation fails — every dependent step below then degrades to a no-op.
 run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
-lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
-# Resolve the pane to anchor the trace split to. Claude Code's Bash tool now
-# strips BOTH `$TMUX` and `$TMUX_PANE`, and this wrapper frequently runs
-# backgrounded — so the bare active-pane probe can land on whatever pane the
-# user happens to be looking at now, not Claude's. Prefer the lead pane the
-# lead captured ONCE in its own foreground pane (reliable, see
-# `_common-contract.md`); fall back to `$TMUX_PANE`, then the active-pane
-# probe. A stale recorded id (pane since closed) is rejected via a liveness
-# check so we never anchor the split to a dead pane.
-caller_pane="${TMUX_PANE:-}"
-if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
-  cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
-  if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
-    caller_pane="$cand"
-  fi
-fi
-if [[ -z "$caller_pane" ]]; then
-  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
+# Resolve the pane THIS wrapper actually runs in by walking our ancestor PIDs
+# and matching tmux pane_pids (see lib/okstra/tmux-pane.sh). Reliable
+# regardless of $TMUX/$TMUX_PANE (stripped by Claude Code's Bash tool) and of
+# which tmux client is currently active — a bare `tmux display-message` would
+# instead return the most-recently-active client's pane, frequently a DIFFERENT
+# session than the okstra run, which is why earlier approaches mis-placed or
+# dropped the trace pane. Empty = not inside a tmux pane (e.g. Claude launched
+# from the GUI app) → the trace split below is skipped.
+caller_pane=""
+if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+  caller_pane="$(okstra_resolve_caller_pane)"
 fi
 # Pane titles: worker (caller) pane gets `codex-<role>-<pid>`; the sibling

package/runtime/bin/okstra-gemini-exec.sh CHANGED Viewed

@@ -132,28 +132,25 @@ status_path="${prompt_path%.md}.status.json"
 [[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
 started_ts=$(date +%s)
 script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
+# is a bin-sibling in both repo (scripts/lib/...) and installed
+# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
+[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
 python3 "$script_dir/okstra-wrapper-status.py" \
   init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
   >>"$log_path" 2>&1 || true
 # Resolve the run dir and the trace-split anchor pane. See
-# `okstra-codex-exec.sh` for the full rationale — kept in lock-step: derive
-# `<RUN_DIR>` from the prompt path (paths.py SSOT) to read the lead-recorded
-# pane and to tag the trace pane; prefer that lead pane over the unreliable
-# active-pane probe (this wrapper runs backgrounded and `$TMUX`/`$TMUX_PANE`
-# are stripped).
+# `okstra-codex-exec.sh` / `lib/okstra/tmux-pane.sh` for the full rationale —
+# kept in lock-step: derive `<RUN_DIR>` from the prompt path (paths.py SSOT) to
+# tag the trace pane, and resolve the caller pane by walking our ancestor PIDs
+# against tmux pane_pids (reliable even though `$TMUX`/`$TMUX_PANE` are stripped
+# and the wrapper runs backgrounded). Empty = not inside a tmux pane → skip.
 run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
-lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
-caller_pane="${TMUX_PANE:-}"
-if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
-  cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
-  if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
-    caller_pane="$cand"
-  fi
-fi
-if [[ -z "$caller_pane" ]]; then
-  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
+caller_pane=""
+if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+  caller_pane="$(okstra_resolve_caller_pane)"
 fi
 # Pane titles: worker (caller) pane gets `gemini-<role>-<pid>`; the sibling

package/runtime/bin/okstra-render-report-views.py CHANGED Viewed

@@ -129,7 +129,7 @@ def main(argv: list[str] | None = None) -> int:
     meta = RunMeta(task_key=task_key, task_type=task_type, seq=seq, source_report=source_report)
     html_path = render_html_view(report_path, run_meta=meta, css=css, js=js)
     if html_path is None:
-        print("html: skipped (no §5 clarification rows — html view carries no interactive forms for this report)")
+        print("html: skipped (no §1 clarification rows — html view carries no interactive forms for this report)")
     else:
         print(f"html: {html_path}")
     return 0

package/runtime/bin/okstra-trace-cleanup.sh CHANGED Viewed

@@ -37,6 +37,14 @@
 set -u
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane) — see
+# lib/okstra/tmux-pane.sh. Used as the lead-pane fallback below so a missing /
+# stale lead-pane.id resolves to the pane THIS process actually runs in (via
+# ancestor-PID ↔ tmux pane_pid matching), never a foreign active-client pane.
+# Bin-sibling path in repo + installed layouts; degrade silently if absent.
+_clean_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+[ -r "$_clean_script_dir/lib/okstra/tmux-pane.sh" ] && . "$_clean_script_dir/lib/okstra/tmux-pane.sh"
 MODE="kill"   # kill | list
 REAP=0
 run_dir=""
@@ -88,7 +96,11 @@ if [[ "$REAP" -eq 0 ]]; then
   [[ -r "$lead_pane_file" ]] && lead_pane="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
 fi
 if [[ -z "$lead_pane" ]] || ! tmux display-message -p -t "$lead_pane" '#{pane_id}' >/dev/null 2>&1; then
-  lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
+  if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+    lead_pane="$(okstra_resolve_caller_pane 2>/dev/null || true)"
+  else
+    lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
+  fi
 fi
 # Does a trace pane's tag belong to the set we are closing?

package/runtime/prompts/launch.template.md CHANGED Viewed

@@ -91,4 +91,4 @@ Emit one `PROGRESS: <phase-id> <verb-phrase>` line as plain user-facing text at
 - Source path: `{{CLARIFICATION_RESPONSE_RELATIVE_PATH}}`
 - If the source path above is empty, no prior clarification response was attached to this run.
-- If the source path is set, a copy is staged at `{{INSTRUCTION_SET_RELATIVE_PATH}}/clarification-response.md`. Read it before running workers; reconcile each `C-*` row in section 5 (`## 5. Clarification Items`) of the prior report against new evidence and record the outcome in the conditional `## 0. Clarification Response Carried In From Previous Run` section of this run's final report (render that heading only when carry-in is non-empty — the validator fails empty Section 0 stubs).
+- If the source path is set, a copy is staged at `{{INSTRUCTION_SET_RELATIVE_PATH}}/clarification-response.md`. Read it before running workers; reconcile each `C-*` row in section 1 (`## 1. Clarification Items`) of the prior report against new evidence and record the outcome in the conditional `## 0. Clarification Response Carried In From Previous Run` section of this run's final report (render that heading only when carry-in is non-empty — the validator fails empty Section 0 stubs).

package/runtime/prompts/profiles/_common-contract.md CHANGED Viewed

@@ -24,14 +24,14 @@ profile document.
   - Create, modify, or delete only inside `<PROJECT_ROOT>/.okstra/**` unless the brief verbatim requests a specific non-okstra edit. The phase performing that edit must quote the user instruction in its report. Implementation source edits also require the approved implementation plan.
 - Authority & permissions assumption (applies to every okstra task-type):
   - **Assume the user (and their team) holds full authority and every permission required for the anticipated, in-flight, or follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
-  - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `## 5. Clarification Items` table, or any day/effort estimate. They are not legitimate sources of schedule extension.
+  - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `## 1. Clarification Items` table, or any day/effort estimate. They are not legitimate sources of schedule extension.
   - Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
   - This rule does NOT relax any phase-specific Forbidden actions list; safety rules in the per-profile document remain in force regardless of the user's authority.
 - Anti-escalation rule (shared):
   - treating "다음 단계 진행해" or equivalent user phrases as authorisation to start a *different* lifecycle phase is forbidden. The next phase begins only in a separate okstra run launched with the new `--task-type`. Per-profile documents may further restrict this within their own scope.
 - Run-start pane recording (shared — runs ONCE at run start, before the FIRST worker dispatch):
-  - The wrappers anchor each trace pane to the lead's pane and the cleanup scopes the worker-agent scan to it, but Claude Code's Bash tool strips `$TMUX`/`$TMUX_PANE`, so the lead MUST record its own pane explicitly. Because the lead runs this in its OWN foreground pane, the active pane IS the lead's — reliable, unlike a backgrounded wrapper's later probe.
-  - The lead MUST run once, at run start: `mkdir -p "<RUN_DIR>/state" && tmux display-message -p '#{pane_id}' > "<RUN_DIR>/state/lead-pane.id" 2>/dev/null || true` (substitute the run's absolute `RUN_DIR`). Outside tmux this writes nothing and every pane step below silently no-ops — that empty/absent file is the single signal that the lead is not in tmux.
+  - The codex/gemini wrappers now self-anchor their trace pane by walking their own ancestor PIDs against tmux `pane_pid`s (see `lib/okstra/tmux-pane.sh`), so they no longer depend on this file. The lead still records its own pane id here for the cleanup steps below (which-pane-to-never-kill) and as the "am I in tmux" gate. A bare `tmux display-message -p '#{pane_id}'` is NOT reliable for this — Claude Code's Bash tool strips `$TMUX`/`$TMUX_PANE`, so that command returns the most-recently-active *client's* pane (often a different session, or a foreign pane when the lead is launched outside tmux entirely). The lead therefore records via the same ancestry resolver.
+  - The lead MUST run once, at run start: `mkdir -p "<RUN_DIR>/state" && { . "$HOME/.okstra/bin/lib/okstra/tmux-pane.sh" 2>/dev/null && okstra_resolve_caller_pane; } > "<RUN_DIR>/state/lead-pane.id" 2>/dev/null || true` (substitute the run's absolute `RUN_DIR`). When the lead is not inside a tmux pane (e.g. Claude launched from the GUI app) no ancestor matches a pane, the file is empty, and every pane step below silently no-ops — that empty/absent file is the single signal that the lead is not in tmux.
 - Phase-start pane reset (shared — runs BEFORE dispatching each new worker batch):
   - okstra creates two kinds of tmux pane per run: (a) **worker-agent panes** the harness gives to dispatched subagents (titled `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`), and (b) **trace panes** the codex/gemini wrappers spawn (`<cli>-<role>-<pid>-tail`). Both accumulate across internal phases because each new phase dispatches a fresh worker batch and the prior panes are never reclaimed.
   - When `<RUN_DIR>/state/lead-pane.id` is non-empty (the lead is in tmux), the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --run-dir "<RUN_DIR>"` **immediately before** dispatching the next phase's workers — i.e. just before emitting each `PROGRESS: phase-5.5-convergence round=<N>` marker and just before `PROGRESS: phase-6-synthesis dispatching report-writer-worker`. This closes every prior-phase okstra pane (worker-agent + trace) for this run, while NEVER killing the lead's own pane.
@@ -41,8 +41,8 @@ profile document.
   - This step is **automatic and silent** — NO user prompt (workers are idle sessions that have already delivered their results; there is nothing for the user to preserve). It runs only when team-state's `teamCreate.status == "ok"` (Teams mode was actually used); in the no-`team_name` fallback there is no team to delete, so silent-skip.
   - Sequence (token-usage collection MUST already be complete — `TeamDelete` removes `~/.claude/teams/<team>/` + `~/.claude/tasks/<team>/` but NOT the `~/.claude/projects/` jsonls Phase 7 reads, yet the read MUST precede teardown):
     1. Read `~/.claude/teams/okstra-<task-key>/config.json` and, for every `members` entry whose name is not the lead, `SendMessage(to: <name>, message: { type: "shutdown_request" })` to terminate it gracefully.
-    2. Wait for the shutdown confirmations / idle notifications from all addressed teammates.
-    3. Call `TeamDelete()`. If it errors with an active-members message, a teammate has not finished shutting down — wait briefly and retry `TeamDelete()` once.
+    2. These workers already delivered their results and terminated when their `Agent()` dispatch returned (the lead's completion evidence is the returned output + the existing result/final-report file, not a teardown ack) — a terminated session emits NO shutdown confirmation. Treat `shutdown_request` as best-effort (fire-and-forget); the lead MUST NOT block waiting for acks from addressed teammates. Proceed immediately to step 3.
+    3. Call `TeamDelete()` — the single synchronization point for teardown. If it errors with an active-members message, one teammate is genuinely still shutting down: wait briefly, retry `TeamDelete()` once, then proceed regardless of the result. NEVER loop or re-send `shutdown_request`; teardown must never block run completion once the work and final report already exist.
   - Report it in one short line (e.g. `worker 6명 종료 + 팀 해제`) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
 - Phase wrap-up — okstra pane disposition (shared, MUST be the *last* step before returning control to the user):
   - At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this run.
@@ -61,14 +61,14 @@ profile document.
   - **Reporter confirmation precondition (BLOCKING)**: the brief's frontmatter carries `reporter-confirmations: <complete | partial | pending | skipped>` set by `okstra-brief` Step 6.5. Every phase that consumes the brief MUST read this field before doing analysis. The handling matrix is:
     - `complete` → proceed normally.
     - `partial` → proceed; treat still-unmarked `intent-check:` / `conversion-block:` rows as the `skipped` branch.
-    - `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `## 5. Clarification Items` as `Kind=decision`. Use `Blocks=approval` in `implementation-planning`, where the row gates the User Approval Request; otherwise use `Blocks=next-phase`. The recommended answer is drawn from the brief's matching content and clearly labelled `보고자 직접 확인 권장`.
-    - `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `## 5. Clarification Items` table carries one row per pending item with `Blocks=approval` in `implementation-planning`, otherwise `Blocks=next-phase`. The operator must rerun `okstra-brief` Step 6.5. Do NOT emit `## 0.` for this case — Section 0 is reserved for clarification-response carry-in only.
+    - `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `## 1. Clarification Items` as `Kind=decision`. Use `Blocks=approval` in `implementation-planning`, where the row gates the User Approval Request; otherwise use `Blocks=next-phase`. The recommended answer is drawn from the brief's matching content and clearly labelled `보고자 직접 확인 권장`.
+    - `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `## 1. Clarification Items` table carries one row per pending item with `Blocks=approval` in `implementation-planning`, otherwise `Blocks=next-phase`. The operator must rerun `okstra-brief` Step 6.5. Do NOT emit `## 0.` for this case — Section 0 is reserved for clarification-response carry-in only.
     `[CONFIRMED <YYYY-MM-DD> → RC-N]` markers on `Open Questions` rows are the per-row signal that the reporter has answered; their answers live verbatim under `## Reporter Confirmations` in the brief.
   - `Source Material` is reporter-verbatim. Do NOT paraphrase, summarize, reorder, or restructure it. Quote it directly when needed.
   - `Augmentation` entries carry one of four labels — `evidence-link`, `format-conversion`, `terminology-mapping`, `intent-inference`. Treat them as follows:
     - `evidence-link` / `format-conversion` → trust without re-verification.
     - `terminology-mapping` → verify against `<PROJECT_ROOT>/.okstra/glossary.md` (authoritative); raise a `Clarification Items` row if the mapping is missing or contradicts the glossary.
-    - `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `## 5. Clarification Items` table as `Kind=decision, Blocks=next-phase` (or `Blocks=approval` for `implementation-planning`) with the recommended answer set to "보고자에게 직접 확인 후 응답" unless the codebase can be inspected to confirm or refute the inference.
+    - `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `## 1. Clarification Items` table as `Kind=decision, Blocks=next-phase` (or `Blocks=approval` for `implementation-planning`) with the recommended answer set to "보고자에게 직접 확인 후 응답" unless the codebase can be inspected to confirm or refute the inference.
   - `Open Questions` row prefixes are signals — do not strip them when promoting:
     - `intent-check:` → `Kind=decision`, recommended answer = reporter confirmation. NEVER silently resolve an `intent-check:` by inference at this layer.
     - `terminology:` → `Kind=decision`, recommended answer = canonical term from `<PROJECT_ROOT>/.okstra/glossary.md` (or "extend okstra glossary via brief Step 4.5").
@@ -77,22 +77,22 @@ profile document.
     - `general:` → free-form; classify per the standard `Clarification Items` rules.
   - Any decision in this run that contradicts the brief's `Source Material` must be raised back to the reporter via a `Clarification Items` row; it must NOT be silently overridden. Disagreement with the reporter is allowed only after the row is resolved.
   - This contract is the single authority on brief consumption. Phase-specific addenda may *tighten* these rules but may not relax them.
-- Clarification request policy (shared — applies whenever a profile uses `## 5. Clarification Items`):
-  - **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §5.1 exactly):** every `## 5. Clarification Items` table has exactly these 8 columns, in this order:
+- Clarification request policy (shared — applies whenever a profile uses `## 1. Clarification Items`):
+  - **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §1 exactly):** every `## 1. Clarification Items` table has exactly these 8 columns, in this order:
     `| ID | Ticket ID | Kind | Statement | Expected form | Blocks | Status | User input |`.
     Profile-specific addenda may tighten cell content but MUST NOT add, remove, rename, or reorder columns. The `ID` cell uses `C-NNN` (3-digit zero-padded), the `Status` cell ∈ `{open, answered, resolved, obsolete}`, and the `Kind` / `Blocks` legal values are listed below.
-  - section 5 is a **single unified table** per `final-report-template.md`. Every clarification item — whether the user must attach a file, choose between options, or supply a single number/path — is one row of that table. Do not split it into sub-sections (`5.1 추가 자료 요청` / `5.2 사용자 확인 질문` / `4.5.9 Open Questions` are removed and the validator fails reports that reintroduce them), do not create a parallel table elsewhere in the report, and do not duplicate the same item into the top-of-report `User Approval Request (사용자 승인 게이트)` block or any other section.
+  - section 1 is a **single unified table** per `final-report-template.md`. Every clarification item — whether the user must attach a file, choose between options, or supply a single number/path — is one row of that table. Do not split it into sub-sections (`1.1 추가 자료 요청` / `1.2 사용자 확인 질문` / `5.5.9 Open Questions` are removed and the validator fails reports that reintroduce them), do not create a parallel table elsewhere in the report, and do not duplicate the same item into the top-of-report `User Approval Request (사용자 승인 게이트)` block or any other section.
   - each row's `Kind` column picks one of `{material, decision, data-point}`: `material` for files / snapshots / logs / screenshots the user must attach (the `User input` cell will hold a path or URL); `decision` for choices and yes/no confirmations only the user can make; `data-point` for a single number, ID, date, or short string the user can answer inline. Items that mix "yes/no + file path if yes" are one row of `Kind=material` with the combined expectation written into `Expected form`.
   - each row's `Blocks` column picks one of `{approval, next-phase, none}`. `approval` is reserved for items that gate an approval action, especially the `implementation-planning` User Approval Request; outside `implementation-planning`, unresolved brief reporter-confirmation rows use `next-phase` instead. `next-phase` blocks the next run from starting cleanly. `none` is informational/audit-only.
   - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. The `Statement` cell must state *what* is needed, *why* the answer / attachment changes the next step, and (for `material`) *where* the user can find it and *where* to place it. The `Expected form` cell must state the answer shape (예/아니오, 보기 중 하나, 숫자/날짜, 파일 경로, 짧은 서술 등); supply concrete option choices when applicable.
   - if a phase requires a recommended answer, alternatives, or an evidence-check note, encode it inside the existing 8-column schema: put evidence notes in `Statement` as `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>`, and put recommendations/options in `Expected form` as `Recommended: <answer> — <rationale>; Alternatives: <options>`. Do not add `Recommended`, `Evidence`, `Alternatives`, or `evidence-checked` columns.
   - the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
-  - if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `## 5. Clarification Items` table, reconcile each one against new evidence, and update its `Status` to `resolved` or `obsolete` before issuing the next decision/verdict. When no carry-in path was provided, omit the `## 0.` heading entirely — the validator fails reports that emit an empty Section 0 stub (e.g. "No prior clarification response was provided for this run.").
+  - if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `## 1. Clarification Items` table, reconcile each one against new evidence, and update its `Status` to `resolved` or `obsolete` before issuing the next decision/verdict. When no carry-in path was provided, omit the `## 0.` heading entirely — the validator fails reports that emit an empty Section 0 stub (e.g. "No prior clarification response was provided for this run.").
 - Verdict Card (shared — applies to every final-report regardless of profile):
-  - The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `## 2. Final Verdict` and the first item of `## 6. Recommended Next Steps`. The validator treats the card as a non-authoritative index — when card values diverge from the authoritative sections, the run is `contract-violated`.
-- Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `## 1.` / `## 3.` tables in the final-report):
+  - The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `## 7. Final Verdict` and the first item of `## 3. Recommended Next Steps`. The validator treats the card as a non-authoritative index — when card values diverge from the authoritative sections, the run is `contract-violated`.
+- Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `## 6.` / `## 2.` tables in the final-report):
   - **Worker-side item IDs (free-form but unique within the worker).** Every row item in sections 1–5 (and any optional section 6) of an analysis worker's output MUST carry an item ID that is unique within that one worker's result file. The ID convention is the worker's choice — `F-001` / `F-002` per the suggested schema, `1.1` / `1.2` / `1.3` as Codex tends to use, or any other shape — but it MUST appear as the leading column of the row (for table-form items) or as a `[<ID>]` prefix (for bullet/numbered items). Workers that emit findings without IDs make cross-worker reconciliation impossible.
-  - **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `## 1.1 Consensus` / `## 1.2 Differences` / `## 3.1 Primary Evidence` rows from worker outputs, the lead assigns a fresh `C-NNN` / `D-NNN` / `E-NNN` row ID. The `Source items` column (or, where the template still calls it `Supporting workers` / `Workers (position)` / `Source`, that same column) MUST list every contributing worker:item pair (e.g. `claude:F-001, codex:1.1, gemini:F-3`) so a reviewer can trace the synthesised row back to each worker's original wording without re-reading every worker-results file. Bare worker names without item IDs (e.g. `claude, codex, gemini`) are deprecated for these tables; the validator does not yet fail on them but the readability pass treats it as a contract violation.
+  - **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `## 6.1 Consensus` / `## 6.2 Differences` / `## 2.1 Primary Evidence` rows from worker outputs, the lead assigns a fresh `C-NNN` / `D-NNN` / `E-NNN` row ID. The `Source items` column (or, where the template still calls it `Supporting workers` / `Workers (position)` / `Source`, that same column) MUST list every contributing worker:item pair (e.g. `claude:F-001, codex:1.1, gemini:F-3`) so a reviewer can trace the synthesised row back to each worker's original wording without re-reading every worker-results file. Bare worker names without item IDs (e.g. `claude, codex, gemini`) are deprecated for these tables; the validator does not yet fail on them but the readability pass treats it as a contract violation.
   - **Why this matters.** A real run had `claude=F-1..F-11`, `codex=1.1..1.8`, `gemini=F-3..F-9` — three incompatible ID schemes. When the lead synthesised `C-1..C-8`, the link from `C-3` back to "which sentence in which worker file" was lost. Source-item preservation restores that link without forcing every worker to adopt a single ID prefix, which would over-constrain worker output style.
 - Audit sidecar (shared — applies to every analysis-worker output and every final-report):
   - Reading Confirmation lines (one short line per input file confirming end-to-end reading) live in the **worker audit sidecar** at `runs/<task-type>/worker-results/<worker>-audit-<task-type>-<seq>.md`, NOT in the worker's main worker-results file. The worker-results body starts at section 1 (Findings). The validator fails worker-results files that contain a `## 0. Reading Confirmation` heading.

package/runtime/prompts/profiles/_implementation-deliverable.md CHANGED Viewed

@@ -30,7 +30,7 @@ are collected and convergence finished. Phase 1-5 do not need it.
   - **Feature-flag-gated changes**: confirm the off-switch path was exercised in this run's validation evidence (i.e. one of the validation commands ran with the flag off and succeeded). A plan that ships a flag without exercising the off-path does NOT satisfy this requirement.
   - **Schema migrations, config-format changes, or any change with persisted state**: a **dry-run of the rollback step is mandatory**, not preferred. Record the exact rollback command and its captured exit code / stdout. If the migration tool offers no dry-run mode (`--dry-run`, `--plan`, equivalent), the executor MUST refuse to claim rollback verification and instead end the run with a routing recommendation back to `implementation-planning` for a safer rollback strategy. Skipping this step on a stateful change is treated as a `contract-violated` outcome by `final-verification`.
 - **Routing recommendation for `final-verification`**: brief note on whether the changes are ready for final-verification phase or need a new error-analysis / planning loop first.
-- **Follow-up tasks (Section 7 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 7. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any unresolved `## 5. Clarification Items` row carried over from the approved plan (`Status` ∈ `{open, answered}` at approval time). An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 6. Recommended Next Steps` so the user knows to act manually.
+- **Follow-up tasks (Section 4 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 4. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any unresolved `## 1. Clarification Items` row carried over from the approved plan (`Status` ∈ `{open, answered}` at approval time). An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 3. Recommended Next Steps` so the user knows to act manually.
 ## Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent)

package/runtime/prompts/profiles/_implementation-executor.md CHANGED Viewed

@@ -27,7 +27,7 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
   - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
   - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
 - **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
-- re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
+- re-read the approved plan end-to-end and parse the `## 5.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
   - for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
   - the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
 - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively