npm - okstra - Versions diffs - 0.47.0 → 0.49.0 - Mend

okstra 0.47.0 → 0.49.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/docs/superpowers/specs/2026-06-04-adversarial-implementation-planning-design.md ADDED Viewed

@@ -0,0 +1,90 @@
+# 적대적 검증을 implementation-planning 으로 확장 — 설계 (sub-project A)
+- 작성일: 2026-06-04
+- 범위: `implementation-planning` phase 의 두 교차검증 라운드를 적대적으로 전환한다 — (1) Phase 5.5 **finding convergence** 는 `requirements-discovery` / `error-analysis` 에 이미 적용된 적대적 기계를 *그대로 재사용*하고, (2) **plan-body verification** 은 검증 *자세*(입증 책임을 계획 쪽으로, 인용 경로 적극 추적)를 적대적으로 바꾸되 게이트 임계값은 **과반 유지**한다.
+- 비범위
+  - plan-body 게이트 임계값 변경 없음 — `majority-disagree` 차단 규칙·gate-result 4값 enum·`validators/validate-run.py` 의 `validate_phase_boundary` 모두 불변. (단일-DISAGREE 차단은 사용자가 명시적으로 기각함.)
+  - finding convergence 의 적대적 알고리즘 자체는 재정의하지 않는다 — 이미 [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) §"Adversarial Verification Mode" 에 정의돼 있고, implementation-planning 은 플래그만 켜서 상속한다.
+  - `final-verification` 적대화 / coverage critic 워커는 **sub-project B** 의 범위 — 본 문서에서 다루지 않는다.
+  - 새 verdict kind·classification 값·worker 역할 추가 없음.
+- 관계: 이전 작업 [`2026-06-04-adversarial-verification-design.md`](2026-06-04-adversarial-verification-design.md) 가 만든 적대적 convergence 기계를 한 phase 더(`implementation-planning`)로 확장한다. 그 문서의 모드 분기(`convergence.adversarial`)·verdict 매핑·`disagreeBasis`·집계 규칙을 재사용한다.
+## 1. 동기
+`implementation-planning` 은 거짓 합의 비용이 가장 큰 지점 중 하나다 — 여기서 통과한 계획이 곧바로 `implementation` run 의 실행 대상이 된다. 그런데 현재 이 phase 의 두 검증 라운드는 모두 협조적이다:
+1. **finding convergence** 는 `convergence.adversarial=false` 라 lightweight 협조적 — 약한 요구-갭/리스크 주장이 적극 반박 없이 `full-consensus` 로 남는다 ([`scripts/okstra_ctl/render.py:917`](../../../scripts/okstra_ctl/render.py)).
+2. **plan-body verification** 은 "이 항목이 깨졌나?" 를 묻되 입증 책임이 반박자 쪽에 있고, `dissent-isolated`(1명만 DISAGREE)는 무사통과한다 ([`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) §"Plan-body verification mode"). 깨진 계획 항목을 적극적으로 추적·검증하는 압력이 약하다.
+→ 두 라운드 모두 검증 자세를 "동의" 에서 "반박 시도" 로 뒤집되, plan-body 의 *게이트 산술*은 과반을 유지해 한 명의 오탐이 approval 을 막지 않게 한다.
+## 2. 핵심 설계
+### 2.1 finding convergence — 기존 적대 기계 재사용
+`scripts/okstra_ctl/render.py` 의 `_build_convergence_block` 가 쓰는 `adversarial_phases` 집합에 `implementation-planning` 을 추가한다 ([`render.py:917`](../../../scripts/okstra_ctl/render.py)). 이 한 줄로 implementation-planning 의 finding convergence 는:
+- `config.adversarial=true`, `config.verificationMode="full-reanalysis"` 를 주입받고,
+- §"Adversarial Verification Mode" 의 적대적 프롬프트·집계(1 counter-evidence→강등·burden-not-met)·`disagreeBasis`·인용-증거-한정 재조사를 *그대로* 상속한다.
+`maxRounds` 는 현행 유지(implementation-planning=2). 적대적 full-reanalysis 가 라운드당 비용을 올리지만 재조사 범위가 finding 인용 증거로 한정돼 있어 폭증하지 않으며, 계획 단계의 높은 비용-정당성이 이를 흡수한다.
+### 2.2 plan-body verification — 적대적 자세, 과반 게이트 유지
+plan-body verification 라운드는 같은 run 의 `config.adversarial` 플래그를 읽어 적대적 자세를 적용한다(별도 플래그 없음 — 한 run 의 적대성은 단일 스위치).
+**바뀌는 것 (프롬프트/자세):**
+- 검증자는 각 `P-*` 항목의 인용 경로/심볼/명령을 *직접 열어* 실존·실행성을 확인하며 항목을 깨뜨리려 시도한다. (현행 plan-body 프롬프트의 "원본 재분석 금지" 는 유지하되, *항목이 인용한 경로/명령의 실존 확인*은 허용하도록 좁게 푼다.)
+- 입증 책임은 계획 쪽에 있다 — 인용 경로/명령/검증신호를 확인할 수 없으면 해당 `DISAGREE(<kind>)`(a–e 중 적용 가능한 것)로 응답한다.
+- plan-body 는 lightweight 전용 그대로다(full-reanalysis 강제 안 함) — "인용 경로 실존 확인" 이 적대적 재조사의 범위.
+**바뀌지 않는 것 (게이트 산술):**
+- classification enum(`full-consensus | partial-consensus | dissent-isolated | majority-disagree | contested`) 불변.
+- 게이트 규칙 불변 — `majority-disagree`(분석자 **과반** DISAGREE) 일 때만 approval 차단. `dissent-isolated`(1명 DISAGREE)는 종전대로 partial-consensus 로 처리하며 차단하지 않는다.
+- gate-result 4값(`passed` / `passed-with-dissent` / `blocked-by-disagreement` / `aborted-non-result`)·`validate_phase_boundary`([`validators/validate-run.py:1180`](../../../validators/validate-run.py)) 불변.
+즉 plan-body 적대화는 검증의 *질*(능동적 추적·계획 측 입증책임)만 올리고 *판정 임계값*은 그대로다. 이건 LLM 프롬프트 지시 레이어의 변경이며 런타임으로 강제되지 않는다(§4).
+### 2.3 일관성 sweep (Rule 2 — "어떤 phase 가 적대적인가" 가 적히는 모든 곳)
+`adversarial_phases` 집합이 행동 SSOT 다. 적대적 phase 목록을 *열거*하는 문서 지점을 모두 2개→3개로 갱신한다:
+- [`render.py:906`](../../../scripts/okstra_ctl/render.py) docstring 의 default 설명
+- [`skills/okstra-convergence/SKILL.md:49`](../../../skills/okstra-convergence/SKILL.md) Configuration 표 `adversarial` 행
+- 같은 파일 §"Adversarial Verification Mode" 전제 문장(현 line 198 부근)
+- 같은 파일 Schema rules 의 `config.adversarial` 불릿("default for requirements-discovery / error-analysis")
+- [`prompts/profiles/_common-contract.md:17`](../../../prompts/profiles/_common-contract.md) Phase 5.5 문장
+- [`agents/SKILL.md`](../../../agents/SKILL.md) `convergence.adversarial` knob 행
+§"Scoped full-reanalysis" 비용-근거 산문(현 line 204·302)의 "requirements-discovery·error-analysis 가 가장 큰 회피 가능 비용" 서술은 *비용 출처 설명*이므로 implementation-planning 을 같은 맥락에 더해 갱신한다.
+## 3. 변경 파일
+1. [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` — `adversarial_phases` 집합에 `implementation-planning` 추가 + docstring 갱신.
+2. [`tests/test_render_convergence_adversarial.py`](../../../tests/test_render_convergence_adversarial.py) — `implementation-planning` 을 비적대 파라미터 목록에서 적대 파라미터 목록으로 **이동**.
+3. [`tests/test_plan_body_verification.py`](../../../tests/test_plan_body_verification.py) — `test_convergence_block_default_for_implementation_planning` 의 기대값을 `adversarial: True`, `verificationMode: "full-reanalysis"` 로 갱신.
+4. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) — (a) §2.3 의 phase-목록 갱신 3곳, (b) §"Plan-body verification mode" 에 "Adversarial plan-body posture" 하위 절 추가(자세/프롬프트 변경 + 게이트 과반 유지 명시).
+5. [`prompts/profiles/implementation-planning.md`](../../../prompts/profiles/implementation-planning.md) — Phase 5.5 finding convergence 적대 선언 + §4.5.9 plan-body 적대 자세 선언.
+6. [`prompts/profiles/_common-contract.md`](../../../prompts/profiles/_common-contract.md) — §2.3 Phase 5.5 문장 갱신.
+7. [`agents/SKILL.md`](../../../agents/SKILL.md) — §2.3 knob 행 갱신.
+8. [`CHANGES.md`](../../../CHANGES.md) — `사용자 영향:` 항목.
+## 4. Enforcement — 선언과 강제의 구분
+- **machine-강제 (변경됨):** `render.py` 가 implementation-planning 에 `adversarial=true` + `full-reanalysis` 를 주입 — 기존 contract 테스트([`tests/test_convergence_state_contract.py`](../../../tests/test_convergence_state_contract.py)) + render 테스트가 형태를 강제. 본 sub-project 는 render 테스트 + plan-body 블록 테스트의 기대값을 갱신해 이를 반영한다.
+- **machine-강제 (불변):** plan-body 의 gate↔approval 대응은 `validate_phase_boundary` 가 종전대로 강제. 게이트 enum·classification enum 불변이라 validator 코드 변경 없음.
+- **prompt-only (강제 불가):** finding 의 적대적 반박 행동, plan-body 의 능동적 경로 추적·계획측 입증책임은 lead/워커(LLM) 프롬프트 지시일 뿐 런타임 강제가 아니다. skill·profile 선언으로만 유도한다.
+## 5. 비용·리스크
+- **비용:** finding convergence 가 lightweight→full-reanalysis(scoped) + 적대적, 최대 2라운드. 재조사 범위 한정으로 폭증 억제. plan-body 는 lightweight 유지라 추가 비용은 인용 경로 확인 정도.
+- **리스크 — 계획 finding 거짓 강등:** 적대적 finding convergence 가 참인 요구-갭 주장을 `contested` 로 강등할 수 있음. 완화: `contested` 는 기각이 아니라 "다툼 있음" 분류로 리포트에 남고, counter-evidence 반박은 `file:line` 인용 필수라 사유 추적 가능(기존 기계의 성질 그대로).
+- **리스크 — plan-body 오탐:** 게이트 과반 유지로 1명의 오탐이 approval 을 막지 않음(사용자 결정). 능동적 추적이 늘려도 차단은 과반 합의가 필요.
+## 6. 수용 기준
+1. `implementation-planning` 의 manifest `convergence` 블록에 `adversarial: true`, `verificationMode: "full-reanalysis"` 가 주입된다. (3개 적대 phase: requirements-discovery / error-analysis / implementation-planning.)
+2. convergence skill 이 plan-body 적대적 자세를 정의하고, 게이트가 과반 유지임을 명시한다.
+3. 적대 phase 목록이 적히는 모든 문서 지점(§2.3)이 3개로 일치한다(grep 일관성).
+4. `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` 통과.
+5. implementation-planning 프로필이 Phase 5.5 적대 + plan-body 적대 자세를 선언한다.

package/docs/superpowers/specs/2026-06-04-coverage-critic-design.md ADDED Viewed

@@ -0,0 +1,99 @@
+# Coverage critic — 설계 (sub-project B1)
+- 작성일: 2026-06-04
+- 범위: `requirements-discovery` / `error-analysis` / `implementation-planning` 세 phase 에, Phase 5.5 convergence 직후·Phase 6 report-writer 직전에 도는 **coverage critic pass** 를 추가한다. critic 은 기존 worker subagent 를 *재사용*해 dispatch 되며("아무도 안 본 각도/파일/모달리티, finding 없는 요구사항을 내라"), critic 이 낸 coverage gap 은 **1회 적대적 reverify** 를 거쳐 살아남은 것만 최종 finding/clarification 으로 병합된다. 사용 여부·backing provider 는 **okstra-run 초기 select box** 로 고르며(추천 + "직접 입력"), CLI 파라미터가 넘어오면 그 선택 단계를 건너뛴다. 기본값은 off(opt-in).
+- 비범위
+  - `final-verification` 의 verdict 악마의 변호인은 **sub-project B2** — 본 문서 밖. (critic primitive 는 B2 가 재사용한다.)
+  - 새 설치형 agent(`critic-worker.md` / `installed-agents.json`) 추가 없음 — 기존 `claude-worker`/`codex-worker`/`gemini-worker` subagent_type 을 critic 프롬프트로 재사용.
+  - convergence 의 finding/plan-body classification·gate enum 변경 없음 — critic gap 은 기존 adversarial finding classifier 를 그대로 탄다.
+  - 워커 *수*(로스터 크기) 변경 없음 — critic 은 별도 pass 이지 Phase 4 analyser 로스터를 늘리는 게 아니다.
+- 관계: 적대적 convergence 기계([`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) §"Adversarial Verification Mode")를 재사용해 critic gap 을 검증한다. wizard select 패턴은 기존 `okstra-run` step 들(`defaults_or_custom` 등)과 동일한 picker 관례를 따른다.
+## 1. 동기 — 커버리지 갭은 구조적이다
+okstra 의 analyser 워커들은 설계상 *같은* section 1–5 질문을 답한다 — triangulation 이지 partition 이 아니다 ([`skills/okstra-team-contract/SKILL.md:204`](../../../skills/okstra-team-contract/SKILL.md)). 따라서 워커를 늘려도 *같은 종류*의 finding 을 더 찾을 뿐, "아무도 보지 않은 각도" 는 구조적으로 비어 있다. 합의 품질(b)은 적대적 검증으로 올렸지만, 커버리지(c) — 놓친 finding — 는 별도 메커니즘이 필요하다. coverage critic 은 통합 findings 를 입력으로 "무엇이 빠졌나" 만 전담해 이 갭을 메운다.
+## 2. 핵심 설계
+### 2.1 critic pass primitive
+Phase 5.5 convergence 가 끝나 findings 가 분류된 직후, Phase 6 report-writer dispatch **전에** lead 가 critic pass 를 1회 실행한다.
+- **재사용 dispatch:** 선택된 provider 의 기존 subagent_type(`claude-worker` / `codex-worker` / `gemini-worker`)에 critic 프롬프트로 dispatch. 새 agent 정의 없음.
+  - 입력: task brief + Phase 5.5 통합 findings 요약 + 코드베이스 read 접근.
+  - 프롬프트 골자: "다음은 이미 합의된 findings 다. 아무도 검사하지 않은 파일/디렉터리/실행경로, finding 이 하나도 없는 요구사항/수용기준, 제기됐지만 아무도 검증하지 않은 주장을 찾아라. 각 coverage gap 을 새 finding 으로, 근거(`file:line` 또는 요구사항 인용)와 함께 내라. 이미 있는 finding 을 반복하지 마라."
+  - 결과 파일: `runs/<task-type>/worker-results/<provider>-critic-<task-type>-<seq>.md`.
+- **1회 적대적 reverify (질문3 결정):** critic 이 낸 gap 들을 `originWorker=<provider>-critic` 인 새 finding 으로 verification queue 에 넣고, **Phase 4 analyser 들**(critic 자신 제외)이 1라운드 적대적으로 reverify 한다(기존 §"Adversarial Verification Mode" classifier 재사용: 증거기반 반박 1건 → 강등). `full-consensus`/`partial-consensus` 로 살아남은 gap 만 최종 리포트 finding 으로 병합; 강등(`contested`/`worker-unique`)된 gap 은 환각으로 보고 버리거나 dissent 로만 기록.
+- **모델 (질문2 결정):** backing provider 는 `--critic <claude|codex|gemini>` 로 선택, 모델은 그 provider 의 기존 `--<provider>-model` 값(executor 바인딩 패턴 미러링). 별도 critic 모델 플래그는 두지 않는다(YAGNI).
+### 2.2 선택 UX — wizard select step + CLI bypass
+critic 사용 여부·provider 는 실행 파라미터 전용이 아니라 **okstra-run 초기 select box** 로 고른다.
+- 신규 wizard step `S_CRITIC_PICK` — [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) 의 `_build_*`/`_submit_*` + [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) SOT. picker 관례(추천 1~2 + 마지막 "직접 입력"):
+  - `critic 사용 안 함 (기본·추천)` → 비활성
+  - `claude critic (opus)` *(추천)* → provider=claude, 모델=해당 phase 의 claude 모델
+  - `직접 입력` → provider(+선택 모델)를 사용자가 직접 지정
+  - (목록은 현재 analyser 로스터에 맞춰 codex/gemini 옵션을 추천에 추가할 수 있으나, 추천은 최대 2개 + 직접 입력 = 3옵션 관례 유지.)
+- **CLI 우선·건너뛰기:** `--critic <provider>` 또는 `--no-critic` 가 넘어오면 `S_CRITIC_PICK` 을 **건너뛴다**. `okstra.sh` / `node bin/okstra` 비대화 경로는 플래그로, 대화형 `okstra-run` 은 select box 로. 플래그 무지정 + 비대화 경로면 기본 off.
+- 세 진입점(okstra-run skill / okstra.sh / node CLI)은 모두 `prepare_task_bundle()` 로 수렴하므로, critic 선택은 거기서 manifest 로 직렬화된다(단일 참조점 보존).
+### 2.3 manifest `convergence.critic` 블록 + render resolve
+[`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) 가 `convergence` 하위에 `critic` 블록을 emit 한다:
+```json
+"critic": {
+  "enabled": false,
+  "provider": null,
+  "modelExecutionValue": null
+}
+```
+- `enabled`: wizard 선택 or `--critic` 플래그가 provider 를 정하면 `true`, 아니면 `false`(기본).
+- `provider`: `claude` | `codex` | `gemini` | `null`.
+- `modelExecutionValue`: 선택된 provider 의 모델(그 provider 의 `--<provider>-model` 시드에서). `enabled=false` 면 `null`.
+lead 는 Phase 5.5 종료 시 이 블록을 읽어 critic pass 실행 여부/대상을 정한다.
+### 2.4 적용 phase
+requirements-discovery / error-analysis / implementation-planning. (final-verification 은 B2.) 이 세 phase 는 모두 finding 을 산출하므로 coverage critic 이 의미가 있다. release-handoff/implementation 은 적용하지 않는다.
+## 3. 데이터 모델
+- **manifest:** §2.3 의 `convergence.critic` 블록.
+- **convergence 상태 아티팩트:** critic gap 은 `findings[]` 에 `originWorker: "<provider>-critic"` 로 들어가고 기존 rounds/votes/classification 스키마를 그대로 쓴다. 추적용으로 finding 에 선택 필드 `source: "critic"` 를 둔다(없으면 `null`=일반 워커 발견). schemaVersion 은 `1.2` 유지(optional 필드 추가, reader 는 누락을 null 로 취급) — enum 변경 없음.
+- **convergence state `config`:** critic 실행 시 `config.critic` 에 `{ provider, modelExecutionValue, gapsProposed, gapsMerged }` 요약을 기록(감사용).
+## 4. 변경 파일
+1. [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` — `critic` 블록 emit + `--critic`/wizard 선택 resolve.
+2. [`scripts/okstra_ctl/run.py`](../../../scripts/okstra_ctl/run.py) — `--critic <provider>` / `--no-critic` argparse + ctx 전달 + 세 phase 한정 적용.
+3. [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) — `S_CRITIC_PICK` step `_build`/`_submit` + 흐름 편입(CLI 미지정 시에만 표시).
+4. [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) (+ 영문 SOT 있으면 동기) — `critic_pick` step label/options.
+5. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) — "Coverage critic pass" 절 신설(시점·프롬프트·1회 적대 reverify·병합·`convergence.critic` 스키마).
+6. [`agents/SKILL.md`](../../../agents/SKILL.md) — Phase 5.5→6 흐름에 critic pass + `PROGRESS: phase-5.5-critic provider=<p>` 라인 + (해당 시) 모델/knob 참조.
+7. [`prompts/profiles/requirements-discovery.md`](../../../prompts/profiles/requirements-discovery.md) / [`error-analysis.md`](../../../prompts/profiles/error-analysis.md) / [`implementation-planning.md`](../../../prompts/profiles/implementation-planning.md) — coverage critic opt-in 선언 1줄.
+8. 테스트: render `critic` 블록 resolve(`--critic`/무지정/`--no-critic`), wizard `S_CRITIC_PICK` 빌드+제출, CLI bypass.
+9. [`CHANGES.md`](../../../CHANGES.md) — 사용자 영향 항목.
+## 5. Enforcement — 선언과 강제의 구분
+- **machine-강제:** `convergence.critic` 블록 형태 + render resolve(`--critic`/`--no-critic`/무지정) → 단위 테스트. wizard `S_CRITIC_PICK` 의 picker 옵션·CLI bypass → wizard 테스트.
+- **prompt-only(강제 불가):** critic 이 실제로 의미 있는 gap 을 찾는지, 1회 적대 reverify 가 환각을 거르는지는 lead/워커(LLM) 프롬프트 지시일 뿐 런타임 강제 아님 — skill/profile 선언으로 유도.
+## 6. 비용·리스크
+- **비용:** opt-in(기본 off). 켜면 critic dispatch 1 + reverify 1라운드(analyser 수만큼). 기본 off 라 미선택 run 은 비용 0.
+- **리스크 — 환각 gap:** critic 이 가짜 gap 을 낼 수 있음. 완화: 1회 적대 reverify 가 증거 없는 gap 을 강등. 살아남은 gap 만 finding 으로.
+- **리스크 — 중복 finding:** critic 이 기존 finding 을 재서술. 완화: 프롬프트가 "이미 있는 finding 반복 금지" 명시 + reverify 단계의 semantic grouping 이 중복을 흡수.
+- **리스크 — reverify 투표자 부족:** critic gap 은 critic 자신을 뺀 Phase 4 analyser 가 검증. 기본 로스터가 ≥2 analyser 라 최소 1명은 항상 투표 가능. analyser 가 1명뿐인 비정상 구성이면 critic gap 은 검증 불가로 표면화만(병합 안 함)하고 그 사실을 기록.
+## 7. 수용 기준
+1. wizard 에 `S_CRITIC_PICK` select box 가 추가되고(추천 + "직접 입력"), `--critic`/`--no-critic` 미지정 대화형 run 에서만 표시된다. 플래그 지정 시 건너뛴다.
+2. manifest `convergence.critic` 가 wizard 선택/플래그에서 정확히 resolve 된다(enabled/provider/model). 기본 off.
+3. convergence skill 이 critic pass(시점·프롬프트·1회 적대 reverify·병합)를 정의한다.
+4. 세 적용 phase 프로필이 coverage critic opt-in 을 선언한다.
+5. `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` 통과.

package/docs/superpowers/specs/2026-06-05-acceptance-critic-design.md ADDED Viewed

@@ -0,0 +1,90 @@
+# Acceptance devil's-advocate critic — 설계 (sub-project B2)
+- 작성일: 2026-06-05
+- 범위: sub-project B1 의 critic dispatch primitive(기존 worker 재사용 + provider/model 선택 + opt-in)를 `final-verification` 으로 확장한다. final-verification 에서 critic 은 **악마의 변호인** 모드로 동작 — "받아들이면 안 되는 이유/놓친 acceptance blocker 를 캐라" — 그리고 후보 blocker 는 **confirm-or-downgrade** 로 검증(확인→Acceptance Blocker, 미확인→Residual Risk, 절대 drop 안 함)되어 verdict 에 반영된다. 선택 UX·`--critic`/`S_CRITIC_PICK`·`convergence.critic` 블록은 B1 것을 그대로 재사용하며, 적용 phase 에 `final-verification` 을 추가한다.
+- 비범위
+  - 새 dispatch 메커니즘·새 설치형 agent·새 selection 플래그 없음 — B1 primitive 전부 재사용.
+  - convergence 의 finding/plan-body classification·gate enum·verdict↔blocker validator 변경 없음 — 확인된 critic blocker 는 *기존* Acceptance Blockers 경로로 들어가고 기존 verdict 규칙(`accepted` ⇒ blocker 0)이 그대로 작동한다.
+  - B1 의 coverage critic 동작(3 finding-phase, 적대적-drop 검증) 변경 없음 — B2 는 final-verification 전용 *모드*를 추가할 뿐.
+- 관계: B1 [`2026-06-04-coverage-critic-design.md`](2026-06-04-coverage-critic-design.md) 가 만든 critic primitive·`convergence.critic` 블록·`S_CRITIC_PICK` 선택을 재사용한다. final-verification 프로필의 Acceptance Blockers/Residual Risk/Verdict Token 구조([`prompts/profiles/final-verification.md`](../../../prompts/profiles/final-verification.md))에 출력을 연동한다.
+## 1. 동기
+final-verification 은 수용 판정 직전의 마지막 게이트다. 거짓 합의가 여기서 새면 결함이 그대로 릴리스된다. 그런데 적대적 검증(sub-project 0)을 final-verification 에 그대로 적용하면 역효과다 — finding 이 *결함/blocker* 이므로 finding 을 적대적으로 반박하면 "재현 못 한 진짜 결함" 이 강등·누락된다(결함 민감도 하락). 그래서 final-verification 의 critic 은 finding 을 반박하는 게 아니라, **verdict 에 대한 악마의 변호인** — "이 작업을 받아들이면 안 되는 이유를 적극적으로 찾는" 추가 패스여야 한다. 이는 커버리지를 늘리되(놓친 blocker 발굴) 결함 민감도를 *높이는* 방향이다.
+## 2. 핵심 설계
+### 2.1 critic primitive 재사용 + 적용 phase 확장
+B1 의 critic dispatch(선택된 provider 의 기존 subagent + provider model, opt-in)를 그대로 쓴다. 바뀌는 것은 적용 phase 집합뿐:
+- [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` 의 `critic_phases` 에 `final-verification` 을 추가 → `convergence.critic.enabled` 가 final-verification 에서도 true 가능.
+- [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) `S_CRITIC_PICK` 의 `applies` phase tuple + summary/confirmation 의 phase 조건에 `final-verification` 추가.
+- 선택 UX(`--critic <provider>` / okstra-run select box), `convergence.critic {enabled,provider,modelExecutionValue}` 블록, 모델 해석은 **불변**(B1 그대로).
+### 2.2 critic 행동은 phase 별로 분기 (B2 의 신규 부분)
+convergence skill 에서 critic 행동을 phase 로 분기한다:
+- `requirements-discovery` / `error-analysis` / `implementation-planning` → **coverage critic**(B1: "뭐가 빠졌나", 적대적-drop 검증). 불변.
+- `final-verification` → **acceptance devil's-advocate critic**(신규).
+**악마의 변호인 프롬프트(final-verification):**
+```
+You are the acceptance devil's advocate for <task-key>. The delivered work is about
+to be judged for acceptance. Your ONLY job is to find reasons it should NOT be
+accepted — surface candidate acceptance BLOCKERS the verifiers may have missed:
+- requirements / acceptance points with no covering evidence,
+- DB / IO / SQL changes lacking real-execution evidence,
+- regressions or broken error paths,
+- scope/contract violations.
+For each, emit a candidate blocker with a one-line statement, evidence (file:line /
+log / test output), and a severity (critical / major / minor). Do NOT restate an
+existing Acceptance Blocker. If you find none, say so explicitly.
+```
+**검증 = confirm-or-downgrade (B1 의 적대적-drop 과 다름, BLOCKING):**
+각 후보 blocker 를 Phase 4 analyser 들(critic 제외)이 검증한다.
+- **확인**(재현/증거 인용 성공) → `## 4 Acceptance Blockers` 행으로 승격(severity 유지, follow-up phase 포함).
+- **미확인**(재현 불가 또는 증거 약함) → **Residual Risk 로 강등(절대 drop 하지 않음)** — 추적 대상으로 남기고 trigger 를 기록.
+- 적대적 finding classifier 의 "불확실하면 기각" 규칙을 여기 적용하는 것은 금지(진짜 결함을 억누름).
+**출력·verdict 연동:**
+- 확인된 후보가 Acceptance Blockers 에 들어가면, `accepted` 는 blocker 0 을 요구하므로([`final-verification.md:32`](../../../prompts/profiles/final-verification.md)) verdict 가 자동으로 `conditional-accept` / `blocked` 로 밀린다. 이것이 악마의 변호인의 목적이다.
+- 미확인 후보는 Residual Risk 로 — verdict 를 막지는 않으나 추적된다.
+- 기존 verdict↔blocker 일관성 validator(`validators/validate-run.py` `_validate_final_verification_consistency`)가 그대로 강제한다. 새 enum/validator 없음.
+### 2.3 critic 결과의 상태 기록
+- critic 후보 blocker 는 `runs/final-verification/worker-results/<provider>-critic-final-verification-<seq>.md` 에 기록.
+- convergence 상태 아티팩트의 `config.critic` 요약(B1 정의)에 `mode: "acceptance-devils-advocate"`, `candidatesProposed`, `confirmedBlockers`, `downgradedToResidual` 를 기록(optional v1.2 필드, reader 는 누락을 null 로). enum 변경 없음.
+## 3. 변경 파일
+1. [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) — `critic_phases` 에 `final-verification` 추가.
+2. [`tests/test_render_critic_block.py`](../../../tests/test_render_critic_block.py) — `final-verification` 을 비적용→적용 파라미터로 이동(`enabled=True` 검증).
+3. [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) — `S_CRITIC_PICK.applies` + summary/confirmation phase 조건에 `final-verification` 추가.
+4. [`tests/test_wizard_critic_pick.py`](../../../tests/test_wizard_critic_pick.py) — `final-verification` 을 "skipped" → "applies" 로 플립.
+5. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) — "Coverage critic pass" 절에 phase 분기 + "Acceptance devil's-advocate critic (final-verification)" 하위 모드(프롬프트·confirm-or-downgrade·blocker/residual-risk 출력·상태 요약) 추가.
+6. [`prompts/profiles/final-verification.md`](../../../prompts/profiles/final-verification.md) — 악마의 변호인 critic opt-in 선언 + 출력이 Acceptance Blockers/Residual Risk 로 들어가고 verdict 에 미치는 영향.
+7. [`agents/SKILL.md`](../../../agents/SKILL.md) — critic pass(Phase 5.6)가 final-verification 에도 적용됨을 반영(phase 행/PROGRESS 주석).
+8. (선택) [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) — `critic_pick` label 을 phase-중립적으로 일반화(예: "추가 critic 패스(놓친 finding/blocker 발굴) — opt-in").
+9. [`CHANGES.md`](../../../CHANGES.md) — 사용자 영향 항목.
+## 4. Enforcement — 선언과 강제의 구분
+- **machine-강제:** render `critic_phases` 에 final-verification 포함 + wizard `applies` 확장 → 단위/wizard 테스트. 확인된 critic blocker 가 Acceptance Blockers 로 들어갔을 때의 verdict 일관성(`accepted` ⇒ blocker 0)은 *기존* `_validate_final_verification_consistency` 가 그대로 강제.
+- **prompt-only(강제 불가):** 악마의 변호인이 실제로 의미 있는 후보를 찾는지, confirm-or-downgrade 가 정확히 분류하는지는 lead/워커(LLM) 프롬프트 지시 — skill/profile 선언으로 유도.
+## 5. 비용·리스크
+- **비용:** opt-in(기본 off, B1 과 동일). 켜면 critic dispatch 1 + 후보 검증 1라운드(analyser 수만큼). 미선택 final-verification run 비용 0.
+- **리스크 — 후보 폭증:** critic 이 약한 후보를 다수 낼 수 있음. 완화: confirm-or-downgrade 가 미확인을 Residual Risk 로 강등하므로 verdict 를 막는 것은 *확인된* blocker 뿐. severity·증거 필수.
+- **리스크 — 거짓 통과(억압) 방지가 목적인데 confirm-or-downgrade 가 미확인을 강등:** 미확인을 drop 하지 않고 Residual Risk 로 남기므로 추적은 보존. "확인" 기준은 재현/증거 인용이며, 재현이 불확실한 고-severity 후보는 Residual Risk 의 escalation trigger 로 기록해 사용자가 판단할 수 있게 한다.
+## 6. 수용 기준
+1. `final-verification` 의 manifest `convergence.critic` 가 `--critic`/wizard 선택에서 resolve 되어 `enabled=true` 가능(B1 의 3 phase + final-verification = 4 적용 phase).
+2. okstra-run `S_CRITIC_PICK` 이 final-verification 에서도 표시된다.
+3. convergence skill 이 final-verification 의 악마의 변호인 모드(프롬프트·confirm-or-downgrade·blocker/residual-risk 출력)를 정의하고, B1 coverage 모드와 명확히 구분한다.
+4. final-verification 프로필이 critic opt-in 과 verdict 영향을 선언한다.
+5. `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` 통과.

package/docs/superpowers/specs/2026-06-05-compact-markdown-report-tables-design.md ADDED Viewed

@@ -0,0 +1,87 @@
+# 정본 final-report `.md` 표를 compact 하게 (옵션 X) — 설계
+- 작성일: 2026-06-05
+- 범위: final-report 정본 `.md` 의 narrative 표(§1 Summary, §1.1 Consensus, §1.2 Differences, §3.1 Primary Evidence, §3.2 Secondary, §4 Risks, Execution Status by Agent)를 **짧은 코드 컬럼을 `<br>` 로 한 셀에 stack 한 meta 컬럼 + 긴 prose 컬럼은 별도 컬럼** 으로 재구성해, markdown 에디터에서도 핵심 본문(요약·근거·이견·위험)이 넓게 읽히도록 한다. 이는 [`templates/reports/final-report.template.md`](../../../templates/reports/final-report.template.md) 의 jinja 레이아웃만 바꾸며, `data.json` 스키마·report-writer 계약은 불변이다.
+- 비범위
+  - **§5 Clarification Items 는 평면 8-컬럼 유지** — [`scripts/okstra_ctl/clarification_items.py`](../../../scripts/okstra_ctl/clarification_items.py) 가 `--resume-clarification` carry-in 을 위해 `|` 8-컬럼으로 파싱하고 validator 가 8-컬럼 스키마를 BLOCKING 으로 강제. §5 의 compact 는 HTML view 의 기존 grouping(이미 Expected form wide 까지 교정됨)이 담당한다.
+  - `data.json` 스키마·report-writer worker 의 출력 계약·convergence 상태 변경 없음 (같은 필드를 템플릿이 다르게 배치할 뿐).
+  - implementation-planning §4.5 deliverable 표(Stage Map / Stepwise 등)는 비대상 — validator 가 그 컬럼/헤딩을 substring 검사하므로 손대지 않는다.
+- 관계: 직전 작업(브랜치 `fix/report-table-grouping`, 커밋 `2343e30`)은 **HTML view 에서만** §1/§3/§4 를 grouped 로 만들었다. 본 설계(X)는 §1/§3/§4 를 `.md` 자체에서 compact 하게 만들어 그 HTML-only 접근을 **대체**한다 — 해당 표들은 더 이상 별도 `Ticket ID` 컬럼이 없어 report_views 의 generic grouping 분기가 발동하지 않으므로, 그 분기를 정리한다. §5 grouping(+Expected form wide) 은 유지된다.
+## 1. 동기
+사용자는 final-report 를 **`.md` 파일로 markdown 에디터에서 읽는다.** markdown 표는 colspan 이 없어, `ID·Ticket ID·Source·Kind·Status` 같은 짧은 코드 컬럼이 각각 한 칸씩 차지하면 정작 긴 prose 컬럼(요약·Statement·Evidence·Disagreement·Item·Risk)이 좁아져 세로로 한 글자씩 뭉개진다(실측: §1 Summary 의 "한 줄 요약"이 1글자/줄). HTML self-contained view 는 grouping 으로 해결되지만 `.md` 를 읽는 사용자에겐 닿지 않는다. 따라서 `.md` 자체에서 짧은 컬럼을 `<br>` 로 한 셀에 모아 컬럼 수를 줄이고 prose 에 폭을 준다.
+## 2. 핵심 설계
+### 2.1 대상 표와 meta/wide 분해
+각 표를 `[meta 컬럼] + [prose 컬럼들]` 로 재구성한다(meta = 짧은 코드 필드를 `<br>` stack):
+| 표 | meta 컬럼(한 셀에 `<br>` stack) | 별도 prose 컬럼 |
+|---|---|---|
+| §1 Summary | ID, Ticket ID, 출처 | 한 줄 요약 |
+| §1.1 Consensus | ID, Ticket ID, Source items | Statement, Evidence |
+| §1.2 Differences | ID, Ticket ID, Workers(position+item) | Disagreement, Evidence |
+| §3.1 Primary Evidence | ID, Ticket ID, Source items, Source(path:line) | Evidence |
+| §3.2 Secondary | ID, Ticket ID | Hypothesis or supporting evidence, Source / confidence |
+| §4 Risks | ID, Ticket ID | Item, Risk if ignored, Mitigation Owner |
+| Execution Status | Agent, Role, Model, Status, raw/billable tokens, cost, Duration | Summary of Key Findings |
+빈 상태(empty) 분기는 현행 그대로 유지(`emptyState.*`).
+### 2.2 meta 셀 포맷 (`<br>` stack, i18n)
+meta 셀은 headline(주 식별자) + `<br>` 로 이어지는 `라벨: 값` 들로 구성한다. 예 §1.1 한 row:
+```
+**C-1**<br>{ticketLabel}: `DEV-9184`<br>{sourceLabel}: claude:F-001, codex:1.1
+```
+- headline 은 ID(또는 Agent/Step)를 `**굵게**`.
+- 라벨은 기존 i18n 키를 재사용(`columns.ticketId`, `columns.source`, …); 없으면 신규 키 추가(예 `columns.recordMeta` = meta 컬럼 헤더 "항목"/"Record").
+- 값에 코드성인 것(ticket 등)은 기존처럼 백틱 유지.
+- meta 컬럼 헤더는 짧은 라벨(예 "항목" / "Record") — 신규 i18n 키.
+### 2.3 report_views 정합
+1. **`_inline` 가 `<br>` 를 보존**: 현재 `html.escape` 가 `<br>`→`&lt;br&gt;` 로 깨뜨린다. escape 후 `&lt;br&gt;` / `&lt;br/&gt;` / `&lt;br /&gt;` 를 `<br>` 로 복원(bold/code/link 복원과 동일 패턴). 이로써 HTML view 도 meta 셀이 줄바꿈으로 보인다.
+2. **generic Ticket-ID grouping 분기 제거**: §1/§3/§4 는 이제 `Ticket ID` 단독 컬럼이 없어 그 분기가 발동하지 않는다 — commit `2343e30` 가 추가한 generic 분기(+ 관련 헬퍼·테스트)를 정리한다. Execution Status 도 `.md` 에서 merged 되므로 그 explicit 분기 제거. **§5 Clarification grouping(+Expected form wide)만 남긴다**(§5 는 `.md` 평면 유지 → HTML 에서 grouping).
+3. **plain-table 폭 보강**: merged meta 컬럼은 좁게, prose 컬럼은 넓게 나오도록 plain 경로의 컬럼 폭 처리를 점검(필요 시 meta 컬럼에 narrow, prose 에 min-width).
+### 2.4 계약 영향
+- `data.json`·report-writer 계약·convergence 상태: **불변**. 템플릿이 같은 데이터를 다르게 배치.
+- §5 파서(`clarification_items.py`)·§5 8-컬럼 validator: **불변**(§5 평면 유지).
+- §1/§3/§4 컬럼 헤더를 substring 검사하는 validator/테스트는 없음(확인됨); 구현 단계에서 grep + 전체 테스트로 재확인.
+- carry-in 으로 다음 run 에 들어가는 `.md` 의 §1/§3/§4 는 LLM 이 컨텍스트로 읽을 뿐 코드가 컬럼 파싱하지 않으므로 `<br>` stack 도 안전.
+## 3. 변경 파일
+1. [`templates/reports/final-report.template.md`](../../../templates/reports/final-report.template.md) — §1/§1.1/§1.2/§3.1/§3.2/§4 + Execution Status 표를 meta(`<br>` stack) + prose 형태로 재작성.
+2. [`templates/reports/report.i18n.*`](../../../templates/reports/) (또는 i18n SOT) — meta 컬럼 헤더 + 필요한 라벨 키 추가(ko/en).
+3. [`scripts/okstra_ctl/report_views.py`](../../../scripts/okstra_ctl/report_views.py) — `_inline` `<br>` 보존; generic Ticket-ID grouping 분기 + Execution Status 분기 제거(§5 grouping 유지); plain 폭 보강.
+4. [`tests/test_report_views.py`](../../../tests/test_report_views.py) — `<br>` 보존 테스트; §1/§3/§4 가 더는 grouped 분기 안 타고 `<br>` 가 보존되는지; §5 는 여전히 grouped + Expected form wide.
+5. (필요 시) [`tests/test_render_*`](../../../tests/) — 템플릿 렌더 결과의 표 구조 스냅샷/스모크.
+6. [`CHANGES.md`](../../../CHANGES.md) — 사용자 영향 항목.
+## 4. Enforcement / 검증
+- 단위: `test_report_views.py` 로 `_inline` `<br>` 보존 + §5 grouping 유지를 잠금.
+- 템플릿 렌더: 실제 `data.json`(또는 픽스처)로 렌더해 §1/§3/§4/Exec 표가 meta+prose 형태로 나오는지 + §5 가 평면 8-컬럼인지 확인.
+- **실제 재렌더 검증(BLOCKING)**: 기존 사용자 리포트(또는 픽스처)를 렌더해 `.md` 와 HTML view 양쪽이 compact 하게 나오고 §5 carry-in 파서가 여전히 8-컬럼을 파싱하는지 실행 확인.
+- `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` + `npm run build` 통과.
+## 5. 트레이드오프 / 리스크
+- **트레이드오프:** §1/§3/§4 의 HTML view 가 grp-meta "key: value" 폴리시 대신 `<br>` stack 형태가 된다(살짝 덜 꾸며짐). 대신 `.md`↔HTML 레이아웃이 일관되고 `.md` 자체가 어떤 에디터에서도 읽기 쉽다(사용자 목표).
+- **리스크 — 전 task-type 영향:** 템플릿이 공유라 모든 phase 의 final-report `.md` 구조가 바뀐다. 기계 파싱은 §5 뿐이라 안전하나, 구현 시 전체 테스트 + 실제 렌더로 회귀 확인.
+- **리스크 — `<br>` 미지원 에디터:** 드물게 표 셀 `<br>` 를 literal 로 보이는 렌더러가 있을 수 있음. 주류(GitHub/Obsidian/Typora/VS Code)는 지원. 정본 가독성 목표상 수용.
+## 6. 수용 기준
+1. final-report `.md` 의 §1/§1.1/§1.2/§3.1/§3.2/§4 + Execution Status 가 meta(`<br>` stack) + prose 컬럼 형태로 렌더된다.
+2. §5 Clarification 은 평면 8-컬럼 유지, carry-in 파서·validator 통과.
+3. HTML view 가 meta 셀의 `<br>` 를 줄바꿈으로 보여준다(`_inline` 보존).
+4. report_views 의 §1/§3/§4 generic grouping·Execution Status 분기는 제거되고 §5 grouping(+Expected form wide)은 유지.
+5. `python3 -m pytest tests/` + validator + build 통과 + 실제 재렌더 육안 확인.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.47.0",
+  "version": "0.49.0",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.47.0",
-  "builtAt": "2026-06-04T12:46:31.759Z",
+  "package": "0.49.0",
+  "builtAt": "2026-06-05T09:11:13.878Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/agents/SKILL.md CHANGED Viewed

@@ -42,6 +42,7 @@ This SKILL.md is the operating contract and phase index. Detailed procedures liv
 | 4. Execution | Spawn analysis workers (Teams preferred) | `okstra-team-contract` |
 | 5. Fallback | Sequential/background dispatch when Teams unavailable | `okstra-team-contract` |
 | 5.5 Convergence | Cross-verify findings across workers | `okstra-convergence` |
+| 5.6 Critic pass | (opt-in) reused-worker critic pass: coverage gaps (discovery/error-analysis/impl-planning) or acceptance devil's-advocate (final-verification), each verified one round | `okstra-convergence` "Coverage critic pass" / "Acceptance critic pass" |
 | 6. Synthesis | Dispatch Report writer worker, review draft. **For `implementation-planning`: then run the Phase 6 plan-body verification sub-step (see Phase 6 section below).** | `okstra-report-writer` + `okstra-convergence` (sub-step) |
 | 7. Persist | Run token-usage collector, update manifests, then disband the worker team (shutdown teammates + `TeamDelete`, after collection) | `okstra-report-writer` + `_common-contract.md` "Run-end team teardown" |
@@ -92,6 +93,7 @@ Required checkpoints:
 - `PROGRESS: phase-4-dispatch worker=<role> model=<model>` — once per worker, immediately before the `Agent` / wrapper call.
 - `PROGRESS: phase-5-collect worker=<role> status=<terminal-status>` — once per worker, immediately after the result file is verified.
 - `PROGRESS: phase-5.5-convergence round=<N> queue=<count>` — at the start of each convergence round (Phase 5.5).
+- `PROGRESS: phase-5.6-critic provider=<provider> gaps=<n>` — when the coverage critic pass runs (Phase 5.6, opt-in). Omitted when `convergence.critic.enabled == false`.
 - `PROGRESS: phase-6-synthesis dispatching report-writer-worker` — at the start of Phase 6.
 - `PROGRESS: phase-7-persist updating manifests` — at the start of Phase 7.
 - `PROGRESS: phase-7-teardown disbanding team` — after token-usage collection, immediately before shutting down worker teammates + `TeamDelete` (Teams mode only; see `_common-contract.md` "Run-end team teardown"). Skipped in the no-`team_name` fallback.
@@ -251,7 +253,7 @@ Convergence is enabled by default. Configure via task-manifest.json:
 - `convergence.enabled`: true/false (default: true)
 - `convergence.maxRounds`: 1–3 — **phase-aware default**: `1` for `requirements-discovery`, `2` for all other task types
 - `convergence.verificationMode`: `"lightweight"` | `"full-reanalysis"` (default: `"lightweight"`; the adversarial phases below force `"full-reanalysis"`)
-- `convergence.adversarial`: true/false — **phase-aware default**: `true` for `requirements-discovery` / `error-analysis`, `false` otherwise. When `true`, Phase 5.5 runs in adversarial mode (verifiers refute findings; burden of proof on the claim). See [okstra-convergence](./skills/okstra-convergence/SKILL.md) "Adversarial Verification Mode".
+- `convergence.adversarial`: true/false — **phase-aware default**: `true` for `requirements-discovery` / `error-analysis` / `implementation-planning`, `false` otherwise. When `true`, Phase 5.5 runs in adversarial mode (verifiers refute findings; burden of proof on the claim). See [okstra-convergence](./skills/okstra-convergence/SKILL.md) "Adversarial Verification Mode".
 When `task-manifest.json` does not set `convergence.maxRounds`, lead MUST resolve the effective value via the phase-aware default above before entering Phase 5.5, and record the resolved value in the convergence state artifact at `config.effectiveMaxRounds`.

package/runtime/bin/lib/okstra/tmux-pane.sh ADDED Viewed

@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+# Resolve the tmux pane that the CURRENT process actually runs in.
+#
+# Why this exists: Claude Code's Bash tool strips $TMUX and $TMUX_PANE, so a
+# bare `tmux display-message -p '#{pane_id}'` does NOT return the caller's pane
+# — it returns the pane of the most-recently-active tmux *client*, which (when
+# the user has several attached sessions) is frequently a DIFFERENT session
+# than the one the okstra run lives in. Earlier trace-pane fixes all trusted
+# `display-message` and therefore mis-placed (or dropped) the tail pane.
+#
+# This resolver instead walks the process's own ancestor PIDs and matches them
+# against the tmux server's pane_pids. That is deterministic and correct
+# regardless of $TMUX/$TMUX_PANE or which client is active: when the process is
+# a descendant of a tmux pane's shell it finds exactly that pane; when it is not
+# inside any tmux pane (e.g. Claude launched from the macOS GUI app) no ancestor
+# matches and the function prints nothing.
+#
+# Usage: pane="$(okstra_resolve_caller_pane)"   # empty => not in a tmux pane
+# Optional arg: a starting PID (defaults to $$) — used by the regression test.
+# bash 3.2 safe (no associative arrays).
+okstra_resolve_caller_pane() {
+  command -v tmux >/dev/null 2>&1 || return 0
+  local panes
+  panes="$(tmux list-panes -a -F '#{pane_pid} #{pane_id}' 2>/dev/null)" || return 0
+  [ -n "$panes" ] || return 0
+  local pid="${1:-$$}"
+  local depth=0
+  local hit
+  while [ -n "$pid" ] && [ "$pid" != "0" ] && [ "$depth" -lt 16 ]; do
+    hit="$(printf '%s\n' "$panes" | awk -v p="$pid" '$1==p {print $2; exit}')"
+    if [ -n "$hit" ]; then
+      printf '%s\n' "$hit"
+      return 0
+    fi
+    pid="$(ps -o ppid= -p "$pid" 2>/dev/null | tr -d ' ')"
+    depth=$((depth + 1))
+  done
+  return 0
+}

package/runtime/bin/okstra-codex-exec.sh CHANGED Viewed

@@ -183,36 +183,32 @@ status_path="${prompt_path%.md}.status.json"
 [[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
 started_ts=$(date +%s)
 script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
+# is a bin-sibling in both repo (scripts/lib/...) and installed
+# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
+[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
 python3 "$script_dir/okstra-wrapper-status.py" \
   init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
   >>"$log_path" 2>&1 || true
 # Derive the okstra run dir from the prompt path. paths.py is the SSOT:
 # dispatched prompts live at `<RUN_DIR>/prompts/<cli>-worker-prompt<NNN>.md`,
-# so the run dir is two levels up. Used to (a) read the lead pane the lead
-# recorded in its own foreground pane and (b) tag the trace pane so cleanup
-# can find exactly this run's panes without any tmux env var. Empty if the
+# so the run dir is two levels up. Used to tag the trace pane so cleanup can
+# find exactly this run's panes without any tmux env var. Empty if the
 # derivation fails — every dependent step below then degrades to a no-op.
 run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
-lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
-# Resolve the pane to anchor the trace split to. Claude Code's Bash tool now
-# strips BOTH `$TMUX` and `$TMUX_PANE`, and this wrapper frequently runs
-# backgrounded — so the bare active-pane probe can land on whatever pane the
-# user happens to be looking at now, not Claude's. Prefer the lead pane the
-# lead captured ONCE in its own foreground pane (reliable, see
-# `_common-contract.md`); fall back to `$TMUX_PANE`, then the active-pane
-# probe. A stale recorded id (pane since closed) is rejected via a liveness
-# check so we never anchor the split to a dead pane.
-caller_pane="${TMUX_PANE:-}"
-if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
-  cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
-  if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
-    caller_pane="$cand"
-  fi
-fi
-if [[ -z "$caller_pane" ]]; then
-  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
+# Resolve the pane THIS wrapper actually runs in by walking our ancestor PIDs
+# and matching tmux pane_pids (see lib/okstra/tmux-pane.sh). Reliable
+# regardless of $TMUX/$TMUX_PANE (stripped by Claude Code's Bash tool) and of
+# which tmux client is currently active — a bare `tmux display-message` would
+# instead return the most-recently-active client's pane, frequently a DIFFERENT
+# session than the okstra run, which is why earlier approaches mis-placed or
+# dropped the trace pane. Empty = not inside a tmux pane (e.g. Claude launched
+# from the GUI app) → the trace split below is skipped.
+caller_pane=""
+if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+  caller_pane="$(okstra_resolve_caller_pane)"
 fi
 # Pane titles: worker (caller) pane gets `codex-<role>-<pid>`; the sibling

package/runtime/bin/okstra-gemini-exec.sh CHANGED Viewed

@@ -132,28 +132,25 @@ status_path="${prompt_path%.md}.status.json"
 [[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
 started_ts=$(date +%s)
 script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
+# is a bin-sibling in both repo (scripts/lib/...) and installed
+# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
+[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
 python3 "$script_dir/okstra-wrapper-status.py" \
   init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
   >>"$log_path" 2>&1 || true
 # Resolve the run dir and the trace-split anchor pane. See
-# `okstra-codex-exec.sh` for the full rationale — kept in lock-step: derive
-# `<RUN_DIR>` from the prompt path (paths.py SSOT) to read the lead-recorded
-# pane and to tag the trace pane; prefer that lead pane over the unreliable
-# active-pane probe (this wrapper runs backgrounded and `$TMUX`/`$TMUX_PANE`
-# are stripped).
+# `okstra-codex-exec.sh` / `lib/okstra/tmux-pane.sh` for the full rationale —
+# kept in lock-step: derive `<RUN_DIR>` from the prompt path (paths.py SSOT) to
+# tag the trace pane, and resolve the caller pane by walking our ancestor PIDs
+# against tmux pane_pids (reliable even though `$TMUX`/`$TMUX_PANE` are stripped
+# and the wrapper runs backgrounded). Empty = not inside a tmux pane → skip.
 run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
-lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
-caller_pane="${TMUX_PANE:-}"
-if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
-  cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
-  if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
-    caller_pane="$cand"
-  fi
-fi
-if [[ -z "$caller_pane" ]]; then
-  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
+caller_pane=""
+if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+  caller_pane="$(okstra_resolve_caller_pane)"
 fi
 # Pane titles: worker (caller) pane gets `gemini-<role>-<pid>`; the sibling

package/runtime/bin/okstra-trace-cleanup.sh CHANGED Viewed

@@ -37,6 +37,14 @@
 set -u
+# Trace-pane caller resolution helper (okstra_resolve_caller_pane) — see
+# lib/okstra/tmux-pane.sh. Used as the lead-pane fallback below so a missing /
+# stale lead-pane.id resolves to the pane THIS process actually runs in (via
+# ancestor-PID ↔ tmux pane_pid matching), never a foreign active-client pane.
+# Bin-sibling path in repo + installed layouts; degrade silently if absent.
+_clean_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+[ -r "$_clean_script_dir/lib/okstra/tmux-pane.sh" ] && . "$_clean_script_dir/lib/okstra/tmux-pane.sh"
 MODE="kill"   # kill | list
 REAP=0
 run_dir=""
@@ -88,7 +96,11 @@ if [[ "$REAP" -eq 0 ]]; then
   [[ -r "$lead_pane_file" ]] && lead_pane="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
 fi
 if [[ -z "$lead_pane" ]] || ! tmux display-message -p -t "$lead_pane" '#{pane_id}' >/dev/null 2>&1; then
-  lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
+  if type okstra_resolve_caller_pane >/dev/null 2>&1; then
+    lead_pane="$(okstra_resolve_caller_pane 2>/dev/null || true)"
+  else
+    lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
+  fi
 fi
 # Does a trace pane's tag belong to the set we are closing?