npm - okstra - Versions diffs - 0.46.0 → 0.48.0 - Mend

okstra 0.46.0 → 0.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/docs/superpowers/specs/2026-06-04-coverage-critic-design.md ADDED Viewed

@@ -0,0 +1,99 @@
+# Coverage critic — 설계 (sub-project B1)
+- 작성일: 2026-06-04
+- 범위: `requirements-discovery` / `error-analysis` / `implementation-planning` 세 phase 에, Phase 5.5 convergence 직후·Phase 6 report-writer 직전에 도는 **coverage critic pass** 를 추가한다. critic 은 기존 worker subagent 를 *재사용*해 dispatch 되며("아무도 안 본 각도/파일/모달리티, finding 없는 요구사항을 내라"), critic 이 낸 coverage gap 은 **1회 적대적 reverify** 를 거쳐 살아남은 것만 최종 finding/clarification 으로 병합된다. 사용 여부·backing provider 는 **okstra-run 초기 select box** 로 고르며(추천 + "직접 입력"), CLI 파라미터가 넘어오면 그 선택 단계를 건너뛴다. 기본값은 off(opt-in).
+- 비범위
+  - `final-verification` 의 verdict 악마의 변호인은 **sub-project B2** — 본 문서 밖. (critic primitive 는 B2 가 재사용한다.)
+  - 새 설치형 agent(`critic-worker.md` / `installed-agents.json`) 추가 없음 — 기존 `claude-worker`/`codex-worker`/`gemini-worker` subagent_type 을 critic 프롬프트로 재사용.
+  - convergence 의 finding/plan-body classification·gate enum 변경 없음 — critic gap 은 기존 adversarial finding classifier 를 그대로 탄다.
+  - 워커 *수*(로스터 크기) 변경 없음 — critic 은 별도 pass 이지 Phase 4 analyser 로스터를 늘리는 게 아니다.
+- 관계: 적대적 convergence 기계([`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) §"Adversarial Verification Mode")를 재사용해 critic gap 을 검증한다. wizard select 패턴은 기존 `okstra-run` step 들(`defaults_or_custom` 등)과 동일한 picker 관례를 따른다.
+## 1. 동기 — 커버리지 갭은 구조적이다
+okstra 의 analyser 워커들은 설계상 *같은* section 1–5 질문을 답한다 — triangulation 이지 partition 이 아니다 ([`skills/okstra-team-contract/SKILL.md:204`](../../../skills/okstra-team-contract/SKILL.md)). 따라서 워커를 늘려도 *같은 종류*의 finding 을 더 찾을 뿐, "아무도 보지 않은 각도" 는 구조적으로 비어 있다. 합의 품질(b)은 적대적 검증으로 올렸지만, 커버리지(c) — 놓친 finding — 는 별도 메커니즘이 필요하다. coverage critic 은 통합 findings 를 입력으로 "무엇이 빠졌나" 만 전담해 이 갭을 메운다.
+## 2. 핵심 설계
+### 2.1 critic pass primitive
+Phase 5.5 convergence 가 끝나 findings 가 분류된 직후, Phase 6 report-writer dispatch **전에** lead 가 critic pass 를 1회 실행한다.
+- **재사용 dispatch:** 선택된 provider 의 기존 subagent_type(`claude-worker` / `codex-worker` / `gemini-worker`)에 critic 프롬프트로 dispatch. 새 agent 정의 없음.
+  - 입력: task brief + Phase 5.5 통합 findings 요약 + 코드베이스 read 접근.
+  - 프롬프트 골자: "다음은 이미 합의된 findings 다. 아무도 검사하지 않은 파일/디렉터리/실행경로, finding 이 하나도 없는 요구사항/수용기준, 제기됐지만 아무도 검증하지 않은 주장을 찾아라. 각 coverage gap 을 새 finding 으로, 근거(`file:line` 또는 요구사항 인용)와 함께 내라. 이미 있는 finding 을 반복하지 마라."
+  - 결과 파일: `runs/<task-type>/worker-results/<provider>-critic-<task-type>-<seq>.md`.
+- **1회 적대적 reverify (질문3 결정):** critic 이 낸 gap 들을 `originWorker=<provider>-critic` 인 새 finding 으로 verification queue 에 넣고, **Phase 4 analyser 들**(critic 자신 제외)이 1라운드 적대적으로 reverify 한다(기존 §"Adversarial Verification Mode" classifier 재사용: 증거기반 반박 1건 → 강등). `full-consensus`/`partial-consensus` 로 살아남은 gap 만 최종 리포트 finding 으로 병합; 강등(`contested`/`worker-unique`)된 gap 은 환각으로 보고 버리거나 dissent 로만 기록.
+- **모델 (질문2 결정):** backing provider 는 `--critic <claude|codex|gemini>` 로 선택, 모델은 그 provider 의 기존 `--<provider>-model` 값(executor 바인딩 패턴 미러링). 별도 critic 모델 플래그는 두지 않는다(YAGNI).
+### 2.2 선택 UX — wizard select step + CLI bypass
+critic 사용 여부·provider 는 실행 파라미터 전용이 아니라 **okstra-run 초기 select box** 로 고른다.
+- 신규 wizard step `S_CRITIC_PICK` — [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) 의 `_build_*`/`_submit_*` + [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) SOT. picker 관례(추천 1~2 + 마지막 "직접 입력"):
+  - `critic 사용 안 함 (기본·추천)` → 비활성
+  - `claude critic (opus)` *(추천)* → provider=claude, 모델=해당 phase 의 claude 모델
+  - `직접 입력` → provider(+선택 모델)를 사용자가 직접 지정
+  - (목록은 현재 analyser 로스터에 맞춰 codex/gemini 옵션을 추천에 추가할 수 있으나, 추천은 최대 2개 + 직접 입력 = 3옵션 관례 유지.)
+- **CLI 우선·건너뛰기:** `--critic <provider>` 또는 `--no-critic` 가 넘어오면 `S_CRITIC_PICK` 을 **건너뛴다**. `okstra.sh` / `node bin/okstra` 비대화 경로는 플래그로, 대화형 `okstra-run` 은 select box 로. 플래그 무지정 + 비대화 경로면 기본 off.
+- 세 진입점(okstra-run skill / okstra.sh / node CLI)은 모두 `prepare_task_bundle()` 로 수렴하므로, critic 선택은 거기서 manifest 로 직렬화된다(단일 참조점 보존).
+### 2.3 manifest `convergence.critic` 블록 + render resolve
+[`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) 가 `convergence` 하위에 `critic` 블록을 emit 한다:
+```json
+"critic": {
+  "enabled": false,
+  "provider": null,
+  "modelExecutionValue": null
+}
+```
+- `enabled`: wizard 선택 or `--critic` 플래그가 provider 를 정하면 `true`, 아니면 `false`(기본).
+- `provider`: `claude` | `codex` | `gemini` | `null`.
+- `modelExecutionValue`: 선택된 provider 의 모델(그 provider 의 `--<provider>-model` 시드에서). `enabled=false` 면 `null`.
+lead 는 Phase 5.5 종료 시 이 블록을 읽어 critic pass 실행 여부/대상을 정한다.
+### 2.4 적용 phase
+requirements-discovery / error-analysis / implementation-planning. (final-verification 은 B2.) 이 세 phase 는 모두 finding 을 산출하므로 coverage critic 이 의미가 있다. release-handoff/implementation 은 적용하지 않는다.
+## 3. 데이터 모델
+- **manifest:** §2.3 의 `convergence.critic` 블록.
+- **convergence 상태 아티팩트:** critic gap 은 `findings[]` 에 `originWorker: "<provider>-critic"` 로 들어가고 기존 rounds/votes/classification 스키마를 그대로 쓴다. 추적용으로 finding 에 선택 필드 `source: "critic"` 를 둔다(없으면 `null`=일반 워커 발견). schemaVersion 은 `1.2` 유지(optional 필드 추가, reader 는 누락을 null 로 취급) — enum 변경 없음.
+- **convergence state `config`:** critic 실행 시 `config.critic` 에 `{ provider, modelExecutionValue, gapsProposed, gapsMerged }` 요약을 기록(감사용).
+## 4. 변경 파일
+1. [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` — `critic` 블록 emit + `--critic`/wizard 선택 resolve.
+2. [`scripts/okstra_ctl/run.py`](../../../scripts/okstra_ctl/run.py) — `--critic <provider>` / `--no-critic` argparse + ctx 전달 + 세 phase 한정 적용.
+3. [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) — `S_CRITIC_PICK` step `_build`/`_submit` + 흐름 편입(CLI 미지정 시에만 표시).
+4. [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) (+ 영문 SOT 있으면 동기) — `critic_pick` step label/options.
+5. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) — "Coverage critic pass" 절 신설(시점·프롬프트·1회 적대 reverify·병합·`convergence.critic` 스키마).
+6. [`agents/SKILL.md`](../../../agents/SKILL.md) — Phase 5.5→6 흐름에 critic pass + `PROGRESS: phase-5.5-critic provider=<p>` 라인 + (해당 시) 모델/knob 참조.
+7. [`prompts/profiles/requirements-discovery.md`](../../../prompts/profiles/requirements-discovery.md) / [`error-analysis.md`](../../../prompts/profiles/error-analysis.md) / [`implementation-planning.md`](../../../prompts/profiles/implementation-planning.md) — coverage critic opt-in 선언 1줄.
+8. 테스트: render `critic` 블록 resolve(`--critic`/무지정/`--no-critic`), wizard `S_CRITIC_PICK` 빌드+제출, CLI bypass.
+9. [`CHANGES.md`](../../../CHANGES.md) — 사용자 영향 항목.
+## 5. Enforcement — 선언과 강제의 구분
+- **machine-강제:** `convergence.critic` 블록 형태 + render resolve(`--critic`/`--no-critic`/무지정) → 단위 테스트. wizard `S_CRITIC_PICK` 의 picker 옵션·CLI bypass → wizard 테스트.
+- **prompt-only(강제 불가):** critic 이 실제로 의미 있는 gap 을 찾는지, 1회 적대 reverify 가 환각을 거르는지는 lead/워커(LLM) 프롬프트 지시일 뿐 런타임 강제 아님 — skill/profile 선언으로 유도.
+## 6. 비용·리스크
+- **비용:** opt-in(기본 off). 켜면 critic dispatch 1 + reverify 1라운드(analyser 수만큼). 기본 off 라 미선택 run 은 비용 0.
+- **리스크 — 환각 gap:** critic 이 가짜 gap 을 낼 수 있음. 완화: 1회 적대 reverify 가 증거 없는 gap 을 강등. 살아남은 gap 만 finding 으로.
+- **리스크 — 중복 finding:** critic 이 기존 finding 을 재서술. 완화: 프롬프트가 "이미 있는 finding 반복 금지" 명시 + reverify 단계의 semantic grouping 이 중복을 흡수.
+- **리스크 — reverify 투표자 부족:** critic gap 은 critic 자신을 뺀 Phase 4 analyser 가 검증. 기본 로스터가 ≥2 analyser 라 최소 1명은 항상 투표 가능. analyser 가 1명뿐인 비정상 구성이면 critic gap 은 검증 불가로 표면화만(병합 안 함)하고 그 사실을 기록.
+## 7. 수용 기준
+1. wizard 에 `S_CRITIC_PICK` select box 가 추가되고(추천 + "직접 입력"), `--critic`/`--no-critic` 미지정 대화형 run 에서만 표시된다. 플래그 지정 시 건너뛴다.
+2. manifest `convergence.critic` 가 wizard 선택/플래그에서 정확히 resolve 된다(enabled/provider/model). 기본 off.
+3. convergence skill 이 critic pass(시점·프롬프트·1회 적대 reverify·병합)를 정의한다.
+4. 세 적용 phase 프로필이 coverage critic opt-in 을 선언한다.
+5. `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` 통과.

package/docs/superpowers/specs/2026-06-05-acceptance-critic-design.md ADDED Viewed

@@ -0,0 +1,90 @@
+# Acceptance devil's-advocate critic — 설계 (sub-project B2)
+- 작성일: 2026-06-05
+- 범위: sub-project B1 의 critic dispatch primitive(기존 worker 재사용 + provider/model 선택 + opt-in)를 `final-verification` 으로 확장한다. final-verification 에서 critic 은 **악마의 변호인** 모드로 동작 — "받아들이면 안 되는 이유/놓친 acceptance blocker 를 캐라" — 그리고 후보 blocker 는 **confirm-or-downgrade** 로 검증(확인→Acceptance Blocker, 미확인→Residual Risk, 절대 drop 안 함)되어 verdict 에 반영된다. 선택 UX·`--critic`/`S_CRITIC_PICK`·`convergence.critic` 블록은 B1 것을 그대로 재사용하며, 적용 phase 에 `final-verification` 을 추가한다.
+- 비범위
+  - 새 dispatch 메커니즘·새 설치형 agent·새 selection 플래그 없음 — B1 primitive 전부 재사용.
+  - convergence 의 finding/plan-body classification·gate enum·verdict↔blocker validator 변경 없음 — 확인된 critic blocker 는 *기존* Acceptance Blockers 경로로 들어가고 기존 verdict 규칙(`accepted` ⇒ blocker 0)이 그대로 작동한다.
+  - B1 의 coverage critic 동작(3 finding-phase, 적대적-drop 검증) 변경 없음 — B2 는 final-verification 전용 *모드*를 추가할 뿐.
+- 관계: B1 [`2026-06-04-coverage-critic-design.md`](2026-06-04-coverage-critic-design.md) 가 만든 critic primitive·`convergence.critic` 블록·`S_CRITIC_PICK` 선택을 재사용한다. final-verification 프로필의 Acceptance Blockers/Residual Risk/Verdict Token 구조([`prompts/profiles/final-verification.md`](../../../prompts/profiles/final-verification.md))에 출력을 연동한다.
+## 1. 동기
+final-verification 은 수용 판정 직전의 마지막 게이트다. 거짓 합의가 여기서 새면 결함이 그대로 릴리스된다. 그런데 적대적 검증(sub-project 0)을 final-verification 에 그대로 적용하면 역효과다 — finding 이 *결함/blocker* 이므로 finding 을 적대적으로 반박하면 "재현 못 한 진짜 결함" 이 강등·누락된다(결함 민감도 하락). 그래서 final-verification 의 critic 은 finding 을 반박하는 게 아니라, **verdict 에 대한 악마의 변호인** — "이 작업을 받아들이면 안 되는 이유를 적극적으로 찾는" 추가 패스여야 한다. 이는 커버리지를 늘리되(놓친 blocker 발굴) 결함 민감도를 *높이는* 방향이다.
+## 2. 핵심 설계
+### 2.1 critic primitive 재사용 + 적용 phase 확장
+B1 의 critic dispatch(선택된 provider 의 기존 subagent + provider model, opt-in)를 그대로 쓴다. 바뀌는 것은 적용 phase 집합뿐:
+- [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` 의 `critic_phases` 에 `final-verification` 을 추가 → `convergence.critic.enabled` 가 final-verification 에서도 true 가능.
+- [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) `S_CRITIC_PICK` 의 `applies` phase tuple + summary/confirmation 의 phase 조건에 `final-verification` 추가.
+- 선택 UX(`--critic <provider>` / okstra-run select box), `convergence.critic {enabled,provider,modelExecutionValue}` 블록, 모델 해석은 **불변**(B1 그대로).
+### 2.2 critic 행동은 phase 별로 분기 (B2 의 신규 부분)
+convergence skill 에서 critic 행동을 phase 로 분기한다:
+- `requirements-discovery` / `error-analysis` / `implementation-planning` → **coverage critic**(B1: "뭐가 빠졌나", 적대적-drop 검증). 불변.
+- `final-verification` → **acceptance devil's-advocate critic**(신규).
+**악마의 변호인 프롬프트(final-verification):**
+```
+You are the acceptance devil's advocate for <task-key>. The delivered work is about
+to be judged for acceptance. Your ONLY job is to find reasons it should NOT be
+accepted — surface candidate acceptance BLOCKERS the verifiers may have missed:
+- requirements / acceptance points with no covering evidence,
+- DB / IO / SQL changes lacking real-execution evidence,
+- regressions or broken error paths,
+- scope/contract violations.
+For each, emit a candidate blocker with a one-line statement, evidence (file:line /
+log / test output), and a severity (critical / major / minor). Do NOT restate an
+existing Acceptance Blocker. If you find none, say so explicitly.
+```
+**검증 = confirm-or-downgrade (B1 의 적대적-drop 과 다름, BLOCKING):**
+각 후보 blocker 를 Phase 4 analyser 들(critic 제외)이 검증한다.
+- **확인**(재현/증거 인용 성공) → `## 4 Acceptance Blockers` 행으로 승격(severity 유지, follow-up phase 포함).
+- **미확인**(재현 불가 또는 증거 약함) → **Residual Risk 로 강등(절대 drop 하지 않음)** — 추적 대상으로 남기고 trigger 를 기록.
+- 적대적 finding classifier 의 "불확실하면 기각" 규칙을 여기 적용하는 것은 금지(진짜 결함을 억누름).
+**출력·verdict 연동:**
+- 확인된 후보가 Acceptance Blockers 에 들어가면, `accepted` 는 blocker 0 을 요구하므로([`final-verification.md:32`](../../../prompts/profiles/final-verification.md)) verdict 가 자동으로 `conditional-accept` / `blocked` 로 밀린다. 이것이 악마의 변호인의 목적이다.
+- 미확인 후보는 Residual Risk 로 — verdict 를 막지는 않으나 추적된다.
+- 기존 verdict↔blocker 일관성 validator(`validators/validate-run.py` `_validate_final_verification_consistency`)가 그대로 강제한다. 새 enum/validator 없음.
+### 2.3 critic 결과의 상태 기록
+- critic 후보 blocker 는 `runs/final-verification/worker-results/<provider>-critic-final-verification-<seq>.md` 에 기록.
+- convergence 상태 아티팩트의 `config.critic` 요약(B1 정의)에 `mode: "acceptance-devils-advocate"`, `candidatesProposed`, `confirmedBlockers`, `downgradedToResidual` 를 기록(optional v1.2 필드, reader 는 누락을 null 로). enum 변경 없음.
+## 3. 변경 파일
+1. [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) — `critic_phases` 에 `final-verification` 추가.
+2. [`tests/test_render_critic_block.py`](../../../tests/test_render_critic_block.py) — `final-verification` 을 비적용→적용 파라미터로 이동(`enabled=True` 검증).
+3. [`scripts/okstra_ctl/wizard.py`](../../../scripts/okstra_ctl/wizard.py) — `S_CRITIC_PICK.applies` + summary/confirmation phase 조건에 `final-verification` 추가.
+4. [`tests/test_wizard_critic_pick.py`](../../../tests/test_wizard_critic_pick.py) — `final-verification` 을 "skipped" → "applies" 로 플립.
+5. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) — "Coverage critic pass" 절에 phase 분기 + "Acceptance devil's-advocate critic (final-verification)" 하위 모드(프롬프트·confirm-or-downgrade·blocker/residual-risk 출력·상태 요약) 추가.
+6. [`prompts/profiles/final-verification.md`](../../../prompts/profiles/final-verification.md) — 악마의 변호인 critic opt-in 선언 + 출력이 Acceptance Blockers/Residual Risk 로 들어가고 verdict 에 미치는 영향.
+7. [`agents/SKILL.md`](../../../agents/SKILL.md) — critic pass(Phase 5.6)가 final-verification 에도 적용됨을 반영(phase 행/PROGRESS 주석).
+8. (선택) [`prompts/wizard/prompts.ko.json`](../../../prompts/wizard/prompts.ko.json) — `critic_pick` label 을 phase-중립적으로 일반화(예: "추가 critic 패스(놓친 finding/blocker 발굴) — opt-in").
+9. [`CHANGES.md`](../../../CHANGES.md) — 사용자 영향 항목.
+## 4. Enforcement — 선언과 강제의 구분
+- **machine-강제:** render `critic_phases` 에 final-verification 포함 + wizard `applies` 확장 → 단위/wizard 테스트. 확인된 critic blocker 가 Acceptance Blockers 로 들어갔을 때의 verdict 일관성(`accepted` ⇒ blocker 0)은 *기존* `_validate_final_verification_consistency` 가 그대로 강제.
+- **prompt-only(강제 불가):** 악마의 변호인이 실제로 의미 있는 후보를 찾는지, confirm-or-downgrade 가 정확히 분류하는지는 lead/워커(LLM) 프롬프트 지시 — skill/profile 선언으로 유도.
+## 5. 비용·리스크
+- **비용:** opt-in(기본 off, B1 과 동일). 켜면 critic dispatch 1 + 후보 검증 1라운드(analyser 수만큼). 미선택 final-verification run 비용 0.
+- **리스크 — 후보 폭증:** critic 이 약한 후보를 다수 낼 수 있음. 완화: confirm-or-downgrade 가 미확인을 Residual Risk 로 강등하므로 verdict 를 막는 것은 *확인된* blocker 뿐. severity·증거 필수.
+- **리스크 — 거짓 통과(억압) 방지가 목적인데 confirm-or-downgrade 가 미확인을 강등:** 미확인을 drop 하지 않고 Residual Risk 로 남기므로 추적은 보존. "확인" 기준은 재현/증거 인용이며, 재현이 불확실한 고-severity 후보는 Residual Risk 의 escalation trigger 로 기록해 사용자가 판단할 수 있게 한다.
+## 6. 수용 기준
+1. `final-verification` 의 manifest `convergence.critic` 가 `--critic`/wizard 선택에서 resolve 되어 `enabled=true` 가능(B1 의 3 phase + final-verification = 4 적용 phase).
+2. okstra-run `S_CRITIC_PICK` 이 final-verification 에서도 표시된다.
+3. convergence skill 이 final-verification 의 악마의 변호인 모드(프롬프트·confirm-or-downgrade·blocker/residual-risk 출력)를 정의하고, B1 coverage 모드와 명확히 구분한다.
+4. final-verification 프로필이 critic opt-in 과 verdict 영향을 선언한다.
+5. `python3 -m pytest tests/` + `bash validators/validate-workflow.sh` 통과.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.46.0",
+  "version": "0.48.0",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.46.0",
-  "builtAt": "2026-06-04T11:19:42.641Z",
+  "package": "0.48.0",
+  "builtAt": "2026-06-04T17:18:36.379Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/agents/SKILL.md CHANGED Viewed

@@ -42,6 +42,7 @@ This SKILL.md is the operating contract and phase index. Detailed procedures liv
 | 4. Execution | Spawn analysis workers (Teams preferred) | `okstra-team-contract` |
 | 5. Fallback | Sequential/background dispatch when Teams unavailable | `okstra-team-contract` |
 | 5.5 Convergence | Cross-verify findings across workers | `okstra-convergence` |
+| 5.6 Critic pass | (opt-in) reused-worker critic pass: coverage gaps (discovery/error-analysis/impl-planning) or acceptance devil's-advocate (final-verification), each verified one round | `okstra-convergence` "Coverage critic pass" / "Acceptance critic pass" |
 | 6. Synthesis | Dispatch Report writer worker, review draft. **For `implementation-planning`: then run the Phase 6 plan-body verification sub-step (see Phase 6 section below).** | `okstra-report-writer` + `okstra-convergence` (sub-step) |
 | 7. Persist | Run token-usage collector, update manifests, then disband the worker team (shutdown teammates + `TeamDelete`, after collection) | `okstra-report-writer` + `_common-contract.md` "Run-end team teardown" |
@@ -92,6 +93,7 @@ Required checkpoints:
 - `PROGRESS: phase-4-dispatch worker=<role> model=<model>` — once per worker, immediately before the `Agent` / wrapper call.
 - `PROGRESS: phase-5-collect worker=<role> status=<terminal-status>` — once per worker, immediately after the result file is verified.
 - `PROGRESS: phase-5.5-convergence round=<N> queue=<count>` — at the start of each convergence round (Phase 5.5).
+- `PROGRESS: phase-5.6-critic provider=<provider> gaps=<n>` — when the coverage critic pass runs (Phase 5.6, opt-in). Omitted when `convergence.critic.enabled == false`.
 - `PROGRESS: phase-6-synthesis dispatching report-writer-worker` — at the start of Phase 6.
 - `PROGRESS: phase-7-persist updating manifests` — at the start of Phase 7.
 - `PROGRESS: phase-7-teardown disbanding team` — after token-usage collection, immediately before shutting down worker teammates + `TeamDelete` (Teams mode only; see `_common-contract.md` "Run-end team teardown"). Skipped in the no-`team_name` fallback.
@@ -250,7 +252,8 @@ Convergence is enabled by default. Configure via task-manifest.json:
 - `convergence.enabled`: true/false (default: true)
 - `convergence.maxRounds`: 1–3 — **phase-aware default**: `1` for `requirements-discovery`, `2` for all other task types
-- `convergence.verificationMode`: `"lightweight"` | `"full-reanalysis"` (default: `"lightweight"`)
+- `convergence.verificationMode`: `"lightweight"` | `"full-reanalysis"` (default: `"lightweight"`; the adversarial phases below force `"full-reanalysis"`)
+- `convergence.adversarial`: true/false — **phase-aware default**: `true` for `requirements-discovery` / `error-analysis` / `implementation-planning`, `false` otherwise. When `true`, Phase 5.5 runs in adversarial mode (verifiers refute findings; burden of proof on the claim). See [okstra-convergence](./skills/okstra-convergence/SKILL.md) "Adversarial Verification Mode".
 When `task-manifest.json` does not set `convergence.maxRounds`, lead MUST resolve the effective value via the phase-aware default above before entering Phase 5.5, and record the resolved value in the convergence state artifact at `config.effectiveMaxRounds`.

package/runtime/prompts/profiles/_common-contract.md CHANGED Viewed

@@ -14,7 +14,7 @@ profile document.
 - Worker interaction model (shared — read before inferring behaviour from the roster):
   - the per-profile `Required workers:` block is a **roster**, not a behaviour contract. Each role's interaction mode changes across operating phases of the same run.
   - **Phase 4 / 5 (independent analysis)**: analyser workers (`claude`, `codex`, `gemini` when opted in) produce findings independently and have no access to one another's outputs. `report-writer` does not analyse.
-  - **Phase 5.5 (convergence — peer review by workers)**: the lead replays each analyser's findings to the *other* analysers and collects `AGREE` / `DISAGREE` / `SUPPLEMENT` verdicts across up to `effectiveMaxRounds` rounds. Workers act as peer reviewers of each other's findings in this phase; the lead mediates but does not vote. See `skills/okstra-convergence/SKILL.md` for the round protocol, queue invariants, and final classification (`full-consensus` / `partial-consensus` / `contested` / `worker-unique`).
+  - **Phase 5.5 (convergence — peer review by workers)**: the lead replays each analyser's findings to the *other* analysers and collects `AGREE` / `DISAGREE` / `SUPPLEMENT` verdicts across up to `effectiveMaxRounds` rounds. Workers act as peer reviewers of each other's findings in this phase; the lead mediates but does not vote. See `skills/okstra-convergence/SKILL.md` for the round protocol, queue invariants, and final classification (`full-consensus` / `partial-consensus` / `contested` / `worker-unique`). For `requirements-discovery`, `error-analysis`, and `implementation-planning` this phase runs in **adversarial mode** (`convergence.adversarial=true`): verifiers try to refute each finding against its cited evidence and the burden of proof sits on the claim — see that skill's §"Adversarial Verification Mode".
   - Do NOT conclude "no peer review happens" from the roster alone — every profile that lists ≥2 analyser workers runs convergence by default (`convergence.enabled=true` in `task-manifest.json`).
 - Tooling — read-only MCP availability (shared):
   - MCP is not implicit okstra context. Query an MCP server only when the task brief explicitly lists it as source material for this run. Any MCP-derived finding MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files reviewed by humans.

package/runtime/prompts/profiles/error-analysis.md CHANGED Viewed

@@ -30,6 +30,9 @@
   - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
   - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity about repro, file behavior, or symbol semantics that can be answered by `Read` / `Grep` / log inspection MUST be resolved that way and recorded with file:line (or log-line) evidence. Writing a clarification row for something the codebase or shipped logs already answer is a defect of this phase.
   - **Evidence note required inside `Statement`**: every clarification row includes `Evidence checked: <path:line>` or `Evidence checked: none — <reporter-only reason>` in the `Statement` cell. `none` is allowed ONLY when the row's nature is "only the reporter can answer this" (reporter-side data, business priority, environment they observed). A row with `none` that *could* have been answered by code or logs is a defect.
+- Cross-verification mode:
+  - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each root-cause / reproduction claim by directly re-inspecting the cited code, logs, or config; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
+  - **Coverage critic (opt-in)**: when `convergence.critic.enabled=true` (chosen via the okstra-run picker or `--critic`), a reused-worker critic pass runs after convergence to surface missed findings; its gaps are merged only after a 1-round adversarial reverify. See `skills/okstra-convergence/SKILL.md` "Coverage critic pass".
 - Non-goals:
   - implementation details unless they are necessary to validate the cause
   - **source code edits, builds, migrations, or deployments** — this run produces evidence and cause analysis only; the fix belongs to a later `implementation-planning` run followed by an `implementation` run

package/runtime/prompts/profiles/final-verification.md CHANGED Viewed

@@ -44,6 +44,8 @@
   3. **Coverage check** — every requirement in the originating plan/task brief is either marked covered (with artifact) or listed as a blocker. No silent omissions.
   4. **Verifier dissent preserved** — if workers reach different verdicts, the disagreement is visible in section 1.2; synthesis hides nothing.
   5. **No source-mutation audit** — scan the run's session transcripts for Edit / Write or state-mutating Bash commands that touch paths OUTSIDE `<PROJECT_ROOT>/.okstra/**` and outside the assigned run-artifact paths. Writes to worker prompts, audit sidecars, team-state, the final-report `data.json`, and rendered reports under the run directory are allowed okstra artifacts. Any source/schema/deployment mutation means the run has crossed into implementation and MUST be re-routed; do NOT silently strip the evidence.
+- Cross-verification mode:
+  - **Acceptance critic (opt-in)**: when `convergence.critic.enabled=true` (chosen via the okstra-run picker or `--critic`), a reused-worker **acceptance devil's-advocate** pass runs after convergence to surface candidate acceptance blockers the verifiers may have missed. Each candidate is verified **confirm-or-downgrade**: confirmed → an `Acceptance Blockers` row (which, since `accepted` requires zero blockers, moves the verdict to `conditional-accept` / `blocked`); unconfirmed → a `Residual Risk` row (never dropped). See `skills/okstra-convergence/SKILL.md` "Acceptance critic pass (final-verification)".
 - Non-goals:
   - proposing unrelated refactors beyond the delivered scope
   - **source code edits, follow-up bug fixes, or scope expansion** — this run renders a verdict only; defects detected here become inputs to a new `error-analysis` or `implementation-planning` run

package/runtime/prompts/profiles/implementation-planning.md CHANGED Viewed

@@ -37,6 +37,10 @@
   - recommended execution order
 - Approval gate (phase-specific addendum to shared authority rule):
   - The YAML frontmatter `approved: true|false` field is the only authorised approval gate. report-writer always emits `approved: false`. The user clears it either by (a) editing the frontmatter line to `approved: true` directly, or (b) invoking the next phase with `--approve` so the CLI flips the frontmatter on the user's behalf. `okstra_ctl.run._validate_approved_plan` reads this field and refuses entry until it is `true`.
+- Cross-verification mode:
+  - Phase 5.5 finding convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each worker finding (requirement gap / risk / option) by re-inspecting its cited evidence; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode".
+  - §4.5.9 plan-body verification runs with an **adversarial posture** (`skills/okstra-convergence/SKILL.md` §"Adversarial plan-body posture"): verifiers open and confirm every cited path / command and put the burden of proof on the plan. The gate threshold is unchanged — a *majority* `DISAGREE` (`majority-disagree`) is still required to block approval; a single dissent does not.
+  - **Coverage critic (opt-in)**: when `convergence.critic.enabled=true` (chosen via the okstra-run picker or `--critic`), a reused-worker critic pass runs after convergence to surface missed findings; its gaps are merged only after a 1-round adversarial reverify. See `skills/okstra-convergence/SKILL.md` "Coverage critic pass".
 - Non-goals:
   - code-level micro-optimization unless it changes the implementation approach
   - **source code edits of any kind** — this run produces a plan document only; Edit/Write on project source files is forbidden until the plan is approved and a separate `implementation` run starts
@@ -74,7 +78,7 @@
   - the YAML frontmatter MUST include the line `approved: false` (report-writer always emits the unflipped value). The user authorises the next `implementation` run by flipping it to `approved: true` (manual edit or `--approve` CLI). Do NOT recreate any `User Approval Request` body block — the validator fails reports that contain one (see `validators/validate-run.py` deprecated patterns).
   - **the frontmatter `approved: false` line is rendered unconditionally; if the plan-body verification gate (§4.5.9) returns `blocked-by-disagreement` or `aborted-non-result`, the writer MUST keep `approved: false` and the validator refuses any report that ships with `approved: true` under such a gate result.**
   - every ambiguity flagged during pre-planning that the user must resolve before approval registered as a `Blocks=approval` row in the `## 5. Clarification Items` table (do NOT create a separate `Open Questions` block under `4.5.x` — the unified table is the single home)
-  - **§4.5.9 Plan Body Verification (BLOCKING).** After report-writer finishes the draft, the lead MUST run a worker peer-review round on the consolidated plan body (sections 4.5.1 – 4.5.7) and populate `### 4.5.9 Plan Body Verification` in the final report. The round protocol, plan-item ID scheme (`P-Opt-*` / `P-Step-*` / `P-Dep-*` / `P-Val-*` / `P-Rb-*`), verdict semantics, gate-result classification, and dissent log format are defined in `skills/okstra-convergence/SKILL.md` "Plan-body verification mode". The four gate-result values are `passed`, `passed-with-dissent`, `blocked-by-disagreement`, `aborted-non-result`. When the gate would have been `blocked-by-disagreement` or `aborted-non-result`, the lead MUST NOT silently flip it to one of the passing values to "unblock" the run — that is a contract violation.
+  - **§4.5.9 Plan Body Verification (BLOCKING).** After report-writer finishes the draft, the lead MUST run a worker peer-review round on the consolidated plan body (sections 4.5.1 – 4.5.7) and populate `### 4.5.9 Plan Body Verification` in the final report. The round protocol, plan-item ID scheme (`P-Opt-*` / `P-Step-*` / `P-Dep-*` / `P-Val-*` / `P-Rb-*`), verdict semantics, gate-result classification, and dissent log format are defined in `skills/okstra-convergence/SKILL.md` "Plan-body verification mode". The four gate-result values are `passed`, `passed-with-dissent`, `blocked-by-disagreement`, `aborted-non-result`. When the gate would have been `blocked-by-disagreement` or `aborted-non-result`, the lead MUST NOT silently flip it to one of the passing values to "unblock" the run — that is a contract violation. When `convergence.adversarial=true` (the default for this phase), this round uses the adversarial posture — verifiers confirm cited paths/commands and the burden of proof is on the plan — but the gate threshold stays `majority-disagree` (see that skill's §"Adversarial plan-body posture").
   - **Decision-record evaluation (sole owner)**: this phase is the **single owner** of decision-record evaluation in the okstra lifecycle. The brief never evaluates or drafts decision records — it only forwards `adr-candidate:*` signals. Every `adr-candidate:*` entry inherited from the brief's `Open Questions` is a mandatory evaluation target. In addition, evaluate every decision the recommended option introduces against the three criteria:
     1. **Hard to reverse** — would changing the decision later cost meaningfully more than deciding now?
     2. **Surprising without context** — would a future reader, seeing only the code, wonder "why was it built this way?"?

package/runtime/prompts/profiles/requirements-discovery.md CHANGED Viewed

@@ -51,6 +51,9 @@
   - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
   - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity that can be answered by `Read` / `Grep` / file inspection MUST be resolved that way and recorded with file:line evidence. Writing a clarification row for something the codebase already answers is a defect of this phase.
   - **Evidence note required inside `Statement`**: every clarification row includes `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>` in the `Statement` cell. `none` is allowed ONLY when the row's nature is "only a human can answer this" (reporter intent, business priority, external authority). A row with `none` that *could* have been answered by the codebase is a defect.
+- Cross-verification mode:
+  - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each worker's finding by directly re-inspecting the cited evidence; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
+  - **Coverage critic (opt-in)**: when `convergence.critic.enabled=true` (chosen via the okstra-run picker or `--critic`), a reused-worker critic pass runs after convergence to surface missed findings; its gaps are merged only after a 1-round adversarial reverify. See `skills/okstra-convergence/SKILL.md` "Coverage critic pass".
 - Non-goals:
   - full implementation design unless it is required to decide the next phase
   - **source code edits, plan authoring, builds, or deployments** — this run only classifies the work and routes it; deeper analysis and planning belong to subsequent phases

package/runtime/prompts/wizard/prompts.ko.json CHANGED Viewed

@@ -228,6 +228,19 @@
         "_DEFAULT_SUFFIX": " (default)"
       }
     },
+    "critic_pick": {
+      "label": "추가 critic 패스를 돌릴까요? (놓친 finding/blocker 를 캐는 검증 패스 — opt-in)",
+      "echo_template": "critic: {value}",
+      "options": {
+        "off": "사용 안 함 (기본·추천)",
+        "claude": "claude critic (추천)",
+        "__free_input__": "직접 입력 (codex / gemini)"
+      }
+    },
+    "critic_text": {
+      "label": "critic provider 를 직접 입력하세요 (codex / gemini)",
+      "echo_template": "critic: {value}"
+    },
     "defaults_or_custom": {
       "label": "역할별로 어떤 모델을 쓸지 정하는 단계입니다 (참여 워커 구성을 바꾸는 게 아닙니다).\n· 기본값으로 진행 — lead·실행자/워커·report-writer 를 모두 추천 모델로 두고 바로 진행합니다.\n· 커스터마이즈 — 역할별 모델을 직접 고르고, 추가 directive·관련 task 도 지정합니다.",
       "echo_template": "customize: {value}",

package/runtime/python/okstra_ctl/render.py CHANGED Viewed

@@ -903,21 +903,47 @@ def _build_convergence_block(ctx: dict) -> dict:
     - `enabled` default True
     - `maxRounds` default 1 for `requirements-discovery`, 2 otherwise
     - `verificationMode` default "lightweight"
+    - `adversarial` default True for `requirements-discovery` / `error-analysis` /
+      `implementation-planning` (forces `verificationMode` to "full-reanalysis"),
+      False otherwise
     - `planBodyVerification` is implementation-planning specific; the key is
       always emitted (dead-letter on other phases) so the schema stays stable.
     ctx knobs honoured:
     - `OKSTRA_PLAN_VERIFICATION`: "true" | "false" | "" (empty → default True).
       Wired from CLI `--no-plan-verification` (sets "false").
+    - `CRITIC_CHOICE`: "" | "off" | "claude" | "codex" | "gemini" — critic
+      backing provider (enabled only for requirements-discovery / error-analysis /
+      implementation-planning / final-verification); model taken from that
+      provider's execution value.
     """
     task_type = ctx.get("TASK_TYPE", "")
     default_max_rounds = 1 if task_type == "requirements-discovery" else 2
+    adversarial_phases = {"requirements-discovery", "error-analysis", "implementation-planning"}
+    is_adversarial = task_type in adversarial_phases
     raw_plan_verify = (ctx.get("OKSTRA_PLAN_VERIFICATION", "") or "").strip().lower()
     plan_verify_enabled = raw_plan_verify != "false"
+    critic_choice = (ctx.get("CRITIC_CHOICE", "") or "").strip().lower()
+    # Independent of `adversarial_phases` above (they answer different questions and
+    # may diverge): the coverage critic is opt-in for the finding-producing phases.
+    critic_phases = {"requirements-discovery", "error-analysis", "implementation-planning", "final-verification"}
+    critic_exec_key = {
+        "claude": "CLAUDE_WORKER_MODEL_EXECUTION_VALUE",
+        "codex": "CODEX_WORKER_MODEL_EXECUTION_VALUE",
+        "gemini": "GEMINI_WORKER_MODEL_EXECUTION_VALUE",
+    }
+    critic_enabled = critic_choice in critic_exec_key and task_type in critic_phases
+    critic_block = {
+        "enabled": critic_enabled,
+        "provider": critic_choice if critic_enabled else None,
+        "modelExecutionValue": (ctx.get(critic_exec_key[critic_choice]) or None) if critic_enabled else None,
+    }
     return {
         "enabled": True,
+        "adversarial": is_adversarial,
         "maxRounds": default_max_rounds,
-        "verificationMode": "lightweight",
+        "verificationMode": "full-reanalysis" if is_adversarial else "lightweight",
+        "critic": critic_block,
         "planBodyVerification": {
             "enabled": plan_verify_enabled,
             "maxRounds": 1,

package/runtime/python/okstra_ctl/run.py CHANGED Viewed

@@ -120,6 +120,7 @@ class PrepareInputs:
     gemini_model: str = ""
     report_writer_model: str = ""
     executor: str = ""
+    critic: str = ""
     related_tasks_raw: str = ""
     work_category: str = ""
     base_ref: str = ""
@@ -499,6 +500,7 @@ def _canonical_argv(inp: PrepareInputs, ctx: dict) -> list[str]:
         ("--gemini-model", inp.gemini_model or ctx.get("GEMINI_WORKER_MODEL", "")),
         ("--report-writer-model", inp.report_writer_model or ctx.get("REPORT_WRITER_MODEL", "")),
         ("--executor", inp.executor or ctx.get("EXECUTOR_PROVIDER", "")),
+        ("--critic", inp.critic or ctx.get("CRITIC_CHOICE", "")),
         ("--related-tasks", inp.related_tasks_raw),
         ("--work-category", inp.work_category),
     ]
@@ -707,6 +709,13 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
         default_display=report_writer_default, default_execution=report_writer_default,
     )
+    # ---- coverage critic choice (validated; phase-gating happens in render) ----
+    critic_choice = (inp.critic or "").strip().lower()
+    if critic_choice not in ("", "off", "claude", "codex", "gemini"):
+        raise PrepareError(
+            f"--critic must be one of: off, claude, codex, gemini (got: {critic_choice!r})"
+        )
     # ---- executor binding (implementation phase only; recorded universally for manifest consistency) ----
     executor_default = _default("OKSTRA_DEFAULT_EXECUTOR", "claude")
     executor_provider = (inp.executor or executor_default).strip().lower()
@@ -842,6 +851,7 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
         "EXECUTOR_WORKER_AGENT": executor_worker_agent,
         "EXECUTOR_MODEL_DISPLAY": executor_model_meta.display,
         "EXECUTOR_MODEL_EXECUTION_VALUE": executor_model_meta.execution,
+        "CRITIC_CHOICE": critic_choice,
         "RELATED_TASKS_JSON": related_tasks_json_str,
         "RELATED_TASKS_BULLETS": bullets,
         "RELATED_TASKS_INLINE": inline,
@@ -1098,6 +1108,7 @@ def main(argv: list[str]) -> int:
     p.add_argument("--gemini-model", default="")
     p.add_argument("--report-writer-model", default="")
     p.add_argument("--executor", default="")
+    p.add_argument("--critic", default="")
     p.add_argument("--related-tasks", default="", dest="related_tasks_raw")
     p.add_argument("--approved-plan", default="", dest="approved_plan_path")
     p.add_argument(
@@ -1198,6 +1209,7 @@ def main(argv: list[str]) -> int:
         gemini_model=args.gemini_model,
         report_writer_model=args.report_writer_model,
         executor=args.executor,
+        critic=args.critic,
         related_tasks_raw=args.related_tasks_raw,
         work_category=args.work_category,
         base_ref=args.base_ref,

package/runtime/python/okstra_ctl/wizard.py CHANGED Viewed

@@ -181,6 +181,8 @@ S_APPROVED_PLAN_PICK = "approved_plan_pick"
 S_APPROVED_PLAN = "approved_plan"
 S_STAGE_PICK = "stage_pick"
 S_EXECUTOR = "executor"
+S_CRITIC_PICK = "critic_pick"
+S_CRITIC_TEXT = "critic_text"
 S_DEFAULTS_OR_CUSTOM = "defaults_or_custom"
 S_WORKERS_OVERRIDE = "workers_override"
 S_LEAD_MODEL = "lead_model"
@@ -246,6 +248,8 @@ class WizardState:
     approved_plan_pending_text: bool = False
     selected_stage: str = "auto"
     executor: str = ""
+    critic: str = ""
+    critic_pending_text: bool = False
     # customize
     use_defaults: Optional[bool] = None
@@ -1459,6 +1463,55 @@ def _submit_pr_template_pick(state: WizardState, value: str) -> Optional[str]:
     )
+CRITIC_CHOICES = ["off", "claude", "codex", "gemini"]
+def _build_critic_pick(state: WizardState) -> Prompt:
+    t = _p(state.workspace_root, "critic_pick")
+    options: list[Option] = []
+    for k, v in t["options"].items():
+        if not k.startswith("_"):
+            options.append(_opt(k, v))
+    custom_label = t["options"].get(PICK_TYPE_CUSTOM, PICK_TYPE_CUSTOM)
+    options.append(_opt(PICK_TYPE_CUSTOM, custom_label))
+    return Prompt(
+        step=S_CRITIC_PICK, kind="pick",
+        label=t["label"],
+        options=options,
+        echo_template=t["echo_template"],
+    )
+def _submit_critic_pick(state: WizardState, value: str) -> Optional[str]:
+    if value == PICK_TYPE_CUSTOM:
+        state.critic_pending_text = True
+        return None
+    choice = (value or "").strip().lower()
+    if choice not in CRITIC_CHOICES:
+        raise WizardError(f"critic must be one of {CRITIC_CHOICES}, got: {value!r}")
+    state.critic = choice
+    state.critic_pending_text = False
+    return f"critic: {choice}"
+def _build_critic_text(state: WizardState) -> Prompt:
+    t = _p(state.workspace_root, "critic_text")
+    return Prompt(
+        step=S_CRITIC_TEXT, kind="text",
+        label=t["label"],
+        echo_template=t["echo_template"],
+    )
+def _submit_critic_text(state: WizardState, value: str) -> Optional[str]:
+    choice = (value or "").strip().lower()
+    if choice not in CRITIC_CHOICES:
+        raise WizardError(f"critic must be one of {CRITIC_CHOICES}, got: {value!r}")
+    state.critic = choice
+    state.critic_pending_text = False
+    return f"critic: {choice}"
 def _build_executor(state: WizardState) -> Prompt:
     t = _p(state.workspace_root, "executor")
     default_suffix = t["options"].get("_DEFAULT_SUFFIX", "")
@@ -1922,6 +1975,17 @@ STEPS: list[Step] = [
                             and not s.executor),
          build=_build_executor, submit=_submit_executor,
          owns=("executor",)),
+    Step(S_CRITIC_PICK,
+         applies=lambda s: (s.task_type in ("requirements-discovery", "error-analysis", "implementation-planning", "final-verification")
+                            and not s.critic
+                            and not s.critic_pending_text
+                            and S_CRITIC_PICK not in s.answered),
+         build=_build_critic_pick, submit=_submit_critic_pick,
+         owns=("critic", "critic_pending_text")),
+    Step(S_CRITIC_TEXT,
+         applies=lambda s: (s.critic_pending_text and S_CRITIC_TEXT not in s.answered),
+         build=_build_critic_text, submit=_submit_critic_text,
+         owns=("critic", "critic_pending_text")),
     Step(S_DEFAULTS_OR_CUSTOM,
          applies=lambda s: (_identity_ready(s)
                             and s.use_defaults is None),
@@ -2118,7 +2182,8 @@ _FIELD_DEFAULTS: dict[str, Any] = {
     "base_ref_pending_text": False, "approved_plan_path": "",
     "approved_plan_pending_text": False,
     "selected_stage": "auto",
-    "executor": "", "use_defaults": None, "workers_override": "",
+    "executor": "", "critic": "", "critic_pending_text": False,
+    "use_defaults": None, "workers_override": "",
     "lead_model": "", "claude_model": "", "codex_model": "",
     "gemini_model": "", "report_writer_model": "", "directive": "",
     "directive_pending_text": False,
@@ -2200,6 +2265,7 @@ def render_args(state: WizardState) -> dict[str, str]:
         "task-type": state.task_type,
         "task-brief": state.brief_path,
         "executor": state.executor,
+        "critic": state.critic,
         "approved-plan": state.approved_plan_path,
         "stage": (state.selected_stage or "auto") if state.task_type == "implementation" else "",
         "base-ref": base_ref,
@@ -2244,6 +2310,8 @@ def confirmation_block(state: WizardState) -> str:
     if state.report_writer_model:
         lines.append(f"  report-writer : {state.report_writer_model}")
     lines.append(f"  directive     : {state.directive or '(none)'}")
+    if state.task_type in ("requirements-discovery", "error-analysis", "implementation-planning", "final-verification"):
+        lines.append(f"  critic        : {state.critic or '(off)'}")
     if state.task_type == "implementation":
         lines.append(f"  approved-plan : {state.approved_plan_path}")
     if state.clarification_response_path:
@@ -2288,6 +2356,7 @@ def _cli(argv: list[str]) -> int:
     p_init.add_argument("--workspace-root", required=True)
     p_init.add_argument("--project-root", required=True)
     p_init.add_argument("--project-id", required=True)
+    p_init.add_argument("--critic", default="")
     p_step = sub.add_parser("step")
     p_step.add_argument("--state-file", required=True)
@@ -2313,6 +2382,8 @@ def _cli(argv: list[str]) -> int:
             project_root=args.project_root,
             project_id=args.project_id,
         )
+        if args.critic:
+            state.critic = args.critic
         save_state_file(state_path, state)
         first = next_prompt(state)
         print(json.dumps({"ok": True, "next": first.to_json()},