npm - okstra - Versions diffs - 0.30.3 → 0.31.0 - Mend

okstra 0.30.3 → 0.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/docs/kr/architecture.md +2 -2
package/docs/kr/cli.md +2 -2
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/SKILL.md +4 -2
package/runtime/agents/workers/claude-worker.md +1 -1
package/runtime/agents/workers/codex-worker.md +23 -6
package/runtime/agents/workers/gemini-worker.md +23 -6
package/runtime/agents/workers/report-writer-worker.md +2 -1
package/runtime/bin/okstra-codex-exec.sh +31 -0
package/runtime/bin/okstra-gemini-exec.sh +26 -0
package/runtime/python/lib/okstra/globals.sh +1 -1
package/runtime/python/lib/okstra/usage.sh +2 -2
package/runtime/python/okstra_ctl/models.py +2 -0
package/runtime/python/okstra_ctl/report_views.py +186 -10
package/runtime/python/okstra_ctl/run.py +1 -1
package/runtime/python/okstra_ctl/wizard.py +53 -14
package/runtime/python/okstra_ctl/workers.py +45 -11
package/runtime/python/okstra_token_usage/pricing.py +1 -0
package/runtime/skills/okstra-report-writer/SKILL.md +2 -2
package/runtime/skills/okstra-run/SKILL.md +6 -4
package/runtime/skills/okstra-team-contract/SKILL.md +27 -3
package/runtime/templates/reports/final-report.template.md +14 -8
package/runtime/templates/reports/report.css +51 -4
package/runtime/templates/reports/report.js +63 -7
package/runtime/templates/reports/settings.template.json +1 -0
package/src/install.mjs +1 -0

package/docs/kr/architecture.md CHANGED Viewed

@@ -257,7 +257,7 @@ Claude launch prompt 본문은 항상 `prompts/launch.template.md` 템플릿에
 - 메인 Claude는 항상 `Claude lead`이며 synthesis-only로 동작합니다.
 - 기본 required worker role은 `Claude worker`, `Codex worker`, `Report writer worker`입니다. `Gemini worker`는 옵션 워커로, `--workers` 또는 프로필의 `- Workers:` 섹션에 명시될 때만 required 로 포함됩니다.
 - `Report writer worker`는 보고서 구조화와 근거 정리에 집중하지만 최종 synthesis owner는 여전히 `Claude lead`입니다.
-- 기본 모델 계약은 중앙 기본값에서 계산합니다. 기본 fallback은 `Claude lead`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`(opt-in 시 적용)이며, `Report writer worker`는 별도 override가 없으면 `Claude lead` 모델을 따릅니다(즉, 기본값에서는 `opus`).
+- 기본 모델 계약은 중앙 기본값에서 계산합니다. 기본 fallback은 `Claude lead`=`opus-4-6`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`(opt-in 시 적용)이며, `Report writer worker`는 별도 override가 없으면 `Claude lead` 모델을 따릅니다(즉, 기본값에서는 `opus-4-6`).
 - `Gemini worker`는 옵션이므로 명시 포함된 run에 한해서만 시도 대상이 됩니다.
 - 최종 판단 전에는 현재 run의 worker roster 에 포함된 각 required role별로 결과 또는 명시적인 terminal status(`completed`, `timeout`, `error`, `not-run`)가 필요합니다.
 - 시도된 worker(`completed`, `timeout`, `error`)는 현재 run의 `prompts/` 아래 assigned worker prompt history file을 반드시 가져야 합니다.
@@ -929,7 +929,7 @@ phase 산출물의 출고 가능 여부를 강제하는 진입점:
 - run directory 내부는 `manifests/`, `state/`, `prompts/`, `reports/`, `status/`, `sessions/`, `worker-results/`처럼 유형별 하위 폴더로 구성되고, prompt snapshot은 `prompts/` 아래에 먼저 준비됩니다.
 - worker 생성과 결과 취합은 Claude가 수행합니다.
 - standard workflow는 `Claude lead` + 기본 worker `Claude worker`, `Codex worker`, `Report writer worker`를 사용하고, `Gemini worker`는 명시할 때만 포함되는 옵션입니다.
-- worker 모델은 `--lead-model`, `--claude-model`, `--codex-model`, `--gemini-model`, `--report-writer-model`로 override할 수 있고, 기본값은 `OKSTRA_DEFAULT_*` 환경 변수에서 중앙 관리합니다. fallback 기본값은 `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`입니다.
+- worker 모델은 `--lead-model`, `--claude-model`, `--codex-model`, `--gemini-model`, `--report-writer-model`로 override할 수 있고, 기본값은 `OKSTRA_DEFAULT_*` 환경 변수에서 중앙 관리합니다. fallback 기본값은 `Claude lead`/`Report writer worker`=`opus-4-6`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`입니다.
 - `--task-type implementation` 에서는 Executor 역할을 맡을 provider 를 `--executor <claude|codex|gemini>` (또는 `OKSTRA_DEFAULT_EXECUTOR`, fallback `claude`) 로 선택합니다. Executor 만 프로젝트 파일을 mutate 할 수 있고, 나머지 두 provider 와 자기 자신의 provider 가 모두 별도 CLI 세션으로 verifier 로 dispatch 됩니다 (세션 분리만으로도 self-review 안전장치 유지). Executor 의 모델은 선택된 provider 의 worker 모델 플래그(`--claude-model` / `--codex-model` / `--gemini-model`) 를 그대로 재사용하며, run-manifest 의 `teamContract.executor` 블록에 provider / displayName / workerAgent / model 이 기록됩니다.
 - Executor 별 worktree cwd 주입: codex / gemini executor 는 wrapper(`okstra-codex-exec.sh -C` / `okstra-gemini-exec.sh --include-directories`) 가 CLI layer 에서 cwd 를 worktree 로 고정합니다. Claude executor 는 Bash tool 에 per-call cwd 인자가 없어 cwd 민감 toolchain (`cargo`/`npm`/`pnpm`/`bun`/`pytest`/`make`/`go`) 호출을 같은 Bash invocation 안에서 `cd {{EXECUTOR_WORKTREE_PATH}} && <cmd>` 로 prefix 합니다 — `bash -lc`/`bash -c` 래핑은 금지되며 (`cd` leading token 이 가려져 permission auto-allow 우회 실패), 작업 디렉터리 플래그 (`git -C`, `cargo --manifest-path` 등) 가 있으면 그것을 우선합니다. 자세한 규약은 `prompts/profiles/implementation.md` 의 *Executor Worktree* 블록과 `agents/workers/claude-worker.md` 의 Executor exception 항목 참고.
 - project-level current-task convenience pointer는 `.project-docs/okstra/discovery/latest-task.json`입니다.

package/docs/kr/cli.md CHANGED Viewed

@@ -277,7 +277,7 @@ scripts/okstra.sh --task-type implementation-planning --workers claude,codex --p
 ```
 > 모든 `--*-model` 플래그는 `scripts/okstra_ctl/models.py` 의 provider 별 mapping 에 등록된 alias 만 허용합니다. 등록되지 않은 값은 `UnknownModelError` 로 즉시 거부됩니다 (manifest 의 `modelExecutionValue` 와 실제 실행값 불일치로 인한 contract-violation 을 사전에 차단). 허용값:
-> - Claude (`--lead-model` / `--claude-model` / `--report-writer-model`): `opus`, `opus-4-7`, `claude-opus-4-7`, `sonnet`, `sonnet-4-6`, `claude-sonnet-4-6`, `haiku`, `haiku-4-5`, `claude-haiku-4-5`, `claude-haiku-4-5-20251001`
+> - Claude (`--lead-model` / `--claude-model` / `--report-writer-model`): `opus`, `opus-4-7`, `claude-opus-4-7`, `opus-4-6`, `claude-opus-4-6`, `sonnet`, `sonnet-4-6`, `claude-sonnet-4-6`, `haiku`, `haiku-4-5`, `claude-haiku-4-5`, `claude-haiku-4-5-20251001`
 > - Codex (`--codex-model`): `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.3-codex`, `gpt-5.2`, `codex-auto-review`
 > - Gemini (`--gemini-model`): `auto`, `pro`, `gemini-3-flash-preview`, `gemini-3-pro-preview` (그리고 `gemini auto` / `gemini pro` 별칭)
@@ -289,7 +289,7 @@ scripts/okstra.sh --task-type implementation-planning --workers claude,codex --p
 ### `--lead-model`
 `Claude lead`에 사용할 모델을 지정합니다.
-지정하지 않으면 중앙 기본값 `OKSTRA_DEFAULT_LEAD_MODEL` 또는 fallback `opus`를 사용합니다.
+지정하지 않으면 중앙 기본값 `OKSTRA_DEFAULT_LEAD_MODEL` 또는 fallback `opus-4-6`를 사용합니다.
 ### `--codex-model`

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.30.3",
+  "version": "0.31.0",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.30.3",
-  "builtAt": "2026-05-19T08:41:16.438Z",
+  "package": "0.31.0",
+  "builtAt": "2026-05-19T11:50:21.853Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/agents/SKILL.md CHANGED Viewed

@@ -88,8 +88,8 @@ Unless the task bundle overrides:
 | Role | Model | subagent_type | Notes |
 |------|-------|---------------|-------|
-| Claude lead | opus | -- | orchestration + synthesis |
-| Report writer worker | opus | report-writer-worker | authors the final report file (`agents/report-writer-worker.md`) |
+| Claude lead | opus-4-6 | -- | orchestration + synthesis |
+| Report writer worker | opus-4-6 | report-writer-worker | authors the final report file (`agents/report-writer-worker.md`) |
 | Claude worker | sonnet | claude-worker | defined in `agents/claude-worker.md` |
 | Codex worker | gpt-5.5 | codex-worker | defined in `agents/codex-worker.md` |
 | Gemini worker | auto | gemini-worker | defined in `agents/gemini-worker.md` |
@@ -184,6 +184,8 @@ The launch prompt's `## Run Logs (error-log wiring)` section gives Lead the reso
 Workers are contractually required to extract these two lines and abort with `<WORKER>_ERRORS_PATH_MISSING` if either is absent (see each worker definition's "Path extraction (BLOCKING)" block). Omitting these headers reproduces the historical bug class where every run's `errors-<task-type>-<seq>.jsonl` stayed empty because workers had only template placeholders to work from.
+After each worker terminates, BEFORE classifying its terminal status, verify the canonical result file exists at the absolute path resolved from the `**Result Path:**` header. If it is absent — or the wrapper sub-agent returned `CODEX_RESULT_MISSING` / `GEMINI_RESULT_MISSING` — re-dispatch the SAME worker once with the byte-identical prompt. Only after the second attempt also misses may the role be classified `error` with `--message "result-missing after 1 retry"`. Full rules: [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Lead Redispatch Policy on Result-Missing".
 After each worker terminates (any terminal status), if its errors sidecar exists, dump it to the run error log using the same resolved paths from the launch prompt:
 ```bash

package/runtime/agents/workers/claude-worker.md CHANGED Viewed

@@ -85,7 +85,7 @@ This contract mirrors the `okstra-team-contract` skill's Worker Output Contract
 You are an in-process Claude subagent — Lead's `Agent()` call blocks until you return your final assistant message. Lingering after your worker-results file is on disk extends Phase 4 wall-clock time for the entire run and delays convergence. Be deliberate about stopping.
-After your `Write` to the assigned worker-results file (path provided by Lead as `**Worker Result Path:**` or derived under `runs/<task-type>/worker-results/claude-worker-<task-type>-<seq>.md`) succeeds:
+After your `Write` to the assigned worker-results file (path provided by Lead as `**Result Path:**` — the canonical anchor header defined in `okstra-team-contract` "Worker Prompt Composition" — or derived under `runs/<task-type>/worker-results/claude-worker-<task-type>-<seq>.md`) succeeds:
 1. Return your final assistant message **immediately**, in this format:
    `Worker results written to <abs path>. Sections 1–5 complete. Findings: <n>.`

package/runtime/agents/workers/codex-worker.md CHANGED Viewed

@@ -84,13 +84,29 @@ The wrapper exists because Claude Code's Bash permission matcher rejects simple-
    - Repeat:
      1. Call `BashOutput(bash_id: <shell_id>)`. Inspect `status`. The harness's `BashOutput` primitive already waits internally for new output before returning; back-to-back calls are the canonical wait mechanism for a background shell.
      2. If `status == "completed"`: break out of the loop and proceed to step 8.
-     3. If wall-clock elapsed (`current_ts - start_ts`) exceeds `1800` seconds: call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms 1800000`, `--message "okstra-codex-exec.sh exceeded 30m polling cap"`, and return `CODEX_CLI_TIMEOUT: codex exec exceeded 30-minute polling cap`.
+     3. If wall-clock elapsed (`current_ts - start_ts`) exceeds the current cap (initially `1800` seconds), the cap is reached. **Before** calling `KillShell`, perform a one-shot **mtime-grace check** to distinguish "CLI is stuck" from "CLI is still actively writing." Single `Bash` call (mtime portable across BSD/GNU `stat`):
+        `log="${prompt_path%.md}.log"; mtime=$(stat -f '%m' "$log" 2>/dev/null || stat -c '%Y' "$log" 2>/dev/null); date +%s`
+        (output captured — first line is `mtime`, second is `current_ts`).
+        - If `current_ts - mtime <= 90` AND grace has NOT yet been applied this polling loop: extend the cap to `2100` seconds (one-shot +5min grace), record the grace internally so it does not re-trigger, and continue polling.
+        - Otherwise (mtime stale `> 90s`, OR grace already applied): call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms <observed_ms>`, `--message "okstra-codex-exec.sh exceeded polling cap (grace=<applied|not-applied>, last_mtime_age=<n>s)"`, and return `CODEX_CLI_TIMEOUT: codex exec exceeded polling cap`.
      4. Otherwise continue polling. Read `current_ts` cheaply via another `Bash` call (`date +%s`) at most once per poll iteration.
-   - Do NOT abort the loop on transient `running` status. Only `completed` or the 30-minute cap end it.
+   - Do NOT abort the loop on transient `running` status. Only `completed` or the polling cap (initially 30min, optionally extended once to 35min by mtime grace) end it.
+   - **No external timeout from Lead.** This polling loop is the SINGLE timeout authority for this dispatch. Lead MUST NOT impose a separate Agent-call timeout that would terminate this subagent before the polling cap is reached (see okstra-team-contract "No external timeout on wrapper subagents").
    - Do NOT issue parallel `BashOutput` calls or speculate about progress between polls.
    - **No standalone `sleep` between polls.** The harness blocks `sleep` calls of 5 seconds or longer as a circumvention vector and explicitly forbids chaining shorter sleeps to work around it. `BashOutput` itself is the wait primitive — calling it again immediately after a `running` status is correct. If you find yourself wanting to "slow down" the loop, that desire is a leftover from the retired 60-second-cadence rule and should be ignored.
-8. Concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification. If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning.
+8. After the polling loop exits with `completed`, perform terminal-status determination BEFORE returning:
+   a. **Extract Result Path.** Read the `**Result Path:** <abs-path>` header line from the lead's dispatch prompt body. If the header is absent, return `CODEX_RESULT_PATH_MISSING: lead prompt did not include **Result Path:** header` without proceeding. Resolve to absolute against `Project Root` if relative.
+   b. **CLI failure first.** If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning. Do NOT perform the result-file check on a failed exit — `cli-failure` already covers it.
+   c. **Result-file existence check (exit 0 only).** If `exit_code == 0` BUT no file exists at the extracted Result Path, the Codex CLI returned 0 without producing the analysis artifact. Observed failure mode: the CLI streams analysis prose on stdout, hits its token budget or a sandbox EPERM mid-`Write`, and exits 0 with the artifact never persisted. Forwarding the partial stdout silently degrades lead synthesis (the case that motivated this rule), so this path is required.
+      1. Capture the final ~10 lines of the wrapper's live log for diagnostics — single Bash call: `tail -n 10 "${prompt_path%.md}.log"` (substitute the literal absolute prompt-history path; the wrapper writes the log next to it per the §"trace pane" comment in `okstra-codex-exec.sh`). Write the captured lines to a temp file (e.g. `<errors-sidecar-dir>/codex-result-missing-tail.txt`) so `--stderr-excerpt-file` can reference it.
+      2. Record a `cli-failure` event directly to the run-level error log via the exact `okstra-error-log.py append-observed` template in §"Error reporting" — substitute `--exit-code 0`, `--duration-ms <observed-ms>`, `--message "okstra-codex-exec.sh exited 0 but no result file at <abs-path>"`, and `--stderr-excerpt-file <temp-tail-path>`.
+      3. Return `CODEX_RESULT_MISSING: codex exited 0 but result file absent at <abs-path>` instead of the raw stdout. The lead is responsible for deciding redispatch per `okstra-team-contract` "Lead Redispatch Policy on Result-Missing".
+   d. **Normal return.** Otherwise (`exit_code == 0` AND result file exists), concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification.
 ## Stop Condition
@@ -98,7 +114,7 @@ This wrapper is a thin Bash-execution shell over the Codex CLI (via `okstra-code
 - Return immediately after the polling loop exits with `completed` (or after recording any required `cli-failure` event for a non-zero exit / 30-minute cap / rate-limit).
 - The only tool calls permitted during the polling loop are `BashOutput`, a single `Bash` call per iteration for `date +%s` (timeout bookkeeping only — no `sleep`), and — on the timeout path only — `KillShell`. Do NOT perform additional `Read`, `Grep`, `Glob` calls between polls; do NOT inspect intermediate wrapper output mid-run.
-- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
+- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–8 (prompt persistence, Project Root extraction, model resolution, and the step 8 result-file check). The step 8 result-file existence check is explicitly permitted: at most one `Bash` call for `tail -n 10 <log-path>` and one `Read`/test of the Result Path.
 - Do NOT re-invoke `okstra-codex-exec.sh` to "double-check" or "rerun for safety" — convergence (Phase 5.5) handles cross-worker reconciliation. A single CLI dispatch per dispatched-prompt is the contract.
 The Codex CLI's own exit terminates the underlying analysis; this wrapper terminates by returning its captured output (or sentinel).
@@ -175,10 +191,11 @@ and the run-level error log staying empty.
    will dump it to the run error log after this subagent terminates.
 2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
-   reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
+   reports a non-zero `exit_code`, the polling cap (30min, optionally
+   extended once to 35min via mtime grace; see step 7) is hit, or the
    captured stdout/stderr carries a rate-limit/auth message, immediately
    append a `cli-failure` event directly to the run error log. The
-   30-minute-cap path additionally requires a prior `KillShell` call against
+   polling-cap path additionally requires a prior `KillShell` call against
    the dispatched `bash_id`:
    ```bash

package/runtime/agents/workers/gemini-worker.md CHANGED Viewed

@@ -84,13 +84,29 @@ The wrapper exists because Claude Code's Bash permission matcher rejects simple-
    - Repeat:
      1. Call `BashOutput(bash_id: <shell_id>)`. Inspect `status`. The harness's `BashOutput` primitive already waits internally for new output before returning; back-to-back calls are the canonical wait mechanism for a background shell.
      2. If `status == "completed"`: break out of the loop and proceed to step 8.
-     3. If wall-clock elapsed (`current_ts - start_ts`) exceeds `1800` seconds: call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms 1800000`, `--message "okstra-gemini-exec.sh exceeded 30m polling cap"`, and return `GEMINI_CLI_TIMEOUT: gemini exec exceeded 30-minute polling cap`.
+     3. If wall-clock elapsed (`current_ts - start_ts`) exceeds the current cap (initially `1800` seconds), the cap is reached. **Before** calling `KillShell`, perform a one-shot **mtime-grace check** to distinguish "CLI is stuck" from "CLI is still actively writing." Single `Bash` call (mtime portable across BSD/GNU `stat`):
+        `log="${prompt_path%.md}.log"; mtime=$(stat -f '%m' "$log" 2>/dev/null || stat -c '%Y' "$log" 2>/dev/null); date +%s`
+        (output captured — first line is `mtime`, second is `current_ts`).
+        - If `current_ts - mtime <= 90` AND grace has NOT yet been applied this polling loop: extend the cap to `2100` seconds (one-shot +5min grace), record the grace internally so it does not re-trigger, and continue polling.
+        - Otherwise (mtime stale `> 90s`, OR grace already applied): call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms <observed_ms>`, `--message "okstra-gemini-exec.sh exceeded polling cap (grace=<applied|not-applied>, last_mtime_age=<n>s)"`, and return `GEMINI_CLI_TIMEOUT: gemini exec exceeded polling cap`.
      4. Otherwise continue polling. Read `current_ts` cheaply via another `Bash` call (`date +%s`) at most once per poll iteration.
-   - Do NOT abort the loop on transient `running` status. Only `completed` or the 30-minute cap end it.
+   - Do NOT abort the loop on transient `running` status. Only `completed` or the polling cap (initially 30min, optionally extended once to 35min by mtime grace) end it.
+   - **No external timeout from Lead.** This polling loop is the SINGLE timeout authority for this dispatch. Lead MUST NOT impose a separate Agent-call timeout that would terminate this subagent before the polling cap is reached (see okstra-team-contract "No external timeout on wrapper subagents").
    - Do NOT issue parallel `BashOutput` calls or speculate about progress between polls.
    - **No standalone `sleep` between polls.** The harness blocks `sleep` calls of 5 seconds or longer as a circumvention vector and explicitly forbids chaining shorter sleeps to work around it. `BashOutput` itself is the wait primitive — calling it again immediately after a `running` status is correct. If you find yourself wanting to "slow down" the loop, that desire is a leftover from the retired 60-second-cadence rule and should be ignored.
-8. Concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification. If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning.
+8. After the polling loop exits with `completed`, perform terminal-status determination BEFORE returning:
+   a. **Extract Result Path.** Read the `**Result Path:** <abs-path>` header line from the lead's dispatch prompt body. If the header is absent, return `GEMINI_RESULT_PATH_MISSING: lead prompt did not include **Result Path:** header` without proceeding. Resolve to absolute against `Project Root` if relative.
+   b. **CLI failure first.** If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning. Do NOT perform the result-file check on a failed exit — `cli-failure` already covers it.
+   c. **Result-file existence check (exit 0 only).** If `exit_code == 0` BUT no file exists at the extracted Result Path, the Gemini CLI returned 0 without producing the analysis artifact. Observed failure mode: the CLI streams analysis prose on stdout, hits its token budget or a sandbox EPERM mid-`Write`, and exits 0 with the artifact never persisted. Forwarding the partial stdout silently degrades lead synthesis (the case that motivated this rule), so this path is required.
+      1. Capture the final ~10 lines of the wrapper's live log for diagnostics — single Bash call: `tail -n 10 "${prompt_path%.md}.log"` (substitute the literal absolute prompt-history path; the wrapper writes the log next to it per the §"trace pane" comment in `okstra-gemini-exec.sh`). Write the captured lines to a temp file (e.g. `<errors-sidecar-dir>/gemini-result-missing-tail.txt`) so `--stderr-excerpt-file` can reference it.
+      2. Record a `cli-failure` event directly to the run-level error log via the exact `okstra-error-log.py append-observed` template in §"Error reporting" — substitute `--exit-code 0`, `--duration-ms <observed-ms>`, `--message "okstra-gemini-exec.sh exited 0 but no result file at <abs-path>"`, and `--stderr-excerpt-file <temp-tail-path>`.
+      3. Return `GEMINI_RESULT_MISSING: gemini exited 0 but result file absent at <abs-path>` instead of the raw stdout. The lead is responsible for deciding redispatch per `okstra-team-contract` "Lead Redispatch Policy on Result-Missing".
+   d. **Normal return.** Otherwise (`exit_code == 0` AND result file exists), concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification.
 ## Stop Condition
@@ -98,7 +114,7 @@ This wrapper is a thin Bash-execution shell over the Gemini CLI (via `okstra-gem
 - Return immediately after the polling loop exits with `completed` (or after recording any required `cli-failure` event for a non-zero exit / 30-minute cap / rate-limit).
 - The only tool calls permitted during the polling loop are `BashOutput`, a single `Bash` call per iteration for `date +%s` (timeout bookkeeping only — no `sleep`), and — on the timeout path only — `KillShell`. Do NOT perform additional `Read`, `Grep`, `Glob` calls between polls; do NOT inspect intermediate wrapper output mid-run.
-- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
+- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–8 (prompt persistence, Project Root extraction, model resolution, and the step 8 result-file check). The step 8 result-file existence check is explicitly permitted: at most one `Bash` call for `tail -n 10 <log-path>` and one `Read`/test of the Result Path.
 - Do NOT re-invoke `okstra-gemini-exec.sh` to "double-check" or "rerun for safety" — convergence (Phase 5.5) handles cross-worker reconciliation. A single CLI dispatch per dispatched-prompt is the contract.
 The Gemini CLI's own exit terminates the underlying analysis; this wrapper terminates by returning its captured output (or sentinel).
@@ -175,10 +191,11 @@ and the run-level error log staying empty.
    will dump it to the run error log after this subagent terminates.
 2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
-   reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
+   reports a non-zero `exit_code`, the polling cap (30min, optionally
+   extended once to 35min via mtime grace; see step 7) is hit, or the
    captured stdout/stderr carries a rate-limit/auth message, immediately
    append a `cli-failure` event directly to the run error log. The
-   30-minute-cap path additionally requires a prior `KillShell` call against
+   polling-cap path additionally requires a prior `KillShell` call against
    the dispatched `bash_id`:
    ```bash

package/runtime/agents/workers/report-writer-worker.md CHANGED Viewed

@@ -62,6 +62,7 @@ Hard rules:
 - The file's `Report Author:` header line is `Report writer worker` (your role); `Report Owner:` remains `Claude lead`. Do NOT set `Report Author:` to `Claude lead` unless this run is `release-handoff` (which is single-lead by design) or a recorded report-writer dispatch failure forced the fallback.
 - **Source items (worker:item) preservation.** When synthesising `## 1.1 Consensus` / `## 1.2 Differences` / `## 3.1 Primary Evidence` rows from worker outputs, the `Source items` / `Supporting workers` / `Workers (position)` / `Source` column MUST list each contributing worker's item ID as `worker:item-id` (e.g. `claude:F-001, codex:1.1, gemini:F-3`). Bare worker-name lists (e.g. `claude, codex, gemini`) are deprecated — they break traceability back to the original worker-results files. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
 - **Verdict Card (top)** is mandatory in every final-report. Its `Verdict Token` / `Direction` / `Next Step` cells MUST byte-match the corresponding cells in `## 2. Final Verdict` and the first item of `## 6. Recommended Next Steps`. The validator treats the card as a non-authoritative index — divergence is `contract-violated`.
+- **§7 phase-continuation row (mandatory for non-terminal task-types).** When `task-type` is one of `requirements-discovery` / `implementation-planning` / `error-analysis` / `implementation` / `final-verification`, you MUST emit one `## 7. Follow-up Tasks` row whose `Origin` is `phase-continuation`, `Suggested task-type` equals the next phase named in `## 2. Final Verdict` (byte-identical), `New Task ID` reuses the current task-id (no new slug — same task-key carries forward), `Auto-spawn?` is `no` (next phase advances via `/okstra-run`, not via the spawn script), and `Priority` is `P0`. This row stands in addition to any scope-boundary rows. For `release-handoff` runs the next phase is empty, so omit the phase-continuation row entirely. The §7 empty-state placeholder `- 후속 작업 없음. 본 run 의 다음 phase 는 §6 참고.` is only valid when the task-type has no mandatory phase-continuation AND there are no scope-boundary rows. See `templates/reports/final-report.template.md` §7 contract for the full table schema.
 - **No deprecated sections.** Do NOT emit `4.5.8 User Approval Request` (the body stub is deleted; the top-of-report Approval block is the only one), `4.5.9 Open Questions`, `5.1 추가 자료 요청`, or `5.2 사용자 확인 질문`. The validator fails reports that contain any of these headings.
 - **Conditional Section 0.** Render `## 0. Clarification Response Carried In From Previous Run` ONLY when the carry-in path is non-empty. Never write an empty-state stub (`"No prior clarification response was provided."`). The validator fails empty Section 0.
 - **Reading Confirmation** lives in the audit sidecar (`runs/<task-type>/worker-results/report-writer-worker-audit-<task-type>-<seq>.md`), never in the final-report or main worker-results file.
@@ -109,7 +110,7 @@ date: <YYYY-MM-DD>
 **Task:** <task-type>
 **Target:** final-report assembly
 **Date:** <YYYY-MM-DD>
-**Model:** Report writer worker, opus
+**Model:** Report writer worker, opus-4-6
 ```
 The same frontmatter (with `workerId: "report-writer"`) MUST also appear on the final-report file you assemble — the `final-report.template.md` already encodes it, so simply preserve the template's frontmatter block when filling sections.

package/runtime/bin/okstra-codex-exec.sh CHANGED Viewed

@@ -150,6 +150,37 @@ log_path="${prompt_path%.md}.log"
 [[ "$log_path" == "$prompt_path" ]] && log_path="${prompt_path}.log"
 : > "$log_path"
+# Heartbeat sidecar (`<prompt>.status.json`). The codex CLI streams progress
+# over stdout/stderr but the only structured signal the caller subagent
+# polls is `BashOutput`'s binary `running`/`completed` state. The sidecar
+# records `started_ts`, `pid`, `log_path` up-front and — via the EXIT trap
+# below — `exit_code`, `ended_ts`, `duration_ms` on termination. Two
+# consumers:
+#   1. `codex-worker` step 8c reads `log_path` to capture a diagnostic
+#      tail when `exit_code == 0` but no Result file was produced.
+#   2. Lead's redispatch policy distinguishes "wrapper never started"
+#      from "CLI ran but produced no artifact" via `stage`/`started_ts`.
+# Writes are best-effort; sidecar failures must NOT break the dispatch
+# (the underlying CLI run is the wrapper's primary job).
+status_path="${prompt_path%.md}.status.json"
+[[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
+started_ts=$(date +%s)
+script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+python3 "$script_dir/okstra-wrapper-status.py" \
+  init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
+  >>"$log_path" 2>&1 || true
+_okstra_status_finish() {
+  local exit_code=$?
+  local ended_ts
+  ended_ts=$(date +%s)
+  local duration_ms=$(( (ended_ts - started_ts) * 1000 ))
+  python3 "$script_dir/okstra-wrapper-status.py" \
+    finish "$status_path" "$exit_code" "$ended_ts" "$duration_ms" \
+    >>"$log_path" 2>&1 || true
+}
+trap _okstra_status_finish EXIT
 # When a tmux session is reachable, split a sibling pane that tails the live
 # log so the operator can watch codex's progress in real time without waiting
 # for the wrapper to exit. This fires in every phase the wrapper is invoked

package/runtime/bin/okstra-gemini-exec.sh CHANGED Viewed

@@ -106,6 +106,32 @@ log_path="${prompt_path%.md}.log"
 [[ "$log_path" == "$prompt_path" ]] && log_path="${prompt_path}.log"
 : > "$log_path"
+# Heartbeat sidecar (`<prompt>.status.json`). See `okstra-codex-exec.sh` for
+# the full design comment — kept in lock-step. Two consumers:
+#   1. `gemini-worker` step 8c reads `log_path` to capture a diagnostic
+#      tail when `exit_code == 0` but no Result file was produced.
+#   2. Lead's redispatch policy distinguishes "wrapper never started"
+#      from "CLI ran but produced no artifact" via `stage`/`started_ts`.
+# Writes are best-effort.
+status_path="${prompt_path%.md}.status.json"
+[[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
+started_ts=$(date +%s)
+script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
+python3 "$script_dir/okstra-wrapper-status.py" \
+  init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
+  >>"$log_path" 2>&1 || true
+_okstra_status_finish() {
+  local exit_code=$?
+  local ended_ts
+  ended_ts=$(date +%s)
+  local duration_ms=$(( (ended_ts - started_ts) * 1000 ))
+  python3 "$script_dir/okstra-wrapper-status.py" \
+    finish "$status_path" "$exit_code" "$ended_ts" "$duration_ms" \
+    >>"$log_path" 2>&1 || true
+}
+trap _okstra_status_finish EXIT
 # When a tmux session is reachable, split a sibling pane tailing the log so
 # the operator can watch progress live. This fires in every phase the
 # wrapper is invoked from — long-running gemini dispatches are not

package/runtime/python/lib/okstra/globals.sh CHANGED Viewed

@@ -154,7 +154,7 @@ GEMINI_WORKER_MODEL_EXECUTION_VALUE=""
 REPORT_WRITER_MODEL_DISPLAY=""
 REPORT_WRITER_MODEL_EXECUTION_VALUE=""
 DEFAULT_WORKERS="claude,codex,report-writer"
-DEFAULT_LEAD_MODEL_NAME="${OKSTRA_DEFAULT_LEAD_MODEL:-opus}"
+DEFAULT_LEAD_MODEL_NAME="${OKSTRA_DEFAULT_LEAD_MODEL:-opus-4-6}"
 DEFAULT_CLAUDE_WORKER_MODEL_NAME="${OKSTRA_DEFAULT_CLAUDE_MODEL:-sonnet}"
 DEFAULT_CODEX_WORKER_MODEL_NAME="${OKSTRA_DEFAULT_CODEX_MODEL:-gpt-5.5}"
 DEFAULT_GEMINI_WORKER_MODEL_NAME="${OKSTRA_DEFAULT_GEMINI_MODEL:-auto}"

package/runtime/python/lib/okstra/usage.sh CHANGED Viewed

@@ -71,7 +71,7 @@ options:
   --yes                Skip interactive prompting and confirmation. Requires all required arguments.
   --workers            Comma-separated worker list for this run. Default: claude,codex,report-writer
                       (Gemini worker is optional; add `gemini` explicitly, e.g. --workers claude,codex,gemini,report-writer)
-  --lead-model         Model for Claude lead. Default: OKSTRA_DEFAULT_LEAD_MODEL or opus
+  --lead-model         Model for Claude lead. Default: OKSTRA_DEFAULT_LEAD_MODEL or opus-4-6
   --claude-model       Model for Claude worker. Default: OKSTRA_DEFAULT_CLAUDE_MODEL or sonnet
   --codex-model        Model for Codex worker. Default: OKSTRA_DEFAULT_CODEX_MODEL or gpt-5.5
   --gemini-model       Model for Gemini worker. Default: OKSTRA_DEFAULT_GEMINI_MODEL or auto
@@ -98,7 +98,7 @@ options:
   -h, --help           Show this help.
 model defaults:
-  Claude lead: OKSTRA_DEFAULT_LEAD_MODEL or opus
+  Claude lead: OKSTRA_DEFAULT_LEAD_MODEL or opus-4-6
   Report writer worker: OKSTRA_DEFAULT_REPORT_WRITER_MODEL or Claude lead default
   Claude worker: OKSTRA_DEFAULT_CLAUDE_MODEL or sonnet
   Codex worker: OKSTRA_DEFAULT_CODEX_MODEL or gpt-5.5

package/runtime/python/okstra_ctl/models.py CHANGED Viewed

@@ -11,6 +11,8 @@ CLAUDE_MAPPING = {
     "opus": ("opus", "opus"),
     "opus-4-7": ("opus-4-7", "claude-opus-4-7"),
     "claude-opus-4-7": ("opus-4-7", "claude-opus-4-7"),
+    "opus-4-6": ("opus-4-6", "claude-opus-4-6"),
+    "claude-opus-4-6": ("opus-4-6", "claude-opus-4-6"),
     "sonnet": ("sonnet", "sonnet"),
     "sonnet-4-6": ("sonnet-4-6", "claude-sonnet-4-6"),
     "claude-sonnet-4-6": ("sonnet-4-6", "claude-sonnet-4-6"),