npm - okstra - Versions diffs - 0.30.3 → 0.32.0 - Mend

okstra 0.30.3 → 0.32.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

package/docs/kr/architecture.md +2 -2
package/docs/kr/cli.md +2 -2
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/SKILL.md +7 -5
package/runtime/agents/workers/claude-worker.md +1 -1
package/runtime/agents/workers/codex-worker.md +23 -6
package/runtime/agents/workers/gemini-worker.md +23 -6
package/runtime/agents/workers/report-writer-worker.md +45 -66
package/runtime/bin/okstra-codex-exec.sh +31 -0
package/runtime/bin/okstra-gemini-exec.sh +26 -0
package/runtime/bin/okstra-render-final-report.py +101 -0
package/runtime/bin/okstra-render-report-views.py +17 -10
package/runtime/bin/okstra-token-usage.py +3 -1
package/runtime/python/lib/okstra/globals.sh +1 -1
package/runtime/python/lib/okstra/usage.sh +2 -2
package/runtime/python/okstra_ctl/final_report_schema.py +253 -0
package/runtime/python/okstra_ctl/models.py +2 -0
package/runtime/python/okstra_ctl/render_final_report.py +201 -0
package/runtime/python/okstra_ctl/report_views.py +276 -297
package/runtime/python/okstra_ctl/run.py +1 -1
package/runtime/python/okstra_ctl/wizard.py +53 -14
package/runtime/python/okstra_ctl/workers.py +45 -11
package/runtime/python/okstra_token_usage/__init__.py +5 -1
package/runtime/python/okstra_token_usage/cli.py +66 -36
package/runtime/python/okstra_token_usage/pricing.py +1 -0
package/runtime/python/okstra_token_usage/report.py +148 -65
package/runtime/python/okstra_vendor/__init__.py +37 -0
package/runtime/python/okstra_vendor/jinja2/__init__.py +38 -0
package/runtime/python/okstra_vendor/jinja2/_identifier.py +6 -0
package/runtime/python/okstra_vendor/jinja2/async_utils.py +99 -0
package/runtime/python/okstra_vendor/jinja2/bccache.py +408 -0
package/runtime/python/okstra_vendor/jinja2/compiler.py +1998 -0
package/runtime/python/okstra_vendor/jinja2/constants.py +20 -0
package/runtime/python/okstra_vendor/jinja2/debug.py +191 -0
package/runtime/python/okstra_vendor/jinja2/defaults.py +48 -0
package/runtime/python/okstra_vendor/jinja2/environment.py +1672 -0
package/runtime/python/okstra_vendor/jinja2/exceptions.py +166 -0
package/runtime/python/okstra_vendor/jinja2/ext.py +870 -0
package/runtime/python/okstra_vendor/jinja2/filters.py +1873 -0
package/runtime/python/okstra_vendor/jinja2/idtracking.py +318 -0
package/runtime/python/okstra_vendor/jinja2/lexer.py +868 -0
package/runtime/python/okstra_vendor/jinja2/loaders.py +693 -0
package/runtime/python/okstra_vendor/jinja2/meta.py +112 -0
package/runtime/python/okstra_vendor/jinja2/nativetypes.py +130 -0
package/runtime/python/okstra_vendor/jinja2/nodes.py +1206 -0
package/runtime/python/okstra_vendor/jinja2/optimizer.py +48 -0
package/runtime/python/okstra_vendor/jinja2/parser.py +1049 -0
package/runtime/python/okstra_vendor/jinja2/py.typed +0 -0
package/runtime/python/okstra_vendor/jinja2/runtime.py +1062 -0
package/runtime/python/okstra_vendor/jinja2/sandbox.py +436 -0
package/runtime/python/okstra_vendor/jinja2/tests.py +256 -0
package/runtime/python/okstra_vendor/jinja2/utils.py +766 -0
package/runtime/python/okstra_vendor/jinja2/visitor.py +92 -0
package/runtime/python/okstra_vendor/markupsafe/__init__.py +396 -0
package/runtime/python/okstra_vendor/markupsafe/_native.py +8 -0
package/runtime/python/okstra_vendor/markupsafe/py.typed +0 -0
package/runtime/schemas/final-report-v1.0.schema.json +1391 -0
package/runtime/skills/okstra-report-writer/SKILL.md +31 -30
package/runtime/skills/okstra-run/SKILL.md +6 -4
package/runtime/skills/okstra-team-contract/SKILL.md +27 -3
package/runtime/templates/reports/final-report.template.md +370 -405
package/runtime/templates/reports/report.css +57 -4
package/runtime/templates/reports/report.js +63 -7
package/runtime/templates/reports/settings.template.json +1 -0
package/runtime/validators/lib/fixtures.sh +7 -7
package/runtime/validators/validate-report-views.py +24 -153
package/runtime/validators/validate-run.py +102 -19
package/src/install.mjs +21 -1

package/runtime/skills/okstra-report-writer/SKILL.md CHANGED Viewed

@@ -8,11 +8,13 @@ user-invocable: false
 ## File-author ownership (BLOCKING)
-The final-report file at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md` is authored by the `Report writer worker` subagent when that worker is in the run's roster. Claude lead reviews the file but does NOT write it itself in that case. Lead-authored fallback is permitted only after a real Report writer worker dispatch attempt with a recorded non-`completed` terminal status (`error` / `timeout` / `not-run`) and a logged reason (`okstra-error-log.py`). **Except for `release-handoff`**, which has no worker roster — the Claude lead authors the final-report directly by design (see "Release-handoff section contract" below), and the fallback rules in this section do not apply.
+The final-report **data.json** (JSON SSOT) at `runs/<task-type>/reports/final-report-<task-type>-<seq>.data.json` is authored by the `Report writer worker` subagent when that worker is in the run's roster. The user-facing **markdown** at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md` is then produced by `scripts/okstra-render-final-report.py` from the data.json — the worker invokes the renderer as part of its own turn so both files land on disk before it returns. Claude lead reviews both files but does NOT write them itself in that case. Lead-authored fallback is permitted only after a real Report writer worker dispatch attempt with a recorded non-`completed` terminal status (`error` / `timeout` / `not-run`) and a logged reason (`okstra-error-log.py`). **Except for `release-handoff`**, which has no worker roster — the Claude lead authors the data.json directly by design (see "Release-handoff section contract" below), and the fallback rules in this section do not apply.
-If you are reading this skill **as the report-writer-worker subagent**, YOU are the one calling the `Write` tool against the result path. Do not return the report inline — the file on disk is the canonical artifact.
+The data.json schema is `schemas/final-report-v1.0.schema.json`. The renderer + the run-validator both consume that schema, so a data.json that validates is guaranteed to render into a markdown that passes the contract checks.
-If you are reading this skill **as Claude lead**, your job in Phase 6 is to (a) prepare the report-writer prompt, (b) dispatch the Report writer worker per the Phase 6 dispatch template in SKILL.md, (c) review the produced file in Phase 7. Do not call `Write` against the final-report path yourself when Report writer worker is in the roster.
+If you are reading this skill **as the report-writer-worker subagent**, YOU are the one calling the `Write` tool against the data.json path AND invoking the renderer via `Bash`. Do not return either artifact inline — the files on disk are the canonical record.
+If you are reading this skill **as Claude lead**, your job in Phase 6 is to (a) prepare the report-writer prompt, (b) dispatch the Report writer worker per the Phase 6 dispatch template in SKILL.md, (c) review both files in Phase 7. Do not call `Write` against either path yourself when Report writer worker is in the roster.
 ## When to Use
@@ -40,15 +42,15 @@ The prompt MUST include, in this order at the top:
 1. `**Project Root:** <absolute-path>`
 2. `**Prompt History Path:** <project-relative-path>` (under current run `prompts/`)
-3. `**Result Path:** runs/<task-type>/reports/final-report-<task-type>-<seq>.md` — canonical final-report file (user-facing)
+3. `**Result Path:** runs/<task-type>/reports/final-report-<task-type>-<seq>.data.json` — canonical JSON SSOT. The renderer produces the sibling `.md` automatically.
 4. `**Worker Result Path:** runs/<task-type>/worker-results/report-writer-worker-<task-type>-<seq>.md` — mandatory validator-checked worker-results audit file
 5. `Assigned worker prompt history path: <absolute-path>`
 6. `**Model:** Report writer worker, <modelExecutionValue>` (resolved per Phase 5.5 anchor-header rules)
-7. The full `[Required reading]` clause (see [okstra-team-contract](../okstra-team-contract/SKILL.md)) including `final-report-template.md`.
+7. The full `[Required reading]` clause (see [okstra-team-contract](../okstra-team-contract/SKILL.md)) including `schemas/final-report-v1.0.schema.json` and `templates/reports/final-report.template.md` (template is read-only — the worker writes the data.json that drives it).
 8. The verbatim `## Available MCP Servers` block from the task brief, if present.
 9. The convergence classifications (Full/Partial/Contested/Worker-Unique), the round history table (`roundHistory[]`), the `round2SkippedReason` value, and pointers to all worker result files under `worker-results/`. The report-writer worker must reproduce a Round History sub-table in Section 1 of the final report so the reader can see which rounds executed, queue sizes, and why Round 2 was (or was not) skipped.
 10. For implementation-planning runs: a literal block listing the 8 required English section headings the validator scans for (`Option Candidates`, `Trade-off`, `Recommended Option`, `Stepwise Execution Order`, `Dependency`, `Validation Checklist`, `Rollback`, `User Approval Request`). The writer must use these exact substrings as section headings (Korean translation in parentheses is allowed).
-11. An explicit instruction: `You are the author of TWO files: (a) the final-report file at <Result Path>, (b) the worker-results file at <Worker Result Path>. Write both directly using your Write tool. Do not return the report inline. The validator fails the run when (b) is missing.`
+11. An explicit instruction: `You are the author of TWO files: (a) the final-report data.json at <Result Path>, (b) the worker-results audit file at <Worker Result Path>. After writing the data.json, invoke "python3 scripts/okstra-render-final-report.py <Result Path>" via Bash so the markdown sibling is rendered before you return. Do not return the report inline. The validator fails the run when (a)'s schema validation fails, when the rendered markdown is absent, or when (b) is missing.`
 ### Resume-safe dispatch
@@ -70,36 +72,35 @@ Speculative reasons such as "session resume constraint", "team object no longer
 ## Phase 6 → Phase 7 execution sequence (BLOCKING order)
-The four steps below MUST execute in this exact order. Reordering them is the recurring root cause of reports shipping with unsubstituted `{{LEAD_TOTAL_TOKENS}}` placeholders, Section 6 missing the follow-up entries, or Section 7 rows never spawning.
+The four steps below MUST execute in this exact order. Reordering them is the recurring root cause of reports shipping with `--` token cells (Phase 7 not run yet), Section 6 missing follow-up entries, or Section 7 rows never spawning.
-1. **Phase 6 — Report writer worker drafts the final-report file** at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md`. Token placeholders are left verbatim; Section 6 lists prioritized actions but does NOT yet include auto-spawned follow-ups (they don't exist yet).
-2. **Phase 7 step 1 — Token-usage collector with `--substitute-final-report`** (BLOCKING). One invocation aggregates `leadUsage` / `workers[].usage` / `usageSummary` into team-state AND substitutes the 10 placeholders in the final-report file. Skipping the flag ships literal `{{...}}` in the Token Usage Summary table.
+1. **Phase 6 — Report writer worker drafts the final-report data.json** at `runs/<task-type>/reports/final-report-<task-type>-<seq>.data.json`, then invokes `scripts/okstra-render-final-report.py` to produce the sibling markdown. Token Usage cells in the data.json are `null` at this point (renderer emits `--` for nulls); Section 6 lists prioritized actions but does NOT yet include auto-spawned follow-ups (they don't exist yet).
+2. **Phase 7 step 1 — Token-usage collector with `--substitute-data`** (BLOCKING). One invocation aggregates `leadUsage` / `workers[].usage` / `usageSummary` into team-state AND populates `tokenUsage` + `executionStatus[].totalTokens` etc. in the data.json AND re-invokes the renderer so the sibling markdown carries the real numbers. Skipping the flag ships a markdown full of `--` cells.
    ```bash
    python3 scripts/okstra-token-usage.py \
      <runDirectoryPath>/state/team-state-<task-type>-<seq>.json \
      --write --summary \
-     --substitute-final-report <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md
+     --substitute-data <runDirectoryPath>/reports/final-report-<task-type>-<seq>.data.json
    ```
-   The 10 substituted placeholders: `{{LEAD_TOTAL_TOKENS}}`, `{{LEAD_BILLABLE_TOKENS}}`, `{{LEAD_COST_USD}}`, `{{WORKER_TOTAL_TOKENS}}`, `{{WORKER_BILLABLE_TOKENS}}`, `{{WORKER_COST_USD}}`, `{{GRAND_TOTAL_TOKENS}}`, `{{GRAND_BILLABLE_TOKENS}}`, `{{GRAND_COST_USD}}`, `{{CLI_COST_USD}}`. The final-report file MUST already exist (Phase 6 output).
-3. **Phase 7 step 1.5 — Render report views** (BLOCKING). Produces two derived artifacts from the now-substituted final-report MD:
+   The data.json paths populated: `tokenUsage.lead.{totalTokens,billableTokens,costUsd}`, the `worker` / `grand` rows, `tokenUsage.cli.costUsd`, and each `executionStatus[].{totalTokens,billableTokens,costUsd,durationMs,cliTotalTokens,cliCostUsd}` for rows whose role matches a team-state worker. The data.json MUST already exist (Phase 6 output).
+3. **Phase 7 step 1.5 — Render report views** (BLOCKING). Produces the self-contained HTML view from the now-substituted final-report MD:
    ```bash
    python3 scripts/okstra-render-report-views.py \
      <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md
    ```
-   Outputs (idempotent — re-running overwrites):
-   - `runs/<task-type>/reports/final-report-<task-type>-<seq>.slim.md` — token-saving AI-consumption copy.
-   - `runs/<task-type>/reports/final-report-<task-type>-<seq>.html` — single-file self-contained human view. Section 5 `C-*` clarification rows with `Status` ∈ {`open`, `answered`} embed `<textarea>` controls; an `Export user response` button serialises form values to a markdown sidecar (schema in [`templates/reports/user-response.template.md`](../../templates/reports/user-response.template.md)) that the user pastes to `runs/<task-type>/user-responses/user-response-<task-type>-<seq>.md`. The original final-report MD is **never** mutated by user input — the sidecar is the single write target.
+   Output (idempotent — re-running overwrites):
+   - `runs/<task-type>/reports/final-report-<task-type>-<seq>.html` — single-file self-contained human view. Section 5 `C-*` clarification rows with `Status` ∈ {`open`, `answered`} embed form widgets (`<select>` for enum-style decisions, `<input>` for material / data-point kinds, `<textarea>` fallback); an `Export user response` button serialises form values to a markdown sidecar (schema in [`templates/reports/user-response.template.md`](../../templates/reports/user-response.template.md)) that the user pastes to `runs/<task-type>/user-responses/user-response-<task-type>-<seq>.md`. The original final-report MD is **never** mutated by user input — the sidecar is the single write target.
-   Must run AFTER step 1 (so token placeholders are substituted in both derived views) and BEFORE step 2 (so the slim/html artifacts exist for any validator step that checks them).
+   Must run AFTER step 1 (so token placeholders are substituted in the rendered html) and BEFORE step 2 (so the html artifact exists for any validator step that checks it).
 4. **Phase 7 step 2 — Follow-up task spawner** (BLOCKING when Section 7 is non-empty). Turns the report's `## 7. Follow-up Tasks (후속 작업)` rows into `tasks/<task-group>/<new-task-id>/` stubs.
    ```bash
    python3 scripts/okstra-spawn-followups.py \
-     <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md \
+     <runDirectoryPath>/reports/final-report-<task-type>-<seq>.data.json \
      --project-root <project_root> \
      --task-group <task-group> \
      --parent-task-key <task-key>
@@ -107,9 +108,10 @@ The four steps below MUST execute in this exact order. Reordering them is the re
    Behaviour contract:
    - Idempotent: rows whose target dir exists are reported as `existing` and skipped. Reruns of the same parent task are safe.
-   - Rows with `Auto-spawn? != yes` are reported as `skipped` and never written; surface them in Section 6 if manual action is still needed.
-   - An invalid `Origin`, `Suggested task-type`, missing `Title`, or missing `Reason / Why deferred` exits `1`. The report-writer MUST refuse to ship a Section 7 with such rows.
-   - **Canonical spawn rule (single source of truth):** the spawner runs when `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, OR when Section 7 is non-empty for any other task-type. For the listed task-types Section 7 must be present in the report; an empty section renders as `- 후속 작업 없음.`. Missing / empty sections are no-ops (exit `0`). All other references to this rule (including the Persistence Checklist) defer to this statement.
+   - Rows with `autoSpawn != "yes"` are reported as `skipped` and never written; surface them in Section 6 if manual action is still needed.
+   - Rows whose `origin` is `phase-continuation` are reported as `skipped (no new task dir)` and never spawn — they advance the same task-key via `/okstra-run` instead.
+   - An invalid `origin`, `suggestedTaskType`, missing `title`, missing `reason`, or missing `newTaskId` exits `1`. (Schema validation in Phase 6 catches most of these before the spawner runs.)
+   - **Canonical spawn rule (single source of truth):** the spawner runs when `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, OR when `followUpTasks` is non-empty for any other task-type. For the listed task-types `followUpTasks` must be present (schema enforces the phase-continuation row for non-terminal task-types); an empty array is permitted only for `release-handoff`. Missing arrays are no-ops (exit `0`). All other references to this rule (including the Persistence Checklist) defer to this statement.
 5. **Phase 7 step 3 — Update Section 6** after the spawner. The report-writer MUST append one row per newly spawned task-key with its entry command:
    ```
@@ -120,7 +122,7 @@ The status file is written after step 3 completes.
 ## Final Report Structure
-The final report follows the structure below. If `instruction-set/final-report-template.md` exists, that format takes precedence.
+The final report follows the structure encoded in `schemas/final-report-v1.0.schema.json`. The schema is the single source of truth for section names, row shapes, enum values, and task-type-conditional blocks. The Jinja2 template `templates/reports/final-report.template.md` produces the human-readable form from any data.json that validates against the schema. The structure description below is a reading guide for writers; the schema is the binding contract.
 ### Report Header
@@ -140,11 +142,11 @@ The final report follows the structure below. If `instruction-set/final-report-t
 ```markdown
 | Agent | Role | Model | Status | 처리 토큰 | 환산 토큰 | 비용 (USD) | Duration | Summary of Key Findings |
 |-------|------|-------|--------|-----------|-----------|------------|----------|------------------------|
-| Claude Code | Claude lead | opus | completed | 10,479,327 | 1,769,798 | $26.55 | 59m 12s | Final synthesis status |
+| Claude Code | Claude lead | opus-4-6 | completed | 10,479,327 | 1,769,798 | $26.55 | 59m 12s | Final synthesis status |
 | Claude Code | Claude worker | sonnet | completed | 1,941,396 | 475,136 | $1.43 | 13m 33s | Key findings summary |
 | Codex | Codex worker | gpt-5.5 | completed | 2,274,011 (CLI: 5,261,833) | 586,223 | $8.79 (+ CLI $4.20) | 22m 06s | Key findings summary |
 | Gemini | Gemini worker | auto | completed | 3,107,795 | 746,623 | $11.20 | 22m 06s | Key findings summary |
-| Claude Code | Report writer | opus | completed | 665,497 | 267,210 | $4.01 | 4m 20s | Report organization |
+| Claude Code | Report writer | opus-4-6 | completed | 665,497 | 267,210 | $4.01 | 4m 20s | Report organization |
 ```
 Table Generation Rules:
@@ -175,7 +177,7 @@ Place this section immediately after the execution status table.
 ```
 Token Summary Generation Rules:
-- **You author this section in Phase 6, BEFORE Phase 7 runs the collector.** Therefore you MUST leave the 10 placeholders (`{{LEAD_TOTAL_TOKENS}}`, `{{LEAD_BILLABLE_TOKENS}}`, `{{LEAD_COST_USD}}`, `{{WORKER_TOTAL_TOKENS}}`, `{{WORKER_BILLABLE_TOKENS}}`, `{{WORKER_COST_USD}}`, `{{GRAND_TOTAL_TOKENS}}`, `{{GRAND_BILLABLE_TOKENS}}`, `{{GRAND_COST_USD}}`, `{{CLI_COST_USD}}`) verbatim in the table cells — `okstra-token-usage.py --substitute-final-report` will fill them in Phase 7. Never replace any of these cells with a literal number, `not-collected`, `N/A`, `--`, `0`, or any other sentinel: that erases the substitution target, and the report ships with no token numbers. Also do not insert a note like "Phase 7 has not run yet" — the report is read AFTER Phase 7, so that statement is wrong on arrival.
+- **You populate the data.json in Phase 6, BEFORE Phase 7 runs the collector.** Set `tokenUsage.lead.totalTokens` / `.billableTokens` / `.costUsd`, the `worker` and `grand` rows, `tokenUsage.cli.costUsd`, and each `executionStatus[].{totalTokens,billableTokens,costUsd,durationMs,cliTotalTokens,cliCostUsd}` to JSON `null`. The renderer emits `--` for nulls; `okstra-token-usage.py --substitute-data` populates them in Phase 7 and re-renders the markdown. Never set these cells to `0`, `"not-collected"`, `"--"`, `"N/A"`, or any other sentinel: nulls are the only valid placeholder, and the substitution step depends on them being null when it runs.
 - All values come from `usageSummary` (populated by `scripts/okstra-token-usage.py` at the start of Phase 7). Do not estimate or invent.
 - **Lead** row: `usageSummary.leadTotalTokens` / `usageSummary.leadBillableEquivalentTokens` / `usageSummary.estimatedCostUsd.lead`.
 - **Worker 합계** row: `usageSummary.workerTotalTokens` / `usageSummary.workerBillableEquivalentTokens` / `usageSummary.estimatedCostUsd.claudeWorkers`.
@@ -199,11 +201,11 @@ When the run's `task-type` is `implementation-planning`, the final report MUST c
 | 6 | `Validation Checklist` | `### Validation Checklist (검증 체크리스트)` |
 | 7 | `Rollback` | `### Rollback Strategy (롤백 전략)` |
 | 8 | `User Approval Request` | Satisfied by the top-of-report `## User Approval Request (사용자 승인 게이트)` block. Do NOT recreate a `### 4.5.8 User Approval Request` body stub — the validator now fails reports that contain one. |
-| 9 | `Plan Body Verification` + `Gate result:` | `### Plan Body Verification (계획 본문 검증)` containing a `Gate result:` line — copy `okstra-final-report.template.md §4.5.9` verbatim. Validator checks both substrings. |
+| 9 | `Plan Body Verification` + `Gate result:` | `### Plan Body Verification (계획 본문 검증)` containing a `Gate result:` line — copy `templates/reports/final-report.template.md §4.5.9` verbatim. Validator checks both substrings. |
 The Korean translation in parentheses is optional but the English keyword is mandatory. The body of each section is written in Korean per the writing rules below. For non-`implementation-planning` runs, omit this entire block — these headings are NOT validator-checked for other task-types.
-The final-report template `okstra-final-report.template.md` Section 4.5 already encodes this contract — copy that block verbatim and fill in.
+The final-report template `templates/reports/final-report.template.md` Section 4.5 already encodes this contract — copy that block verbatim and fill in.
 ### Final-verification verdict token contract (BLOCKING)
@@ -217,7 +219,7 @@ When the run's `task-type` is `final-verification`, the report's `## 2. Final Ve
 For every other task-type, set the `Verdict Token` cell to `not-applicable`. Do NOT omit the row — the template renders it for all task-types and downstream tooling expects the field to exist.
-The final-report template `okstra-final-report.template.md` Section 2 already encodes this contract — copy that block verbatim and fill in.
+The final-report template `templates/reports/final-report.template.md` Section 2 already encodes this contract — copy that block verbatim and fill in.
 ### Release-handoff section contract (release-handoff runs only)
@@ -225,7 +227,7 @@ When the run's `task-type` is `release-handoff`, the final report MUST include S
 **Single-lead authorship (release-handoff only):** release-handoff has no worker roster (no `Report writer worker`, no `Claude worker` drafter). The Claude lead authors the final-report file directly — there is no `Report writer worker` dispatch to perform in Phase 6, no resume-safe dispatch concern, and no mandatory worker-results file for a report-writer role. The rest of this skill's dispatch / resume / fallback machinery applies ONLY when `Report writer worker` is in the roster (i.e. every task-type other than `release-handoff`).
-The final-report template `okstra-final-report.template.md` Section 4.6 already encodes this contract — copy that block verbatim and fill in. For non-`release-handoff` runs, omit Section 4.6 entirely.
+The final-report template `templates/reports/final-report.template.md` Section 4.6 already encodes this contract — copy that block verbatim and fill in. For non-`release-handoff` runs, omit Section 4.6 entirely.
 ### Mandatory worker-results file (BLOCKING)
@@ -291,8 +293,7 @@ Persistence steps that must be performed in Phase 7:
 - [ ] 6. **Generate final status file**: `runs/<task-type>/status/final-<task-type>-<seq>.status` (if necessary)
 - [ ] 7. **Save convergence state**: `runs/<task-type>/state/convergence-<task-type>-<seq>.json` (when convergence is enabled)
 - [ ] 8. **Spawn follow-up task stubs**: run `scripts/okstra-spawn-followups.py` against the final-report per the canonical spawn rule defined in "Phase 7 follow-up task spawner" above. Do not restate the trigger condition here — that section is the single source of truth. The script is idempotent across reruns.
-- [ ] 9. **Slim AI report**: `runs/<task-type>/reports/final-report-<task-type>-<seq>.slim.md` (produced by Phase 7 step 1.5 — see "Phase 6 → Phase 7 execution sequence" above)
-- [ ] 10. **Human HTML report**: `runs/<task-type>/reports/final-report-<task-type>-<seq>.html` (same step 1.5; self-contained, embeds `Export user response` button)
+- [ ] 9. **Human HTML report**: `runs/<task-type>/reports/final-report-<task-type>-<seq>.html` (produced by Phase 7 step 1.5 — self-contained, embeds `Export user response` button)
 ### Response after Persistence

package/runtime/skills/okstra-run/SKILL.md CHANGED Viewed

@@ -38,9 +38,10 @@ Every wizard call returns JSON. The two shapes you'll see:
 On `ok: false`, re-prompt with the same `current.step` using the error message. The wizard never advances on validation failure; the user retries the same step.
-The wizard tells you *which UI to use* via `kind`:
+The wizard tells you *which UI to use* via `kind` (and the optional `multi` flag on `pick`):
-- `kind: "pick"` → render `AskUserQuestion` with `label` and `options[].label` (use `options[].value` to call `--answer`).
+- `kind: "pick"` + `multi: false` (default) → render `AskUserQuestion` with `label`, `options[].label`, and `multiSelect: false`. Use the chosen `options[].value` (single string) as the answer.
+- `kind: "pick"` + `multi: true` → render `AskUserQuestion` with `label`, `options[].label`, and `multiSelect: true`. Join the chosen `options[].value` entries with `,` into a single CSV string and submit that as `--answer "csv,values"`. If the user selects nothing, still submit `--answer ""` — the wizard will reply `ok: false` and re-prompt the same step (do not skip the call).
 - `kind: "text"` → write `label` as a plain text message and consume the user's NEXT message as the answer.
 - `kind: "done"` → input collection finished; move to Step 5.
@@ -90,8 +91,9 @@ Output: the same `{ok, next}` JSON described above. The first `next` is always `
 Repeat until `next.kind == "done"`:
-1. **Render** the prompt according to `kind`:
-   - `pick` → `AskUserQuestion` with `label` and `options`. The user's chosen option's `value` is the answer string.
+1. **Render** the prompt according to `kind` (and `multi` for pick):
+   - `pick` + `multi: false` → `AskUserQuestion` with `multiSelect: false`, `label`, and `options`. The user's chosen option's `value` is the answer string.
+   - `pick` + `multi: true` → `AskUserQuestion` with `multiSelect: true`, `label`, and `options`. Join the selected `value`s with `,` into a single literal CSV string (e.g. `"claude,codex,gemini"`) and submit it as a single `--answer "claude,codex,gemini"`. Empty selection submits `--answer ""` and the wizard re-prompts.
    - `text` → plain text message containing `label`. Consume the user's next reply verbatim as the answer string (empty reply = empty string).
 2. **Submit** the answer — call `okstra wizard step` with the literal state-file path from Step 2 and the literal user answer (no shell variables, no `$(...)`):
    ```bash

package/runtime/skills/okstra-team-contract/SKILL.md CHANGED Viewed

@@ -70,7 +70,7 @@ Every worker prompt MUST start with the following anchor headers, in this exact
 1. `**Project Root:** <absolute-path>` — absolute target project root (from `{{PROJECT_ROOT}}` in the lead's prompt). Required so the worker can self-anchor without relying on inherited cwd.
 2. `**Prompt History Path:** <project-relative-path>`
-3. `**Result Path:** <project-relative-path>`
+3. `**Result Path:** <project-relative-path>` — canonical destination for the worker's result file. Workers resolve it to absolute against `**Project Root:**` and use it for the post-completion existence check (see codex-worker / gemini-worker step 8c, and Lead's redispatch policy below). The path identifies a single file; do NOT deliver a directory.
 4. `Assigned worker prompt history path: <absolute-path>` — same as the prompt-history path but resolved against `Project Root`. Codex/Gemini wrapper subagents extract this exact line.
 The body must include: role name, task type, task key, required bundle paths, assigned model, output contract, evidence handling rules, and any relevant config/deployment expectations from `reference-expectations.md`.
@@ -209,6 +209,29 @@ Terminal statuses that can be recorded for a worker:
 | `error` | Execution error, reason recorded; prompt history file must exist |
 | `not-run` | Not executed, reason recorded |
+## Lead Redispatch Policy on Result-Missing
+After each worker subagent returns (regardless of role), Lead MUST verify the canonical result file exists at the absolute path resolved from the `**Result Path:**` anchor header (against `**Project Root:**`). The check is identical for in-process workers (claude-worker) and CLI-wrapper workers (codex-worker / gemini-worker).
+**Triggers (any of):**
+- The wrapper subagent returned an explicit `*_RESULT_MISSING` sentinel (codex-worker / gemini-worker step 8c — `CODEX_RESULT_MISSING` / `GEMINI_RESULT_MISSING`).
+- The result file is absent at the resolved absolute path even though the worker returned without a `*_RESULT_MISSING` sentinel — for example, claude-worker returned its final assistant message but never persisted the artifact, or the wrapper exited 0 and the codex/gemini sub-agent forwarded raw stdout despite the contract.
+- The result file exists but cannot be parsed (frontmatter unreadable, sections 1–5 entirely missing). A truncated file in the middle of section 5 is NOT covered here — it goes to the validator's regular `error` path, not the retry path.
+**One-retry policy:**
+1. On the FIRST result-missing trigger for a given role within a single run, Lead MUST re-dispatch the same worker with the byte-identical prompt — same `**Result Path:**`, same `**Prompt History Path:**`, same model assignment, same `team_name`. The redispatch counts as a second attempt against the existing role slot; do NOT create a new role-id, do NOT change the result file path, do NOT switch to a different model as a "workaround".
+2. If the SECOND attempt also fails the same check, Lead records the role's terminal status as `error` with `--message "result-missing after 1 retry"` and proceeds to Phase 5.5 / Phase 6 with the remaining workers' results. Lead MUST NOT retry a third time — convergence and the report writer are designed to operate on reduced-confidence single-or-two-analyser mode when one role is absent (`agents/SKILL.md` "If only one worker result is usable: reduced-confidence synthesis").
+3. The retry counter is **per-run, per-role** and is NOT preserved across runs. A subsequent okstra run for the same task-key starts each role's counter fresh.
+4. Convergence reverify rounds (Phase 5.5) inherit the same one-retry budget — a reverify dispatch that triggers result-missing may be re-dispatched once.
+**Logging.** Lead records the first attempt's `cli-failure` (already emitted by the wrapper sub-agent) as-is. The retry, on success, is logged via the normal worker-completion path; on failure (second `*_RESULT_MISSING`), Lead records a single `contract-violation` entry with `--message "result-missing after 1 retry"` referencing both attempts' bash_ids / prompt-history paths.
+**Diagnostic sidecar (advisory).** Both codex/gemini wrappers also write a heartbeat sidecar at `<prompt-path>.status.json` recording `started_ts`, `ended_ts`, `exit_code`, `duration_ms`, and the canonical `log_path` (see `scripts/okstra-wrapper-status.py` for the schema). Lead MAY read this sidecar when deciding whether the first attempt actually launched the CLI (stage=`exited`, `exit_code=0`, non-zero `duration_ms`) versus failed before reaching it (sidecar absent, or stage=`started` with no exit fields). The sidecar is best-effort — its absence is NOT by itself a reason to skip the retry; the canonical trigger remains the missing result file.
+**Rationale.** Observed failure mode: the CLI (codex/gemini) streams its full analysis to stdout but hits its token budget or a sandbox EPERM mid-`Write` of the result file, exiting 0 with no artifact. Forwarding the partial stdout silently degrades synthesis; classifying the role as `error` without retrying gives up a recoverable signal. A single retry catches the transient class of this failure (re-dispatch with the same prompt typically succeeds when the underlying cause was an intermittent sandbox lock or a token-budget spike) while bounding the retry cost to a known upper bound (~2× the original wrapper budget per role).
 ## Worker Output Contract
 **Authoritative source.** If other documents (SKILL.md, worker agent definitions) disagree with this section, this section wins.
@@ -355,8 +378,9 @@ empty run-level error logs in production.
 - **Background dispatch + polling contract (Codex / Gemini wrappers).** Both wrapper subagents MUST dispatch `okstra-codex-exec.sh` / `okstra-gemini-exec.sh` via `Bash(run_in_background: true)` and poll with `BashOutput(bash_id)` until the shell reports `status == "completed"`, capped at 30 minutes (1800s) of wall-clock elapsed time. `BashOutput` itself is the wait primitive — call it back-to-back; do NOT insert a standalone `sleep` between polls. The Claude Code harness blocks `sleep` calls of 5 seconds or longer as a circumvention vector and explicitly forbids chaining shorter sleeps inside until-loops to work around the block. Workers that hit the contract bug must NOT self-recover with `until ...; do sleep 2; done` wrappers — that path violates the harness anti-circumvention rule, even though it superficially "works". The legacy "single foreground `Bash` with 120000ms timeout" rule, and the subsequent "60-second cadence with `sleep 60` between polls" rule, are both retired. The current rule applies in **every phase** (analysis runs typically complete in 1–2 `BashOutput` calls, so there is no regression for short jobs). Recording responsibilities:
   - Successful completion: return the wrapper's accumulated stdout from the final `BashOutput`. No log entry.
   - Non-zero `exit_code` reported by `BashOutput`: record a `cli-failure` to the run-level error log with the real `exit_code` and observed `duration-ms`.
-  - 30-minute polling cap exceeded: call `KillShell(shell_id)` first, then record `cli-failure` with `--exit-code 124 --duration-ms 1800000 --message "<wrapper> exceeded 30m polling cap"`, then return the language-specific `*_CLI_TIMEOUT` sentinel.
+  - Polling cap reached: before `KillShell`, perform a one-shot **mtime-grace check** on the wrapper's live log (`<prompt>.log`). If the log was written within the last 90 seconds AND grace has not yet been applied this loop, extend the cap from 1800s → 2100s (one-shot +5min) and continue polling. Otherwise (log stale, OR grace already applied), call `KillShell(shell_id)`, record `cli-failure` with `--exit-code 124 --duration-ms <observed_ms> --message "<wrapper> exceeded polling cap (grace=<applied|not-applied>, last_mtime_age=<n>s)"`, then return the language-specific `*_CLI_TIMEOUT` sentinel. The grace exists to absorb token-budget spikes where the CLI is genuinely still producing output past the 30-minute mark; it is a one-shot soft extension, NOT a loop.
   - Token-usage matching is unaffected: the wrapper subagent stays alive throughout polling, so the wrapper's jsonl timestamp window continues to cover the underlying CLI rollout's full duration (see §"Token-usage accounting" below).
+- **No external timeout on wrapper subagents.** The codex/gemini wrapper subagent's polling loop (with optional mtime grace) is the SINGLE timeout authority for its dispatch. Lead MUST NOT impose a separate `Agent()` call timeout, an outer `Bash` wall-clock deadline, or any other mechanism that terminates the subagent before its own polling cap is reached. Doing so reproduces the historical failure mode that motivated this rule: Lead aborts the subagent at e.g. 18 minutes, the subagent returns nothing, and Lead classifies the role as "no response" while the underlying CLI was actively working. The wrapper's polling cap (30min + optional 5min grace) is calibrated so that, combined with Lead's redispatch policy (see "Lead Redispatch Policy on Result-Missing"), a recoverable single-run failure costs at most ~70 minutes of wall-clock — predictable enough to plan around. If a specific run requires a tighter cap, lower it in the wrapper subagent's polling contract (single source of truth), NOT by layering Lead-side timeouts.
 - `contract-violation` events (C) are recorded by Lead via `okstra-error-log.py append-observed --error-type contract-violation ...` after inspecting worker outputs.
 - Lead's responsibility regarding the sidecar is to dump it to the run-level error log via `okstra-error-log.py append-from-worker` after each worker terminates; Lead does not write into the sidecar.
@@ -419,7 +443,7 @@ Examples:
 **Task:** error-analysis
 **Target:** server/auth.ts
 **Date:** 2026-04-06
-**Model:** Report writer worker, opus
+**Model:** Report writer worker, opus-4-6
 ```
 Use the actual model identifier recorded in team-state (never invent a model ID — read it from `resultContract.requiredWorkerRoles[*].modelExecutionValue` or the tool response metadata).