npm - okstra - Versions diffs - 0.50.0 → 0.52.0 - Mend

okstra 0.50.0 → 0.52.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

package/README.kr.md +8 -7
package/README.md +8 -7
package/bin/okstra +2 -0
package/docs/kr/architecture.md +15 -16
package/docs/kr/cli.md +5 -5
package/docs/project-structure-overview.md +10 -6
package/docs/superpowers/specs/2026-06-06-vertical-slice-tdd-planning-design.md +179 -0
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/SKILL.md +15 -11
package/runtime/agents/workers/claude-worker.md +3 -3
package/runtime/agents/workers/codex-worker.md +2 -2
package/runtime/agents/workers/gemini-worker.md +2 -2
package/runtime/bin/lib/okstra/cli.sh +8 -1
package/runtime/bin/lib/okstra/globals.sh +3 -0
package/runtime/bin/lib/okstra/interactive.sh +14 -12
package/runtime/bin/lib/okstra/usage.sh +6 -0
package/runtime/bin/okstra-team-reconcile.sh +28 -0
package/runtime/bin/okstra.sh +2 -0
package/runtime/prompts/launch.template.md +3 -1
package/runtime/prompts/profiles/_common-contract.md +4 -4
package/runtime/prompts/profiles/_implementation-executor.md +2 -2
package/runtime/prompts/profiles/_implementation-verifier.md +1 -1
package/runtime/prompts/profiles/implementation-planning.md +8 -4
package/runtime/prompts/profiles/implementation.md +1 -1
package/runtime/python/okstra_ctl/analysis_packet.py +259 -0
package/runtime/python/okstra_ctl/context_cost.py +308 -0
package/runtime/python/okstra_ctl/migrate.py +2 -12
package/runtime/python/okstra_ctl/paths.py +22 -0
package/runtime/python/okstra_ctl/render.py +284 -125
package/runtime/python/okstra_ctl/render_final_report.py +31 -0
package/runtime/python/okstra_ctl/run.py +507 -245
package/runtime/python/okstra_ctl/sequence.py +2 -5
package/runtime/python/okstra_ctl/team_reconcile.py +131 -0
package/runtime/python/okstra_ctl/wizard.py +129 -133
package/runtime/python/okstra_ctl/worktree.py +13 -5
package/runtime/schemas/final-report-v1.0.schema.json +4 -0
package/runtime/skills/okstra-coding-preflight/SKILL.md +69 -0
package/runtime/skills/okstra-coding-preflight/architecture/hexagonal.md +116 -0
package/runtime/skills/okstra-coding-preflight/clean-code.md +254 -0
package/runtime/skills/okstra-coding-preflight/languages/java.md +64 -0
package/runtime/skills/okstra-coding-preflight/languages/javascript-typescript.md +87 -0
package/runtime/skills/okstra-coding-preflight/languages/kotlin.md +69 -0
package/runtime/skills/okstra-coding-preflight/languages/nodejs.md +66 -0
package/runtime/skills/okstra-coding-preflight/languages/python.md +179 -0
package/runtime/skills/okstra-coding-preflight/languages/rust.md +105 -0
package/runtime/skills/okstra-coding-preflight/languages/sql.md +68 -0
package/runtime/skills/okstra-context-loader/SKILL.md +12 -6
package/runtime/skills/okstra-inspect/SKILL.md +100 -1
package/runtime/skills/okstra-memory/SKILL.md +28 -5
package/runtime/skills/okstra-report-writer/SKILL.md +5 -1
package/runtime/skills/okstra-run/SKILL.md +1 -1
package/runtime/skills/okstra-team-contract/SKILL.md +7 -4
package/runtime/templates/reports/final-report.template.md +1 -0
package/runtime/templates/worker-prompt-preamble.md +3 -3
package/runtime/validators/validate-implementation-plan-stages.py +57 -11
package/src/_python-helper.mjs +3 -3
package/src/context-cost.mjs +27 -0
package/src/install.mjs +1 -0
package/src/memory.mjs +50 -11
package/src/uninstall.mjs +1 -0

package/runtime/agents/SKILL.md CHANGED Viewed

@@ -44,7 +44,7 @@ This SKILL.md is the operating contract and phase index. Detailed procedures liv
 | 5.5 Convergence | Cross-verify findings across workers | `okstra-convergence` |
 | 5.6 Critic pass | (opt-in) reused-worker critic pass: coverage gaps (discovery/error-analysis/impl-planning) or acceptance devil's-advocate (final-verification), each verified one round | `okstra-convergence` "Coverage critic pass" / "Acceptance critic pass" |
 | 6. Synthesis | Dispatch Report writer worker, review draft. **For `implementation-planning`: then run the Phase 6 plan-body verification sub-step (see Phase 6 section below).** | `okstra-report-writer` + `okstra-convergence` (sub-step) |
-| 7. Persist | Run token-usage collector, update manifests, then disband the worker team (shutdown teammates + `TeamDelete`, after collection) | `okstra-report-writer` + `_common-contract.md` "Run-end team teardown" |
+| 7. Persist | Run token-usage collector, update manifests, then disband the worker team (reconcile stale members + `TeamDelete`, after collection) | `okstra-report-writer` + `_common-contract.md` "Run-end team teardown" |
 ## Core operating contract
@@ -96,7 +96,7 @@ Required checkpoints:
 - `PROGRESS: phase-5.6-critic provider=<provider> gaps=<n>` — when the coverage critic pass runs (Phase 5.6, opt-in). Omitted when `convergence.critic.enabled == false`.
 - `PROGRESS: phase-6-synthesis dispatching report-writer-worker` — at the start of Phase 6.
 - `PROGRESS: phase-7-persist updating manifests` — at the start of Phase 7.
-- `PROGRESS: phase-7-teardown disbanding team` — after token-usage collection, immediately before shutting down worker teammates + `TeamDelete` (Teams mode only; see `_common-contract.md` "Run-end team teardown"). Skipped in the no-`team_name` fallback.
+- `PROGRESS: phase-7-teardown disbanding team` — after token-usage collection, immediately before reconciling stale members + `TeamDelete` (Teams mode only; see `_common-contract.md` "Run-end team teardown"). Skipped in the no-`team_name` fallback.
 - `PROGRESS: complete final-report=<relative-path>` — final summary line, after all persistence.
 These lines are the only structured signal the user has during a long run. Do NOT replace them with prose ("Now I'm starting Phase 2..."), do NOT skip a checkpoint because "the previous message already said that", and do NOT batch multiple checkpoints into one. Each line stands alone so the user (or any operator scraping stdout) can timestamp it externally.
@@ -107,6 +107,8 @@ These lines are the only structured signal the user has during a long run. Do NO
 **The lead never invents a model.** Every role's model is read from `task-manifest.json` → `resultContract.requiredWorkerRoles[*].modelExecutionValue` (and the lead model metadata). A missing assignment is a manifest defect, not a license to fall back — see [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Model Assignment Rules". The manifest is always populated at run-prep time by the CLI, which seeds these values from `OKSTRA_DEFAULT_*_MODEL` (`scripts/okstra_ctl/run.py`).
+**Reading the assignment is not enough — it must be applied at spawn.** For the in-process Claude subagents (`Claude worker`, `Report writer worker`), the manifest value is bound only when lead passes it as the `Agent(...)` `model:` parameter; their `model: inherit` frontmatter otherwise follows the lead's own model. See "Model Assignment Rules" #3–#5 in [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) for the spawn rule, the family-token mapping, and the Codex/Gemini exemption.
 The table below documents those prep-time seed values **for reference only** — it is NOT a lead-applied fallback:
 | Role | Seed model | subagent_type | Source definition |
@@ -152,21 +154,21 @@ Executor is chosen at run-prep time via `--executor <claude|codex|gemini>` (or `
 Treat cross verify input as a task bundle, not as a single file. If the user did not specify an explicit task key or task path, use `.okstra/discovery/latest-task.json` as the current-task convenience pointer. If task browsing, task-id disambiguation, or project-level task inventory is needed, inspect `.okstra/discovery/task-catalog.json` first.
-After context-loader completes, read **only the five mandatory files below** in a single parallel-Read message at the start of Phase 1. The other instruction-set files are loaded lazily at the phase that actually needs them — see "Lazy reading discipline" below. This split came from observed lead-token bloat: in `fontsninja-classifier-v2:dev-9461:dev-9495` RD-001 the lead burned 71 M tokens (97 % cache_read) largely because every phase entry re-absorbed a 93 KB instruction-set baseline that included files only one downstream phase ever actually used.
+After context-loader completes, read **only the compact intake files below** in a single parallel-Read message at the start of Phase 1. The other instruction-set files are loaded lazily at the phase that actually needs them — see "Lazy reading discipline" below. This split came from observed lead-token bloat: in `fontsninja-classifier-v2:dev-9461:dev-9495` RD-001 the lead burned 71 M tokens (97 % cache_read) largely because every phase entry re-absorbed a 93 KB instruction-set baseline that included files only one downstream phase ever actually used.
 **Mandatory at Phase 1 start (parallel Read, one message):**
 1. `task-manifest.json` (found by context-loader)
-2. `instruction-set/task-brief.md` — needed to compose every worker prompt
-3. `instruction-set/analysis-profile.md` — needed to compose worker prompts and pick the right `Required workers:` block
-4. the current run manifest under `runs/<task-type>/manifests/`
-5. the current run team-state artifact
+2. `runs/<task-type>/state/active-run-context-<task-type>-<seq>.json` — compact current-run intake; if absent, fall back to the current run manifest + team-state artifact
+3. `instruction-set/analysis-profile.md` — needed to pick the right `Required workers:` block and phase rules
+4. `instruction-set/analysis-packet.md` — primary compact input for analysis worker dispatch
 **Lazy reading discipline (do NOT read at Phase 1):**
 - `task-index.md` — only when the user explicitly asks for a human summary or when history disambiguation is required.
-- `instruction-set/analysis-material.md` — read at Phase 2 only if it is referenced by `analysis-profile.md` or by the brief. Many task bundles have no material file (the placeholder `> 자료가 제공되지 않았습니다` is canonical); in that case skip.
-- `instruction-set/reference-expectations.md` — read at Phase 6 synthesis (or whenever the report-writer worker is dispatched) — it informs the match/gap assessment, not worker dispatch.
+- `instruction-set/task-brief.md` — read only if the packet is insufficient, a cited source needs verification, or report-writer synthesis requires the full source.
+- `instruction-set/analysis-material.md` — read only if the packet is insufficient or a source citation needs verification. Many task bundles have no meaningful material file beyond a duplicate brief wrapper.
+- `instruction-set/reference-expectations.md` — read at Phase 6 synthesis (or whenever the report-writer worker is dispatched) — it informs the match/gap assessment. Analysis workers use the packet excerpt unless they need source verification.
 - `instruction-set/final-report-template.md` — never read by Lead. The Report writer worker reads it as part of its own [Required reading]; Lead only references its path when dispatching.
 - `history/timeline.json` — read only on user request or when carry-in resolution requires it.
@@ -186,7 +188,7 @@ The guard is not satisfied by remembering content from a prior run — each impl
 This pattern is implementation-only. Other profiles (`requirements-discovery`, `error-analysis`, `implementation-planning`, `final-verification`, `release-handoff`) load their whole profile body at Phase 1 as before — they are short enough not to benefit from a split.
-Extract from the five mandatory files: task key, task type, work category, workflow lifecycle snapshot, selected worker roster, assigned models, worker result paths, worker prompt history paths, current run prompt directory, final report path, final status path, validator path, resume helper path, config-file references, deployment-manifest references, and their expected values or invariants.
+Extract from the compact intake files: task key, task type, work category, workflow lifecycle snapshot, selected worker roster, assigned models, worker result paths, worker prompt history paths, current run prompt directory, final report path, final status path, validator path, resume helper path, config-file references, deployment-manifest references, and their expected values or invariants.
 If previous run reports exist, use as historical context only. If discovery metadata or current artifacts conflict with a newer user instruction, prefer the user instruction. If `reference-expectations.md` explicitly says expectations were not provided (you can confirm this without reading the file if the brief's "Expected state" section is empty), treat that as missing information and say `I don't know` rather than inventing expected states.
@@ -195,7 +197,7 @@ If previous run reports exist, use as historical context only. If discovery meta
 These phases are governed by [okstra-team-contract](./skills/okstra-team-contract/SKILL.md). It is the canonical source for:
 - Worker prompt anchor headers and body composition rules.
-- The `[Required reading]` clause (audience-scoped enumeration: analysis workers vs report-writer vs reverify dispatches).
+- The `[Required reading]` clause (analysis-packet primary input for analysis workers, full source files for report-writer, scoped inputs for reverify dispatches).
 - The `[Error reporting]` clause and the asymmetry between claude-worker and codex/gemini-worker prompts.
 - Worker output contract (sections 0–5), header standard, terminal statuses, errors-sidecar schema.
 - Token-usage tracking conventions.
@@ -220,6 +222,8 @@ Spawn **analysis workers only** in the same turn (Phase 4 in Teams mode; Phase 5
 **Agent `name` on dispatch (BLOCKING — token-usage attribution depends on it).** Every analysis-worker `Agent(...)` call MUST set `name: "<workerId>-worker"` — `name: "claude-worker"` / `name: "codex-worker"` / `name: "gemini-worker"` — exactly as the report-writer dispatch sets `name: "report-writer"` ([okstra-report-writer](./skills/okstra-report-writer/SKILL.md)). The Agent harness records this `name` as `agentName` in the subagent session jsonl, and the Phase 7 token collector matches each worker's session by that `agentName` (`okstra_token_usage/collect.py`). A worker dispatched **without** `name` produces a session with no `agentName`; the collector cannot attribute it and records the worker as `source: "unavailable"` even though the session exists and is team-tagged (observed in `dev-9692` error-analysis: `claude`/`codex` workers dispatched without `name` → both `unavailable`, while the named `report-writer` collected normally). Convergence reverify dispatches keep the prefix (`<workerId>-worker-reverify-r<N>`); implementation executor/verifier variants keep `<workerId>-worker` / `<workerId>-executor`.
+**Agent `model:` on dispatch (BLOCKING — assignment is otherwise ignored).** The `Claude worker` `Agent(...)` call MUST set `model: "<family token of that role's modelExecutionValue>"` (`opus` / `sonnet` / `haiku`), per [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Model Assignment Rules" #3–#4. The claude-worker definition is `model: inherit`, so omitting this parameter makes the worker silently run on the lead's model instead of its manifest assignment — the assigned-vs-actual deviation. `Codex worker` / `Gemini worker` are exempt: their CLI model is applied via the wrapper's own `--model` argument, so leave their Agent `model:` at `inherit` (rule #5).
 The no-`team_name` fallback (Phase 5) is only legal when team-state's `teamCreate.status` is `"error"` for this run. If `teamCreate` is missing or `attempted: false`, the correct action when an Agent dispatch is rejected for a missing team is to GO BACK to Phase 3 and call `TeamCreate` — never to strip `team_name` and continue.
 ### Errors log path wiring (BLOCKING)

package/runtime/agents/workers/claude-worker.md CHANGED Viewed

@@ -45,7 +45,7 @@ Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you
 4. Anchor all file operations to the absolute `Project Root` from the lead prompt. Use absolute paths — do NOT rely on inherited cwd. Never use `cd` to change directory.
    - **Executor exception (implementation phase only):** when this worker is dispatched as the `Executor` and the lead prompt provides an `EXECUTOR_WORKTREE_PATH` that differs from the session's inherited cwd, cwd-sensitive Bash commands (`cargo *`, `npm *`, `pnpm *`, `bun *`, `pytest`, `make *`, `go *`, language-toolchain test/build commands) MUST be prefixed with `cd <EXECUTOR_WORKTREE_PATH> && ` in the same Bash invocation — e.g. `cd /Users/.../worktrees/foo && cargo test -p bar`. Do NOT wrap the whole thing in `bash -lc "..."` or `bash -c "..."`; pass the chained command directly to the Bash tool so the leading `cd` token remains visible to the permission layer. The `cd` is scoped to the single Bash subshell and does not mutate the session's shell state, so this does not conflict with the "never use cd" rule above (which prevents the worker from drifting the session cwd across calls).
-   - **Executor coding-conventions preflight (BLOCKING, before your first `Edit` / `Write`):** when dispatched as the `Executor`, you MUST run the coding-conventions preflight defined in the executor sidecar (`prompts/profiles/_implementation-executor.md` → "Pre-implementation context exploration") before writing any code — detect each touched file's language and invoke the project's coding-conventions skill (`coding-preflight` when installed; it routes the matching `languages/<lang>.md` + `clean-code.md` + any hexagonal overlay), then state in one line which conventions apply. Subagents do NOT auto-trigger skills, so this is an explicit step you must perform; if no such skill is reachable in your runtime, degrade per that sidecar section (agnostic principles + project lint/convention files) — never skip the gate.
+   - **Executor coding-conventions preflight (BLOCKING, before your first `Edit` / `Write`):** when dispatched as the `Executor`, you MUST run the coding-conventions preflight defined in the executor sidecar (`prompts/profiles/_implementation-executor.md` → "Pre-implementation context exploration") before writing any code — detect each touched file's language and read okstra's installed coding-conventions files directly — `~/.claude/skills/okstra-coding-preflight/languages/<lang>.md` + `clean-code.md` + any `architecture/hexagonal.md` overlay (placed by `okstra install`), then state in one line which conventions apply. The skill is `user-invocable: false` and subagents do NOT auto-trigger skills, so read the files via the Read tool by absolute path rather than relying on Skill invocation; if those files are not readable in your runtime, degrade per that sidecar section (agnostic principles + project lint/convention files) — never skip the gate.
    - **Verifier QA-gate exception:** verifier roles MAY use the same `cd <WORKTREE> && <cmd>` shape when executing project-declared `qaCommands` (lint / format / typecheck / test) from `project.json`, since those commands are cwd-sensitive by nature. Outside the QA gate, verifiers still read with absolute paths only — do NOT use `cd` for file inspection.
    - **No extra chaining beyond `cd && cmd`:** the permission matcher only allows the exact two-segment shape `cd <PATH> && <single-command>`. Do NOT append additional pipes, semicolons, redirects, or `&&` chains — e.g. `cd ... && cargo test ... 2>&1 | tail -20; echo "exit:$?"` will trigger a permission prompt every dispatch because the trailing `| tail`, `; echo`, and `2>&1` tokens disqualify the prefix match against `Bash(cargo:*)`. Let Claude Code capture the full stdout/stderr and exit code natively — do not post-process with `tail`, `head`, or `echo "exit:$?"`. If output truncation is genuinely needed, run the command first and read the result in a separate tool call.
@@ -60,7 +60,7 @@ Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you
 Before producing any output, you MUST:
 1. Extract the absolute path from the lead's `**Worker Preamble Path:**` anchor header and Read that file end-to-end with a single `Read` call (no `offset`, no `limit`). This is the canonical SSOT for the Required Reading + Error Reporting + Output sections contract.
-2. Read every input file the lead enumerated under `## Inputs` (or equivalent heading) in the dispatch prompt body, end-to-end, following the rules stated in the preamble. For analysis workers this is task-brief + analysis-profile + analysis-material (if present) + reference-expectations + clarification-response (if carry-in). Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
+2. Read every primary input file the lead enumerated under `## Inputs` (or equivalent heading) in the dispatch prompt body, end-to-end, following the rules stated in the preamble. For analysis workers this is normally `analysis-packet.md`; the source files named inside that packet are fallback/evidence paths to open when needed. Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
 **Heartbeat — write the audit sidecar EARLY and APPEND per stage (BLOCKING).** Because this worker runs as an in-process Agent or a fresh-session tmux pane, the lead has no `BashOutput`-style liveness signal while waiting for your return. The audit sidecar is the only signal that survives a silent hang. Write the sidecar at `runs/<task-type>/worker-results/claude-worker-audit-<task-type>-<seq>.md` immediately after extracting `Project Root` and the assigned paths — BEFORE the per-file end-to-end reads — with just the heading line (`# Claude Worker Audit — <task-key>`) and one `- PROGRESS: started <ISO-8601-UTC>` line. Then APPEND one short progress line per stage as you advance: `read-<filename>`, `analysis-start`, `findings-draft-start`, `findings-draft-complete`, `write-result-start`. The append cadence MUST NOT exceed 5 minutes — if a single analysis stage is taking longer, emit a `- PROGRESS: in-stage:<stage> <ISO-8601-UTC>` heartbeat. A 5-minute stale sidecar mtime is the canonical "this worker has hung" signal for the operator. Sidecar write/append uses `Write` (initial) and `Edit` / heredoc `>>` (per-stage append).
@@ -68,7 +68,7 @@ Before producing any output, you MUST:
 When returning results, start the file with a YAML frontmatter block, then organize the body into the following sections in this exact order.
-**Frontmatter (mandatory)** — set `workerId: "claude"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the input files (`analysis-material.md` is canonical; if it lacks any field, record a `tool-failure` and stop). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
+**Frontmatter (mandatory)** — set `workerId: "claude"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the primary input (`analysis-packet.md`; fall back to `analysis-material.md` only if the packet is missing a field). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
 1. **Findings** - what you identified
 2. **Missing Information or Assumptions** - gaps in the analysis

package/runtime/agents/workers/codex-worker.md CHANGED Viewed

@@ -143,7 +143,7 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Cod
 Before invoking the Codex CLI, you MUST:
 1. Extract the absolute path from the lead's `**Worker Preamble Path:**` anchor header and verify the CLI run will Read that file end-to-end (canonical SSOT for the Required Reading + Error Reporting + Output sections contract). The lead's prompt body — which you persist verbatim and feed into Codex via stdin — already contains this anchor; do not strip it.
-2. Verify the lead's prompt body lists the per-run input files under `## Inputs` (task-brief, analysis-profile, analysis-material if present, reference-expectations, clarification-response if carry-in). Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
+2. Verify the lead's prompt body lists the per-run primary input files under `## Inputs` (normally `analysis-packet.md` for analysis workers). The source files named inside that packet are fallback/evidence paths to open when needed. Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
 The CLI writes a Reading Confirmation block to the **audit sidecar** at `runs/<task-type>/worker-results/codex-worker-audit-<task-type>-<seq>.md`. The sidecar's body begins with `# Codex Worker Audit — <task-key>` followed by one short line per input file confirming end-to-end reading. The main Codex output MUST NOT contain a `## 0. Reading Confirmation` heading — the validator fails worker-results that contain one. If any file was skipped, record a `tool-failure` in the errors sidecar instead of fabricating Findings.
@@ -151,7 +151,7 @@ The CLI writes a Reading Confirmation block to the **audit sidecar** at `runs/<t
 When returning results, start the file with a YAML frontmatter block, then organize the body into the following sections in this exact order.
-**Frontmatter (mandatory)** — set `workerId: "codex"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the input files (`analysis-material.md` is canonical; if it lacks any field, record a `tool-failure` and stop). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
+**Frontmatter (mandatory)** — set `workerId: "codex"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the primary input (`analysis-packet.md`; fall back to `analysis-material.md` only if the packet is missing a field). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
 1. **Findings** - what Codex identified
 2. **Missing Information or Assumptions** - gaps in the analysis

package/runtime/agents/workers/gemini-worker.md CHANGED Viewed

@@ -143,7 +143,7 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Gem
 Before invoking the Gemini CLI, you MUST:
 1. Extract the absolute path from the lead's `**Worker Preamble Path:**` anchor header and verify the CLI run will Read that file end-to-end (canonical SSOT for the Required Reading + Error Reporting + Output sections contract). The lead's prompt body — which you persist verbatim and feed into Gemini via stdin — already contains this anchor; do not strip it.
-2. Verify the lead's prompt body lists the per-run input files under `## Inputs` (task-brief, analysis-profile, analysis-material if present, reference-expectations, clarification-response if carry-in). Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
+2. Verify the lead's prompt body lists the per-run primary input files under `## Inputs` (normally `analysis-packet.md` for analysis workers). The source files named inside that packet are fallback/evidence paths to open when needed. Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
 The CLI writes a Reading Confirmation block to the **audit sidecar** at `runs/<task-type>/worker-results/gemini-worker-audit-<task-type>-<seq>.md`. The sidecar's body begins with `# Gemini Worker Audit — <task-key>` followed by one short line per input file confirming end-to-end reading. The main Gemini output MUST NOT contain a `## 0. Reading Confirmation` heading — the validator fails worker-results that contain one. If any file was skipped, record a `tool-failure` in the errors sidecar instead of fabricating Findings.
@@ -151,7 +151,7 @@ The CLI writes a Reading Confirmation block to the **audit sidecar** at `runs/<t
 When returning results, start the file with a YAML frontmatter block, then organize the body into the following sections in this exact order.
-**Frontmatter (mandatory)** — set `workerId: "gemini"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the input files (`analysis-material.md` is canonical; if it lacks any field, record a `tool-failure` and stop). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
+**Frontmatter (mandatory)** — set `workerId: "gemini"`. Copy `id`, `aliases`, `taskType`, `task-id`, `task-group`, `project-id`, `date` verbatim from the primary input (`analysis-packet.md`; fall back to `analysis-material.md` only if the packet is missing a field). Full schema and a concrete example live in the `okstra-team-contract` skill's "Result Frontmatter" subsection.
 1. **Findings** - what Gemini identified
 2. **Missing Information or Assumptions** - gaps in the analysis

package/runtime/bin/lib/okstra/cli.sh CHANGED Viewed

@@ -142,6 +142,13 @@ while [[ $# -gt 0 ]]; do
       APPROVE_PLAN_ACK="true"
       shift
       ;;
+    --implementation-option)
+      # 유저가 implementation-planning final-report 에서 고른 Option Candidate
+      # 이름. 런타임이 approved-plan frontmatter 의 implementation-option 라인을
+      # 이 값으로 채운다. 빈 값이면 implementation 이 Recommended Option 으로 폴백.
+      IMPLEMENTATION_OPTION="$(require_option_value --implementation-option "${2-}")"
+      shift 2
+      ;;
     --no-plan-verification)
       # implementation-planning 의 Phase 6 plan-body verification 라운드를
       # 끈다. 기본값은 활성화. 비활성 시 final-report 상단의 User Approval
@@ -178,7 +185,7 @@ while [[ $# -gt 0 ]]; do
           printf '  hint: did you mean --task-id?\n' >&2
           ;;
       esac
-      printf '  valid options: --render-only --resume-clarification --yes --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan --approve --no-plan-verification -h|--help\n' >&2
+      printf '  valid options: --render-only --resume-clarification --yes --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan --approve --implementation-option --no-plan-verification -h|--help\n' >&2
       usage
       exit 1
       ;;

package/runtime/bin/lib/okstra/globals.sh CHANGED Viewed

@@ -39,6 +39,9 @@ DIRECTIVE=""
 CLARIFICATION_RESPONSE_PATH=""
 APPROVED_PLAN_PATH=""
 APPROVE_PLAN_ACK="false"
+# implementation 전용: 유저가 고른 Option Candidate 이름. 빈 값이면 implementation
+# 이 plan 의 Recommended Option 으로 폴백한다. --implementation-option 으로 설정.
+IMPLEMENTATION_OPTION=""
 # Phase 6 plan-body verification toggle. Default "true" (round runs).
 # Flipped to "false" by --no-plan-verification on the CLI.
 PLAN_VERIFICATION_ENABLED="true"

package/runtime/bin/lib/okstra/interactive.sh CHANGED Viewed

@@ -59,23 +59,25 @@ resolve_task_root_for_shortcut() {
   local task_id="$4"
   local resolved=""
-  resolved="$(python3 - "$project_root" "$project_id" "$task_group" "$task_id" <<'PY'
-import json, os, re, sys
+  resolved="$(python3 - "$WORKSPACE_ROOT/scripts" "$project_root" "$project_id" "$task_group" "$task_id" <<'PY'
+import json, os, sys
 from pathlib import Path
-project_root = Path(sys.argv[1])
-project_id = sys.argv[2]
-task_group = sys.argv[3]
-task_id = sys.argv[4]
+# task root 의 slug 경로 구성은 okstra_ctl.paths.task_dir(SSOT) 에 위임한다.
+# 과거 이 heredoc 은 slugify 와 `.okstra/tasks/<slug>/<slug>` 구조를 자체
+# 재구현해 규칙 변경 시 silent drift 위험이 있었다. project-resolver.sh 와
+# 동일하게 $WORKSPACE_ROOT/scripts 를 sys.path 에 올려 패키지를 import 한다.
+sys.path.insert(0, sys.argv[1])
+from okstra_ctl.paths import task_dir
+project_root = Path(sys.argv[2])
+project_id = sys.argv[3]
+task_group = sys.argv[4]
+task_id = sys.argv[5]
 requested_key = f"{project_id}:{task_group}:{task_id}"
 requested_key_ci = requested_key.lower()
-def slugify(value: str) -> str:
-    value = value.lower()
-    value = re.sub(r"[^a-z0-9]+", "-", value).strip("-")
-    return value
 candidates = []
 catalog_path = project_root / ".okstra" / "discovery" / "task-catalog.json"
@@ -111,7 +113,7 @@ if catalog_path.is_file():
                         sys.exit(0)
                     candidates.append(str(abs_path))
-slug_path = project_root / ".okstra" / "tasks" / slugify(task_group) / slugify(task_id)
+slug_path = task_dir(project_root, task_group, task_id)
 if slug_path.is_dir():
     print(f"OK\t{slug_path}")
     sys.exit(0)

package/runtime/bin/lib/okstra/usage.sh CHANGED Viewed

@@ -46,6 +46,12 @@ optional arguments:
                        \`- [ ] Approved\` to \`- [x] Approved\` and appends an approval audit line
                        (timestamp + "CLI --approve"). Use this for scripted/CI flows or when you want a
                        single command to both approve and launch the next phase.
+  --implementation-option <name>
+                       Name of the Option Candidate the user chose from the implementation-planning
+                       final-report. Only meaningful together with --approved-plan and
+                       --task-type=implementation. The runtime fills the approved-plan frontmatter
+                       \`implementation-option:\` line with <name>. When omitted, the implementation run
+                       falls back to the plan's \`Recommended Option\`.
   --no-plan-verification
                        Disable the Phase 6 plan-body verification round that runs after the report-writer
                        authors the implementation-planning draft. Default: enabled. Only meaningful with

package/runtime/bin/okstra-team-reconcile.sh ADDED Viewed

@@ -0,0 +1,28 @@
+#!/usr/bin/env bash
+#
+# okstra-team-reconcile.sh — flip dead-pane stale-active team members to
+# inactive so the lead's `TeamDelete()` can disband the team in one shot.
+#
+# A Claude Code team member clears its own `isActive` flag in
+# `~/.claude/teams/<team>/config.json` when its `Agent()` dispatch returns. A
+# member whose tmux pane died WITHOUT that flip stays `isActive: true`, and
+# `TeamDelete` then refuses the whole team ("active members remain") — an error
+# no re-sent `shutdown_request` can clear, since the addressee is already gone.
+# This reconciles exactly that case; it never touches a live-pane member, the
+# lead, or a member with no recorded pane (those are left for graceful
+# shutdown). It no-ops when tmux is unavailable or nothing is stale.
+#
+# Usage: okstra-team-reconcile.sh [--list] <team-name>
+#   --list   report what WOULD be deactivated; do not write (alias --dry-run).
+#
+# Failures are non-fatal to the run — teardown must never block on this.
+set -u
+_dir="$(cd -P "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+# Logic lives in okstra_ctl.team_reconcile. In the repo layout the package is a
+# bin-sibling (scripts/okstra_ctl); in the installed layout it is under
+# $OKSTRA_HOME/lib/python. Put both on PYTHONPATH so either resolves.
+home="${OKSTRA_HOME:-$HOME/.okstra}"
+export PYTHONPATH="${_dir}:${home}/lib/python${PYTHONPATH:+:$PYTHONPATH}"
+exec python3 -c 'import sys; from okstra_ctl.team_reconcile import main; sys.exit(main(sys.argv[1:]))' "$@"

package/runtime/bin/okstra.sh CHANGED Viewed

@@ -80,6 +80,7 @@ okstra execution summary:
   executor (implementation only): ${EXECUTOR_OVERRIDE:-default(claude)}
   approved plan: ${APPROVED_PLAN_PATH:-None}
   approve ack (CLI 승인 의사): ${APPROVE_PLAN_ACK}
+  implementation option (선택 옵션): ${IMPLEMENTATION_OPTION:-None(fallback to Recommended Option)}
   related tasks: ${RELATED_TASKS_RAW:-None}
 CONFIRM_EOF
   printf 'Continue? [y/yes]: ' >&2
@@ -117,6 +118,7 @@ PY_ARGS=(
 [[ -n "${RELATED_TASKS_RAW-}" ]] && PY_ARGS+=(--related-tasks "$RELATED_TASKS_RAW")
 [[ -n "${APPROVED_PLAN_PATH-}" ]] && PY_ARGS+=(--approved-plan "$APPROVED_PLAN_PATH")
 [[ "$APPROVE_PLAN_ACK" == "true" ]] && PY_ARGS+=(--approve)
+[[ -n "${IMPLEMENTATION_OPTION-}" ]] && PY_ARGS+=(--implementation-option "$IMPLEMENTATION_OPTION")
 [[ -n "${CLARIFICATION_RESPONSE_PATH-}" ]] && PY_ARGS+=(--clarification-response "$CLARIFICATION_RESPONSE_PATH")
 [[ -n "${WORK_CATEGORY-}" ]] && PY_ARGS+=(--work-category "$WORK_CATEGORY")
 [[ -n "${BASE_REF-}" ]] && PY_ARGS+=(--base-ref "$BASE_REF")

package/runtime/prompts/launch.template.md CHANGED Viewed

@@ -39,6 +39,8 @@ Emit one `PROGRESS: <phase-id> <verb-phrase>` line as plain user-facing text at
 - Task manifest: `{{TASK_MANIFEST_RELATIVE_PATH}}`
 - Run manifest: `{{RUN_MANIFEST_RELATIVE_PATH}}`
+- Active run context: `{{ACTIVE_RUN_CONTEXT_RELATIVE_PATH}}`
+- Analysis packet: `{{ANALYSIS_PACKET_RELATIVE_PATH}}`
 ## Session
@@ -82,7 +84,7 @@ Emit one `PROGRESS: <phase-id> <verb-phrase>` line as plain user-facing text at
 ## Available MCP Servers
 {{AVAILABLE_MCP_SERVERS}}
-- The full usage policy and per-phase rules live in the task brief's `## Available MCP Servers` section. Read them there before dispatching workers and inject only the one-line pointer below into each worker prompt (the brief is already in every worker's [Required reading], so verbatim copy is redundant): `**MCP servers:** follow the task brief's "## Available MCP Servers" section (already in your Required reading).`
+- The full usage policy and per-phase rules live in the analysis packet's `Available MCP Servers` extract. Inject only the one-line pointer below into each analysis-worker prompt: `**MCP servers:** follow the analysis packet's "Available MCP Servers" section (already in your Required reading).`
 - **Invocation rule (forward to every worker prompt)**: MCP tools are addressed by their tool name through the host's tool interface — **never via `Bash`**. Claude-side workers call the tool directly (e.g. `mcp__<server>__<tool>`). Codex/Gemini workers call through their CLI's own MCP transport (e.g. `codex mcp call ...`). Running the tool name as a shell command is a contract violation and will always fail regardless of permission grants.
 - Codex worker and Gemini worker run external CLIs; they can only use these MCP servers if their own CLI configs mirror them. If not, instruct the worker to record `MCP not available in this CLI` in its `Missing Information or Assumptions` block rather than guessing or shell-falling-back.
 - MCP queries are evidence-grade. Cite server, table, and the SELECT used in worker output. MCP must NOT be used as a write path in any phase, including `implementation`.

package/runtime/prompts/profiles/_common-contract.md CHANGED Viewed

@@ -39,11 +39,11 @@ profile document.
 - Run-end team teardown (shared — runs AFTER Phase 7 persistence/token collection, BEFORE the pane disposition step below):
   - The lead created the worker team in Phase 3 (`TeamCreate(team_name: "okstra-<task-key>")`). Worker teammates are NOT reclaimed on their own — without an explicit teardown they linger in the FleetView roster across this and later runs in the session. The lead MUST release them once the run's work is done.
   - This step is **automatic and silent** — NO user prompt (workers are idle sessions that have already delivered their results; there is nothing for the user to preserve). It runs only when team-state's `teamCreate.status == "ok"` (Teams mode was actually used); in the no-`team_name` fallback there is no team to delete, so silent-skip.
+  - Why a reconcile step exists: each worker clears its own `isActive` flag in `~/.claude/teams/<team>/config.json` when its `Agent()` dispatch returns, so by Phase 7 every worker is normally already inactive and `TeamDelete()` succeeds immediately. The one failure mode is a worker whose tmux pane died WITHOUT clearing the flag (e.g. killed mid-turn): it stays `isActive: true`, and `TeamDelete` then refuses the entire team with an "active members" error that no amount of re-sending `shutdown_request` can clear — the addressee is already gone. `okstra-team-reconcile.sh` deterministically flips exactly those dead-pane stale-active members to inactive (never a live-pane member, never the lead).
   - Sequence (token-usage collection MUST already be complete — `TeamDelete` removes `~/.claude/teams/<team>/` + `~/.claude/tasks/<team>/` but NOT the `~/.claude/projects/` jsonls Phase 7 reads, yet the read MUST precede teardown):
-    1. Read `~/.claude/teams/okstra-<task-key>/config.json` and, for every `members` entry whose name is not the lead, `SendMessage(to: <name>, message: { type: "shutdown_request" })` to terminate it gracefully.
-    2. These workers already delivered their results and terminated when their `Agent()` dispatch returned (the lead's completion evidence is the returned output + the existing result/final-report file, not a teardown ack) — a terminated session emits NO shutdown confirmation. Treat `shutdown_request` as best-effort (fire-and-forget); the lead MUST NOT block waiting for acks from addressed teammates. Proceed immediately to step 3.
-    3. Call `TeamDelete()` — the single synchronization point for teardown. If it errors with an active-members message, one teammate is genuinely still shutting down: wait briefly, retry `TeamDelete()` once, then proceed regardless of the result. NEVER loop or re-send `shutdown_request`; teardown must never block run completion once the work and final report already exist.
-  - Report it in one short line (e.g. `worker 6명 종료 + 팀 해제`) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
+    1. Run `$HOME/.okstra/bin/okstra-team-reconcile.sh "okstra-<task-key>"` exactly once. It flips dead-pane stale-active members to inactive, and no-ops when tmux is unavailable or nothing is stale. Do NOT loop it.
+    2. Call `TeamDelete()` — the single synchronization point for teardown. If it STILL errors with an active-members message, one worker pane is genuinely still live (rare at Phase 7, since every `Agent()` dispatch has already returned): send that one member a structured `SendMessage(to: <name>, message: { type: "shutdown_request" })` — the `message` MUST be the object literal shown, NEVER a JSON string stuffed into a text field (a stringified payload is delivered as a plain message and the shutdown protocol never fires) — wait briefly, then retry `TeamDelete()` once and proceed regardless of the result. NEVER loop, never use `TaskStop` (teammates are not background tasks — `TaskStop` 404s on a member address), and never let teardown block run completion once the work and final report already exist.
+  - Report it in one short line (e.g. `stale 멤버 1명 정리 + 팀 해제`, or just `worker 팀 해제` when nothing was stale) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
 - Phase wrap-up — okstra pane disposition (shared, MUST be the *last* step before returning control to the user):
   - At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this run.
   - When `<RUN_DIR>/state/lead-pane.id` is non-empty, after the final-report file has been written and the routing recommendation has been issued, the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` exactly once. The output lists every residual okstra pane (worker-agent + trace) for this run, never the lead's own pane.

package/runtime/prompts/profiles/_implementation-executor.md CHANGED Viewed

@@ -19,9 +19,9 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
 ## Pre-implementation context exploration (executor before first edit)
 - **Coding-conventions preflight (BLOCKING — runs before the first `Edit` / `Write`, and binds the TDD loop below):** load the applicable coding conventions for every language the diff will touch, then state in ONE line which conventions apply (e.g. `Applying TS + hexagonal overlay; domain at src/domains/*/domain/`). Lint/test green is necessary but NOT sufficient — self-mocked tests, interaction-only assertions, and untruthful names all pass a green pipeline; this gate is what keeps them out of the diff.
-  - **Language-specific rules load per situation — never inline them here.** Detect each touched file's language (extension / project manifest) and load the matching reference from the project's coding-conventions skill: `coding-preflight`, when installed, routes `languages/<lang>.md` (mock/spy API, idioms, test framework) + `clean-code.md` + any `architecture/*` overlay. For a ports-and-adapters / NestJS-hex layout (`domain/` + `ports/` + `adapters/`, `*.port.*`), load the hexagonal overlay too. This per-language split is the skill's job — the executor does not carry a multi-language block in context.
+  - **Language-specific rules load per situation — never inline them here.** Detect each touched file's language (extension / project manifest) and load the matching reference by reading okstra's installed coding-conventions files directly at `~/.claude/skills/okstra-coding-preflight/` (placed there by `okstra install`): read `languages/<lang>.md` (mock/spy API, idioms, test framework) + `clean-code.md` + any `architecture/*` overlay via the Read tool by absolute path. The skill is `user-invocable: false`, so do NOT rely on Skill-tool auto-invocation — read the files directly. For a ports-and-adapters / NestJS-hex layout (`domain/` + `ports/` + `adapters/`, `*.port.*`), load the hexagonal overlay too. This per-language split is the skill's job — the executor does not carry a multi-language block in context.
   - **Language-agnostic principles that ALWAYS bind (the TDD loop below MUST satisfy them):** (1) no self-mocking of the SUT — stub/spy only injected collaborators, never the subject's own methods; (2) behavioral assertions on outcomes (return value, state, persisted rows, events, boundary calls) — never `toHaveBeenCalled*` on an internal helper as the only/primary assertion; (3) truthful names — a `get*` / `find*` that writes/inserts, or a name encoding the caller's use-case (`*ForInit`) or hiding a domain rule (`findValid*`), is a defect; (4) single-purpose functions ≤50 effective lines, plain-English readability.
-  - **Graceful degradation (end-user, or codex / gemini executor runtimes where no coding-conventions skill is reachable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
+  - **Graceful degradation (codex / gemini executor runtimes, or any runtime where the `~/.claude/skills/okstra-coding-preflight/` files are absent or unreadable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
 - **Mandatory TDD loop**: BEFORE the first `Edit` or `Write` call, the executor MUST apply a red-green-refactor loop for every code change in this run. This is required; skipping it is a `contract-violated` outcome. This governs HOW each step is executed (failing test first → minimal implementation → refactor); it does not override the approved plan's WHAT/file scope.
   - Order of operations per plan step: (1) write/extend the test that captures the step's acceptance criterion and confirm it fails for the right reason, (2) commit the failing test (`test(<scope>): ...`), (3) implement the minimum change to make it pass, (4) commit the implementation (`feat|fix(<scope>): ...`), (5) refactor without changing behaviour and commit separately if any cleanup is made (`refactor(<scope>): ...`). The failing-then-passing transition between steps (2) and (4) is the `TDD evidence` required by the final report.
   - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).

package/runtime/prompts/profiles/_implementation-verifier.md CHANGED Viewed

@@ -66,7 +66,7 @@ The final report keeps both — executor's `Validation evidence` AND each verifi
 Re-running commands proves the diff *builds and passes*; it does NOT prove the diff is *well-designed*. Lint/test green is necessary but not sufficient — self-mocked tests, interaction-only assertions, and untruthful names all survive a green pipeline. This gate is the filter for exactly those defects, so the executor's design errors are caught here instead of in post-merge PR review. It is a real gate, not a checklist: it enumerates the full diff and a blocking hit forces `FAIL`.
 - **Scope (no silent sampling).** Enumerate every changed source/test file via `git diff --name-only <base>...HEAD` and review each one. Skipping a changed file silently is a `contract-violated` outcome. If a file's language has no reference and is not covered by the agnostic checks below, record `design-review skipped: <file> (language=<x> no reference)` — never pass it silently.
-- **Load the same conventions the executor used, per language.** For each touched language load the coding-conventions reference (`coding-preflight` `languages/<lang>.md` + `clean-code.md` + the hexagonal overlay when the layout matches); degrade to the agnostic checks below when no skill is reachable. The verifier does NOT inline language rules — it loads them per situation, identical to the executor preflight.
+- **Load the same conventions the executor used, per language.** For each touched language load the coding-conventions reference by reading `~/.claude/skills/okstra-coding-preflight/languages/<lang>.md` + `clean-code.md` + the `architecture/hexagonal.md` overlay when the layout matches; degrade to the agnostic checks below when those files are not readable. The verifier does NOT inline language rules — it loads them per situation, identical to the executor preflight.
 - **Blocking checks (any hit → verdict `FAIL`, cited `path:line` + rule name, recommended fix recorded — the verifier does NOT apply it):**
   - **Self-mocking:** a test for `Foo` stubs/spies a method on the `Foo` instance under test (`jest.spyOn(sut, ...)`, `spyOn(FooService.prototype, ...)` in `foo.*.spec.*`, `vi.mocked(sut)` + stub). Mocking injected collaborators is fine.
   - **Interaction-only assertion:** a test whose only/primary assertion is `toHaveBeenCalled*` / `toHaveBeenCalledTimes` on an internal helper or a non-side-effecting collaborator, with no assertion on the returned value / resulting state / persisted row / emitted event.

package/runtime/prompts/profiles/implementation-planning.md CHANGED Viewed

@@ -55,7 +55,7 @@
   - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
   - Korean translations are allowed in parentheses (e.g. `### Recommended Option (권장 옵션)`), but the English keyword must be present verbatim in the heading line.
   - The shape and ordering follow `final-report-template.md` section 4.5 (`Implementation Plan Deliverables`). Do NOT translate the heading keywords — `validators/validate-run.py` does substring matching on the raw report text and 7-of-8 missing strings is a real, repeatedly observed failure mode (root cause: writer translated the headings to Korean).
-  - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), and the `depends-on` DAG are all enforced here, not deferred to the `implementation` entry gate.
+  - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), the `depends-on` DAG, and the per-stage vertical-slice contract (S10) are all enforced here, not deferred to the `implementation` entry gate. S10 scans for the literal in-section strings `Slice value:`, `Acceptance:`, and the Stepwise `action`-cell prefixes `RED:` / `GREEN:` (or a `TDD exemption:` line) — keep these tokens verbatim for the same reason as the heading keywords above.
 - Required deliverable shape (final report, in addition to the standard sections):
   - at least two implementation options. **Each option must include**:
     - **File Structure**: an explicit list of files to create / modify / delete with each file's responsibility (one-line each). Use the form `Create: path — responsibility` / `Modify: path:line-range — change summary` / `Delete: path — reason`.
@@ -64,18 +64,22 @@
   - trade-off matrix across options (rows = options, columns at minimum: complexity, risk, reversibility, test coverage cost, rollout cost)
   - recommended option with rationale tied to the design principles above
   - **Stage Map (mandatory — always emitted, even when N=1):** a table of all stages with `stage | title | depends-on | step-count | exit-contract-summary`. `depends-on` is `(none)` or a comma-separated stage number list. Stages with `depends-on (none)` can be implemented in parallel by two simultaneous `implementation` runs.
+  - **Per-stage slice declaration (mandatory two lines, directly under the `## 5.5.<i> Stage <i>:` heading, before `### Carry-In`):**
+    - `Slice value: <the one user-observable increment this stage delivers, end-to-end>` — describe WHAT starts working from the consumer's view (e.g. "X 를 조회하면 Y 가 반환된다"), NOT a layer name ("repository 추가"). Validator S10a rejects a missing/empty value.
+    - `Acceptance: <the observable pass condition or the exact command>` — the signal that proves the slice is done; normally the same test command that the `RED:` step below flips to PASS. Validator S10b rejects a missing/empty value.
   - **Per-stage subsections** (`## 5.5.<i> Stage <i>: <title>` for each `i`), each containing the four required subsections:
     - `### Carry-In` — for `depends-on (none)`: task-brief only. Otherwise: each depended-on stage's static exit contract + runtime sidecar path `runs/<impl-key>/carry/stage-<i>.json` placeholder.
-    - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
+    - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch. **TDD ordering is MUST, not a preference:** the **first** effective step's `action` cell MUST start with the literal `RED:` and describe the failing test that captures this stage's `Acceptance` (`expected` = FAIL); at least one later `action` cell MUST start with the literal `GREEN:` and describe the minimal implementation that makes it pass (`expected` = PASS); an optional refactor step starts with `REFACTOR:`. **Exemption:** doc-only / config-only / pure-rename stages with no observable runtime behaviour may omit RED/GREEN by declaring one line `TDD exemption: <reason>` in the stage section (mirrors the executor's per-step exemption in `_implementation-executor.md`). Validator S10c enforces RED-first + GREEN, or the exemption line.
     - `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
     - `### Stage Validation` — pre / mid / post exact commands or observable outcomes for this stage only.
-  - **Cohesion-first partition rule (1st-class):** the grouping anchor is **shared file/module proximity** — steps touching the same file/directory/module go in the same stage so the diff, PR, and rollback unit are semantically cohesive. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) the file sets are disjoint (unrelated work touching no shared file is not crammed together). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
+  - **Vertical-slice-first partition rule (1st-class):** the grouping anchor is a **thin end-to-end vertical slice** — one stage delivers a single user-observable increment, crossing whatever layers are needed (data → service → API → UI) to make that one increment work. File/module proximity is demoted to the **intra-slice grouping rule**: within a slice, keep steps touching the same file/directory/module together so the diff, PR, and rollback unit stay cohesive. **Horizontal layer-splitting is forbidden** — never carve "the DB layer" into one stage and "the service layer" into the next; that produces stages that ship no standalone user value. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) it is a distinct vertical slice (a different user-value increment). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
   - **Parallel-safety invariant (BLOCKING):** any two stages that are both `depends-on (none)` MUST predict disjoint file sets in their `Stage Exit Contract`. Two parallel `implementation` runs would otherwise edit the same file concurrently. Work touching a shared file must either go in one stage or be ordered with `depends-on`. Enforced by `validators/validate-implementation-plan-stages.py` check S9.
   - **Stage exit contract is the carry surface:** keep it as narrow as possible. Wider surface = more downstream coupling.
   - dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
   - validation checklist (pre / mid / post) — each item is an exact command or observable outcome
   - rollback strategy — exact revert path (commits, flags, migrations) and the signal that triggers rollback
   - the YAML frontmatter MUST include the line `approved: false` (report-writer always emits the unflipped value). The user authorises the next `implementation` run by flipping it to `approved: true` (manual edit or `--approve` CLI). Do NOT recreate any `User Approval Request` body block — the validator fails reports that contain one (see `validators/validate-run.py` deprecated patterns).
+  - the YAML frontmatter MUST include the line `implementation-option:` directly under `approved:` (report-writer always emits it with an **empty value**). The user selects which Option Candidate the next `implementation` run executes by filling this line with that option's name (manual edit or `--implementation-option <name>` CLI). When left empty, the `implementation` run falls back to the `Recommended Option`.
   - **the frontmatter `approved: false` line is rendered unconditionally; if the plan-body verification gate (§5.5.9) returns `blocked-by-disagreement` or `aborted-non-result`, the writer MUST keep `approved: false` and the validator refuses any report that ships with `approved: true` under such a gate result.**
   - every ambiguity flagged during pre-planning that the user must resolve before approval registered as a `Blocks=approval` row in the `## 1. Clarification Items` table (do NOT create a separate `Open Questions` block under `4.5.x` — the unified table is the single home)
   - **§5.5.9 Plan Body Verification (BLOCKING).** After report-writer finishes the draft, the lead MUST run a worker peer-review round on the consolidated plan body (sections 4.5.1 – 4.5.7) and populate `### 5.5.9 Plan Body Verification` in the final report. The round protocol, plan-item ID scheme (`P-Opt-*` / `P-Step-*` / `P-Dep-*` / `P-Val-*` / `P-Rb-*`), verdict semantics, gate-result classification, and dissent log format are defined in `skills/okstra-convergence/SKILL.md` "Plan-body verification mode". The four gate-result values are `passed`, `passed-with-dissent`, `blocked-by-disagreement`, `aborted-non-result`. When the gate would have been `blocked-by-disagreement` or `aborted-non-result`, the lead MUST NOT silently flip it to one of the passing values to "unblock" the run — that is a contract violation. When `convergence.adversarial=true` (the default for this phase), this round uses the adversarial posture — verifiers confirm cited paths/commands and the burden of proof is on the plan — but the gate threshold stays `majority-disagree` (see that skill's §"Adversarial plan-body posture").
@@ -98,4 +102,4 @@
   4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 1. Clarification Items` table as a `Blocks=approval` row.
   5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
   6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
-  7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
+  7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Confirm each stage declares a non-empty `Slice value:` and `Acceptance:` line, and that its first step `action` starts with `RED:` with a later `GREEN:` (or carries a `TDD exemption:` line) — this is what validator S10 enforces. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).

package/runtime/prompts/profiles/implementation.md CHANGED Viewed

@@ -19,7 +19,7 @@
   - the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
   - that file's YAML frontmatter MUST carry `approved: true`. report-writer emits `approved: false` by default; the user flips it to `true` to authorise this run. Free-form approvals such as "lgtm" / "go ahead" / paraphrased confirmations are NOT accepted; re-edit the plan file's frontmatter to `approved: true` before invoking implementation, or pass `--approve` so the CLI flips it on the user's behalf (`okstra_ctl.run._apply_cli_approval`).
   - The `--approve` flag is meaningful ONLY with `--task-type implementation` and `--approved-plan <path>`; any other use raises `PrepareError`. Idempotent — re-running with `approved: true` already set appends an audit line but does NOT re-toggle.
-  - the file's `Recommended option` and its bite-sized step list become the authoritative scope for this run; deviations must be justified in the final report and routed back to a new `implementation-planning` run rather than silently expanded.
+  - the authoritative scope for this run is the Option Candidate named by the YAML frontmatter `implementation-option:` field. **If `implementation-option:` is empty, fall back to the plan's `Recommended Option`** (this is a soft fallback, not a hard block). The chosen option's bite-sized step list becomes the authoritative scope; deviations must be justified in the final report and routed back to a new `implementation-planning` run rather than silently expanded. If the chosen option name does not match any heading under `Option Candidates`, record it as a deviation.
 - Task worktree (provisioned by `okstra-ctl` at the first phase's run-prep time, reused for every subsequent phase of this task-key):
   - Status: `{{EXECUTOR_WORKTREE_STATUS}}` (one of: `created` | `reused` | `skipped-in-worktree` | `skipped-not-git`)
   - Working tree path: `{{EXECUTOR_WORKTREE_PATH}}` — when status is `created` or `reused`, this is the task's `git worktree` rooted at `~/.okstra/worktrees/<project>/<task-group>/<task-id>/`. When skipped, this is the caller's `project_root`.