okstra 0.49.0 → 0.51.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. package/README.kr.md +8 -7
  2. package/README.md +8 -7
  3. package/bin/okstra +2 -0
  4. package/docs/kr/architecture.md +23 -24
  5. package/docs/kr/cli.md +6 -6
  6. package/docs/project-structure-overview.md +13 -9
  7. package/docs/superpowers/plans/2026-06-05-wizard-batch-prompts.md +559 -0
  8. package/docs/superpowers/specs/2026-06-05-wizard-batch-prompts-design.md +121 -0
  9. package/docs/task-process/error-analysis.md +1 -1
  10. package/docs/task-process/final-verification.md +1 -1
  11. package/docs/task-process/release-handoff.md +1 -1
  12. package/docs/task-process/requirements-discovery.md +1 -1
  13. package/package.json +1 -1
  14. package/runtime/BUILD.json +2 -2
  15. package/runtime/agents/SKILL.md +18 -14
  16. package/runtime/agents/workers/claude-worker.md +4 -4
  17. package/runtime/agents/workers/codex-worker.md +3 -3
  18. package/runtime/agents/workers/gemini-worker.md +3 -3
  19. package/runtime/agents/workers/report-writer-worker.md +3 -3
  20. package/runtime/bin/lib/okstra/cli.sh +8 -1
  21. package/runtime/bin/lib/okstra/globals.sh +3 -0
  22. package/runtime/bin/lib/okstra/interactive.sh +14 -12
  23. package/runtime/bin/lib/okstra/usage.sh +6 -0
  24. package/runtime/bin/okstra-render-report-views.py +1 -1
  25. package/runtime/bin/okstra-team-reconcile.sh +28 -0
  26. package/runtime/bin/okstra.sh +2 -0
  27. package/runtime/prompts/launch.template.md +4 -2
  28. package/runtime/prompts/profiles/_common-contract.md +15 -15
  29. package/runtime/prompts/profiles/_implementation-deliverable.md +1 -1
  30. package/runtime/prompts/profiles/_implementation-executor.md +3 -3
  31. package/runtime/prompts/profiles/_implementation-verifier.md +2 -2
  32. package/runtime/prompts/profiles/error-analysis.md +1 -1
  33. package/runtime/prompts/profiles/final-verification.md +2 -2
  34. package/runtime/prompts/profiles/implementation-planning.md +10 -9
  35. package/runtime/prompts/profiles/implementation.md +1 -1
  36. package/runtime/prompts/profiles/improvement-discovery.md +5 -5
  37. package/runtime/prompts/profiles/release-handoff.md +2 -2
  38. package/runtime/prompts/profiles/requirements-discovery.md +2 -2
  39. package/runtime/python/okstra_ctl/analysis_packet.py +259 -0
  40. package/runtime/python/okstra_ctl/clarification_items.py +11 -11
  41. package/runtime/python/okstra_ctl/context_cost.py +308 -0
  42. package/runtime/python/okstra_ctl/migrate.py +2 -12
  43. package/runtime/python/okstra_ctl/paths.py +22 -0
  44. package/runtime/python/okstra_ctl/render.py +285 -126
  45. package/runtime/python/okstra_ctl/render_final_report.py +32 -1
  46. package/runtime/python/okstra_ctl/report_views.py +12 -12
  47. package/runtime/python/okstra_ctl/run.py +510 -248
  48. package/runtime/python/okstra_ctl/sequence.py +2 -5
  49. package/runtime/python/okstra_ctl/team_reconcile.py +131 -0
  50. package/runtime/python/okstra_ctl/wizard.py +219 -136
  51. package/runtime/python/okstra_ctl/workflow.py +1 -1
  52. package/runtime/python/okstra_ctl/worktree.py +13 -5
  53. package/runtime/schemas/final-report-v1.0.schema.json +4 -0
  54. package/runtime/skills/okstra-brief/SKILL.md +1 -1
  55. package/runtime/skills/okstra-coding-preflight/SKILL.md +69 -0
  56. package/runtime/skills/okstra-coding-preflight/architecture/hexagonal.md +116 -0
  57. package/runtime/skills/okstra-coding-preflight/clean-code.md +254 -0
  58. package/runtime/skills/okstra-coding-preflight/languages/java.md +64 -0
  59. package/runtime/skills/okstra-coding-preflight/languages/javascript-typescript.md +87 -0
  60. package/runtime/skills/okstra-coding-preflight/languages/kotlin.md +69 -0
  61. package/runtime/skills/okstra-coding-preflight/languages/nodejs.md +66 -0
  62. package/runtime/skills/okstra-coding-preflight/languages/python.md +179 -0
  63. package/runtime/skills/okstra-coding-preflight/languages/rust.md +105 -0
  64. package/runtime/skills/okstra-coding-preflight/languages/sql.md +68 -0
  65. package/runtime/skills/okstra-context-loader/SKILL.md +12 -6
  66. package/runtime/skills/okstra-convergence/SKILL.md +8 -8
  67. package/runtime/skills/okstra-inspect/SKILL.md +100 -1
  68. package/runtime/skills/okstra-report-writer/SKILL.md +27 -23
  69. package/runtime/skills/okstra-run/SKILL.md +3 -1
  70. package/runtime/skills/okstra-team-contract/SKILL.md +8 -5
  71. package/runtime/templates/reports/final-report.template.md +188 -187
  72. package/runtime/templates/reports/i18n/en.json +4 -4
  73. package/runtime/templates/reports/i18n/ko.json +4 -4
  74. package/runtime/templates/reports/implementation-planning-input.template.md +1 -1
  75. package/runtime/templates/reports/release-handoff-input.template.md +1 -1
  76. package/runtime/templates/reports/user-response.template.md +1 -1
  77. package/runtime/templates/worker-prompt-preamble.md +4 -4
  78. package/runtime/validators/lib/fixtures.sh +2 -2
  79. package/runtime/validators/validate-implementation-plan-stages.py +9 -9
  80. package/runtime/validators/validate-report-views.py +10 -10
  81. package/runtime/validators/validate-run.py +36 -36
  82. package/runtime/validators/validate_improvement_report.py +8 -8
  83. package/src/_python-helper.mjs +3 -3
  84. package/src/context-cost.mjs +27 -0
  85. package/src/install.mjs +1 -0
  86. package/src/uninstall.mjs +1 -0
@@ -24,7 +24,7 @@ profile document.
24
24
  - Create, modify, or delete only inside `<PROJECT_ROOT>/.okstra/**` unless the brief verbatim requests a specific non-okstra edit. The phase performing that edit must quote the user instruction in its report. Implementation source edits also require the approved implementation plan.
25
25
  - Authority & permissions assumption (applies to every okstra task-type):
26
26
  - **Assume the user (and their team) holds full authority and every permission required for the anticipated, in-flight, or follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
27
- - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `## 5. Clarification Items` table, or any day/effort estimate. They are not legitimate sources of schedule extension.
27
+ - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `## 1. Clarification Items` table, or any day/effort estimate. They are not legitimate sources of schedule extension.
28
28
  - Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
29
29
  - This rule does NOT relax any phase-specific Forbidden actions list; safety rules in the per-profile document remain in force regardless of the user's authority.
30
30
  - Anti-escalation rule (shared):
@@ -39,11 +39,11 @@ profile document.
39
39
  - Run-end team teardown (shared — runs AFTER Phase 7 persistence/token collection, BEFORE the pane disposition step below):
40
40
  - The lead created the worker team in Phase 3 (`TeamCreate(team_name: "okstra-<task-key>")`). Worker teammates are NOT reclaimed on their own — without an explicit teardown they linger in the FleetView roster across this and later runs in the session. The lead MUST release them once the run's work is done.
41
41
  - This step is **automatic and silent** — NO user prompt (workers are idle sessions that have already delivered their results; there is nothing for the user to preserve). It runs only when team-state's `teamCreate.status == "ok"` (Teams mode was actually used); in the no-`team_name` fallback there is no team to delete, so silent-skip.
42
+ - Why a reconcile step exists: each worker clears its own `isActive` flag in `~/.claude/teams/<team>/config.json` when its `Agent()` dispatch returns, so by Phase 7 every worker is normally already inactive and `TeamDelete()` succeeds immediately. The one failure mode is a worker whose tmux pane died WITHOUT clearing the flag (e.g. killed mid-turn): it stays `isActive: true`, and `TeamDelete` then refuses the entire team with an "active members" error that no amount of re-sending `shutdown_request` can clear — the addressee is already gone. `okstra-team-reconcile.sh` deterministically flips exactly those dead-pane stale-active members to inactive (never a live-pane member, never the lead).
42
43
  - Sequence (token-usage collection MUST already be complete — `TeamDelete` removes `~/.claude/teams/<team>/` + `~/.claude/tasks/<team>/` but NOT the `~/.claude/projects/` jsonls Phase 7 reads, yet the read MUST precede teardown):
43
- 1. Read `~/.claude/teams/okstra-<task-key>/config.json` and, for every `members` entry whose name is not the lead, `SendMessage(to: <name>, message: { type: "shutdown_request" })` to terminate it gracefully.
44
- 2. These workers already delivered their results and terminated when their `Agent()` dispatch returned (the lead's completion evidence is the returned output + the existing result/final-report file, not a teardown ack) a terminated session emits NO shutdown confirmation. Treat `shutdown_request` as best-effort (fire-and-forget); the lead MUST NOT block waiting for acks from addressed teammates. Proceed immediately to step 3.
45
- 3. Call `TeamDelete()` the single synchronization point for teardown. If it errors with an active-members message, one teammate is genuinely still shutting down: wait briefly, retry `TeamDelete()` once, then proceed regardless of the result. NEVER loop or re-send `shutdown_request`; teardown must never block run completion once the work and final report already exist.
46
- - Report it in one short line (e.g. `worker 6명 종료 + 팀 해제`) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
44
+ 1. Run `$HOME/.okstra/bin/okstra-team-reconcile.sh "okstra-<task-key>"` exactly once. It flips dead-pane stale-active members to inactive, and no-ops when tmux is unavailable or nothing is stale. Do NOT loop it.
45
+ 2. Call `TeamDelete()` the single synchronization point for teardown. If it STILL errors with an active-members message, one worker pane is genuinely still live (rare at Phase 7, since every `Agent()` dispatch has already returned): send that one member a structured `SendMessage(to: <name>, message: { type: "shutdown_request" })` — the `message` MUST be the object literal shown, NEVER a JSON string stuffed into a text field (a stringified payload is delivered as a plain message and the shutdown protocol never fires) — wait briefly, then retry `TeamDelete()` once and proceed regardless of the result. NEVER loop, never use `TaskStop` (teammates are not background tasks — `TaskStop` 404s on a member address), and never let teardown block run completion once the work and final report already exist.
46
+ - Report it in one short line (e.g. `stale 멤버 1명 정리 + 해제`, or just `worker 해제` when nothing was stale) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
47
47
  - Phase wrap-up — okstra pane disposition (shared, MUST be the *last* step before returning control to the user):
48
48
  - At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this run.
49
49
  - When `<RUN_DIR>/state/lead-pane.id` is non-empty, after the final-report file has been written and the routing recommendation has been issued, the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` exactly once. The output lists every residual okstra pane (worker-agent + trace) for this run, never the lead's own pane.
@@ -61,14 +61,14 @@ profile document.
61
61
  - **Reporter confirmation precondition (BLOCKING)**: the brief's frontmatter carries `reporter-confirmations: <complete | partial | pending | skipped>` set by `okstra-brief` Step 6.5. Every phase that consumes the brief MUST read this field before doing analysis. The handling matrix is:
62
62
  - `complete` → proceed normally.
63
63
  - `partial` → proceed; treat still-unmarked `intent-check:` / `conversion-block:` rows as the `skipped` branch.
64
- - `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `## 5. Clarification Items` as `Kind=decision`. Use `Blocks=approval` in `implementation-planning`, where the row gates the User Approval Request; otherwise use `Blocks=next-phase`. The recommended answer is drawn from the brief's matching content and clearly labelled `보고자 직접 확인 권장`.
65
- - `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `## 5. Clarification Items` table carries one row per pending item with `Blocks=approval` in `implementation-planning`, otherwise `Blocks=next-phase`. The operator must rerun `okstra-brief` Step 6.5. Do NOT emit `## 0.` for this case — Section 0 is reserved for clarification-response carry-in only.
64
+ - `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `## 1. Clarification Items` as `Kind=decision`. Use `Blocks=approval` in `implementation-planning`, where the row gates the User Approval Request; otherwise use `Blocks=next-phase`. The recommended answer is drawn from the brief's matching content and clearly labelled `보고자 직접 확인 권장`.
65
+ - `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `## 1. Clarification Items` table carries one row per pending item with `Blocks=approval` in `implementation-planning`, otherwise `Blocks=next-phase`. The operator must rerun `okstra-brief` Step 6.5. Do NOT emit `## 0.` for this case — Section 0 is reserved for clarification-response carry-in only.
66
66
  `[CONFIRMED <YYYY-MM-DD> → RC-N]` markers on `Open Questions` rows are the per-row signal that the reporter has answered; their answers live verbatim under `## Reporter Confirmations` in the brief.
67
67
  - `Source Material` is reporter-verbatim. Do NOT paraphrase, summarize, reorder, or restructure it. Quote it directly when needed.
68
68
  - `Augmentation` entries carry one of four labels — `evidence-link`, `format-conversion`, `terminology-mapping`, `intent-inference`. Treat them as follows:
69
69
  - `evidence-link` / `format-conversion` → trust without re-verification.
70
70
  - `terminology-mapping` → verify against `<PROJECT_ROOT>/.okstra/glossary.md` (authoritative); raise a `Clarification Items` row if the mapping is missing or contradicts the glossary.
71
- - `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `## 5. Clarification Items` table as `Kind=decision, Blocks=next-phase` (or `Blocks=approval` for `implementation-planning`) with the recommended answer set to "보고자에게 직접 확인 후 응답" unless the codebase can be inspected to confirm or refute the inference.
71
+ - `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `## 1. Clarification Items` table as `Kind=decision, Blocks=next-phase` (or `Blocks=approval` for `implementation-planning`) with the recommended answer set to "보고자에게 직접 확인 후 응답" unless the codebase can be inspected to confirm or refute the inference.
72
72
  - `Open Questions` row prefixes are signals — do not strip them when promoting:
73
73
  - `intent-check:` → `Kind=decision`, recommended answer = reporter confirmation. NEVER silently resolve an `intent-check:` by inference at this layer.
74
74
  - `terminology:` → `Kind=decision`, recommended answer = canonical term from `<PROJECT_ROOT>/.okstra/glossary.md` (or "extend okstra glossary via brief Step 4.5").
@@ -77,22 +77,22 @@ profile document.
77
77
  - `general:` → free-form; classify per the standard `Clarification Items` rules.
78
78
  - Any decision in this run that contradicts the brief's `Source Material` must be raised back to the reporter via a `Clarification Items` row; it must NOT be silently overridden. Disagreement with the reporter is allowed only after the row is resolved.
79
79
  - This contract is the single authority on brief consumption. Phase-specific addenda may *tighten* these rules but may not relax them.
80
- - Clarification request policy (shared — applies whenever a profile uses `## 5. Clarification Items`):
81
- - **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §5.1 exactly):** every `## 5. Clarification Items` table has exactly these 8 columns, in this order:
80
+ - Clarification request policy (shared — applies whenever a profile uses `## 1. Clarification Items`):
81
+ - **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §1 exactly):** every `## 1. Clarification Items` table has exactly these 8 columns, in this order:
82
82
  `| ID | Ticket ID | Kind | Statement | Expected form | Blocks | Status | User input |`.
83
83
  Profile-specific addenda may tighten cell content but MUST NOT add, remove, rename, or reorder columns. The `ID` cell uses `C-NNN` (3-digit zero-padded), the `Status` cell ∈ `{open, answered, resolved, obsolete}`, and the `Kind` / `Blocks` legal values are listed below.
84
- - section 5 is a **single unified table** per `final-report-template.md`. Every clarification item — whether the user must attach a file, choose between options, or supply a single number/path — is one row of that table. Do not split it into sub-sections (`5.1 추가 자료 요청` / `5.2 사용자 확인 질문` / `4.5.9 Open Questions` are removed and the validator fails reports that reintroduce them), do not create a parallel table elsewhere in the report, and do not duplicate the same item into the top-of-report `User Approval Request (사용자 승인 게이트)` block or any other section.
84
+ - section 1 is a **single unified table** per `final-report-template.md`. Every clarification item — whether the user must attach a file, choose between options, or supply a single number/path — is one row of that table. Do not split it into sub-sections (`1.1 추가 자료 요청` / `1.2 사용자 확인 질문` / `5.5.9 Open Questions` are removed and the validator fails reports that reintroduce them), do not create a parallel table elsewhere in the report, and do not duplicate the same item into the top-of-report `User Approval Request (사용자 승인 게이트)` block or any other section.
85
85
  - each row's `Kind` column picks one of `{material, decision, data-point}`: `material` for files / snapshots / logs / screenshots the user must attach (the `User input` cell will hold a path or URL); `decision` for choices and yes/no confirmations only the user can make; `data-point` for a single number, ID, date, or short string the user can answer inline. Items that mix "yes/no + file path if yes" are one row of `Kind=material` with the combined expectation written into `Expected form`.
86
86
  - each row's `Blocks` column picks one of `{approval, next-phase, none}`. `approval` is reserved for items that gate an approval action, especially the `implementation-planning` User Approval Request; outside `implementation-planning`, unresolved brief reporter-confirmation rows use `next-phase` instead. `next-phase` blocks the next run from starting cleanly. `none` is informational/audit-only.
87
87
  - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. The `Statement` cell must state *what* is needed, *why* the answer / attachment changes the next step, and (for `material`) *where* the user can find it and *where* to place it. The `Expected form` cell must state the answer shape (예/아니오, 보기 중 하나, 숫자/날짜, 파일 경로, 짧은 서술 등); supply concrete option choices when applicable.
88
88
  - if a phase requires a recommended answer, alternatives, or an evidence-check note, encode it inside the existing 8-column schema: put evidence notes in `Statement` as `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>`, and put recommendations/options in `Expected form` as `Recommended: <answer> — <rationale>; Alternatives: <options>`. Do not add `Recommended`, `Evidence`, `Alternatives`, or `evidence-checked` columns.
89
89
  - the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
90
- - if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `## 5. Clarification Items` table, reconcile each one against new evidence, and update its `Status` to `resolved` or `obsolete` before issuing the next decision/verdict. When no carry-in path was provided, omit the `## 0.` heading entirely — the validator fails reports that emit an empty Section 0 stub (e.g. "No prior clarification response was provided for this run.").
90
+ - if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `## 1. Clarification Items` table, reconcile each one against new evidence, and update its `Status` to `resolved` or `obsolete` before issuing the next decision/verdict. When no carry-in path was provided, omit the `## 0.` heading entirely — the validator fails reports that emit an empty Section 0 stub (e.g. "No prior clarification response was provided for this run.").
91
91
  - Verdict Card (shared — applies to every final-report regardless of profile):
92
- - The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `## 2. Final Verdict` and the first item of `## 6. Recommended Next Steps`. The validator treats the card as a non-authoritative index — when card values diverge from the authoritative sections, the run is `contract-violated`.
93
- - Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `## 1.` / `## 3.` tables in the final-report):
92
+ - The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `## 7. Final Verdict` and the first item of `## 3. Recommended Next Steps`. The validator treats the card as a non-authoritative index — when card values diverge from the authoritative sections, the run is `contract-violated`.
93
+ - Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `## 6.` / `## 2.` tables in the final-report):
94
94
  - **Worker-side item IDs (free-form but unique within the worker).** Every row item in sections 1–5 (and any optional section 6) of an analysis worker's output MUST carry an item ID that is unique within that one worker's result file. The ID convention is the worker's choice — `F-001` / `F-002` per the suggested schema, `1.1` / `1.2` / `1.3` as Codex tends to use, or any other shape — but it MUST appear as the leading column of the row (for table-form items) or as a `[<ID>]` prefix (for bullet/numbered items). Workers that emit findings without IDs make cross-worker reconciliation impossible.
95
- - **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `## 1.1 Consensus` / `## 1.2 Differences` / `## 3.1 Primary Evidence` rows from worker outputs, the lead assigns a fresh `C-NNN` / `D-NNN` / `E-NNN` row ID. The `Source items` column (or, where the template still calls it `Supporting workers` / `Workers (position)` / `Source`, that same column) MUST list every contributing worker:item pair (e.g. `claude:F-001, codex:1.1, gemini:F-3`) so a reviewer can trace the synthesised row back to each worker's original wording without re-reading every worker-results file. Bare worker names without item IDs (e.g. `claude, codex, gemini`) are deprecated for these tables; the validator does not yet fail on them but the readability pass treats it as a contract violation.
95
+ - **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `## 6.1 Consensus` / `## 6.2 Differences` / `## 2.1 Primary Evidence` rows from worker outputs, the lead assigns a fresh `C-NNN` / `D-NNN` / `E-NNN` row ID. The `Source items` column (or, where the template still calls it `Supporting workers` / `Workers (position)` / `Source`, that same column) MUST list every contributing worker:item pair (e.g. `claude:F-001, codex:1.1, gemini:F-3`) so a reviewer can trace the synthesised row back to each worker's original wording without re-reading every worker-results file. Bare worker names without item IDs (e.g. `claude, codex, gemini`) are deprecated for these tables; the validator does not yet fail on them but the readability pass treats it as a contract violation.
96
96
  - **Why this matters.** A real run had `claude=F-1..F-11`, `codex=1.1..1.8`, `gemini=F-3..F-9` — three incompatible ID schemes. When the lead synthesised `C-1..C-8`, the link from `C-3` back to "which sentence in which worker file" was lost. Source-item preservation restores that link without forcing every worker to adopt a single ID prefix, which would over-constrain worker output style.
97
97
  - Audit sidecar (shared — applies to every analysis-worker output and every final-report):
98
98
  - Reading Confirmation lines (one short line per input file confirming end-to-end reading) live in the **worker audit sidecar** at `runs/<task-type>/worker-results/<worker>-audit-<task-type>-<seq>.md`, NOT in the worker's main worker-results file. The worker-results body starts at section 1 (Findings). The validator fails worker-results files that contain a `## 0. Reading Confirmation` heading.
@@ -30,7 +30,7 @@ are collected and convergence finished. Phase 1-5 do not need it.
30
30
  - **Feature-flag-gated changes**: confirm the off-switch path was exercised in this run's validation evidence (i.e. one of the validation commands ran with the flag off and succeeded). A plan that ships a flag without exercising the off-path does NOT satisfy this requirement.
31
31
  - **Schema migrations, config-format changes, or any change with persisted state**: a **dry-run of the rollback step is mandatory**, not preferred. Record the exact rollback command and its captured exit code / stdout. If the migration tool offers no dry-run mode (`--dry-run`, `--plan`, equivalent), the executor MUST refuse to claim rollback verification and instead end the run with a routing recommendation back to `implementation-planning` for a safer rollback strategy. Skipping this step on a stateful change is treated as a `contract-violated` outcome by `final-verification`.
32
32
  - **Routing recommendation for `final-verification`**: brief note on whether the changes are ready for final-verification phase or need a new error-analysis / planning loop first.
33
- - **Follow-up tasks (Section 7 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 7. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any unresolved `## 5. Clarification Items` row carried over from the approved plan (`Status` ∈ `{open, answered}` at approval time). An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 6. Recommended Next Steps` so the user knows to act manually.
33
+ - **Follow-up tasks (Section 4 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 4. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any unresolved `## 1. Clarification Items` row carried over from the approved plan (`Status` ∈ `{open, answered}` at approval time). An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 3. Recommended Next Steps` so the user knows to act manually.
34
34
 
35
35
  ## Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent)
36
36
 
@@ -19,15 +19,15 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
19
19
  ## Pre-implementation context exploration (executor before first edit)
20
20
 
21
21
  - **Coding-conventions preflight (BLOCKING — runs before the first `Edit` / `Write`, and binds the TDD loop below):** load the applicable coding conventions for every language the diff will touch, then state in ONE line which conventions apply (e.g. `Applying TS + hexagonal overlay; domain at src/domains/*/domain/`). Lint/test green is necessary but NOT sufficient — self-mocked tests, interaction-only assertions, and untruthful names all pass a green pipeline; this gate is what keeps them out of the diff.
22
- - **Language-specific rules load per situation — never inline them here.** Detect each touched file's language (extension / project manifest) and load the matching reference from the project's coding-conventions skill: `coding-preflight`, when installed, routes `languages/<lang>.md` (mock/spy API, idioms, test framework) + `clean-code.md` + any `architecture/*` overlay. For a ports-and-adapters / NestJS-hex layout (`domain/` + `ports/` + `adapters/`, `*.port.*`), load the hexagonal overlay too. This per-language split is the skill's job — the executor does not carry a multi-language block in context.
22
+ - **Language-specific rules load per situation — never inline them here.** Detect each touched file's language (extension / project manifest) and load the matching reference by reading okstra's installed coding-conventions files directly at `~/.claude/skills/okstra-coding-preflight/` (placed there by `okstra install`): read `languages/<lang>.md` (mock/spy API, idioms, test framework) + `clean-code.md` + any `architecture/*` overlay via the Read tool by absolute path. The skill is `user-invocable: false`, so do NOT rely on Skill-tool auto-invocation — read the files directly. For a ports-and-adapters / NestJS-hex layout (`domain/` + `ports/` + `adapters/`, `*.port.*`), load the hexagonal overlay too. This per-language split is the skill's job — the executor does not carry a multi-language block in context.
23
23
  - **Language-agnostic principles that ALWAYS bind (the TDD loop below MUST satisfy them):** (1) no self-mocking of the SUT — stub/spy only injected collaborators, never the subject's own methods; (2) behavioral assertions on outcomes (return value, state, persisted rows, events, boundary calls) — never `toHaveBeenCalled*` on an internal helper as the only/primary assertion; (3) truthful names — a `get*` / `find*` that writes/inserts, or a name encoding the caller's use-case (`*ForInit`) or hiding a domain rule (`findValid*`), is a defect; (4) single-purpose functions ≤50 effective lines, plain-English readability.
24
- - **Graceful degradation (end-user, or codex / gemini executor runtimes where no coding-conventions skill is reachable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
24
+ - **Graceful degradation (codex / gemini executor runtimes, or any runtime where the `~/.claude/skills/okstra-coding-preflight/` files are absent or unreadable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
25
25
  - **Mandatory TDD loop**: BEFORE the first `Edit` or `Write` call, the executor MUST apply a red-green-refactor loop for every code change in this run. This is required; skipping it is a `contract-violated` outcome. This governs HOW each step is executed (failing test first → minimal implementation → refactor); it does not override the approved plan's WHAT/file scope.
26
26
  - Order of operations per plan step: (1) write/extend the test that captures the step's acceptance criterion and confirm it fails for the right reason, (2) commit the failing test (`test(<scope>): ...`), (3) implement the minimum change to make it pass, (4) commit the implementation (`feat|fix(<scope>): ...`), (5) refactor without changing behaviour and commit separately if any cleanup is made (`refactor(<scope>): ...`). The failing-then-passing transition between steps (2) and (4) is the `TDD evidence` required by the final report.
27
27
  - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
28
28
  - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
29
29
  - **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
30
- - re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
30
+ - re-read the approved plan end-to-end and parse the `## 5.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
31
31
  - for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
32
32
  - the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
33
33
  - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
@@ -66,7 +66,7 @@ The final report keeps both — executor's `Validation evidence` AND each verifi
66
66
  Re-running commands proves the diff *builds and passes*; it does NOT prove the diff is *well-designed*. Lint/test green is necessary but not sufficient — self-mocked tests, interaction-only assertions, and untruthful names all survive a green pipeline. This gate is the filter for exactly those defects, so the executor's design errors are caught here instead of in post-merge PR review. It is a real gate, not a checklist: it enumerates the full diff and a blocking hit forces `FAIL`.
67
67
 
68
68
  - **Scope (no silent sampling).** Enumerate every changed source/test file via `git diff --name-only <base>...HEAD` and review each one. Skipping a changed file silently is a `contract-violated` outcome. If a file's language has no reference and is not covered by the agnostic checks below, record `design-review skipped: <file> (language=<x> no reference)` — never pass it silently.
69
- - **Load the same conventions the executor used, per language.** For each touched language load the coding-conventions reference (`coding-preflight` `languages/<lang>.md` + `clean-code.md` + the hexagonal overlay when the layout matches); degrade to the agnostic checks below when no skill is reachable. The verifier does NOT inline language rules — it loads them per situation, identical to the executor preflight.
69
+ - **Load the same conventions the executor used, per language.** For each touched language load the coding-conventions reference by reading `~/.claude/skills/okstra-coding-preflight/languages/<lang>.md` + `clean-code.md` + the `architecture/hexagonal.md` overlay when the layout matches; degrade to the agnostic checks below when those files are not readable. The verifier does NOT inline language rules — it loads them per situation, identical to the executor preflight.
70
70
  - **Blocking checks (any hit → verdict `FAIL`, cited `path:line` + rule name, recommended fix recorded — the verifier does NOT apply it):**
71
71
  - **Self-mocking:** a test for `Foo` stubs/spies a method on the `Foo` instance under test (`jest.spyOn(sut, ...)`, `spyOn(FooService.prototype, ...)` in `foo.*.spec.*`, `vi.mocked(sut)` + stub). Mocking injected collaborators is fine.
72
72
  - **Interaction-only assertion:** a test whose only/primary assertion is `toHaveBeenCalled*` / `toHaveBeenCalledTimes` on an internal helper or a non-side-effecting collaborator, with no assertion on the returned value / resulting state / persisted row / emitted event.
@@ -83,7 +83,7 @@ A mocked unit test cannot observe the SQL a query builder actually emits — `co
83
83
  - **Requirement when fired.** The verifier MUST reproduce a real-DB execution: run the `db-test` tier (Tier 1 = plan `validation` db step; else Tier 2 = `project.json.qaCommands.db-test`) against a **local / replica** datastore (same engine + schema — never shared / staging / prod, consistent with the verifier forbidden-actions list) and record its exact command + exit code. A mock, an in-memory shim that does not parse real SQL, or static reasoning does NOT satisfy this.
84
84
  - **No `db-test` command available → blocking, not a passive skip.** If neither tier declares a `db-test` command, the verifier records the blocking finding `db-test not configured — DB change unverified (mock-only)` and sets the verdict to `FAIL`; it MUST NOT emit only the passive `qa-command not configured` note and pass. Recommended fix: declare a `db-test` command in `project.json.qaCommands` or the plan's validation set.
85
85
  - **Mock-only evidence → unverified.** If the diff's only DB coverage is mocked, the verifier labels the DB portion `정적 분석상 …, 미검증(실행 안 함)` (never `검증됨`), records it as a blocking finding, and sets `FAIL`. Never downplay the real run as "too heavy / static proof suffices".
86
- - **Surface it at every layer.** The finding is copied verbatim into the verifier result and MUST survive into the final report's `## 1.` and Verdict Card, so the user sees the DB-unverified state continuously — it is the load-bearing reason a downstream `final-verification` cannot reach `accepted` and `release-handoff` cannot push.
86
+ - **Surface it at every layer.** The finding is copied verbatim into the verifier result and MUST survive into the final report's `## 6.` and Verdict Card, so the user sees the DB-unverified state continuously — it is the load-bearing reason a downstream `final-verification` cannot reach `accepted` and `release-handoff` cannot push.
87
87
 
88
88
  ## All-verifier-failure policy
89
89
 
@@ -25,7 +25,7 @@
25
25
  - uncertainty boundaries
26
26
  - practical next diagnostic steps
27
27
  - Clarification request policy (phase-specific addenda — shared policy is in `_common-contract.md`):
28
- - if any blocking uncertainty remains at the time of writing the final report, populate `## 5. Clarification Items` in `final-report-template.md` (a single unified table; `Blocks=next-phase` for items the next run cannot start without)
28
+ - if any blocking uncertainty remains at the time of writing the final report, populate `## 1. Clarification Items` in `final-report-template.md` (a single unified table; `Blocks=next-phase` for items the next run cannot start without)
29
29
  - prefer plain Korean over abbreviations (e.g. write "초당 평균 요청 수" instead of "QPS", "재현 절차" instead of "repro")
30
30
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
31
31
  - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity about repro, file behavior, or symbol semantics that can be answered by `Read` / `Grep` / log inspection MUST be resolved that way and recorded with file:line (or log-line) evidence. Writing a clarification row for something the codebase or shipped logs already answer is a defect of this phase.
@@ -29,7 +29,7 @@
29
29
  - if the cited implementation report is missing, lacks commits for delivered code changes, or the current checkout does not match the implementation report's commit list / diff summary, the run MUST end with status `blocked` and route back to `implementation` or `implementation-planning` rather than verifying an ambiguous target.
30
30
  - Required deliverable shape (final report, in addition to the standard sections):
31
31
  - **Source Implementation Report**: relative path of the originating `implementation` final-report file, the quoted commit list / diff summary used as the verification target, the worktree path inspected, and the base/head SHAs captured at run start. The lead injects this same target snapshot into every analyser prompt (`**Worktree:** / **Verification base ref:** / **Verification head SHA:** / **Verification diff stat:**`); a worker that cannot confirm its analysis ran against that exact head MUST record a `tool-failure` rather than verify an ambiguous target.
32
- - **Verdict vocabulary**: Section 2 (`Final Verdict`) MUST include a `Verdict Token` field whose value is exactly one of `accepted`, `conditional-accept`, or `blocked`. `conditional-accept` requires an explicit, exhaustive list of conditions; ambiguous verdicts ("looks good", "mostly ready") are not allowed. Each condition MUST be recorded as a row in the **Conditional Acceptance Conditions** deliverable (`id` `CA-NNN`, `condition`, `evidenceRequired`, `blocksReleaseHandoff`). The validator enforces verdict↔deliverable consistency: `accepted` ⇒ zero acceptance blockers, `blocked` ⇒ at least one, `conditional-accept` ⇒ at least one condition, and a `release-handoff` routing recommendation is allowed only when the verdict is `accepted`.
32
+ - **Verdict vocabulary**: Section 7 (`Final Verdict`) MUST include a `Verdict Token` field whose value is exactly one of `accepted`, `conditional-accept`, or `blocked`. `conditional-accept` requires an explicit, exhaustive list of conditions; ambiguous verdicts ("looks good", "mostly ready") are not allowed. Each condition MUST be recorded as a row in the **Conditional Acceptance Conditions** deliverable (`id` `CA-NNN`, `condition`, `evidenceRequired`, `blocksReleaseHandoff`). The validator enforces verdict↔deliverable consistency: `accepted` ⇒ zero acceptance blockers, `blocked` ⇒ at least one, `conditional-accept` ⇒ at least one condition, and a `release-handoff` routing recommendation is allowed only when the verdict is `accepted`.
33
33
  - **Acceptance Blockers block** (under section 4): one row per blocker with `id`, `severity` (`critical` / `major` / `minor`), evidence (file path, log excerpt, or test output), and the recommended follow-up phase (`error-analysis` or `implementation-planning`). Empty block is acceptable and preferred — render the single line `- No acceptance blockers found.`
34
34
  - **Residual Risk block** (under section 4): risks that are not blockers but should be tracked, each with mitigation owner and a trigger that would escalate them to a blocker.
35
35
  - **Validation Evidence**: for every requirement in the originating plan or task brief, cite the artifact (commit SHA, test output, log line, MCP SELECT result) that demonstrates coverage. Paraphrased "verified" claims without an artifact are rejected.
@@ -37,7 +37,7 @@
37
37
  - **Two-tier command lookup (shared with `implementation`):** when this phase performs its own independent re-validation, the command source is exactly the same two tiers `implementation` verifiers use — Tier 1 is the originating task brief / approved plan's `validation` set, Tier 2 is `<PROJECT_ROOT>/.okstra/project.json` under `qaCommands`. Auto-detecting tools from manifest files is forbidden; missing tiers are recorded as `qa-command not configured: <category>` and do NOT trigger a guess. The `cmd` deny-list (`--fix`, `--write`, ` -w`, ` -u`, `--snapshot-update`, `INSTA_UPDATE=<not-no>`, `cargo update`, `npm install` without `ci`, etc.) is enforced identically. NOTE: runtime fail-fast validation (`okstra_ctl.qa_commands.validate_qa_commands`) only fires at `--task-type implementation` run-prep, so this phase MUST self-check each `qaCommands` entry against the deny-list before executing it — if a denied token is present, skip the command and record it as a `Read-only command log` line `qa-command rejected (denied token: <token>): <label>`.
38
38
  - **Routing recommendation**: the next safe phase — one of `release-handoff`, `done`, `error-analysis`, `implementation-planning` — tied to the verdict and blocker list. `release-handoff` is allowed ONLY when the Verdict Token is `accepted`.
39
39
  - Clarification request policy (phase-specific addendum — shared policy is in `_common-contract.md`):
40
- - populate `## 5. Clarification Items` only when a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation); use `Blocks=next-phase` for items that gate continuing to release-handoff
40
+ - populate `## 1. Clarification Items` only when a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation); use `Blocks=next-phase` for items that gate continuing to release-handoff
41
41
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
42
42
  1. **Verdict precision** — section 2 includes `Verdict Token` with one of the three allowed verdict tokens; `conditional-accept` lists every condition as an actionable item.
43
43
  2. **Blocker traceability** — every blocker cites a concrete artifact (file:line, log excerpt, test exit code, MCP SELECT). Blockers without evidence are demoted to residual risk or removed.
@@ -17,7 +17,7 @@
17
17
  - inspect the current state of every file the task names (or the closest matching files if names are stale) — record current responsibilities, public interfaces, and known coupling points
18
18
  - skim recent commits touching those files (`git log -- <path>`) to surface in-flight work or contested areas
19
19
  - **codebase-first ambiguity resolution**: any ambiguity that can be answered by `Read` / `Grep` MUST be resolved that way and recorded with file:line evidence. Only ambiguities that genuinely require a human decision are escalated as `Clarification Items` rows. Writing a clarification row for something the code already answers is a defect of this phase.
20
- - flag any requirement that is ambiguous, contradictory, or missing success criteria — register each one as a row in the report's `## 5. Clarification Items` table with `Blocks=approval` instead of guessing
20
+ - flag any requirement that is ambiguous, contradictory, or missing success criteria — register each one as a row in the report's `## 1. Clarification Items` table with `Blocks=approval` instead of guessing
21
21
  - read `<PROJECT_ROOT>/.okstra/glossary.md` and `<PROJECT_ROOT>/.okstra/decisions/` titles if present. Absent okstra memory files are the normal state — do not error. Treat the brief's `terminology:*` resolutions from `requirements-discovery` (if any) as authoritative; if missing, resolve any remaining fuzzy term as a `Blocks=approval` clarification row.
22
22
  - Primary focus areas:
23
23
  - requirement gaps
@@ -39,7 +39,7 @@
39
39
  - The YAML frontmatter `approved: true|false` field is the only authorised approval gate. report-writer always emits `approved: false`. The user clears it either by (a) editing the frontmatter line to `approved: true` directly, or (b) invoking the next phase with `--approve` so the CLI flips the frontmatter on the user's behalf. `okstra_ctl.run._validate_approved_plan` reads this field and refuses entry until it is `true`.
40
40
  - Cross-verification mode:
41
41
  - Phase 5.5 finding convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each worker finding (requirement gap / risk / option) by re-inspecting its cited evidence; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode".
42
- - §4.5.9 plan-body verification runs with an **adversarial posture** (`skills/okstra-convergence/SKILL.md` §"Adversarial plan-body posture"): verifiers open and confirm every cited path / command and put the burden of proof on the plan. The gate threshold is unchanged — a *majority* `DISAGREE` (`majority-disagree`) is still required to block approval; a single dissent does not.
42
+ - §5.5.9 plan-body verification runs with an **adversarial posture** (`skills/okstra-convergence/SKILL.md` §"Adversarial plan-body posture"): verifiers open and confirm every cited path / command and put the burden of proof on the plan. The gate threshold is unchanged — a *majority* `DISAGREE` (`majority-disagree`) is still required to block approval; a single dissent does not.
43
43
  - **Coverage critic (opt-in)**: when `convergence.critic.enabled=true` (chosen via the okstra-run picker or `--critic`), a reused-worker critic pass runs after convergence to surface missed findings; its gaps are merged only after a 1-round adversarial reverify. See `skills/okstra-convergence/SKILL.md` "Coverage critic pass".
44
44
  - Non-goals:
45
45
  - code-level micro-optimization unless it changes the implementation approach
@@ -55,7 +55,7 @@
55
55
  - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
56
56
  - Korean translations are allowed in parentheses (e.g. `### Recommended Option (권장 옵션)`), but the English keyword must be present verbatim in the heading line.
57
57
  - The shape and ordering follow `final-report-template.md` section 4.5 (`Implementation Plan Deliverables`). Do NOT translate the heading keywords — `validators/validate-run.py` does substring matching on the raw report text and 7-of-8 missing strings is a real, repeatedly observed failure mode (root cause: writer translated the headings to Korean).
58
- - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 4.5 Stage Map` heading, each `## 4.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), and the `depends-on` DAG are all enforced here, not deferred to the `implementation` entry gate.
58
+ - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), and the `depends-on` DAG are all enforced here, not deferred to the `implementation` entry gate.
59
59
  - Required deliverable shape (final report, in addition to the standard sections):
60
60
  - at least two implementation options. **Each option must include**:
61
61
  - **File Structure**: an explicit list of files to create / modify / delete with each file's responsibility (one-line each). Use the form `Create: path — responsibility` / `Modify: path:line-range — change summary` / `Delete: path — reason`.
@@ -64,7 +64,7 @@
64
64
  - trade-off matrix across options (rows = options, columns at minimum: complexity, risk, reversibility, test coverage cost, rollout cost)
65
65
  - recommended option with rationale tied to the design principles above
66
66
  - **Stage Map (mandatory — always emitted, even when N=1):** a table of all stages with `stage | title | depends-on | step-count | exit-contract-summary`. `depends-on` is `(none)` or a comma-separated stage number list. Stages with `depends-on (none)` can be implemented in parallel by two simultaneous `implementation` runs.
67
- - **Per-stage subsections** (`## 4.5.<i> Stage <i>: <title>` for each `i`), each containing the four required subsections:
67
+ - **Per-stage subsections** (`## 5.5.<i> Stage <i>: <title>` for each `i`), each containing the four required subsections:
68
68
  - `### Carry-In` — for `depends-on (none)`: task-brief only. Otherwise: each depended-on stage's static exit contract + runtime sidecar path `runs/<impl-key>/carry/stage-<i>.json` placeholder.
69
69
  - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
70
70
  - `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
@@ -76,9 +76,10 @@
76
76
  - validation checklist (pre / mid / post) — each item is an exact command or observable outcome
77
77
  - rollback strategy — exact revert path (commits, flags, migrations) and the signal that triggers rollback
78
78
  - the YAML frontmatter MUST include the line `approved: false` (report-writer always emits the unflipped value). The user authorises the next `implementation` run by flipping it to `approved: true` (manual edit or `--approve` CLI). Do NOT recreate any `User Approval Request` body block — the validator fails reports that contain one (see `validators/validate-run.py` deprecated patterns).
79
- - **the frontmatter `approved: false` line is rendered unconditionally; if the plan-body verification gate (§4.5.9) returns `blocked-by-disagreement` or `aborted-non-result`, the writer MUST keep `approved: false` and the validator refuses any report that ships with `approved: true` under such a gate result.**
80
- - every ambiguity flagged during pre-planning that the user must resolve before approval registered as a `Blocks=approval` row in the `## 5. Clarification Items` table (do NOT create a separate `Open Questions` block under `4.5.x` the unified table is the single home)
81
- - **§4.5.9 Plan Body Verification (BLOCKING).** After report-writer finishes the draft, the lead MUST run a worker peer-review round on the consolidated plan body (sections 4.5.1 – 4.5.7) and populate `### 4.5.9 Plan Body Verification` in the final report. The round protocol, plan-item ID scheme (`P-Opt-*` / `P-Step-*` / `P-Dep-*` / `P-Val-*` / `P-Rb-*`), verdict semantics, gate-result classification, and dissent log format are defined in `skills/okstra-convergence/SKILL.md` "Plan-body verification mode". The four gate-result values are `passed`, `passed-with-dissent`, `blocked-by-disagreement`, `aborted-non-result`. When the gate would have been `blocked-by-disagreement` or `aborted-non-result`, the lead MUST NOT silently flip it to one of the passing values to "unblock" the run that is a contract violation. When `convergence.adversarial=true` (the default for this phase), this round uses the adversarial posture — verifiers confirm cited paths/commands and the burden of proof is on the plan — but the gate threshold stays `majority-disagree` (see that skill's §"Adversarial plan-body posture").
79
+ - the YAML frontmatter MUST include the line `implementation-option:` directly under `approved:` (report-writer always emits it with an **empty value**). The user selects which Option Candidate the next `implementation` run executes by filling this line with that option's name (manual edit or `--implementation-option <name>` CLI). When left empty, the `implementation` run falls back to the `Recommended Option`.
80
+ - **the frontmatter `approved: false` line is rendered unconditionally; if the plan-body verification gate (§5.5.9) returns `blocked-by-disagreement` or `aborted-non-result`, the writer MUST keep `approved: false` and the validator refuses any report that ships with `approved: true` under such a gate result.**
81
+ - every ambiguity flagged during pre-planning that the user must resolve before approval registered as a `Blocks=approval` row in the `## 1. Clarification Items` table (do NOT create a separate `Open Questions` block under `4.5.x` — the unified table is the single home)
82
+ - **§5.5.9 Plan Body Verification (BLOCKING).** After report-writer finishes the draft, the lead MUST run a worker peer-review round on the consolidated plan body (sections 4.5.1 – 4.5.7) and populate `### 5.5.9 Plan Body Verification` in the final report. The round protocol, plan-item ID scheme (`P-Opt-*` / `P-Step-*` / `P-Dep-*` / `P-Val-*` / `P-Rb-*`), verdict semantics, gate-result classification, and dissent log format are defined in `skills/okstra-convergence/SKILL.md` "Plan-body verification mode". The four gate-result values are `passed`, `passed-with-dissent`, `blocked-by-disagreement`, `aborted-non-result`. When the gate would have been `blocked-by-disagreement` or `aborted-non-result`, the lead MUST NOT silently flip it to one of the passing values to "unblock" the run — that is a contract violation. When `convergence.adversarial=true` (the default for this phase), this round uses the adversarial posture — verifiers confirm cited paths/commands and the burden of proof is on the plan — but the gate threshold stays `majority-disagree` (see that skill's §"Adversarial plan-body posture").
82
83
  - **Decision-record evaluation (sole owner)**: this phase is the **single owner** of decision-record evaluation in the okstra lifecycle. The brief never evaluates or drafts decision records — it only forwards `adr-candidate:*` signals. Every `adr-candidate:*` entry inherited from the brief's `Open Questions` is a mandatory evaluation target. In addition, evaluate every decision the recommended option introduces against the three criteria:
83
84
  1. **Hard to reverse** — would changing the decision later cost meaningfully more than deciding now?
84
85
  2. **Surprising without context** — would a future reader, seeing only the code, wonder "why was it built this way?"?
@@ -95,7 +96,7 @@
95
96
  1. **Spec coverage** — for every requirement in the task brief, point to the option(s) and step(s) that satisfy it. List gaps explicitly.
96
97
  2. **Placeholder scan** — search the report for the patterns in the No-placeholder rule above and fix inline.
97
98
  3. **Internal consistency** — option file lists, trade-off matrix, and recommended step list must agree on file paths, names, and signatures. A symbol called `clearLayers()` in the matrix and `clearFullLayers()` in the steps is a bug.
98
- 4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 5. Clarification Items` table as a `Blocks=approval` row.
99
+ 4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 1. Clarification Items` table as a `Blocks=approval` row.
99
100
  5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
100
- 6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 4.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 5. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 4.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 5. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §4.5.9 `Dissent log` and is NOT promoted to §5.
101
+ 6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
101
102
  7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
@@ -19,7 +19,7 @@
19
19
  - the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
20
20
  - that file's YAML frontmatter MUST carry `approved: true`. report-writer emits `approved: false` by default; the user flips it to `true` to authorise this run. Free-form approvals such as "lgtm" / "go ahead" / paraphrased confirmations are NOT accepted; re-edit the plan file's frontmatter to `approved: true` before invoking implementation, or pass `--approve` so the CLI flips it on the user's behalf (`okstra_ctl.run._apply_cli_approval`).
21
21
  - The `--approve` flag is meaningful ONLY with `--task-type implementation` and `--approved-plan <path>`; any other use raises `PrepareError`. Idempotent — re-running with `approved: true` already set appends an audit line but does NOT re-toggle.
22
- - the file's `Recommended option` and its bite-sized step list become the authoritative scope for this run; deviations must be justified in the final report and routed back to a new `implementation-planning` run rather than silently expanded.
22
+ - the authoritative scope for this run is the Option Candidate named by the YAML frontmatter `implementation-option:` field. **If `implementation-option:` is empty, fall back to the plan's `Recommended Option`** (this is a soft fallback, not a hard block). The chosen option's bite-sized step list becomes the authoritative scope; deviations must be justified in the final report and routed back to a new `implementation-planning` run rather than silently expanded. If the chosen option name does not match any heading under `Option Candidates`, record it as a deviation.
23
23
  - Task worktree (provisioned by `okstra-ctl` at the first phase's run-prep time, reused for every subsequent phase of this task-key):
24
24
  - Status: `{{EXECUTOR_WORKTREE_STATUS}}` (one of: `created` | `reused` | `skipped-in-worktree` | `skipped-not-git`)
25
25
  - Working tree path: `{{EXECUTOR_WORKTREE_PATH}}` — when status is `created` or `reused`, this is the task's `git worktree` rooted at `~/.okstra/worktrees/<project>/<task-group>/<task-id>/`. When skipped, this is the caller's `project_root`.
@@ -13,7 +13,7 @@
13
13
  - this phase REQUIRES a codebase-scan brief whose frontmatter contains `scope: codebase`. A brief without that marker is rejected before worker dispatch.
14
14
  - the brief's `priority-lenses` MUST be a non-empty subset (size 1..4) of the lens whitelist defined in `scripts/okstra_ctl/improvement_lenses.py`. Lenses outside the whitelist are rejected.
15
15
  - the brief's `scan-scope` defines the only paths workers may read for candidate evidence. `out-of-scope` paths MUST be ignored even when the codebase is otherwise reachable.
16
- - the brief's `candidate-cap` (default 8 if absent, absolute cap 12) bounds the number of rows in `## 4.9 Improvement Candidates`.
16
+ - the brief's `candidate-cap` (default 8 if absent, absolute cap 12) bounds the number of rows in `## 5.9 Improvement Candidates`.
17
17
  - Apply the shared reporter-confirmation precondition as written. For this phase any unresolved `intent-check:` / `conversion-block:` row uses `Blocks=next-phase`.
18
18
  - Primary focus areas:
19
19
  - candidate discovery within the lens whitelist
@@ -29,11 +29,11 @@
29
29
  - Decision-tree walk (bounded):
30
30
  - When candidates branch on a structural question (e.g. "is module X meant to own this responsibility?"), resolve via `Read` / `Grep` first. Only escalate to the user inside the Phase 1.5 budget.
31
31
  - Expected output emphasis:
32
- - the `## 4.9 Improvement Candidates` table populated with rows that obey the 10-column schema from `validators/validate-improvement-report.py` (Cand ID `I-NNN`, Lens from whitelist, Title, Scope ⊆ scan-scope, Severity, Effort, Consensus, Source workers `<worker>:<id>` from {claude, codex, gemini}, Recommended next-phase ∈ {requirements-discovery, implementation-planning, error-analysis}, Evidence as path:line list)
33
- - `## 2. Final Verdict` Verdict Token ∈ {`candidates-ready`, `no-candidates`, `blocked`}; Direction `routing`; Next Step "사용자에게 후보 K개 선택 의뢰 (## 4.9 표 참조)"
34
- - `## 6. Recommended Next Steps` first entry summarises per-candidate routing and proposes new task-key names of the form `<task-group>/imp-<Cand-ID>`
32
+ - the `## 5.9 Improvement Candidates` table populated with rows that obey the 10-column schema from `validators/validate-improvement-report.py` (Cand ID `I-NNN`, Lens from whitelist, Title, Scope ⊆ scan-scope, Severity, Effort, Consensus, Source workers `<worker>:<id>` from {claude, codex, gemini}, Recommended next-phase ∈ {requirements-discovery, implementation-planning, error-analysis}, Evidence as path:line list)
33
+ - `## 7. Final Verdict` Verdict Token ∈ {`candidates-ready`, `no-candidates`, `blocked`}; Direction `routing`; Next Step "사용자에게 후보 K개 선택 의뢰 (## 5.9 표 참조)"
34
+ - `## 3. Recommended Next Steps` first entry summarises per-candidate routing and proposes new task-key names of the form `<task-group>/imp-<Cand-ID>`
35
35
  - Clarification request policy (phase-specific addenda — shared policy is in `_common-contract.md`):
36
- - if scan-scope or priority-lenses cannot be made concrete during Phase 1.5, end the run with Verdict Token `blocked`, populate `## 5. Clarification Items` with `Blocks=next-phase` rows, and do not run worker dispatch
36
+ - if scan-scope or priority-lenses cannot be made concrete during Phase 1.5, end the run with Verdict Token `blocked`, populate `## 1. Clarification Items` with `Blocks=next-phase` rows, and do not run worker dispatch
37
37
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell
38
38
  - Non-goals:
39
39
  - concrete implementation plans, cost estimates, or code edits for any candidate
@@ -6,12 +6,12 @@
6
6
  - Lead-only contract (replaces the shared team contract for this phase):
7
7
  - The Claude lead is the sole agent for this run. No `Agent(...)` worker dispatch, no `TeamCreate`, no parallel sub-agents, no convergence loop.
8
8
  - The lead drafts the PR title and PR body **inline** by reading the run brief, the cited final-verification report, `git log --oneline <base>..HEAD`, and `git diff <base>..HEAD --stat`. No drafter worker is dispatched.
9
- - The lead authors the final-report file directly (no `Report writer worker` dispatch). The report still conforms to the standard `okstra-final-report.template.md` structure, including the `## 4.6 Release Handoff Deliverables` section.
9
+ - The lead authors the final-report file directly (no `Report writer worker` dispatch). The report still conforms to the standard `okstra-final-report.template.md` structure, including the `## 5.6 Release Handoff Deliverables` section.
10
10
  - The shared anti-escalation rule from the common contract still applies: do not start any other lifecycle phase from inside this run.
11
11
  - The shared "authority & permissions assumption" rule from the common contract still applies: assume the user holds every permission needed; do not block on hypothetical approvals.
12
12
  - The shared "MCP read-only" rule still applies if the brief lists MCP servers, though most release-handoff runs do not use MCP.
13
13
  - Pre-handoff entry gate (mandatory — refuse to start if any item fails):
14
- - the task brief MUST cite the originating `final-verification` final-report path under `## Source Verification Report`. The lead opens that file and confirms section `## 2. Final Verdict` contains a `Verdict Token` field whose value is exactly `accepted`.
14
+ - the task brief MUST cite the originating `final-verification` final-report path under `## Source Verification Report`. The lead opens that file and confirms section `## 7. Final Verdict` contains a `Verdict Token` field whose value is exactly `accepted`.
15
15
  - if the verdict is `conditional-accept`, `blocked`, or any other token (including ambiguous phrasing like "looks good"), the run MUST end immediately with status `blocked` and a routing recommendation back to `error-analysis` or `implementation-planning`. Do NOT prompt the user; Do NOT run any git command.
16
16
  - the lead MUST capture `git status --short` and confirm the working tree is clean. Dirty state aborts the run; release-handoff packages the commits produced by `implementation`, it does not stage or commit changes.
17
17
  - the lead MUST capture `git rev-parse --abbrev-ref HEAD` and record it as the **feature branch**. If the current branch is itself `main`, `master`, `prod`, `preprod`, `staging`, or `dev`, the run MUST end immediately — release-handoff never operates on a base branch.
@@ -39,14 +39,14 @@
39
39
  - When the brief's `Desired Outcome`, classification, or routing target depends on a chain of decisions, walk that chain one branch at a time. Each branch is one `Clarification Items` row, not a free-form interview.
40
40
  - For every clarification row, put the single best answer and one-line rationale in `Expected form` as `Recommended: ...`. Put other options and one-sentence consequences in the same cell as `Alternatives: ...`.
41
41
  - **Codebase-first rule**: if a branch can be resolved by `Read` / `Grep` / file inspection, resolve it that way and record `Evidence checked: <path:line>` in the `Statement` cell. Do NOT escalate to the user.
42
- - Budget: the unified `## 5. Clarification Items` table caps at the smaller of (a) one row per unresolved decision branch, (b) 8 rows total. Beyond the cap, fold remaining ambiguity into the routing recommendation's risk notes.
42
+ - Budget: the unified `## 1. Clarification Items` table caps at the smaller of (a) one row per unresolved decision branch, (b) 8 rows total. Beyond the cap, fold remaining ambiguity into the routing recommendation's risk notes.
43
43
  - Expected output emphasis:
44
44
  - evidence-backed routing decision
45
45
  - uncertainty boundaries and missing inputs
46
46
  - next recommended phase and safe resume guidance
47
47
  - canonical-term resolution for every `terminology:*` brief item, written as a one-line `<term> = <definition>` line in a new `Domain Alignment` subsection of the final report; alongside each, propose whether `<PROJECT_ROOT>/.okstra/glossary.md` should be updated (proposal only — actual writes happen via `okstra-brief` Step 4.5 on a subsequent run)
48
48
  - Clarification request policy (phase-specific addenda — shared policy is in `_common-contract.md`):
49
- - if any blocking input is missing at the time of writing the final report, populate `## 5. Clarification Items` in `final-report-template.md` (a single unified table; `Blocks=next-phase` for items the next run cannot start without)
49
+ - if any blocking input is missing at the time of writing the final report, populate `## 1. Clarification Items` in `final-report-template.md` (a single unified table; `Blocks=next-phase` for items the next run cannot start without)
50
50
  - prefer concrete questions whose answers map directly to a routing decision (`bugfix` vs `feature`, `error-analysis` vs `implementation-planning`, etc.). State each option in plain language with one sentence describing what choosing it would mean for the next phase.
51
51
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
52
52
  - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity that can be answered by `Read` / `Grep` / file inspection MUST be resolved that way and recorded with file:line evidence. Writing a clarification row for something the codebase already answers is a defect of this phase.