npm - okstra - Versions diffs - 0.26.0 → 0.28.0 - Mend

okstra 0.26.0 → 0.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/README.kr.md +15 -0
package/README.md +15 -0
package/docs/kr/architecture.md +2 -6
package/docs/kr/cli.md +40 -6
package/docs/kr/performance-improvement-plan-v2.md +23 -0
package/docs/kr/performance-improvement-plan.md +22 -0
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/workers/claude-worker.md +4 -3
package/runtime/agents/workers/codex-worker.md +4 -3
package/runtime/agents/workers/gemini-worker.md +4 -3
package/runtime/agents/workers/report-writer-worker.md +7 -2
package/runtime/bin/okstra.sh +0 -1
package/runtime/prompts/launch.template.md +1 -1
package/runtime/prompts/profiles/_common-contract.md +36 -4
package/runtime/prompts/profiles/error-analysis.md +12 -0
package/runtime/prompts/profiles/implementation-planning.md +20 -0
package/runtime/prompts/profiles/requirements-discovery.md +20 -0
package/runtime/python/lib/okstra/cli.sh +1 -7
package/runtime/python/lib/okstra/globals.sh +0 -1
package/runtime/python/lib/okstra/usage.sh +1 -4
package/runtime/python/okstra_ctl/render.py +3 -0
package/runtime/python/okstra_ctl/run.py +0 -6
package/runtime/python/okstra_ctl/run_context.py +1 -1
package/runtime/python/okstra_ctl/wizard.py +25 -2
package/runtime/python/okstra_token_usage/blocks.py +5 -1
package/runtime/python/okstra_token_usage/claude.py +16 -1
package/runtime/python/okstra_token_usage/cli.py +9 -2
package/runtime/python/okstra_token_usage/collect.py +17 -3
package/runtime/python/okstra_token_usage/pricing.py +159 -24
package/runtime/python/okstra_token_usage/report.py +32 -3
package/runtime/skills/okstra-brief/SKILL.md +532 -65
package/runtime/skills/okstra-context-loader/SKILL.md +25 -11
package/runtime/skills/okstra-convergence/SKILL.md +38 -14
package/runtime/skills/okstra-history/SKILL.md +68 -37
package/runtime/skills/okstra-logs/SKILL.md +26 -4
package/runtime/skills/okstra-report-finder/SKILL.md +49 -22
package/runtime/skills/okstra-report-writer/SKILL.md +62 -65
package/runtime/skills/okstra-run/SKILL.md +35 -34
package/runtime/skills/okstra-schedule/SKILL.md +51 -20
package/runtime/skills/okstra-setup/SKILL.md +31 -12
package/runtime/skills/okstra-status/SKILL.md +20 -8
package/runtime/skills/okstra-team-contract/SKILL.md +41 -25
package/runtime/skills/okstra-time-summary/SKILL.md +53 -16
package/runtime/templates/reports/final-report.template.md +227 -207
package/runtime/templates/reports/settings.template.json +7 -4
package/runtime/validators/lib/fixtures.sh +47 -2
package/runtime/validators/lib/validate-assets.sh +50 -24
package/runtime/validators/validate-brief.py +385 -0
package/runtime/validators/validate-brief.sh +35 -0
package/runtime/validators/validate-run.py +313 -1
package/runtime/validators/validate-workflow.sh +7 -33

package/runtime/skills/okstra-report-writer/SKILL.md CHANGED Viewed

@@ -8,7 +8,7 @@ user-invocable: false
 ## File-author ownership (BLOCKING)
-The final-report file at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md` is authored by the `Report writer worker` subagent when that worker is in the run's roster. Claude lead reviews the file but does NOT write it itself in that case. Lead-authored fallback is permitted only after a real Report writer worker dispatch attempt with a recorded non-`completed` terminal status (`error` / `timeout` / `not-run`) and a logged reason (`okstra-error-log.py`).
+The final-report file at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md` is authored by the `Report writer worker` subagent when that worker is in the run's roster. Claude lead reviews the file but does NOT write it itself in that case. Lead-authored fallback is permitted only after a real Report writer worker dispatch attempt with a recorded non-`completed` terminal status (`error` / `timeout` / `not-run`) and a logged reason (`okstra-error-log.py`). **Except for `release-handoff`**, which has no worker roster — the Claude lead authors the final-report directly by design (see "Release-handoff section contract" below), and the fallback rules in this section do not apply.
 If you are reading this skill **as the report-writer-worker subagent**, YOU are the one calling the `Write` tool against the result path. Do not return the report inline — the file on disk is the canonical artifact.
@@ -60,7 +60,7 @@ A resumed lead session can ALWAYS dispatch a fresh Report writer worker. The Age
 ### Lead-authored fallback (only if dispatch failed)
-Lead-authored fallback is permitted only if all of the following are true and recorded in team-state:
+Except for `release-handoff` (which is single-lead by design and never dispatches a Report writer worker — see "Release-handoff section contract" below), lead-authored fallback is permitted only if all of the following are true and recorded in team-state:
 1. A Report writer worker dispatch was actually attempted (Agent call was issued).
 2. The attempt recorded a terminal status of `error`, `timeout`, or `not-run` with a concrete reason (tool error message, timeout duration, or external blocker).
@@ -68,50 +68,43 @@ Lead-authored fallback is permitted only if all of the following are true and re
 Speculative reasons such as "session resume constraint", "team object no longer exists", or "lead can do it faster" are NOT valid.
-## Phase 7 follow-up task spawner (BLOCKING when Section 7 is non-empty)
+## Phase 6 → Phase 7 execution sequence (BLOCKING order)
-After the token-usage collector finishes (the next subsection), Phase 7 must run the follow-up task spawner against the final-report file. This step is what turns the report's `## 7. Follow-up Tasks (후속 작업)` table into actual `tasks/<task-group>/<new-task-id>/` stubs that show up in `okstra-status`.
+The four steps below MUST execute in this exact order. Reordering them is the recurring root cause of reports shipping with unsubstituted `{{LEAD_TOTAL_TOKENS}}` placeholders, Section 6 missing the follow-up entries, or Section 7 rows never spawning.
-```bash
-python3 scripts/okstra-spawn-followups.py \
-  <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md \
-  --project-root <project_root> \
-  --task-group <task-group> \
-  --parent-task-key <task-key>
-```
-Behaviour contract:
-- The script is **idempotent**: rows whose target directory already exists are reported as `existing` and skipped without modification. Re-running the spawner across reruns of the same parent task is safe.
-- Rows with `Auto-spawn? != yes` are reported as `skipped` and never written to disk. Surface them in the final-report's Section 6 if the user should still take manual action.
-- A row with an invalid `Origin`, `Suggested task-type`, missing `Title`, or missing `Reason / Why deferred` causes the script to exit `1`. The report-writer worker MUST refuse to ship a final-report whose Section 7 contains such rows. Either fix the row or change `Auto-spawn?` to `no` and document why in Section 6.
-- For `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, Section 7 must be present in the final report. An empty section is acceptable and is expressed as the single line `- 후속 작업 없음.` under the heading. The spawner treats a missing or empty section as a no-op (exit `0`).
-After the spawner completes, the report-writer worker MUST update Section 6 ("Recommended Next Steps") to list every newly created task-key together with its entry command, so the user can pick the follow-up up immediately:
-```
-- Follow-up: `<task-group>/<new-task-id>` — Claude Code 세션 안 `/okstra-run task-key=<task-group>/<new-task-id> task-type=<suggested>` / 별도 터미널 `scripts/okstra.sh --task-key <task-group>/<new-task-id> --task-type <suggested>`
-```
-## Phase 7 token-usage collector (BLOCKING)
+1. **Phase 6 — Report writer worker drafts the final-report file** at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md`. Token placeholders are left verbatim; Section 6 lists prioritized actions but does NOT yet include auto-spawned follow-ups (they don't exist yet).
+2. **Phase 7 step 1 — Token-usage collector with `--substitute-final-report`** (BLOCKING). One invocation aggregates `leadUsage` / `workers[].usage` / `usageSummary` into team-state AND substitutes the 10 placeholders in the final-report file. Skipping the flag ships literal `{{...}}` in the Token Usage Summary table.
-At the start of Phase 7, run the token-usage collector with the final-report substitution flag. This step is BLOCKING — both the team-state aggregation AND the final-report placeholder substitution happen here, in one invocation:
+   ```bash
+   python3 scripts/okstra-token-usage.py \
+     <runDirectoryPath>/state/team-state-<task-type>-<seq>.json \
+     --write --summary \
+     --substitute-final-report <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md
+   ```
-```bash
-python3 scripts/okstra-token-usage.py \
-  <runDirectoryPath>/state/team-state-<task-type>-<seq>.json \
-  --write --summary \
-  --substitute-final-report <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md
-```
+   The 10 substituted placeholders: `{{LEAD_TOTAL_TOKENS}}`, `{{LEAD_BILLABLE_TOKENS}}`, `{{LEAD_COST_USD}}`, `{{WORKER_TOTAL_TOKENS}}`, `{{WORKER_BILLABLE_TOKENS}}`, `{{WORKER_COST_USD}}`, `{{GRAND_TOTAL_TOKENS}}`, `{{GRAND_BILLABLE_TOKENS}}`, `{{GRAND_COST_USD}}`, `{{CLI_COST_USD}}`. The final-report file MUST already exist (Phase 6 output).
+3. **Phase 7 step 2 — Follow-up task spawner** (BLOCKING when Section 7 is non-empty). Turns the report's `## 7. Follow-up Tasks (후속 작업)` rows into `tasks/<task-group>/<new-task-id>/` stubs.
-This:
+   ```bash
+   python3 scripts/okstra-spawn-followups.py \
+     <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md \
+     --project-root <project_root> \
+     --task-group <task-group> \
+     --parent-task-key <task-key>
+   ```
-- Populates `leadUsage`, every `workers[].usage`, and `usageSummary` in team-state from session transcripts.
-- Substitutes the 10 token-related placeholders (`{{LEAD_TOTAL_TOKENS}}`, `{{LEAD_BILLABLE_TOKENS}}`, `{{LEAD_COST_USD}}`, `{{WORKER_TOTAL_TOKENS}}`, `{{WORKER_BILLABLE_TOKENS}}`, `{{WORKER_COST_USD}}`, `{{GRAND_TOTAL_TOKENS}}`, `{{GRAND_BILLABLE_TOKENS}}`, `{{GRAND_COST_USD}}`, `{{CLI_COST_USD}}`) in the final-report file with concrete values from the freshly computed usageSummary.
+   Behaviour contract:
+   - Idempotent: rows whose target dir exists are reported as `existing` and skipped. Reruns of the same parent task are safe.
+   - Rows with `Auto-spawn? != yes` are reported as `skipped` and never written; surface them in Section 6 if manual action is still needed.
+   - An invalid `Origin`, `Suggested task-type`, missing `Title`, or missing `Reason / Why deferred` exits `1`. The report-writer MUST refuse to ship a Section 7 with such rows.
+   - **Canonical spawn rule (single source of truth):** the spawner runs when `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, OR when Section 7 is non-empty for any other task-type. For the listed task-types Section 7 must be present in the report; an empty section renders as `- 후속 작업 없음.`. Missing / empty sections are no-ops (exit `0`). All other references to this rule (including the Persistence Checklist) defer to this statement.
+4. **Phase 7 step 3 — Update Section 6** after the spawner. The report-writer MUST append one row per newly spawned task-key with its entry command:
-Skipping `--substitute-final-report` is the recurring root cause of reports shipping with literal `{{LEAD_TOTAL_TOKENS}}` etc. in their Token Usage Summary table. Always pass the flag — never run the collector without it during Phase 7.
+   ```
+   - Follow-up: `<task-group>/<new-task-id>` — Claude Code 세션 안 `/okstra-run task-key=<task-group>/<new-task-id> task-type=<suggested>` / 별도 터미널 `scripts/okstra.sh --task-key <task-group>/<new-task-id> --task-type <suggested>`
+   ```
-The final-report file MUST already exist before this step runs (it's authored by Report writer worker in Phase 6, or by Lead in the documented fallback case). The status file can be written after this step completes.
+The status file is written after step 3 completes.
 ## Final Report Structure
@@ -181,7 +174,7 @@ Token Summary Generation Rules:
 ### Implementation-planning section heading contract (BLOCKING)
-When the run's `task-type` is `implementation-planning`, the final report MUST contain section headings whose **lines include each of the 8 literal English substrings below**. The validator (`validators/validate-run.py`) does plain substring matching on the report text — 7-of-8 missing was a real, repeatedly observed failure mode caused by translating the headings to Korean.
+When the run's `task-type` is `implementation-planning`, the final report MUST contain section headings whose **lines include each of the 9 literal English substrings below**. The validator (`validators/validate-run.py`) does plain substring matching on the report text — missing headings was a real, repeatedly observed failure mode caused by translating the headings to Korean.
 | # | Required substring | Recommended heading form |
 |---|--------------------|--------------------------|
@@ -192,12 +185,27 @@ When the run's `task-type` is `implementation-planning`, the final report MUST c
 | 5 | `Dependency` | `### Dependency / Migration Risk (의존성·마이그레이션 위험)` |
 | 6 | `Validation Checklist` | `### Validation Checklist (검증 체크리스트)` |
 | 7 | `Rollback` | `### Rollback Strategy (롤백 전략)` |
-| 8 | `User Approval Request` | `### User Approval Request (사용자 승인 요청)` |
+| 8 | `User Approval Request` | Satisfied by the top-of-report `## User Approval Request (사용자 승인 게이트)` block. Do NOT recreate a `### 4.5.8 User Approval Request` body stub — the validator now fails reports that contain one. |
+| 9 | `Plan Body Verification` + `Gate result:` | `### Plan Body Verification (계획 본문 검증)` containing a `Gate result:` line — copy `okstra-final-report.template.md §4.5.9` verbatim. Validator checks both substrings. |
 The Korean translation in parentheses is optional but the English keyword is mandatory. The body of each section is written in Korean per the writing rules below. For non-`implementation-planning` runs, omit this entire block — these headings are NOT validator-checked for other task-types.
 The final-report template `okstra-final-report.template.md` Section 4.5 already encodes this contract — copy that block verbatim and fill in.
+### Final-verification verdict token contract (BLOCKING)
+When the run's `task-type` is `final-verification`, the report's `## 2. Final Verdict` table MUST contain a `Verdict Token` row whose value is **exactly one of** the literal strings below. The `release-handoff` profile reads this row as its entry gate; any other value blocks the next phase.
+| # | Required substring | Meaning |
+|---|--------------------|---------|
+| 1 | `accepted` | All acceptance criteria pass; `release-handoff` may proceed. |
+| 2 | `conditional-accept` | Acceptance passes with caveats; user must resolve listed conditions before `release-handoff`. |
+| 3 | `blocked` | Acceptance failed; routing returns to `error-analysis` or `implementation-planning`. |
+For every other task-type, set the `Verdict Token` cell to `not-applicable`. Do NOT omit the row — the template renders it for all task-types and downstream tooling expects the field to exist.
+The final-report template `okstra-final-report.template.md` Section 2 already encodes this contract — copy that block verbatim and fill in.
 ### Release-handoff section contract (release-handoff runs only)
 When the run's `task-type` is `release-handoff`, the final report MUST include Section `## 4.6 Release Handoff Deliverables` with all seven sub-sections (`4.6.1` Source Verification Report, `4.6.2` Feature Branch & Working-Tree State, `4.6.3` User Selections, `4.6.4` Executed Commands, `4.6.5` Commit List, `4.6.6` Pull Request Outcome, `4.6.7` Routing Recommendation). Every entry is dictated by the lead's recorded git/gh command log and the user's verbatim answers to the H1/H2/H3 menu prompts. H1 choices are `local only`, `push + PR`, or `skip`; release-handoff records existing implementation commits and MUST NOT create new commits. If the user picked `skip` (H1) or `cancel` (H3), keep 4.6.3 populated but leave 4.6.4–4.6.6 explicitly empty per the template's empty-state lines.
@@ -216,35 +224,24 @@ runs/<task-type>/worker-results/report-writer-worker-<task-type>-<seq>.md
 This file is checked by the validator whenever the role's terminal status is `completed`. Without it the run fails with `report-writer is completed but worker result file is missing`.
-The file content is short: it begins with the standard worker-result header from `okstra-team-contract`, then names the canonical final-report path you wrote, lists the input artifacts you reconciled, and records any structural deviations from `final-report-template.md`. Do NOT duplicate the full final-report body here — it's an audit pointer, not a second copy.
+**Frontmatter + header schema** — both the worker-results audit file AND the final-report file are governed by `okstra-team-contract` ("Result Frontmatter" and the standard worker-result header sections). That document is the single source of truth; do NOT restate the field list here. Use `workerId: "report-writer"` for both files and copy every other frontmatter value verbatim from `analysis-material.md`. The body of this audit file is short: name the canonical final-report path you wrote, list the input artifacts you reconciled, and record any structural deviations from `final-report.template.md`. Do NOT duplicate the full final-report body here — it's an audit pointer, not a second copy.
 Skipping this file because "the real report is in `reports/`" is wrong. Both files are required.
 ### Main Body Section
-Section numbering matches `okstra-final-report.template.md`. Section 0 is the carry-in reconciliation that runs first when a clarification response was provided; sections 1–7 follow the template's main body order.
-0. **Clarification Response Carried In** - if `{{CLARIFICATION_RESPONSE_RELATIVE_PATH}}` is non-empty, read `instruction-set/clarification-response.md`, walk every `C-*` row of the prior report's `## 5. Clarification Items` table, reconcile each one against new evidence, and record the outcome (`resolved`/`obsolete`) plus the citation in this section before drafting the verdict. If the prior report uses the deprecated `4.5.9 Open Questions` / `5.1` / `5.2` layout with `OQ-*`/`A*`/`Q*` IDs, follow the legacy-carry-in mapping rule in `final-report-template.md` section 0.
-1. **Problem or Verification Summary** - Key summary based on the brief and data (3–5 bullet points)
-2. **Cross Verification Results** (Use 4 categories when convergence is enabled, per `okstra-convergence`)
-   - Round History sub-table (convergence-enabled runs only): one row per executed round with columns `Round | inputQueueSize | resolvedCount | carriedForwardCount | dispatches (worker:status:durationMs) | skippedWorkers (worker:reason)`. Add a one-line note immediately under the table with `round2SkippedReason: <value>` (always present, even when `"not-skipped"`). Pull all values verbatim from `convergence-<task-type>-<seq>.json`.
-   - Full Consensus: Findings agreed upon by all workers
-   - Partial Consensus: Agreed upon by a majority of workers; dissenting opinions are specified
-   - Contested: No consensus after the last executed round; each worker’s position specified. Empty contested list is shown as the literal line `- 합의 미달 항목 없음.`
-   - Worker-Unique: Verified only by the discoverer; verification history specified
-   - In runs with convergence disabled, maintain the existing Consensus/Differences format and omit the Round History sub-table.
-3. **Final Verdict** - Conclusion based on comprehensive evidence; direction provided. For `final-verification`, include a `Verdict Token` field whose value is exactly `accepted`, `conditional-accept`, or `blocked`; `release-handoff` uses that field as its entry gate.
-4. **Evidence and Detailed Analysis**
-   - Key Evidence: File path, line number, actual evidence
-   - If explicit expected values are present in `reference-expectations.md`, specify whether they match or differ from the expected values in config files / deployment manifests
-   - Supporting evidence or alternative interpretations
-5. **Missing Information and Risks** - Uncertain/I don't know items
-6. **Clarification Items** - single unified table (`C-001`, `C-002`, ...) the user fills inline before reruns. Columns: `ID`, `Ticket ID`, `Kind` (`material` / `decision` / `data-point`), `Statement`, `Expected form`, `Blocks` (`approval` / `next-phase` / `none`), `Status`, `User input`. Replaces the legacy `4.5.9 Open Questions` / `5.1` / `5.2` triple; never create those sub-sections — same item appearing in two places is the failure mode this table prevents.
-   - Required for `task-type` `error-analysis` and `requirements-discovery` whenever blocking uncertainty remains
-   - Optional for other task-types; explicitly state "no clarification needed" when none
-   - Follow the table format from `final-report-template.md` exactly (columns: Question ID, Blocking, Why this matters, Question, Expected answer shape, Status, Answer)
-   - Use stable `Q1`, `Q2`, ... ids and never delete prior ids on rerun; mark them `resolved` or `obsolete` instead
-7. **Recommended Next Steps** - Actions by Priority
+Section numbering follows `templates/reports/final-report.template.md` exactly — that file is the single source of truth. Below is a one-line summary of each section's writer obligation; consult the template for full body structure.
+**Verdict Card (top-of-report, mandatory).** Render `## Verdict Card` between the report header and the (conditional) Approval block. Its `Verdict Token` / `Direction` / `Next Step` cells MUST byte-match the corresponding cells in `## 2. Final Verdict` and the first item of `## 6.`. Divergence is `contract-violated`.
+0. **Clarification Response Carried In** — render this `## 0.` heading ONLY when `{{CLARIFICATION_RESPONSE_RELATIVE_PATH}}` is non-empty. Walk every `C-*` row of the prior report's `## 5. Clarification Items` table, reconcile against new evidence, and record the outcome (`resolved` / `obsolete`) with citation before drafting the verdict. When no carry-in path was provided, OMIT the `## 0.` heading entirely — the validator fails an empty Section 0 stub.
+1. **Cross Verification Results** — 4 categories (Full / Partial / Contested / Worker-Unique) when convergence is enabled, per `okstra-convergence`. Prepend the Round History sub-table (columns: `Round | inputQueueSize | resolvedCount | carriedForwardCount | dispatches | skippedWorkers`) plus a `round2SkippedReason: <value>` note, pulled verbatim from `convergence-<task-type>-<seq>.json`. Empty contested list renders as `- 합의 미달 항목 없음.`. Convergence-disabled runs use the legacy Consensus/Differences format and omit the round table.
+2. **Final Verdict** — `Direction` ∈ `continue-investigation` / `begin-implementation` / `approve` / `reject` / `hold`. **Verdict Token** is `not-applicable` for every task-type except `final-verification` — see "Final-verification verdict token contract" below for that case.
+3. **Evidence and Detailed Analysis** — primary evidence rows (file path, line, snippet); secondary evidence / alternate interpretations. If `reference-expectations.md` lists explicit expected values, record match/gap per row.
+4. **Missing Information and Risks** — uncertain / "I don't know" items. `implementation-planning` adds §4.5 (see heading contract below); `release-handoff` adds §4.6.
+5. **Clarification Items** — single unified `C-*` table; column schema, ID convention, and rerun behaviour are owned by `_common-contract.md §Clarification request policy` (8-column SSOT). The deprecated `4.5.9 Open Questions` / `5.1 추가 자료 요청` / `5.2 사용자 확인 질문` sub-sections are removed; the validator fails reports that reintroduce them.
+6. **Recommended Next Steps** — prioritized actions. After Phase 7's follow-up spawner runs, append a row per newly created task-key (see "Phase 6 → Phase 7 execution sequence" above).
+7. **Follow-up Tasks** — auto-spawn-eligible table. Each row drives `okstra-spawn-followups.py`; see template §7 for the row schema.
 ### Writing Guidelines
@@ -280,7 +277,7 @@ Persistence steps that must be performed in Phase 7:
 - [ ] 5. **Update task-index.md**: Refresh human-readable summary
 - [ ] 6. **Generate final status file**: `runs/<task-type>/status/final-<task-type>-<seq>.status` (if necessary)
 - [ ] 7. **Save convergence state**: `runs/<task-type>/state/convergence-<task-type>-<seq>.json` (when convergence is enabled)
-- [ ] 8. **Spawn follow-up task stubs**: run `scripts/okstra-spawn-followups.py` against the final-report when the run's `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, OR when Section 7 is non-empty for any other task-type. See "Phase 7 follow-up task spawner" above for the exact command and contract. The script is idempotent across reruns.
+- [ ] 8. **Spawn follow-up task stubs**: run `scripts/okstra-spawn-followups.py` against the final-report per the canonical spawn rule defined in "Phase 7 follow-up task spawner" above. Do not restate the trigger condition here — that section is the single source of truth. The script is idempotent across reruns.
 ### Response after Persistence

package/runtime/skills/okstra-run/SKILL.md CHANGED Viewed

@@ -18,7 +18,7 @@ Launch an okstra task — gather inputs interactively via the **wizard state mac
 ## When NOT to Use
-- User explicitly asks to spawn a new terminal / new claude — use `okstra-history` Step 4 (resume command) or instruct them to run `okstra.sh` in another terminal.
+- User explicitly asks to spawn a new terminal / new claude — use `okstra-history` Step 4 (resume command) or instruct them to run `okstra` in another terminal.
 - User wants status only — use `okstra-status`.
 - User wants past runs — use `okstra-history`.
@@ -48,29 +48,26 @@ Never invent additional questions. Never reorder. Never use `AskUserQuestion` fo
 ## Step 1: Verify okstra runtime + project setup
-```bash
-if command -v okstra >/dev/null 2>&1; then
-  okstra ensure-installed >/dev/null 2>&1 || { echo "FAIL: okstra ensure-installed failed" >&2; exit 1; }
-  eval "$(okstra paths --shell)"
-  export PYTHONPATH="$OKSTRA_PYTHONPATH"
-  okstra check-project --json || { echo "FAIL: this project has no okstra setup. Tell the user to run /okstra-setup first." >&2; exit 1; }
-else
-  npx -y okstra@latest ensure-installed >/dev/null 2>&1 || { echo "FAIL: okstra not installed; tell the user to run: npx okstra@latest install" >&2; exit 1; }
-  eval "$(npx -y okstra@latest paths --shell)"
-  export PYTHONPATH="$OKSTRA_PYTHONPATH"
-  npx -y okstra@latest check-project --json || { echo "FAIL: this project has no okstra setup. Tell the user to run /okstra-setup first." >&2; exit 1; }
-fi
-```
+Run each of the following commands as a **separate Bash tool call**. Each command starts with the literal token `okstra` so the `Bash(okstra:*)` permission match succeeds. Do **not** wrap any of them in `if`, `eval`, `export`, `$(...)`, `VAR=...`, `||`, or `&&` — those leading tokens defeat the permission match and force a confirmation prompt on every call. The LLM (you) inspects each command's JSON output and decides what to do next in natural language — never in shell.
-The `check-project --json` output goes to stdout; read it from the tool result. If its `ok` field is `false`, ask the user with a **plain text prompt** for an absolute project-root path; rerun `okstra check-project --cwd <path> --json`. Re-prompt with plain text on failure.
+1. `okstra ensure-installed`
+   If this exits non-zero, tell the user: "okstra runtime missing — run `npx okstra@latest install` once and retry." Then stop.
-Parse `projectRoot` and `projectId` from that JSON output.
+2. `okstra paths --json`
+   Read `pythonPath` and other path fields from the JSON output. **Do not** `export PYTHONPATH=...` from this skill — every subsequent `okstra <subcmd>` call already self-bootstraps its Python path. Keep the parsed values in mind only as diagnostic context.
-## Step 2: Initialize the wizard
+3. `okstra check-project --json`
+   Reads the project from the current working directory. Parse the JSON from stdout. If `ok` is `false`, ask the user with a **plain text prompt** for an absolute project-root path, then rerun as a separate tool call (still literal-token-first):
+   `okstra check-project --cwd /abs/path/from/user --json`
+   Substitute the literal absolute path the user gave you (no `$(pwd)`, no shell variables). Re-prompt with plain text on continued failure.
+Parse `projectRoot` and `projectId` from the successful `check-project --json` output and carry them as literal strings into Step 2.
-> **Permission-friendly invocation rule**: every `okstra wizard ...` / `okstra render-bundle ...` call below MUST start with the literal token `okstra` and use literal argument values copied from prior tool outputs. Do **not** introduce shell variables (`$STATE_FILE`, `$ANSWER`, `$projectRoot`, ...), `$(...)` command substitution, or leading assignments — they break the `Bash(okstra:*)` permission match and force a confirmation prompt on every call.
+> If the `okstra` binary is not on `PATH` at all, none of the commands above will run. In that case tell the user verbatim: "okstra not installed — run `npx okstra@latest install` once, then retry this skill." Do **not** try to invoke `npx -y okstra@latest ...` from this skill — `npx` is not on the literal-token allow-list and will force a confirmation prompt on every wizard call afterward.
-First, generate a state-file path:
+## Step 2: Initialize the wizard
+First, generate a state-file path (Bash invocation rule from the top of this file applies to every command below):
 ```bash
 okstra wizard new-state-file
@@ -101,6 +98,14 @@ Repeat until `next.kind == "done"`:
    okstra wizard step --state-file /var/folders/.../okstra-wizard.AbCd.json --answer preprod
    ```
    If the answer contains spaces or shell metacharacters, wrap it in double quotes around the literal string only — never inside `"$VAR"`.
+   **MANDATORY: empty answers must pass `--answer ""` explicitly.** If the user's reply is the empty string, the call MUST still include the flag with an empty literal value:
+   ```bash
+   okstra wizard step --state-file /var/folders/.../okstra-wizard.AbCd.json --answer ""
+   ```
+   Omitting `--answer` entirely is forbidden. The wizard interprets a missing `--answer` flag as "re-emit the current prompt" (a `get-current-prompt` style no-op), not as "submit empty" — so dropping the flag will loop the same prompt forever. Submitting `--answer ""` is the only way to advance past an intentionally-blank step (e.g. "use phase default").
+   **Escaping rule**: if the literal answer contains `"`, escape each occurrence as `\"` inside the double-quoted argument. Empty values must still be `--answer ""` — the flag itself is mandatory, even when the value is empty.
 3. **Handle result**:
    - `ok: true` → echo `result.echo` to the user on one short line, then loop with `result.next`.
    - `ok: false` → show `result.error` to the user verbatim, then loop with `result.current` (re-prompt the same step).
@@ -126,8 +131,6 @@ When `next.step == "confirm"`, before relaying the picker, fetch the human-reada
 okstra wizard confirmation --state-file /var/folders/.../okstra-wizard.AbCd.json
 ```
-(Substitute the literal state-file path captured in Step 2 — no `$STATE_FILE`.)
 Output: `{ok: true, text: "선택 확인:\n  task-type     : ...\n  ..."}`. Print `text` to the user, then render the `confirm` picker (Proceed / Edit).
 ## Step 5: Render the task bundle
@@ -138,10 +141,12 @@ When `next.kind == "done"`, fetch the final args:
 okstra wizard render-args --state-file /var/folders/.../okstra-wizard.AbCd.json
 ```
-(Again: literal state-file path, no `$STATE_FILE`.)
 Output: `{ok: true, args: {"project-root": "...", "task-type": "...", ...}}`. Build the `okstra render-bundle` invocation from `args`, passing each key as `--<key>` and the value verbatim (including empty strings — they are intentional `use phase default` markers).
+**Empty-value rule (same as Step 3)**: every flag whose value is the empty string MUST still be passed explicitly as `--<key> ""`. For example: `--workers ""`, `--directive ""`, `--related-tasks ""`. Omitting the flag is forbidden — `render-bundle` distinguishes "flag absent" from "flag present with empty value", and the wizard's intent is always the latter.
+**Escaping rule**: inside a double-quoted value, escape any literal `"` as `\"`. Do not collapse `--key ""` into `--key` even when the value is empty.
 ```bash
 okstra render-bundle \
   --project-root "<args.project-root>" \
@@ -173,13 +178,7 @@ You can delete the literal state-file path after this point — its job is done.
 ## Step 6: Take over as Claude lead
-Read these files (do not paraphrase) and enter `Claude lead` mode:
-1. `<INSTRUCTION_SET_DIR>/claude-execution-prompt.md` — the lead prompt
-2. `<INSTRUCTION_SET_DIR>/analysis-profile.md` — per-task-type allowed outputs / forbidden actions
-3. `<INSTRUCTION_SET_DIR>/analysis-material.md` — task brief + directive
-4. `<INSTRUCTION_SET_DIR>/reference-expectations.md`
-5. `<INSTRUCTION_SET_DIR>/final-report-template.md`
+Read `<INSTRUCTION_SET_DIR>/claude-execution-prompt.md` verbatim and enter `Claude lead` mode. The lead prompt itself enumerates every other instruction-set file to load (`analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `final-report-template.md`, the run manifest, the team-state artifact, etc.) — follow its order, do not preempt it.
 Then proceed through the phases exactly as the lead prompt directs (Phase 1 context → Phase 2+ worker dispatch → final synthesis → final report).
@@ -197,15 +196,17 @@ okstra config set pr-template-path "<path>" --scope project
 okstra config set pr-template-path "<path>" --scope global
 ```
-The scope is exposed via `wizard render-args` only as the `pr-template-path` value (1-shot override); the persist hint lives in the wizard state. Read it with:
+The scope is held in the wizard state but is not yet exposed by any `okstra wizard` subcommand. Until the subcommand below ships, read the JSON state file directly with the `Read` tool (literal path captured in Step 2) and inspect the `pr_template_scope` field — it is a plain serialized `WizardState`. Do not shell out (`python3 -c`, `jq`, etc.); the literal-token Bash rule rejects them.
+## Out-of-scope backlog
-Read the JSON state file directly with the `Read` tool (literal path captured in Step 2) and inspect the `pr_template_scope` field — it is a plain serialized `WizardState`. Avoid `python3 -c "...$STATE_FILE"` style commands; they trip Bash static analysis.
+- **`okstra wizard pr-template-scope --state-file PATH`**: add a thin subcommand to `scripts/okstra_ctl/wizard.py` that prints `{ok: true, scope: "once" | "project" | "global"}` so this skill can drop the `Read`-the-raw-state-file detour. The subcommand should reuse the existing `load_state_file` path; no schema changes required.
 ## Concurrency
 - `prepare_task_bundle` serializes per-task via `~/.okstra/.locks/<task-key>.lock`. Concurrent skill invocations on the same task wait; different tasks proceed in parallel.
-- Each wizard run owns its own `$STATE_FILE`; two parallel skill invocations do not collide.
-- The skill must NOT call `okstra.sh` or any other bash entrypoint that would re-implement the orchestration. The wizard + `render-bundle` is the single authority.
+- Each wizard run owns its own state file (one per `okstra wizard new-state-file`); two parallel skill invocations do not collide.
+- The skill must NOT call `okstra.sh` (or any other bash entrypoint) that would re-implement the orchestration. The wizard + `render-bundle` is the single authority.
 ## Failure Modes

package/runtime/skills/okstra-schedule/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ model: opus
 Generate a consolidated work schedule for all non-done tasks in a given `task-group`. The skill reads each task's `task-manifest.json` and `latestReport`, classifies tasks into phases by priority and risk, and writes a single Markdown plan file under `.project-docs/okstra/tasks/<task-group>/schedule/`.
-The default mode is lightweight (single Claude lead synthesis). The `--cross-verify` option triggers the full okstra multi-agent flow on the schedule itself.
+The skill runs as a single Claude lead synthesis (lightweight mode). A `--cross-verify` multi-agent variant was previously sketched here but never specified end-to-end; it has been dropped pre-1.0 and is tracked as a follow-up if multi-agent schedule verification is needed later.
 ## When to Use
@@ -30,12 +30,12 @@ The default mode is lightweight (single Claude lead synthesis). The `--cross-ver
 **Explicit command**:
 - `okstra schedule <task-group>`
-- `okstra schedule <task-group> --cross-verify`
 - `okstra schedule <task-group> --title "<custom title>"`
+- `okstra schedule <task-group> --directive-file <abs-path>` (optional; see "Directive override" below)
 If `--title` is omitted, derive a default title from `task-group` (e.g. `uploadFont` → `uploadFont — Work Schedule`).
-## Step 0: Verify okstra runtime + project setup
+## Preflight: Verify okstra runtime + project setup
 Run before anything else in this skill:
@@ -81,17 +81,19 @@ This skill performs cross-task synthesis (multi-task classification, dependency
 ### Step 1: Resolve task-group and collect tasks
 1. Read `.project-docs/okstra/discovery/task-catalog.json`.
-2. Filter entries where `taskGroupPathSegment == <task-group lowercased>` OR `taskGroup == <task-group>` (case-insensitive match).
+2. **Normalise the user-supplied `<task-group>` argument:** lowercase it, then strip every character that is not `[a-z0-9]` (drop spaces, hyphens, underscores, dots, etc.). Apply the same transform to each entry's `taskGroupPathSegment`. Match when the two normalised forms are equal. This is the single comparison rule — do NOT also fall back to the raw `taskGroup` field.
 3. If no tasks found, output `해당 task-group을 찾을 수 없습니다.` and stop.
 4. For each matched task, read `.project-docs/okstra/tasks/<task-group-segment>/<task-id-segment>/task-manifest.json` directly. Catalog data may be stale; the manifest is authoritative.
 5. **Derive `<project-id>`** for the schedule header: prefer `task-catalog.json`'s top-level `projectId` field if present, otherwise use the first matched manifest's `projectId` field. Do not invent a value.
 ### Step 2: Filter by workStatus
-For each task-manifest:
-- If `workStatus` field is missing or empty → treat as `in-progress` (include).
-- If `workStatus == "done"` → exclude.
-- Otherwise (`todo` / `in-progress` / `blocked`) → include.
+Since the `feat(scripts/render): project workStatus fields into task-catalog entries` change (commit `c44c36b`), `workStatus` is also surfaced directly inside each catalog entry — Step 1 still re-reads each `task-manifest.json` because the manifest is authoritative, but no extra fetch is needed beyond that.
+For inference rules when `workStatus` is missing or empty, **defer to the inference table in `skills/okstra-status/SKILL.md` ("Step 4 → Inferring workStatus")** instead of duplicating it here. Apply that table to derive a working value, then filter:
+- If the resolved value is `done` (set by the user OR inferred via `currentStatus == "completed"` + terminal next-phase) → exclude.
+- Otherwise (`todo` / `in-progress` / `blocked` / `phase-done` / explicit user value / inferred non-done) → include.
 If after filtering 0 tasks remain, output:
 ```
@@ -133,20 +135,35 @@ If the report file does not exist or cannot be parsed:
 If parsing fails for a specific section:
 - Mark the entry with `[PARSE-ERROR: <section>]` and continue.
-### Step 4: Mode branching
+### Step 4: Synthesis
-- **Default (lightweight)**: Claude lead synthesizes the schedule directly using collected data. Proceed to Step 5.
-- **`--cross-verify`**: Invoke the parent `okstra` skill flow with the schedule synthesis itself as the analysis target. The schedule generation becomes the cross-verified deliverable.
+Claude lead synthesises the schedule directly using collected data. Proceed to Step 5.
 ### Step 5: Phase classification heuristic
-Classify each task into one of three phases based on report data:
+Classify each task into one of three phases based on report data.
+**`workCategory` accepted values** (de-facto enum, from `scripts/okstra_ctl/worktree.py::_WORK_CATEGORY_PREFIX` plus the `unknown` fallback emitted by `render.py`):
+| workCategory | Default phase |
+|--------------|---------------|
+| `bugfix` | Phase 1 (when risk is High/Med-High); otherwise Phase 2 |
+| `feature` | Phase 2 |
+| `improvement` | Phase 2 |
+| `refactor` | Phase 3 |
+| `ops` | Phase 3 |
+| `docs` / `doc` | Phase 2 |
+| `unknown` (or unmatched / missing) | **Phase 2**, with a one-line rationale `> _workCategory '<raw-value>' 미정의 — Phase 2로 기본 분류._` at the top of that phase section |
+Priority overrides category:
+- **Phase 1 (안정성/Critical)**: priority `P0`, OR `workCategory == "bugfix"` with High / Med-High risk
+- **Phase 2 (개선/기능)**: priority `P1` or `P2`, OR `workCategory in {"feature", "improvement", "docs", "doc"}`, OR fallback per the table above
+- **Phase 3 (확장/아키텍처)**: priority `P3`, OR `workCategory in {"refactor", "ops"}`, OR multi-repo + infrastructure scope
-- **Phase 1 (안정성/Critical)**: priority `P0`, OR `workCategory == "bugfix"` with high risk
-- **Phase 2 (개선/기능)**: priority `P1` or `P2`, `workCategory in {"improvement", "feature"}`
-- **Phase 3 (확장/아키텍처)**: priority `P3`, OR multi-repo + infrastructure scope
+When the classification is genuinely ambiguous after applying the table + priority override, place the task in the closest phase and add a one-line rationale at the top of that phase section.
-When the classification is ambiguous, place the task in the closest phase and add a one-line rationale at the top of that phase section.
+> Note: enum codification (turning this de-facto list into a `validators/`-enforced contract) is out of scope for this skill — file as a follow-up if needed.
 ### Step 6: Write the schedule file
@@ -168,7 +185,7 @@ After writing the file, reply briefly:
 - 포함 task: N개
 - 제외(done) task: M개
 - 예상 소요: X.X ~ Y.Y days (Effort 합산)
-- 모드: lightweight | cross-verify
+- 모드: lightweight
 ```
 ## Audience (READ FIRST — drives everything below)
@@ -312,7 +329,17 @@ When skipping, follow the skip-reason note rule below — but the skip should be
 #### Directive override (highest priority)
-Before applying the heuristics above, **check `instruction-set/analysis-material.md` for a `## Directive` section** (sourced from the `--directive` CLI flag). If that section is present and expresses any directive that affects Gantt rendering — e.g. "render a Gantt even with single XL task", "no Gantt needed" — that directive **overrides the heuristic and the skip-reason rule** for the affected section. When following a Directive override that contradicts the default heuristic, append a one-line note inside the rendered (or skipped) section: `> _Per Directive directive: <verbatim short excerpt>._` so the reader can trace why the section appeared/disappeared against the default. The Directive may also pre-supply day allocations, phase weights, or sub-task decompositions — use those verbatim as bar lengths in the Gantt.
+Before applying the heuristics above, **check the schedule-level directive source** for a `## Directive` section.
+**Resolution order (first hit wins):**
+1. Absolute path passed via the `--directive-file <abs-path>` argument when invoking the skill.
+2. `<PROJECT_ROOT>/.project-docs/okstra/tasks/<task-group-segment>/schedule/instruction-set/analysis-material.md` (the canonical schedule-level instruction-set location — okstra writes here when a schedule run is dispatched with `--directive`).
+3. **No directive file** — fall through and apply the default heuristic without any override. This is the normal case; do not warn, do not block.
+If a directive source is found and contains a `## Directive` section that affects Gantt rendering — e.g. "render a Gantt even with single XL task", "no Gantt needed" — that directive **overrides the heuristic and the skip-reason rule** for the affected section. When following a Directive override that contradicts the default heuristic, append a one-line note inside the rendered (or skipped) section: `> _Per Directive directive: <verbatim short excerpt>._` so the reader can trace why the section appeared/disappeared against the default. The Directive may also pre-supply day allocations, phase weights, or sub-task decompositions — use those verbatim as bar lengths in the Gantt.
+A directive file that exists but contains no `## Directive` heading is treated as "no directive" — fall through to the heuristic, no warning required.
 `## Gantt Chart` is the only `##` section that MAY be added beyond the 11 mandatory ones. Any other extra `##` heading is forbidden.
@@ -415,6 +442,10 @@ Required shapes:
 | Skipping `## Gantt Chart` because the data is "wide range / single task / part-day allocation pending" | Per the render-by-default rule, these are reasons to render an **estimate-tagged** chart, not to skip. Skip is only legitimate when day data literally does not exist (all-XXL or no effort sizing anywhere) |
 | Rendering Gantt as a mermaid (`` ```mermaid `` / `gantt` block) or any other graph DSL (PlantUML, Graphviz, etc.) | This skill renders ASCII only. Mermaid output is forbidden — use the ASCII Gantt format described above. Validators may reject mermaid blocks under this heading |
+## Known Validator Gaps (out of scope here)
+- The ambiguous-classification rationale required by Step 5 (`> _workCategory '<raw>' 미정의 …_` / nearest-phase rationale) is **not** currently enforced by `validators/validate-schedule.py`. The skill is responsible for emitting it; closing the validator gap requires a validator change and is tracked as a follow-up.
 ## Self-Validation Before Reporting Completion
 After writing the file and before printing the completion message in Step 7, you MUST:
@@ -596,13 +627,13 @@ Markdown bullet list of next concrete engineering actions, one item per line: `-
 | Case | Handling |
 |------|----------|
-| `workStatus` field absent or empty | Treat as `in-progress` → include in schedule |
+| `workStatus` field absent or empty | Resolve via the okstra-status inference table; include unless the resolved value is `done` |
 | Filtered task count is 0 | Emit "모든 task가 done" message; do NOT create a file |
 | `latestReportPath` missing or unreadable | Tag entry with `[NEEDS-OKSTRA-RUN]`; include manifest metadata only |
 | Report parsing fails for a specific section | Tag with `[PARSE-ERROR: <section>]`; continue with the rest |
 | `task-group` argument matches no tasks | Output "해당 task-group을 찾을 수 없습니다." and stop |
 | Catalog and manifest disagree on `workStatus` | Manifest wins (catalog may be stale) |
-| Multiple `task-group` casings | Match case-insensitively; use the manifest's `taskGroupPathSegment` for path output |
+| Multiple `task-group` casings / punctuation variants | Normalise both sides (lowercase + strip non-`[a-z0-9]`) then compare against `taskGroupPathSegment` only. Use the manifest's `taskGroupPathSegment` verbatim for path output. |
 ## Output Rules