npm - okstra - Versions diffs - 0.26.0 → 0.27.0 - Mend

okstra 0.26.0 → 0.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.kr.md +15 -0
package/README.md +15 -0
package/docs/kr/architecture.md +2 -6
package/docs/kr/cli.md +40 -6
package/docs/kr/performance-improvement-plan-v2.md +23 -0
package/docs/kr/performance-improvement-plan.md +22 -0
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/bin/okstra.sh +0 -1
package/runtime/prompts/profiles/_common-contract.md +25 -1
package/runtime/prompts/profiles/error-analysis.md +12 -0
package/runtime/prompts/profiles/implementation-planning.md +20 -0
package/runtime/prompts/profiles/requirements-discovery.md +20 -0
package/runtime/python/lib/okstra/cli.sh +1 -7
package/runtime/python/lib/okstra/globals.sh +0 -1
package/runtime/python/lib/okstra/usage.sh +1 -4
package/runtime/python/okstra_ctl/render.py +3 -0
package/runtime/python/okstra_ctl/run.py +0 -6
package/runtime/python/okstra_ctl/run_context.py +1 -1
package/runtime/python/okstra_ctl/wizard.py +25 -2
package/runtime/python/okstra_token_usage/blocks.py +5 -1
package/runtime/python/okstra_token_usage/claude.py +16 -1
package/runtime/python/okstra_token_usage/collect.py +17 -3
package/runtime/python/okstra_token_usage/pricing.py +159 -24
package/runtime/skills/okstra-brief/SKILL.md +532 -65
package/runtime/skills/okstra-context-loader/SKILL.md +25 -11
package/runtime/skills/okstra-convergence/SKILL.md +37 -13
package/runtime/skills/okstra-history/SKILL.md +68 -37
package/runtime/skills/okstra-logs/SKILL.md +26 -4
package/runtime/skills/okstra-report-finder/SKILL.md +49 -22
package/runtime/skills/okstra-report-writer/SKILL.md +59 -64
package/runtime/skills/okstra-run/SKILL.md +35 -34
package/runtime/skills/okstra-schedule/SKILL.md +51 -20
package/runtime/skills/okstra-setup/SKILL.md +31 -12
package/runtime/skills/okstra-status/SKILL.md +20 -8
package/runtime/skills/okstra-team-contract/SKILL.md +27 -15
package/runtime/skills/okstra-time-summary/SKILL.md +53 -16
package/runtime/templates/reports/settings.template.json +7 -4
package/runtime/validators/lib/fixtures.sh +10 -2
package/runtime/validators/lib/validate-assets.sh +50 -24
package/runtime/validators/validate-brief.py +385 -0
package/runtime/validators/validate-brief.sh +35 -0
package/runtime/validators/validate-workflow.sh +7 -33

package/runtime/skills/okstra-context-loader/SKILL.md CHANGED Viewed

@@ -12,7 +12,9 @@ user-invocable: false
 - When the user needs to know the okstra task bundle path
 - When you need to derive all artifact paths based on `task-manifest.json`
-## Step 1: Locating the Task Bundle
+## Step 1: Resolve the Task Bundle Path
+(Resolve which task-root path to use; Step 2 opens `task-manifest.json` at that path.)
 ### Default Location Rules
@@ -28,11 +30,12 @@ user-invocable: false
 3. If the user attempts to find a task based on `task-group` + `task-id` or `task-id`, `.project-docs/okstra/discovery/task-catalog.json` is read to find candidates.
 4. If multiple candidates are found based on `task-id` alone, the situation is ambiguous, so `task-group` or the full `taskKey` is required.
 5. If the user has not provided an explicit task key/path, first read `.project-docs/okstra/discovery/latest-task.json` using the current-task convenience pointer.
-6. Even if the latest-task pointer is missing or corrupted, but the task catalog exists, first check the list of prepared task bundles based on the catalog. Do not use the legacy `CLAUDE.md`, project guide, or task scan fallback.
+6. If the latest-task pointer is missing or corrupted but the task catalog exists, list candidates from the catalog. Do not use the legacy `CLAUDE.md`, project guide, or task scan fallback.
+7. If **neither** `latest-task.json` **nor** `task-catalog.json` exists, ABORT Phase 1 with `OKSTRA_CONTEXT_NOT_INITIALIZED`. Suggest the user run `/okstra-setup` and `/okstra-brief` to bootstrap the project. Do NOT crawl `.project-docs/okstra/tasks/` directly — discovery pointers are the only supported entry path.
-## Step 2: Read task-manifest.json
+## Step 2: Open and Parse task-manifest.json
-task-manifest.json is the canonical metadata source. The following fields must be extracted from this file:
+`task-manifest.json` (found at the task-root resolved in Step 1) is the canonical metadata source. Extract the following fields:
 | Field | Description |
 |------|------|
@@ -53,7 +56,7 @@ task-manifest.json is the canonical metadata source. The following fields must b
 | `workflow.awaitingApproval` | Approval wait marker |
 | `workflow.routingStatus` | Routing decision status |
 | `workflow.lastSafeCheckpoint` | Safe resume checkpoint metadata |
-| `instructionSetPath` | Instruction-set path |
+| `instructionSetPath` | Path to the `instruction-set/` **directory** containing `analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `task-brief.md`, `final-report-template.md` (see Step 4). Not a single-file path. |
 | `referenceExpectationsPath` | config/deployment expectation artifact path |
 | `latestRunPath` | latest run path |
 | `latestRunStatus` | latest run status |
@@ -63,7 +66,7 @@ task-manifest.json is the canonical metadata source. The following fields must b
 | `historyTimelinePath` | timeline path |
 | `resultContract` | team contract and expected artifact metadata |
 | `resultContract.requiredWorkerRoles[*].promptPath` | worker prompt history path by role |
-| `convergence` | convergence loop settings (enabled, maxRounds, verificationMode). `maxRounds` 기본값은 phase-aware: `requirements-discovery`는 `1`, 그 외는 `2`. 자세한 내용은 [okstra-convergence](../okstra-convergence/SKILL.md) 참고 |
+| `convergence` | convergence loop settings (`enabled`, `maxRounds`, `verificationMode`). See [okstra-convergence](../okstra-convergence/SKILL.md) for the authoritative defaults — do not re-document the `maxRounds` value here. |
 ## Step 3: Directory Structure Rules
@@ -121,12 +124,21 @@ After verifying `task-manifest.json`, read the instruction set in the following
 4. `instruction-set/task-brief.md` (task brief)
 5. `instruction-set/final-report-template.md` (report template)
+### Brief Reporter-Confirmation Precondition (BLOCKING)
+After reading `task-brief.md`, extract the frontmatter `reporter-confirmations` field (`complete | partial | pending | skipped`). This precondition is shared across every consuming phase — see `prompts/profiles/_common-contract.md` "Brief consumption" block for the authoritative handling matrix.
+- `complete` or `partial` → proceed to Step 5 and hand off to `okstra-team-contract`.
+- `skipped` → proceed, but flag the unmarked `intent-check:` / `conversion-block:` rows for promotion by the phase profile.
+- `pending` (or field missing) → emit `REPORTER_CONFIRMATION_PENDING` and STOP. Do not invoke `okstra-team-contract` or any analyser. The operator must rerun `okstra-brief` Step 6.5 before Phase 2 can start.
 ## Step 5: Read Run Manifest and Team State
-1. Current run manifest: The latest run manifest pointed to by the discovery pointer or task-manifest
-2. Current team state: The latest team-state pointed to by the discovery pointer or task-manifest
-3. Extract the worker prompt directory path and per-worker prompt history paths from the current run manifest and team-state
-4. If an existing run report is available, use it solely as historical context.
+1. Identify the active run by reading `runDateTimeSegment` from the latest `runs/<task-type>/manifests/run-manifest-*.json` (mtime order). That segment is the shared run identifier across all category subdirectories (`state/`, `prompts/`, `reports/`, `status/`, `sessions/`, `worker-results/`).
+2. Resolve sibling artifacts for this run by matching the same `runDateTimeSegment`. Do NOT re-scan `<seq>` counters per category — they may diverge if an earlier run only wrote some categories.
+3. Current team state: the team-state file whose `runDateTimeSegment` matches the active run manifest.
+4. Extract the worker prompt directory path and per-worker prompt history paths from the current run manifest and team-state.
+5. If an existing run report is available, use it solely as historical context.
 ## Output
@@ -137,4 +149,6 @@ Information to be obtained after executing this skill:
 - Reference list of config files/deployment manifests and task-level expected values
 - Current run status and presence of existing worker results
 - Current run prompt history contract for attempted workers
-- Resume command path: `runs/<task-type>/sessions/claude-resume-<task-type>-<seq>.sh`
+- Candidate `teamName` for Phase 3 hand-off: `okstra-<task-key>` (with task-key slugified per Step 1's slug rule)
+- Current Claude `lead.sessionId` (the in-flight Claude Code session) — required by `okstra-team-contract` when registering the lead in `team-state.json`
+- Resume command path: from `task-manifest.json` → `latestResumeCommandPath` (fallback: latest `runs/<task-type>/sessions/claude-resume-*.sh` by mtime). Never reconstruct the filename — the `<seq>` counter is category-local and may diverge from `manifests/`.

package/runtime/skills/okstra-convergence/SKILL.md CHANGED Viewed

@@ -6,6 +6,23 @@ user-invocable: false
 # OKSTRA Convergence
+## Index
+- [Scope and Terminology (BLOCKING)](#scope-and-terminology-blocking)
+- [When to Use](#when-to-use)
+- [Configuration](#configuration)
+- [Finding Category](#finding-category)
+- [Convergence Algorithm](#convergence-algorithm)
+  - [Round 0: Parse worker results](#round-0-parse-worker-results)
+  - [Round 1-N: Re-verification Loop (queue-pruned)](#round-1-n-re-verification-loop-queue-pruned)
+  - [Convergence Test](#convergence-test)
+- [Verification Mode](#verification-mode)
+- [Re-verification Agent Dispatch](#re-verification-agent-dispatch)
+- [Convergence State Artifact](#convergence-state-artifact)
+- [Output](#output)
+- [Convergence Disabled](#convergence-disabled)
+- [Plan-body verification mode (implementation-planning only)](#plan-body-verification-mode-implementation-planning-only)
 ## Scope and Terminology (BLOCKING)
 This skill governs **Phase 5.5 (Convergence loop)** — a *lead operating phase* inside a single okstra run, not a task-type lifecycle phase. The 6 task-type lifecycle phases (`requirements-discovery` → `error-analysis` → `implementation-planning` → `implementation` → `final-verification` → `release-handoff`, see [okstra/SKILL.md](../../SKILL.md) "Lifecycle Phase Boundaries") are unchanged by this skill. The lead operating phases (Phase 1 Intake → Phase 7 Persist, see [okstra/SKILL.md](../../SKILL.md) "Quick Reference") describe how the lead drives a *single* task-type run.
@@ -30,6 +47,8 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
 | `maxRounds` | phase-aware: `1` for `requirements-discovery`, `2` otherwise (range 1–3) | Maximum number of re-verification rounds. Discovery's routing/missing-input outputs gain little from a second round; other phases (especially `error-analysis`) keep `2`. Lead resolves the effective value when the manifest omits the key and records it in `config.maxRounds` of the convergence state artifact. |
 | `verificationMode` | `"lightweight"` | `"lightweight"` or `"full-reanalysis"` |
+**Auto-disable rule (BLOCKING).** Convergence requires ≥2 analyser workers to produce a meaningful consensus tally. When the active profile's `Required workers:` block (see `prompts/profiles/*.md`) resolves to fewer than 2 analyser workers — e.g. `release-handoff` (zero analyser workers, lead-only) — the lead MUST treat `convergence.enabled` as `false` for that run regardless of manifest configuration, skip Phases 5.5 and the plan-body verification round, and record `finalState: "converged"` with `totalRounds: 0` and an explanatory note in `config` (e.g. `"autoDisabled": "fewer-than-two-analysers"`). The plan-body round inherits the same rule via its `gating=false` advisory path.
 ## Finding Category
 | Category | Definition | Included in Report |
@@ -37,10 +56,12 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
 | `full-consensus` | All participating workers agree | Required |
 | `partial-consensus` | Majority of workers agree; dissenting opinions are recorded | Required |
 | `contested` | Final classification only. Assigned to a finding that remains in the verification queue after the **last executed round** completes (round index = `effectiveMaxRounds`). Each worker's position across all executed rounds is recorded. NEVER used as an intermediate label. | Required |
-| `worker-unique` | Only the discoverer confirms; others oppose or remain unverified | Required |
+| `worker-unique` | Only the discoverer confirms and ALL other non-error votes are `DISAGREE`. `verification-error` votes are excluded from the tally per §"Worker failure handling in reverify"; a finding where every non-discoverer vote is `verification-error` is carried forward, never classified `worker-unique`. | Required |
 ## Convergence Algorithm
+**Majority definition (BLOCKING).** "Majority" means *strictly greater than half* of the non-error votes for that finding (`verification-error` votes are excluded from both numerator and denominator). Ties — including the 1-AGREE / 1-DISAGREE case in a two-analyser roster — are NOT a majority: in intermediate rounds the finding is **carried forward**; in the final executed round the finding is classified `contested`. This rule applies identically to the plan-body verification round (§"Plan-body verification mode") where the same verdict tokens are reused.
 ### Round 0: Parse worker results
 Read the worker result files generated in Phase 4/5 and extract individual findings.
@@ -132,7 +153,8 @@ The lead MUST construct the per-worker reverify prompt body from `items_for_W` o
 |---|---|---|
 | `effectiveMaxRounds >= 2` | true | `"max-rounds-1"` |
 | `len(queue) > 0` after round 1 | true | `"queue-empty"` |
-| At least one round-1 reverify dispatch terminated as `completed` | true | `"all-reverify-non-result"` |
+The third gate condition — "all reverify dispatches terminated as non-result" — is handled inline by the WHILE-loop body (see the `BREAK` on `all dispatches ... terminal non-result` and §"Worker failure handling in reverify" rule 4) which records `round2SkippedReason = "all-reverify-non-result"` and aborts before the predicate is re-evaluated. It is therefore not duplicated as a gate row here.
 When all conditions hold the predicate returns `true` and `round2SkippedReason` is set to `"not-skipped"`. The field is mandatory on every convergence state artifact — write `"not-skipped"` rather than omitting the key.
@@ -152,11 +174,7 @@ The final classifier (`FOR each finding F still in queue` block) treats `verific
 ### Convergence Test
-- If the verification queue is empty at the end of any round → Convergence complete (`finalState: "converged"`), remaining rounds are not executed
-- Upon completing the **last executed round** (where round index == `effectiveMaxRounds`, OR where Round 2 was suppressed per the Round 2 gate below) → Apply final classification to remaining queue items:
-  - Majority agreement across executed rounds → `partial-consensus`
-  - Otherwise → `contested`
-- The final classification step never runs while the queue is still being re-verified — confirmed items always exit the queue first.
+The exit conditions and final-classification rules are defined by the §"Convergence Algorithm" pseudocode (the `WHILE` exit, the post-loop `FOR each finding F still in queue` block, and the `finalState` mapping in §"Convergence State Artifact"). This is the single source — no separate prose copy is maintained here to prevent drift.
 ## Verification Mode
@@ -383,9 +401,11 @@ Schema rules:
 - `schemaVersion`: literal string `"1.1"` for new runs. Readers MUST accept `"1.0"` for historical artifacts and treat any missing v1.1 field as `null`.
 - `config.effectiveMaxRounds`: the integer the lead actually used after resolving the phase-aware default (`1` for `requirements-discovery`, `2` otherwise). MUST equal `config.maxRounds` when the manifest explicitly set it.
 - `findings[].ticketIds`: array of ticket keys from Phase 4 grouping (parsed per the Round 0 step 5 rule). MAY be empty when the discovering worker tagged the finding `unknown`.
+- `findings[].rounds[].votes.<worker>.verdict`: enum, one of `agree | disagree | supplement | verification-error`. Lower-case tokens; map upper-case AGREE/DISAGREE/SUPPLEMENT verdicts emitted by workers to their lower-case form before persisting. `verification-error` is reserved for terminal non-result dispatches (§"Worker failure handling in reverify").
+- `findings[].classification`: enum, one of `full-consensus | partial-consensus | worker-unique | contested`. No other value is permitted in v1.1.
 - `roundHistory[].inputQueueSize`: queue size at the start of this round.
 - `roundHistory[].resolvedCount`: number of findings that exited the queue this round (sum of full+partial+worker-unique classifications produced this round).
-- `roundHistory[].carriedForwardCount`: queue size at the END of this round (must equal `inputQueueSize - resolvedCount` when there are no in-round queue insertions; in-round insertions are forbidden).
+- `roundHistory[].carriedForwardCount`: queue size at the END of this round — the single definition. In-round insertions into the queue are forbidden, so this always equals `inputQueueSize - resolvedCount`. The pseudocode's per-item `carriedForwardCount += 1` accumulator is a counting convenience that lands on the same value; persist the post-round queue length, not the loop accumulator, if the two ever diverge.
 - `roundHistory[].dispatches[]`: one entry per worker that was actually dispatched in this round. Each entry is `{worker, status, durationMs}`. `status ∈ {completed, timeout, error, not-run}`. `durationMs` is integer milliseconds and is always present, even for terminal-non-result dispatches (use the elapsed time before the wrapper gave up).
 - `roundHistory[].skippedWorkers[]`: per-worker `{worker, reason}` for workers with no items to verify OR with a non-result dispatch.
 - `roundHistory[].verificationsRequested|verificationsCompleted|newConsensus|remainingInQueue|earlyExit`: legacy v1.0 aliases. New runs SHOULD populate them so existing parsers keep working: `verificationsRequested == len(dispatches)`, `verificationsCompleted == len(d for d in dispatches if d.status == "completed")`, `newConsensus == resolvedCount`, `remainingInQueue == carriedForwardCount`, `earlyExit == (round < effectiveMaxRounds AND carriedForwardCount == 0)`.
@@ -433,7 +453,7 @@ The finding queue (Phase 5.5) and the plan-item queue (this section) are **disjo
 - A finding-convergence reverify prompt MUST NOT contain any `P-*` item.
 - A plan-body verification prompt MUST NOT contain any `F-*` finding.
-- The two rounds write to **different state files**: `runs/<task-type>/state/convergence-state.json` (findings) vs. `runs/<task-type>/state/plan-body-verification.json` (plan items).
+- The two rounds write to **different state files**: `runs/<task-type>/state/convergence-<task-type>-<seq>.json` (findings, see §"Convergence State Artifact") vs. `runs/<task-type>/state/plan-body-verification-<task-type>-<seq>.json` (plan items, see §"`plan-body-verification.json` schema").
 - Aggregation logic (verdict counting, classification) MUST NOT carry votes from one queue into the other.
 Mixing the two queues — for example, parsing a Phase 6 draft's Stepwise Execution Order step as if it were an `F-*` finding — is a contract violation. Future Claude reading this skill: if you find yourself tempted to "just reuse the finding queue for plan items, they're similar enough", stop. They are not similar enough; the verdict semantics differ (see §"Plan-body verdict semantics" below).
@@ -493,15 +513,15 @@ Plan-body verification only supports **lightweight mode** (defined in §"Verific
 4. After all dispatches return, lead aggregates verdicts per `P-*` item across workers and classifies each:
    - `full-consensus` — all participating analysers `AGREE` (SUPPLEMENT counts as agree on the item itself).
    - `partial-consensus` — majority `AGREE`, dissenting `DISAGREE` recorded.
-   - `worker-unique` — only one worker `DISAGREE`s, others `AGREE` — treat as `partial-consensus` for gate purposes; record dissent.
+   - `dissent-isolated` — only one worker `DISAGREE`s, others `AGREE` — treat as `partial-consensus` for gate purposes; record dissent. (Distinct from finding-convergence `worker-unique`, which means the *opposite*: only one worker AGREEs. Plan-body classifications use this dedicated label to avoid the collision.)
    - `majority-disagree` — majority of analysers `DISAGREE` on this item. This is the only classification that **blocks the Approval marker**.
    - `contested` only meaningful when `maxRounds > 1`; at default `maxRounds=1`, fold any unresolved item into `partial-consensus`.
 5. Gate result resolution:
    - any `majority-disagree` item present AND `gating=true` → `blocked-by-disagreement`
    - all dispatches non-result → `aborted-non-result`
-   - any `partial-consensus` / `worker-unique` present, no `majority-disagree` → `passed-with-dissent`
+   - any `partial-consensus` / `dissent-isolated` present, no `majority-disagree` → `passed-with-dissent`
    - all items `full-consensus` → `passed`
-6. Lead writes `runs/<task-type>/state/plan-body-verification.json` (schema below) and populates `### 4.5.9 Plan Body Verification` in the final report (template at `templates/reports/final-report.template.md`).
+6. Lead writes `runs/<task-type>/state/plan-body-verification-<task-type>-<seq>.json` (schema below) and populates `### 4.5.9 Plan Body Verification` in the final report (template at `templates/reports/final-report.template.md`).
 7. For every `majority-disagree` item, lead adds a row to `## 5. Clarification Items` with:
    - new `C-<N>` ID (numbering continues from any existing rows)
    - `Statement` summarising the disagreement and the worker breakage `<kind>`
@@ -510,7 +530,7 @@ Plan-body verification only supports **lightweight mode** (defined in §"Verific
    - the §4.5.9 verdict table's `Classification` column for that row reads `majority-disagree → C-<N>` (1:1 ID match — orphan on either side is a contract violation per `prompts/profiles/implementation-planning.md` self-review step 6).
 8. The top-of-report `- [ ] Approved` marker line is rendered if and only if the Gate result is `passed` or `passed-with-dissent`. `validators/validate-run.py` `validate_phase_boundary` enforces this correspondence; manually adding the marker line when the gate did not pass is a contract violation.
-### `plan-body-verification.json` schema
+### `plan-body-verification-<task-type>-<seq>.json` schema
 ```json
 {
@@ -547,6 +567,10 @@ Plan-body verification only supports **lightweight mode** (defined in §"Verific
 `dispatches[].terminalStatus` mirrors finding convergence (`completed | timeout | error | not-run | cli-failure`).
+`planItems[].classification` enum: `full-consensus | partial-consensus | dissent-isolated | majority-disagree | contested`. `contested` only appears when `maxRounds > 1`; at default `maxRounds=1` any otherwise-unresolved item folds into `partial-consensus` per the round protocol above.
+`planItems[].votes.<worker>` is the verbatim verdict token emitted by the worker — `AGREE | DISAGREE(<a|b|c|d|e>) | SUPPLEMENT` — or `verification-error` for terminal non-result dispatches. The `DISAGREE` token retains its `<kind>` suffix so the breakage class is recoverable from the state file alone.
 ### Plan-body reverify prompt
 Required prompt anchor headers are identical to finding convergence (see §"Required reverify-prompt anchor headers"). The prompt body changes from F-* listing to P-* listing:

package/runtime/skills/okstra-history/SKILL.md CHANGED Viewed

@@ -7,9 +7,22 @@ description: Use when the user asks to list past okstra runs, check execution hi
 ## When to Use
-- When a user views the history of past okstra executions
-- When re-running or resuming a previous execution
-- When checking the execution status of each task
+- List past okstra task executions (one row per task, with latest-run summary).
+- Drill into the per-run history of a specific task.
+- Build a new run from an old run's parameters (re-run), or continue an in-flight one (resume).
+This skill is for **listing / re-dispatching**. For reading a single final report by task-key, use `okstra-report-finder` instead.
+## Re-run vs Resume — decide upfront
+Before invoking Step 3 or Step 4, classify the user's intent. The two paths are NOT interchangeable.
+| Intent | Trigger phrases | Use |
+|---|---|---|
+| **Re-run** — start a fresh run (new run-seq, new manifest, new report) reusing an old run's parameters | "re-run", "다시 실행", "another pass", "rerun with same brief" | Step 3 |
+| **Resume** — continue an interrupted Claude session for an existing run, no new run-seq | "resume", "continue", "이어서", "session ended" | Step 4 |
+If the user is ambiguous, ask. Defaulting to the wrong one either wastes a fresh run-seq or silently abandons a recoverable session.
 ## Step 0: Verify okstra runtime + project setup
@@ -38,31 +51,45 @@ use `projectRoot` to locate `.project-docs/okstra/discovery/task-catalog.json`.
 ## Step 1: Read the Task Catalog
 1. Read `.project-docs/okstra/discovery/task-catalog.json`.
-2. Sort the `tasks` array in reverse order by `updatedAt` and display it.
-3. Extract the following fields from each task:
+2. Apply filters from user input (all optional, AND-combined):
+   - `--task-type <type>` → keep entries whose `taskType` matches.
+   - `--latest-run-status <status>` → keep entries whose `latestRunStatus` matches (e.g. `completed`, `contract-violated`, `error`).
+   - `--task-group <group>` → keep entries whose `taskGroup` matches.
+3. Sort the surviving `tasks` array by `updatedAt` descending.
+4. Page: default `--limit 20`. After printing the table, if rows were truncated, add `... <N> more (pass --limit <N> to see all)`.
+5. Extract the following fields from each task:
 | Field | Description |
 |------|------|
-| `taskKey` | Task identifier (`<project-id>:<task-group>:<task-id>`) |
-| `taskType` | Analysis type |
-| `currentStatus` | Task-level status |
-| `latestRunStatus` | Latest run status |
-| `updatedAt` | Last update time |
+| `taskKey` | Task identifier — always 3 colon-separated segments: `<project-id>:<task-group>:<task-id>` (see `parse_task_key` in `okstra_project/state.py`). |
+| `taskType` | `requirements-discovery` / `error-analysis` / `implementation-planning` / `implementation` / `final-verification` / `release-handoff` |
+| `currentStatus` | Task lifecycle status written by the contract validator. Values: `todo` (seeded by spawn-followups), `completed`, `contract-violated`. Empty string = validator has not yet run. NOT the same as the user-managed `workStatus` (managed by `okstra-status`). |
+| `latestRunStatus` | Status of the most recent run (`completed`, `contract-violated`, `error`, ...) |
+| `latestRunManifestPath` | Run-manifest path of the most recent run — feed this into Step 3 to re-run from the latest parameters |
+| `updatedAt` | Last update time (ISO 8601) |
 | `latestReportPath` | Latest report path |
-| `latestResumeCommandPath` | Resume command path |
+| `latestResumeCommandPath` | Resume command path (Step 4) |
+| `historyTimelinePath` | `<task-root>/history/timeline.json` (Step 2 reads from here) |
-4. Output format:
+6. Output format:
 ```markdown
 ## okstra Task History — <project-id>
-| # | Task Key | Type | Status | Last Run | Report |
-|---|----------|------|--------|----------|--------|
-| 1 | proj:group:id | error-analysis | completed | 2026-04-05 22:59 | .project-docs/.../final-report-*.md |
-| 2 | proj:group:id2 | final-verification | prepared | 2026-04-04 15:30 | -- |
+| # | Task Key | Type | currentStatus | latestRunStatus | Last Run | Report |
+|---|----------|------|---------------|------------------|----------|--------|
+| 1 | proj:group:id | error-analysis | completed | completed | 2026-04-05 22:59 | .project-docs/.../final-report-*.md |
+| 2 | proj:group:id2 | final-verification | todo | error | 2026-04-04 15:30 | -- |
 ```
-5. If `task-catalog.json` is missing, it responds with "There is no okstra execution history. Please run okstra.sh first."
+### Catalog absent — fallback
+If `.project-docs/okstra/discovery/task-catalog.json` does not exist, do NOT bail out. The catalog is a derived index — manifests on disk are the source of truth.
+1. Glob `<projectRoot>/.project-docs/okstra/tasks/*/*/task-manifest.json`.
+2. For each manifest, read `taskKey`, `taskGroup`, `taskType`, `currentStatus`, `latestRunStatus`, `updatedAt`, `latestReportPath`, `latestResumeCommandPath`, `latestRunManifestPath`, `historyTimelinePath`.
+3. Apply the same filters/sort/limit and print the same table, prefixed with: `note: task-catalog.json missing; reconstructed from task manifests on disk.`
+4. Only if the glob yields zero manifests: respond `There is no okstra execution history yet.`
 ## Step 2: Run History by Task
@@ -78,9 +105,10 @@ When a user selects a specific task or requests detailed history:
 | `runDateTimeSegment` | YYYY-MM-DD_HH-MM-SS |
 | `taskType` | `--task-type` argument value |
 | `status` | Run status |
-| `reportPath` | Report Path |
-| `resumeCommandPath` | resume Command path |
-| `relatedTasks` | List of related tasks |
+| `runManifestPath` | This run's `run-manifest-*.json` — feed into Step 3 to re-run from this specific run's parameters |
+| `reportPath` | Final report path |
+| `resumeCommandPath` | Resume Claude session for this run (Step 4) |
+| `relatedTasks` | List of related task-keys |
 4. Output format:
@@ -93,23 +121,26 @@ When a user selects a specific task or requests detailed history:
 | 2 | 2026-04-04 15:30 | error-analysis | error | -- |
 ```
-## Step 3: Create a re-execution command
+## Step 3: Re-run (build a NEW run from an old run's parameters)
-To re-run a specific run:
+This builds a **fresh run** — new run-seq, new manifest, new report — using the parameters captured in a previous `run-manifest-*.json`. It does NOT touch the old run's artifacts; use Step 4 if the user wants to continue an interrupted session instead.
-1. Read the run-manifest JSON from the `runManifestPath` of that run.
-2. Extract the required arguments:
+1. Pick the source run-manifest: the `runManifestPath` from a Step 2 timeline entry, or the task's `latestRunManifestPath` from Step 1.
+2. Read the run-manifest JSON and extract required arguments:
    - `projectId` → `--project-id`
    - `taskGroup` → `--task-group`
    - `taskId` → `--task-id`
    - `taskType` → `--task-type`
    - `taskBriefPath` → `--task-brief`
-3. Extract the optional arguments:
+3. Extract optional arguments (include only when present in the source manifest):
    - `recommendedWorkers` → `--workers` (comma-separated)
-   - `relatedTasks` → `--related-tasks` (if present)
-   - model overrides → `--claude-model`, `--codex-model`, `--gemini-model` (if different from default)
-   - for `taskType: implementation`: `teamContract.executor.provider` → `--executor <claude|codex|gemini>` (if different from `claude`)
-4. Display the assembled command:
+   - `relatedTasks` → `--related-tasks`
+   - model overrides → `--claude-model`, `--codex-model`, `--gemini-model` (when different from default)
+   - for `taskType: implementation`: `teamContract.executor.provider` → `--executor <claude|codex|gemini>` (when different from `claude`)
+4. **`taskType: implementation` only — resolve `--base-ref`:** the base ref is NOT stored in the run-manifest; it lives in the worktree registry at `~/.okstra/worktrees/registry.json` against the registered branch. Before assembling the command:
+   - If a worktree for this task-key is already registered, the existing branch & base are reused — omit `--base-ref` unless the user explicitly wants a different starting point.
+   - If no worktree is registered (e.g. it was cleaned up), `--base-ref` is mandatory. Ask the user for the ref to branch from (e.g. `main`, a commit SHA, a tag) before running.
+5. Display the assembled command:
 ```bash
 okstra.sh \
@@ -121,23 +152,23 @@ okstra.sh \
   --workers <worker-list>
 ```
-5. Once the user confirms, execute it using the Bash tool.
+6. Once the user confirms, execute it using the Bash tool.
-## Step 4: Resume
+## Step 4: Resume (continue an interrupted run)
-To resume a paused session:
+This continues an existing Claude session for a run that did not finish. It does NOT create a new run-seq — for a fresh dispatch, use Step 3.
-1. Check `latestResumeCommandPath` in the task catalog or timeline.
-2. Verify that the resume script file actually exists.
-3. If it exists, display the execution command:
+1. Read `latestResumeCommandPath` from the task catalog (Step 1) — or `resumeCommandPath` from a specific timeline entry (Step 2).
+2. Verify the file exists on disk.
+3. If present:
    ```bash
    bash <resume-command-path>
    ```
-4. If it does not exist, display the message: "No resume script found. Please run it again."
+4. If absent, report: `No resume script available for this run. Use Step 3 to start a fresh run instead.`
 ## Output Rules
 - Display concisely in a table format
 - Dates in `YYYY-MM-DD HH:MM` format
-- Display status as-is (`completed`, `prepared`, `error`, `not-run`, etc.)
+- Display status fields as-is from disk (`completed`, `contract-violated`, `todo`, `error`, empty, ...). Do not normalize or remap.
 - Display `--` if no report is available

package/runtime/skills/okstra-logs/SKILL.md CHANGED Viewed

@@ -36,6 +36,9 @@ multi-MB logs; analysis-phase dispatches are typically smaller.
 ## Step 0: Verify okstra runtime + project setup
+Before any other step, ensure both the okstra runtime and the current
+project's okstra metadata are in place:
 ```bash
 if command -v okstra >/dev/null 2>&1; then
   OKSTRA_CMD="okstra"
@@ -46,6 +49,8 @@ $OKSTRA_CMD ensure-installed >/dev/null 2>&1 || {
   echo "FAIL: okstra not installed; tell the user to run: npx okstra@latest install" >&2
   exit 1
 }
+eval "$($OKSTRA_CMD paths --shell)"
+export PYTHONPATH="$OKSTRA_PYTHONPATH"
 OKSTRA_PROJECT_INFO="$($OKSTRA_CMD check-project --json)" || {
   echo "FAIL: this project has no okstra setup. Tell the user to run /okstra-setup first." >&2
   echo "$OKSTRA_PROJECT_INFO" >&2
@@ -68,7 +73,7 @@ LOGS_ROOT="$PROJECT_ROOT/.project-docs/okstra/tasks"
 # columns: size_bytes | mtime_epoch | path
 find "$LOGS_ROOT" -type f -path '*/runs/*/prompts/*.log' \
   -printf '%s\t%T@\t%p\n' 2>/dev/null \
-  | sort -k2,2nr
+  | sort -k1,1nr
 ```
 On macOS, `find -printf` is unavailable. Fall back to `stat`:
@@ -78,7 +83,7 @@ find "$LOGS_ROOT" -type f -path '*/runs/*/prompts/*.log' 2>/dev/null \
   | while IFS= read -r p; do
       stat -f '%z%t%m%t%N' "$p"
     done \
-  | sort -k2,2nr
+  | sort -k1,1nr
 ```
 If the result is empty, report `No wrapper log files found under <PROJECT_ROOT>` and exit.
@@ -117,27 +122,40 @@ Total: N files, X.X MB across M tasks under <PROJECT_ROOT>
 ## Step 3: Suggested cleanup commands
 Emit a fenced bash block the user can copy-paste. Do NOT execute these.
+Each block pairs a dry-run preview (`-print`) with the destructive
+(`-delete`) command so the user can confirm the match set before
+committing.
 ```markdown
 ## Cleanup options (manual)
 # 7일 이상 된 로그만 삭제
+find <PROJECT_ROOT>/.project-docs/okstra/tasks \
+  -type f -path '*/runs/*/prompts/*.log' -mtime +7 -print    # dry-run
 find <PROJECT_ROOT>/.project-docs/okstra/tasks \
   -type f -path '*/runs/*/prompts/*.log' -mtime +7 -delete
 # 30일 이상 된 로그만 삭제
+find <PROJECT_ROOT>/.project-docs/okstra/tasks \
+  -type f -path '*/runs/*/prompts/*.log' -mtime +30 -print   # dry-run
 find <PROJECT_ROOT>/.project-docs/okstra/tasks \
   -type f -path '*/runs/*/prompts/*.log' -mtime +30 -delete
 # 특정 task-group 의 로그 일괄 삭제 (예: dev-9388)
+find <PROJECT_ROOT>/.project-docs/okstra/tasks/dev-9388 \
+  -type f -name '*.log' -print    # dry-run
 find <PROJECT_ROOT>/.project-docs/okstra/tasks/dev-9388 \
   -type f -name '*.log' -delete
 # 특정 task-id 의 로그 일괄 삭제 (예: dev-9428)
+find <PROJECT_ROOT>/.project-docs/okstra/tasks/*/dev-9428 \
+  -type f -name '*.log' -print    # dry-run
 find <PROJECT_ROOT>/.project-docs/okstra/tasks/*/dev-9428 \
   -type f -name '*.log' -delete
 # 전체 일괄 삭제 (주의)
+find <PROJECT_ROOT>/.project-docs/okstra/tasks \
+  -type f -path '*/runs/*/prompts/*.log' -print    # dry-run
 find <PROJECT_ROOT>/.project-docs/okstra/tasks \
   -type f -path '*/runs/*/prompts/*.log' -delete
 ```
@@ -152,10 +170,14 @@ End the response with these short reminders:
 - Logs are truncated on each re-dispatch of the same `seq`, so deleting an
   in-flight run's log will cause the wrapper to recreate an empty file on
   the next dispatch — no data loss beyond the current trace.
+- **If a dispatch is currently running, check `okstra status` first** and
+  avoid deleting logs for tasks in `in-progress` state — you will lose
+  the live trace for the active run.
 - Prompt history files (`.md`) are separate and are NOT touched by these
   commands — only `.log` sidecars.
-- This skill does not modify `.gitignore`. If the project commits
-  `.project-docs/okstra/`, the user may want to add
+- This skill **does not modify any external files itself**, including
+  `.gitignore`. If the project commits `.project-docs/okstra/`, the user
+  may want to add
   `.project-docs/okstra/tasks/**/runs/**/prompts/*.log` to `.gitignore`
   manually to keep large logs out of git.

package/runtime/skills/okstra-report-finder/SKILL.md CHANGED Viewed

@@ -40,54 +40,81 @@ OKSTRA_PROJECT_INFO="$($OKSTRA_CMD check-project --json)" || {
 task-key 형식: `<project-id>:<task-group>:<task-id>`
+**Normalization**: task-key 매칭은 lowercase 로 수행한다. 디스크상의 group/id 세그먼트는 slugify (lowercase + non-alphanumeric runs → `-`) 된다 (`scripts/lib/okstra/interactive.sh:147`). 따라서 catalog 조회는 case-insensitive 이지만 파일 경로를 만들 때는 slugified segment 를 사용해야 한다.
 ### 방법 A: task-catalog.json (빠름)
 1. `.project-docs/okstra/discovery/task-catalog.json`을 읽는다.
-2. `tasks` 배열에서 `taskKey`가 일치하는 항목을 찾는다.
-3. `latestReportPath` 필드가 최신 보고서 경로이다.
+2. `tasks` 배열에서 `taskKey` 를 lowercase 비교로 매칭한다.
+3. `latestReportPath` 필드는 task-type 과 무관한 "가장 최근 보고서" 이다.
 ### 방법 B: task-manifest.json (직접)
-task-catalog가 없거나 최신이 아닌 경우:
+task-catalog 가 없는 경우:
+1. task-key 에서 task-group / task-id 를 추출하고 slugify 한다.
+2. `.project-docs/okstra/tasks/<group-segment>/<id-segment>/task-manifest.json` 을 읽는다.
+3. `latestReportPath` 필드 (task-type 무관, 최신).
+### 방법 C: timeline.json (특정 run 의 보고서)
-1. task-key에서 task-group과 task-id를 추출한다.
-2. `.project-docs/okstra/tasks/<task-group>/<task-id>/task-manifest.json`을 읽는다.
-3. `latestReportPath` 필드가 최신 보고서 경로이다.
+특정 날짜나 run 의 보고서가 필요한 경우:
-### 방법 C: timeline.json (특정 run의 보고서)
+1. `.project-docs/okstra/tasks/<group-segment>/<id-segment>/history/timeline.json` 을 읽는다.
+2. `runs[]` 배열에서 필터한다. 실제 필드:
+   - `runs[].runTimestamp` (ISO-8601 시작 시각)
+   - `runs[].status` (run 종료 상태)
+   - `runs[].taskType` (`requirements-discovery` 등 phase 이름)
+   - `runs[].reportPath` (해당 run 의 보고서 상대 경로)
-특정 날짜나 run의 보고서가 필요한 경우:
+### 방법 D: 특정 task-type 의 최신 보고서 (fallback)
-1. `.project-docs/okstra/tasks/<task-group>/<task-id>/history/timeline.json`을 읽는다.
-2. `runs` 배열에서 원하는 run을 찾는다 (날짜, 상태 등으로 필터).
-3. `reportPath` 필드가 해당 run의 보고서 경로이다.
+`latestReportPath` (방법 A/B) 는 task-type 을 가리지 않는다. 사용자가 *특정* task-type (예: `implementation-planning`) 의 최신 보고서를 요청하면:
+1. 디렉토리: `.project-docs/okstra/tasks/<group-segment>/<id-segment>/runs/<task-type-segment>/reports/`
+2. 파일명 패턴: `final-report-<task-type-segment>-<NNN>.md` (`scripts/okstra_ctl/sequence.py:31`)
+3. seq 가 가장 큰 파일이 해당 task-type 의 최신 보고서.
+4. timeline.json 의 `runs[].taskType == <task-type>` 으로 교차검증 가능.
 ## Step 2: Report 존재 확인
-1. 찾은 경로에 파일이 실제로 존재하는지 확인한다.
+1. `latestReportPath`가 비어있지 않고, 해당 경로의 파일이 실제로 존재하는지 확인한다.
+   - 두 신호 중 하나라도 보고서 존재를 가리키면 경로를 표시한다 (tolerant).
 2. 존재하면 경로를 표시하고 읽을지 사용자에게 확인한다.
-3. 존재하지 않으면:
-   - `task-manifest.json`의 `currentStatus`를 확인한다.
-   - status가 `completed`가 아니면: "이 task는 아직 완료되지 않았습니다 (status: `<status>`)."
-   - 파일만 없으면: "보고서 파일이 존재하지 않습니다: `<path>`"
+3. 존재하지 않으면 `task-manifest.json`에서 다음 신호를 확인한다:
+   - `latestReportPath`가 비어있거나 누락된 경우, 그리고
+   - `currentStatus`가 `completed`가 아니고, `workStatus`가 `done`이 아니며, `workflow.routingStatus`도 완료를 가리키지 않는 경우
+     → "이 task는 아직 완료되지 않았습니다 (currentStatus: `<currentStatus>`, workStatus: `<workStatus>`)."
+   - 위 신호 중 하나라도 완료를 가리키는데 파일만 없는 경우
+     → "보고서 파일이 존재하지 않습니다: `<path>`"
+워크플로 상태 enum은 `todo | in-progress | blocked | done` (workStatus) 와 `currentStatus`의 `completed` / `contract-violated` 등이다. `"completed"` 문자열은 `workStatus`에는 존재하지 않으므로 두 필드를 혼동하지 말 것.
 ## Step 3: Report 읽기 및 후속 작업 안내
 보고서를 읽은 후 사용자에게 가능한 후속 작업을 안내한다:
 1. **구현 진행**: 보고서의 "권장 다음 단계" 섹션 기반으로 코드 수정
-2. **추가 검증**: 같은 task-key로 새 okstra run 실행 (`okstra-history` 스킬로 재실행 커맨드 생성)
-3. **관련 task 확인**: 보고서의 관련 task 참조가 있으면 해당 task의 보고서도 조회
+2. **추가 검증**: 같은 task-key 로 새 okstra run 실행. 구체 커맨드:
+   ```bash
+   scripts/okstra.sh --task-key <task-key> --task-type <task-type>
+   ```
+   더 풍부한 옵션 (base-ref, workers, render-only 등) 이 필요하면 `okstra-history` 스킬을 사용한다.
+3. **관련 task 확인**: 보고서의 관련 task 참조가 있으면 해당 task 의 보고서도 조회
 ## Output
 ```markdown
 ## Report for <task-key>
-- Status: `<status>`
-- Report path: `<relative-path>`
-- Run date: `<YYYY-MM-DD HH:MM>`
-- Task type: `<task-type>`
+| Field        | Value                                              |
+| ------------ | -------------------------------------------------- |
+| Status       | `<status>`                                         |
+| Task type    | `<task-type>`                                      |
+| Run seq      | `<NNN>`                                            |
+| Run date     | `<runTimestamp ISO-8601>`                          |
+| Report (rel) | `<relative-path-from-project-root>`                |
+| Report (abs) | `<absolute-path>`                                  |
 ```
 이후 사용자 요청에 따라 보고서를 읽고 후속 작업을 진행한다.