npm - okstra - Versions diffs - 0.50.0 → 0.51.0 - Mend

okstra 0.50.0 → 0.51.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (57) hide show

package/README.kr.md +8 -7
package/README.md +8 -7
package/bin/okstra +2 -0
package/docs/kr/architecture.md +15 -16
package/docs/kr/cli.md +5 -5
package/docs/project-structure-overview.md +10 -6
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/SKILL.md +15 -11
package/runtime/agents/workers/claude-worker.md +3 -3
package/runtime/agents/workers/codex-worker.md +2 -2
package/runtime/agents/workers/gemini-worker.md +2 -2
package/runtime/bin/lib/okstra/cli.sh +8 -1
package/runtime/bin/lib/okstra/globals.sh +3 -0
package/runtime/bin/lib/okstra/interactive.sh +14 -12
package/runtime/bin/lib/okstra/usage.sh +6 -0
package/runtime/bin/okstra-team-reconcile.sh +28 -0
package/runtime/bin/okstra.sh +2 -0
package/runtime/prompts/launch.template.md +3 -1
package/runtime/prompts/profiles/_common-contract.md +4 -4
package/runtime/prompts/profiles/_implementation-executor.md +2 -2
package/runtime/prompts/profiles/_implementation-verifier.md +1 -1
package/runtime/prompts/profiles/implementation-planning.md +1 -0
package/runtime/prompts/profiles/implementation.md +1 -1
package/runtime/python/okstra_ctl/analysis_packet.py +259 -0
package/runtime/python/okstra_ctl/context_cost.py +308 -0
package/runtime/python/okstra_ctl/migrate.py +2 -12
package/runtime/python/okstra_ctl/paths.py +22 -0
package/runtime/python/okstra_ctl/render.py +284 -125
package/runtime/python/okstra_ctl/render_final_report.py +31 -0
package/runtime/python/okstra_ctl/run.py +507 -245
package/runtime/python/okstra_ctl/sequence.py +2 -5
package/runtime/python/okstra_ctl/team_reconcile.py +131 -0
package/runtime/python/okstra_ctl/wizard.py +129 -133
package/runtime/python/okstra_ctl/worktree.py +13 -5
package/runtime/schemas/final-report-v1.0.schema.json +4 -0
package/runtime/skills/okstra-coding-preflight/SKILL.md +69 -0
package/runtime/skills/okstra-coding-preflight/architecture/hexagonal.md +116 -0
package/runtime/skills/okstra-coding-preflight/clean-code.md +254 -0
package/runtime/skills/okstra-coding-preflight/languages/java.md +64 -0
package/runtime/skills/okstra-coding-preflight/languages/javascript-typescript.md +87 -0
package/runtime/skills/okstra-coding-preflight/languages/kotlin.md +69 -0
package/runtime/skills/okstra-coding-preflight/languages/nodejs.md +66 -0
package/runtime/skills/okstra-coding-preflight/languages/python.md +179 -0
package/runtime/skills/okstra-coding-preflight/languages/rust.md +105 -0
package/runtime/skills/okstra-coding-preflight/languages/sql.md +68 -0
package/runtime/skills/okstra-context-loader/SKILL.md +12 -6
package/runtime/skills/okstra-inspect/SKILL.md +100 -1
package/runtime/skills/okstra-report-writer/SKILL.md +5 -1
package/runtime/skills/okstra-run/SKILL.md +1 -1
package/runtime/skills/okstra-team-contract/SKILL.md +7 -4
package/runtime/templates/reports/final-report.template.md +1 -0
package/runtime/templates/worker-prompt-preamble.md +3 -3
package/src/_python-helper.mjs +3 -3
package/src/context-cost.mjs +27 -0
package/src/install.mjs +1 -0
package/src/uninstall.mjs +1 -0

package/runtime/skills/okstra-coding-preflight/languages/python.md ADDED Viewed

@@ -0,0 +1,179 @@
+# Python Conventions
+If a project-level Python style skill or repo-local `CONTRIBUTING.md` / `pyproject.toml` config exists, **it wins**. This file is the fallback.
+## Core principles
+- Write simple, explicit, readable Python. Prefer boring, maintainable code over clever abstractions.
+- Optimize for correctness first, then clarity, then performance.
+- Keep functions small and single-responsibility. Prefer composition over inheritance.
+- Type-hint all public functions, class methods, and non-trivial internal functions.
+- Use dataclasses or Pydantic models for structured data — never loose `dict`s for domain data.
+- Make side effects explicit. Fail fast with clear errors. No hidden global state.
+- Code must be testable without network, filesystem, database, or time dependencies.
+## Python version
+Target **Python 3.11+** unless told otherwise. Use modern syntax:
+- `str | None`, not `Optional[str]`
+- `list[str]` / `dict[str, int]`, not `List` / `Dict`
+- `match` only when it improves clarity
+- `pathlib.Path`, not string paths
+## Style guide
+PEP 8, enforced by `ruff format` (or `black`).
+- Indent: **4 spaces**. Line length: **88** (ruff/black default) unless the project sets otherwise.
+- `snake_case` functions/variables, `PascalCase` classes, `SCREAMING_SNAKE_CASE` constants, leading `_` for private.
+## Function design
+- One thing per function. Avoid bodies over ~40 lines without a strong reason.
+- Explicit parameters over reading globals. Return values instead of mutating arguments.
+- Avoid behavior-changing boolean flags — split into separate functions.
+- Do not hide I/O inside functions that look pure. Do not catch broad exceptions unless re-raising with useful context.
+Bad:
+```python
+def process(data, save=True, notify=False):
+    ...
+```
+Better:
+```python
+def build_invoice(data: InvoiceInput) -> Invoice: ...
+def save_invoice(invoice: Invoice) -> None: ...
+def notify_invoice_created(invoice: Invoice) -> None: ...
+```
+## Typing
+- Type hints everywhere meaningful. Avoid `Any` unless unavoidable; never silence a type error without a one-line why.
+- No untyped dicts for domain data — use `dataclass`, `Enum`, `TypedDict`, or Pydantic.
+- `Protocol` for dependency inversion.
+Bad:
+```python
+def create_user(data: dict): ...
+```
+Better:
+```python
+@dataclass(frozen=True)
+class CreateUserCommand:
+    email: str
+    name: str
+def create_user(command: CreateUserCommand) -> User: ...
+```
+## Error handling
+- Specific exception types with useful context. Never swallow exceptions silently.
+- Do not use exceptions for normal control flow.
+- Convert infrastructure errors into application/domain errors at boundaries.
+- Never expose internal stack traces or secrets to users.
+Bad:
+```python
+try:
+    send_email(user)
+except Exception:
+    pass
+```
+Better:
+```python
+try:
+    send_email(user)
+except EmailProviderError as exc:
+    raise NotificationFailedError(f"Failed to notify user_id={user.id}") from exc
+```
+## Async
+- Use async only for real I/O concurrency. Never call sync-blocking I/O inside an async function.
+- Always set timeouts on external calls. No fire-and-forget tasks without explicit lifecycle + error handling.
+Bad:
+```python
+async def fetch():
+    requests.get(url)          # blocking call inside async
+```
+Better:
+```python
+async def fetch(client: httpx.AsyncClient, url: str) -> Response:
+    return await client.get(url, timeout=10)
+```
+## Project / module layout
+Separate domain logic from infrastructure; keep business rules independent of frameworks, DBs, queues, and external APIs.
+- `domain/` — entities, value objects, pure business logic
+- `application/` — use cases, orchestration, commands, queries
+- `infrastructure/` — DB, external APIs, filesystem, queues
+- `interfaces/` — HTTP, CLI, workers, event handlers
+- `tests/` — unit, integration, e2e
+When the project is ports-and-adapters, also read [../architecture/hexagonal.md](../architecture/hexagonal.md).
+## Security
+- Never hardcode secrets — read from env vars or a secret manager.
+- Parameterized SQL only; never build SQL via string interpolation / f-strings.
+- Validate all external input. Treat file paths, URLs, headers, and serialized input as untrusted.
+- No unsafe deserialization (arbitrary `pickle.load`).
+## Database
+- Explicit, reviewable SQL. Avoid N+1 queries. Wrap multi-step writes in transactions.
+- Use repository interfaces (`Protocol`) when DB access should be decoupled from domain logic.
+- Keep migrations backward-compatible when possible.
+## Logging
+- Structured logging — never `print()` for application logs.
+- Include IDs/context (request, user, job, entity). Never log secrets, tokens, passwords, cookies, API keys, or sensitive PII.
+## Tests
+- Unit-test domain logic; integration-test DB, external-API wrappers, and framework wiring.
+- Deterministic tests — no `sleep`, no execution-order dependence, no real external services unless marked integration/e2e.
+- Mock only at boundaries. Use factories/builders for test data. Test behavior, not implementation.
+- Every bug fix ships a regression test when feasible.
+### Self-mock signals to refuse (rule from `clean-code.md` → Testing discipline)
+- `unittest.mock.patch`-ing a method **on the class under test**, then asserting that method was called. Mock the collaborator it depends on, not the unit it *is*.
+- Splitting out a helper *only so* the test can patch it, then asserting `helper.assert_called_once()` as the real check — that proves the SUT calls the helper, not that the behavior works.
+- Reaching into privates (`obj._internal`) to assert on internal state instead of observable outcomes.
+- A `MagicMock` standing in for the SUT itself with a `return_value` that mirrors the very thing under test.
+What's fine: mocking injected dependencies (`Mock(spec=UserRepository)`, `httpx.MockTransport`), asserting on return values / raised exceptions / emitted events / boundary calls.
+## Required tooling
+Run before declaring a Python change complete:
+- `ruff check .` — lint
+- `ruff format --check .` — formatting (or `black --check .`)
+- `mypy .` or `pyright` — type check (if configured)
+- `pytest` — tests
+Dependency/project management: `uv`, Poetry, or `pip-tools`. Pin dependencies in application projects; reach for the standard library before adding third-party deps, and avoid importing heavy deps at module-import time when unnecessary.
+## Forbidden anti-patterns
+God classes/functions, hidden global mutable state, circular imports, catch-all `except Exception` without re-raise, silent failure, untyped domain dicts, business logic in controllers/routes or coupled to ORM models, hardcoded secrets, string-formatted SQL, unbounded retries, missing timeouts on external calls, fire-and-forget async without error handling, order-dependent tests, abstractions before two concrete use cases, magic constants without names, copy-pasted logic, comments that restate obvious code, leftover `print()` debugging.

package/runtime/skills/okstra-coding-preflight/languages/rust.md ADDED Viewed

@@ -0,0 +1,105 @@
+# Rust Conventions
+If a project-level `rust-guidelines` skill or repo-local `CONTRIBUTING.md` exists, **it wins**. This file is the fallback.
+## Style guide
+Official Rust Style Guide (rustwiki.org/en/style-guide/), enforced by `rustfmt`. Set the edition in `rustfmt.toml`:
+```toml
+style_edition = "2024"
+```
+## Naming (RFC 430)
+| Element | Convention | Example |
+|---|---|---|
+| Crate / module | `snake_case` | `my_crate`, `parser` |
+| Type / trait / enum variant | `UpperCamelCase` | `HttpClient`, `Error::Timeout` |
+| Function / method / variable | `snake_case` | `send_request`, `retry_count` |
+| Const / static | `SCREAMING_SNAKE_CASE` | `MAX_RETRIES` |
+| Lifetime | short lowercase | `'a`, `'src` |
+| Type parameter | single uppercase | `T`, `E` |
+## Formatting
+- Indent: **4 spaces**.
+- Max line length: **100** (`rustfmt` default).
+- Always run `cargo fmt` before commit.
+## Ownership & borrowing
+- Prefer borrowing (`&T`) over cloning. `.clone()` is a code smell unless you can name the reason in one sentence.
+- Return owned types from constructors and "convert" methods; accept borrowed slices (`&str`, `&[T]`) as parameters.
+- `&mut` only when you must mutate. Two `&` borrows are usually better than one `&mut`.
+- `Cow<'_, str>` when a function sometimes allocates and sometimes does not.
+## Error handling
+- **Library** crates: define a typed error with `thiserror`. One `Error` enum per crate, one variant per failure mode.
+- **Application** crates: `anyhow::Result<T>` at the top level is fine; convert to typed errors at module boundaries.
+- **No `unwrap()` / `expect()` in production paths.** Acceptable in:
+  - Tests.
+  - `main` for setup that genuinely cannot fail.
+  - `expect("invariant: ...")` when the message documents the invariant.
+- Use the `?` operator. Do not write `match`-on-`Result` ladders.
+## Idioms
+- Iterators (`map`, `filter`, `collect`, `fold`, `find`) over indexed loops.
+- `if let` / `let else` over `match` with a single arm.
+- `#[derive(Debug, Clone, ...)]` aggressively. Add `PartialEq` / `Eq` / `Hash` when the type lands in a collection.
+- Newtype small primitives that carry meaning: `struct UserId(u64);`, not raw `u64`.
+- `mut` and `pub` only when needed. Start private, widen on demand.
+- Use `From` / `TryFrom` for conversions, not `parse_x_from_y` helpers.
+- Pattern-match deeply (`if let Some(User { id, .. }) = user`) rather than chained `.unwrap().unwrap()`.
+## Async / Tokio
+- Pick **one** runtime per binary (`tokio` is standard). Never mix `tokio` and `async-std`.
+- Never `block_on` inside an async function.
+- Spawn tasks deliberately; remember `JoinHandle`s and `await` them.
+- `tokio::select!` for races; never poll futures by hand.
+- Long CPU work goes in `tokio::task::spawn_blocking`.
+- Cancellation: use `CancellationToken` (tokio-util) for cooperative shutdown.
+## Unsafe
+- Every `unsafe` block needs a comment documenting **every** invariant the caller must uphold.
+- New `unsafe` requires a second pair of eyes on review.
+- Default to safe abstractions (`bytes`, `parking_lot`, `crossbeam`) before reaching for `unsafe`.
+## Module layout
+- Public API at the crate root via `pub use`; implementation in private modules.
+- One concept per file. Split a module before it passes ~500 lines.
+- Unit tests in `mod tests { use super::*; ... }` at the bottom of the file.
+- Integration tests in the `tests/` directory.
+## Tests
+- Unit: `#[test]` inside `mod tests`. Use `assert_eq!`, `assert!`, `assert_matches!`.
+- Async: `#[tokio::test]` or `#[test]` with an explicit `tokio::runtime::Runtime` when you need control.
+- Property: `proptest` or `quickcheck` for pure transformations.
+- **Doc tests**: runnable examples in doc comments are tests — keep them green.
+- Use `pretty_assertions::assert_eq!` for readable diffs on large structs.
+### Self-mock signals to refuse (rule from `clean-code.md` → Testing discipline)
+Rust's trait-based DI makes self-mocking rarer than in JVM/JS, but it still happens:
+- Using `mockall::mock!` (or `automock`) to generate a mock of the **same struct under test**, then asserting on its own methods. The unit under test must be the real impl; mock the trait it depends on, not the trait it *is*.
+- Splitting a behavior into a helper trait *only so* the test can stub it, then expecting `expect_helper().returning(...)` to be the real assertion. The test now proves the SUT calls the helper, not that the behavior works.
+- Reaching into privates via `pub(crate)` widening, `#[cfg(test)] pub` shortcuts, or test-only modules that expose internal state to assert on.
+- A test that constructs the SUT, replaces one of its trait-object dependencies with a mock whose `returning(...)` mirrors the very thing being tested.
+What's fine: `mockall` mocks of injected trait dependencies (`MockHttpClient`, `MockUserRepository`), `unwrap()` in tests for setup that genuinely cannot fail, `assert_eq!` on returned values, `assert_matches!` on returned `Result` / `Option`.
+## Required tooling
+Run before any commit:
+- `cargo fmt --all` — formatting.
+- `cargo clippy --all-targets --all-features -- -D warnings` — lint (warnings are errors).
+- `cargo check --all-targets` — fast type check.
+- `cargo test --all` — tests.

package/runtime/skills/okstra-coding-preflight/languages/sql.md ADDED Viewed

@@ -0,0 +1,68 @@
+# SQL Conventions
+Defaults below target PostgreSQL. MySQL-specific notes are marked.
+## Formatting / style
+- Keywords **lowercase**: `select`, `from`, `where`, `join`. (Some teams prefer uppercase — match the file you are editing; never mix in one statement.)
+- All identifiers `snake_case`: tables, columns, indexes, constraints.
+- Always use explicit `as` for column aliases: `select count(*) as user_count`.
+- Always state the join type: `inner join`, `left join`, `cross join`. **Never** a bare `join`.
+- One column per line in `select`; one table or join per line in `from`.
+- Prefer **CTEs** (`with foo as (...)`) over deeply nested subqueries.
+- Dates / timestamps in ISO 8601: `'2026-05-17'`, `'2026-05-17T12:00:00Z'`.
+## Schema design
+- Every table has an `id` primary key.
+  - PostgreSQL: `id bigint generated always as identity primary key`.
+  - MySQL: `id bigint unsigned not null auto_increment primary key`.
+- Foreign keys named `<referenced_table>_id`: `user_id`, `order_id`.
+- Singular table names (`user`, `order`, `invoice_line`) unless the project already uses plural — match, do not mix.
+- `comment on table ... is '...'` for every table; `comment on column ...` for non-obvious columns. (MySQL: `comment '...'` inline.)
+- Timestamps: `created_at`, `updated_at`, both `timestamptz` in Postgres. Keep timezone discipline explicit — store UTC, render in the user's zone at the edge.
+- Soft delete: `deleted_at timestamptz null`. Index it if you query on it. Add a partial index `where deleted_at is null` for hot reads.
+- Use `not null` aggressively. `null` should mean "unknown", not "default".
+- Use enum or `check` constraints over free-text status columns.
+- Composite indexes: column order matches your `where` / `order by` patterns; left-most prefix wins.
+## Migrations
+- One change per migration file. Name: `<timestamp>_<verb>_<noun>.sql` (e.g. `20260517_120000_add_user_deleted_at.sql`).
+- Migrations are **forward-only** in production. Write a "down" migration only if your tool requires it and you genuinely intend to run it.
+- Splitting a destructive change:
+  1. Add nullable column.
+  2. Backfill (separate deploy).
+  3. Add `not null` / index `concurrently`.
+  4. Drop the old column.
+- `create index concurrently` in Postgres for hot tables — avoids the write lock.
+- Application-deploy migrations **never** destroy data. Destructive operations are a separate, explicit step gated on backups.
+## Query practices
+- Always filter on indexed columns for OLTP queries. Verify with `explain` / `explain analyze`.
+- Never `select *` in application code (migrations and ad-hoc inspection are fine).
+- Parameterised queries from the application layer. String concatenation is SQL injection.
+- `limit` every query that could return an unbounded set.
+- `returning *` (Postgres) on `insert` / `update` / `delete` instead of a follow-up `select`.
+- For pagination, prefer keyset (`where id > $last_seen`) over offset on large tables.
+## MySQL-specific
+- Engine: **InnoDB** (default). Charset: `utf8mb4`, collation `utf8mb4_0900_ai_ci` (8.0+).
+- Quote identifiers with backticks when they collide with reserved words.
+- `select ... for update` only inside an explicit transaction.
+- `on duplicate key update` for upserts; in Postgres use `on conflict (...) do update`.
+## Tests
+- Run query tests against a **real** database via `testcontainers` or a disposable schema. In-memory SQLite is not equivalent to Postgres / MySQL — different SQL dialect, different isolation, different `null` semantics.
+- Migrations themselves must be tested: apply from an empty schema in CI; assert the resulting structure.
+- Seed test data via factory functions (one per table), not raw `insert` statements copy-pasted into each test.
+- Verify performance assumptions with `explain` in CI for queries flagged as hot.
+## Tools
+- Formatting: `pg_format`, `sqlfluff`, `sql-formatter`.
+- Linting / safety: `squawk` (Postgres migration linter — catches missing `concurrently`, locking foot-guns, etc.).
+- Schema diff: `migra` (Postgres), `mysqldiff` (MySQL).

package/runtime/skills/okstra-context-loader/SKILL.md CHANGED Viewed

@@ -56,7 +56,7 @@ user-invocable: false
 | `workflow.awaitingApproval` | Approval wait marker |
 | `workflow.routingStatus` | Routing decision status |
 | `workflow.lastSafeCheckpoint` | Safe resume checkpoint metadata |
-| `instructionSetPath` | Path to the `instruction-set/` **directory** containing `analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `task-brief.md`, `final-report-template.md` (see Step 4). Not a single-file path. |
+| `instructionSetPath` | Path to the `instruction-set/` **directory** containing `analysis-packet.md`, `analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `task-brief.md`, `final-report-template.md` (see Step 4). Not a single-file path. |
 | `referenceExpectationsPath` | config/deployment expectation artifact path |
 | `latestRunPath` | latest run path |
 | `latestRunStatus` | latest run status |
@@ -78,6 +78,7 @@ After identifying the task root in `task-manifest.json`, derive all paths accord
 ├── task-index.md               (human-readable summary, non-canonical)
 ├── instruction-set/
 │   ├── analysis-profile.md     (analysis guide by task type)
+│   ├── analysis-packet.md      (primary compact input for analysis workers)
 │   ├── analysis-material.md    (analysis materials)
 │   ├── reference-expectations.md (config/deployment expected values)
 │   ├── task-brief.md          (task brief)
@@ -116,13 +117,18 @@ After identifying the task root in `task-manifest.json`, derive all paths accord
 ## Step 4: Instruction Set Reading Order
-After verifying `task-manifest.json`, read the instruction set in the following order:
+After verifying `task-manifest.json`, read only the compact intake files needed for the current action. Do not bulk-read the whole instruction-set directory.
 1. `instruction-set/analysis-profile.md` (analysis guide by task type)
-2. `instruction-set/analysis-material.md` (analysis materials, if available)
-3. `instruction-set/reference-expectations.md` (config/deployment expected values)
-4. `instruction-set/task-brief.md` (task brief)
-5. `instruction-set/final-report-template.md` (report template)
+2. `instruction-set/analysis-packet.md` (primary compact input for analysis workers)
+3. `runs/<task-type>/state/active-run-context-<task-type>-<seq>.json` if present (compact current-run path/worker snapshot)
+Read source files lazily:
+- `instruction-set/task-brief.md` only for reporter-confirmation checks, source verification, or report-writer synthesis.
+- `instruction-set/analysis-material.md` only when packet content is insufficient or a source citation needs verification.
+- `instruction-set/reference-expectations.md` for report-writer synthesis or when packet expectation extract is insufficient.
+- `instruction-set/final-report-template.md` only for report-writer authoring.
 ### Brief Reporter-Confirmation Precondition (BLOCKING)

package/runtime/skills/okstra-inspect/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: okstra-inspect
 description: |
-  Use for any read-side okstra inspection or status mutation. Single skill dispatches by sub-command to five facets — status, history, report, time, logs. Trigger words include "okstra status", "task status", "current phase", "next phase", "okstra status set", "okstra mark", "<task-id> done|in-progress|진행중|완료", "okstra history", "past runs", "re-run", "resume", "list tasks", "find report", "show report for", "read the okstra report", "continue from report", "작업 시간", "소요 시간", "time summary", "duration", "elapsed", "얼마나 걸렸", "시간 분석", "okstra logs", "로그 현황", "로그 파일", "log files", "log size", "log status", "로그 정리", "log cleanup".
+  Use for any read-side okstra inspection or status mutation. Single skill dispatches by sub-command to six facets — status, history, report, time, logs, cost. Trigger words include "okstra status", "task status", "current phase", "next phase", "okstra status set", "okstra mark", "<task-id> done|in-progress|진행중|완료", "okstra history", "past runs", "re-run", "resume", "list tasks", "find report", "show report for", "read the okstra report", "continue from report", "작업 시간", "소요 시간", "time summary", "duration", "elapsed", "얼마나 걸렸", "시간 분석", "okstra logs", "로그 현황", "로그 파일", "log size", "log status", "로그 정리", "log cleanup", "okstra context-cost", "context cost", "context-cost", "컨텍스트 비용", "읽기 비용", "산출물 비용".
 ---
 # OKSTRA Inspect
@@ -15,6 +15,7 @@ Single read-side entry point for okstra runtime inspection plus the one write-si
 | `report` | Resolve final-report path for a task-key. Optionally read it. |
 | `time` | Per-task-type and per-worker duration breakdown for a task. |
 | `logs` | Inventory codex/gemini wrapper `.log` sidecars; emit cleanup commands. |
+| `cost` | Estimate file/read context cost for a task bundle. |
 ## Step 0: Verify okstra runtime + project setup (shared)
@@ -467,6 +468,104 @@ Never write `claude (claude)` — the parenthesized agent is shown only when it
 ---
+## cost
+Trigger phrases: "okstra context-cost", "context cost", "context-cost", "컨텍스트 비용", "읽기 비용", "산출물 비용", "task bundle cost", "agent read cost".
+Read-only estimate of how much file/context surface a prepared task bundle asks the lead, analysis workers, and report-writer to absorb. This sub-command does **not** mutate task artifacts.
+### cost.1 — Resolve target
+Accepted target forms:
+1. Full task-key: `<project-id>:<task-group>:<task-id>`.
+2. Task id only, e.g. `DEV-9184`.
+3. Task root path, e.g. `<projectRoot>/.okstra/tasks/<group>/<task-id>`.
+If the user gives a task root path, run `okstra context-cost <absolute-or-user-provided-path>` directly.
+If the user gives a full task-key, run:
+```bash
+okstra context-cost <task-key> --project-root <projectRoot>
+```
+If the user gives only a task id:
+1. Read `.okstra/discovery/task-catalog.json`.
+2. Match `taskId` case-insensitively.
+3. Single match → use its `taskKey`.
+4. Multiple matches → list candidates and ask the user to retry with a full task-key.
+5. No match → report that the task cannot be found.
+If the user asks generally ("컨텍스트 비용 보여줘") and does not name a task:
+1. Read `.okstra/discovery/task-catalog.json`.
+2. If exactly one task exists, use it.
+3. If multiple tasks exist, show the latest 10 by `updatedAt` and ask which task to measure. Do not guess.
+### cost.2 — Run estimator
+Use the CLI output as the source of truth:
+```bash
+okstra context-cost <resolved-target> --project-root <projectRoot>
+```
+Do not re-count files manually unless the CLI fails and the user explicitly asks for manual fallback.
+### cost.3 — Summarize output
+Parse the JSON and report these fields:
+| Field | Source |
+|---|---|
+| Task bundle | `totals.taskFileCount`, `totals.taskBytes` |
+| Current run | `totals.currentRunFileCount`, `totals.currentRunBytes`, `currentRunPath` |
+| Legacy timestamp artifacts | `totals.legacyTimestampFileCount` |
+| Instruction set | `instructionSet.fileCount`, `instructionSet.bytes`, `instructionSet.analysisPacketBytes`, `instructionSet.legacyTaskPacketBytes` |
+| Lead Phase 1 | `leadPhase1.mode`, `leadPhase1.fileCount`, `leadPhase1.bytes` |
+| Analysis worker | `analysisWorker.mode`, `analysisWorker.fileCount`, `analysisWorker.bytesPerWorker`, `analysisWorker.legacyFullContractBytesPerWorker`, `analysisWorker.estimatedPacketModeBytesPerWorker`, `analysisWorker.estimatedReductionPercent` |
+| Report writer | `reportWriter.fileCount`, `reportWriter.bytes` |
+Format bytes as both raw bytes and rounded KB/MB where useful. Use `analysisWorker.estimatedReductionPercent` for the worker-input reduction. Do not recompute it from `bytesPerWorker` when `analysisWorker.mode == "analysis-packet-primary"` because `bytesPerWorker` is already the packet-primary cost.
+### cost.4 — Output template
+```markdown
+## okstra Context Cost — <task-key>
+| Surface | Files | Size |
+|---|---:|---:|
+| Task bundle | <N> | <bytes> (<human>) |
+| Current run | <N> | <bytes> (<human>) |
+| Instruction set | <N> | <bytes> (<human>) |
+| Lead Phase 1 (`<mode>`) | <N> | <bytes> (<human>) |
+| Analysis worker / worker (`<mode>`) | <N> | <bytes> (<human>) |
+| Report writer synthesis | <N> | <bytes> (<human>) |
+- Current run: `<currentRunPath-or-->`
+- Legacy timestamp artifacts: `<N>`
+- Legacy full worker contract: `<legacyFullContractBytesPerWorker>` bytes (`<human>`) per analysis worker
+- Packet estimate: `<estimatedPacketModeBytesPerWorker>` bytes (`<human>`) per analysis worker
+- Estimated worker-input reduction: `<percent>%`
+### Reading
+<One or two Korean sentences explaining the main bottleneck and the next likely optimization target.>
+```
+Interpretation rules:
+- `leadPhase1.mode == "active-run-context"` means the compact lead intake file is present and should be treated as the primary lead read surface.
+- `leadPhase1.mode == "legacy-five-file"` means this task was prepared before active-run-context, or the manifest does not reference it.
+- `analysisWorker.mode == "analysis-packet-primary"` means new workers should read `analysis-packet.md` first and open full source inputs only for evidence checks or missing detail.
+- If `analysisWorker.mode == "full-input-contract"` and `estimatedReductionPercent` is low, the next target is worker prompt/input contract slimming.
+- If `reportWriter.bytes` dominates, the next target is a compact `synthesis-input` artifact.
+- If `legacyTimestampFileCount` is high, recommend current-view/cold-artifact separation or retention cleanup, not destructive deletion by default.
+---
 ## logs
 Trigger phrases: "okstra logs", "로그 현황", "로그 파일", "log files", "log size", "log status", "로그 정리", "log cleanup".

package/runtime/skills/okstra-report-writer/SKILL.md CHANGED Viewed

@@ -12,6 +12,8 @@ The final-report **data.json** (JSON SSOT) at `runs/<task-type>/reports/final-re
 The data.json schema is `schemas/final-report-v1.0.schema.json`. The renderer + the run-validator both consume that schema, so a data.json that validates is guaranteed to render into a markdown that passes the contract checks.
+Two `frontmatter` approval fields are always emitted with their unset default — never pre-fill them: `frontmatter.approved` is emitted as `false`, and `frontmatter.implementationOption` is emitted as an empty string `""`. The user later flips `approved` to `true` (via `--approve` or manual edit) and fills `implementationOption` with the chosen Option Candidate name (via `--implementation-option <name>` or manual edit) to authorise and scope the next `implementation` run.
 If you are reading this skill **as the report-writer-worker subagent**, YOU are the one calling the `Write` tool against the data.json path AND invoking the renderer via `Bash`. Do not return either artifact inline — the files on disk are the canonical record.
 If you are reading this skill **as Claude lead**, your job in Phase 6 is to (a) prepare the report-writer prompt, (b) dispatch the Report writer worker per the Phase 6 dispatch template in SKILL.md, (c) review both files in Phase 7. Do not call `Write` against either path yourself when Report writer worker is in the roster.
@@ -33,11 +35,13 @@ Agent(
   name: "report-writer",
   subagent_type: "report-writer-worker",
   team_name: "okstra-<task-key>",   # omit if team is not alive — see Resume-safe dispatch
-  model: "opus",
+  model: "<family token of Report writer worker's modelExecutionValue>",   # opus/sonnet/haiku — NOT hardcoded; see below
   mode: "auto"
 )
 ```
+The `model:` parameter is **derived from the Report writer worker's `modelExecutionValue`** in `task-manifest.json`, mapped to an Agent family token (`opus` / `sonnet` / `haiku`) per [okstra-team-contract](../okstra-team-contract/SKILL.md) "Model Assignment Rules" #3–#4. Do NOT hardcode it — the report-writer-worker definition is `model: inherit`, so without this explicit parameter the worker silently runs on the lead's model instead of its assignment. The same `modelExecutionValue` feeds the prompt header in item 6 below, so the spawn model and the recorded `**Model:**` header always agree.
 The prompt MUST include, in this order at the top:
 1. `**Project Root:** <absolute-path>`

package/runtime/skills/okstra-run/SKILL.md CHANGED Viewed

@@ -186,7 +186,7 @@ You can delete the literal state-file path after this point — its job is done.
 ## Step 6: Take over as Claude lead
-Read `<INSTRUCTION_SET_PATH>/claude-execution-prompt.md` verbatim and enter `Claude lead` mode. The lead prompt itself enumerates every other instruction-set file to load (`analysis-profile.md`, `analysis-material.md`, `reference-expectations.md`, `final-report-template.md`, the run manifest, the team-state artifact, etc.) — follow its order, do not preempt it.
+Read `<INSTRUCTION_SET_PATH>/claude-execution-prompt.md` verbatim and enter `Claude lead` mode. The lead prompt now points to compact intake artifacts first (`active-run-context`, `analysis-profile.md`, and `analysis-packet.md`); full source files such as `analysis-material.md`, `reference-expectations.md`, and `final-report-template.md` are lazy/fallback inputs. Follow the rendered prompt order, do not preempt it.
 Then proceed through the phases exactly as the lead prompt directs (Phase 1 context → Phase 2+ worker dispatch → final synthesis → final report).

package/runtime/skills/okstra-team-contract/SKILL.md CHANGED Viewed

@@ -37,6 +37,9 @@ okstra tasks are always operated using the `Claude lead` + required worker team
 1. `resultContract.requiredWorkerRoles` in `task-manifest.json` (and the lead model metadata) is the canonical source. There is no role-level fallback — a missing assignment is a manifest defect, not a license to invent one.
 2. If `modelExecutionValue` differs from `model`, use `modelExecutionValue` during execution.
+3. **Spawn-time enforcement for in-process Claude subagents (BLOCKING).** `Claude worker` and `Report writer worker` are in-process Claude subagents whose agent definitions declare `model: inherit` (`agents/workers/claude-worker.md`, `agents/workers/report-writer-worker.md`). `inherit` follows the **lead's** runtime model, NOT the role's assignment — so an opus assignment silently runs on a sonnet lead. To make the assignment binding (not merely declared), lead MUST pass an explicit `model:` parameter on every `Agent(...)` dispatch for these two roles, derived from that role's `modelExecutionValue`. The dispatch `model:` parameter overrides the `inherit` frontmatter; the frontmatter remains only as the fallback when no parameter is supplied. Omitting `model:` on a Claude-side dispatch is a contract violation that reproduces the assigned-vs-actual model deviation.
+4. **`modelExecutionValue` → Agent `model:` family token.** The Agent tool's `model` parameter accepts family tokens only — `opus` / `sonnet` / `haiku` (an exact version such as `claude-opus-4-7` is NOT a valid value). Map by prefix: a `modelExecutionValue` of `opus*` / `claude-opus*` → `"opus"`, `sonnet*` / `claude-sonnet*` → `"sonnet"`, `haiku*` / `claude-haiku*` → `"haiku"`. This enforces the assignment at **family granularity** (opus vs sonnet vs haiku); the exact version within a family is still inherited from the lead session and cannot be pinned via this parameter.
+5. **Codex / Gemini wrappers are out of scope for the Agent `model:` rule.** `Codex worker` / `Gemini worker` subagents are Claude wrappers that shell out to an external CLI; the role's `modelExecutionValue` is already applied via the CLI's own `--model <modelExecutionValue>` argument (see `agents/workers/_cli-wrapper-template.md`). The Agent `model:` parameter for these wrappers would only set the wrapper's own orchestration model, not the external CLI's model — leave it at `inherit` and do NOT map it from `modelExecutionValue`.
 ### Dynamic Worker Role Determination
@@ -91,7 +94,7 @@ Send byte-identical dispatch prompts to every analysis worker (Claude / Codex /
 The lead does NOT inline `[Required reading]` or `[Error reporting]` blocks into worker prompts. Both contracts live in a single canonical file at `~/.okstra/templates/worker-prompt-preamble.md` (source: `templates/worker-prompt-preamble.md`). The lead injects the path via the `**Worker Preamble Path:**` anchor header (header #5 above) and each worker Reads that file end-to-end before producing output.
 What the lead MUST still do per dispatch:
-- Inject the input file enumeration into the dispatch prompt body via an `## Inputs` section (or any heading the recipient agent expects), listing the actual project-relative paths derived from the run's `instruction-set/` and the carry-in clarification response if any. The preamble describes the rules; the lead provides the specific paths for THIS run.
+- Inject the input file enumeration into the dispatch prompt body via an `## Inputs` section (or any heading the recipient agent expects), listing the actual project-relative primary inputs derived from the run's `instruction-set/`. For analysis workers, list `analysis-packet.md` as the required primary input and list task-brief / analysis-profile / analysis-material / reference-expectations / clarification-response as source/fallback paths only when useful. The preamble describes the rules; the lead provides the specific paths for THIS run.
 - Inject the absolute `**Errors log path:**` and `**Errors sidecar path:**` headers (#6 and #7 above) — workers cannot synthesize these paths.
 - Omit the preamble pointer for reverify dispatches (Phase 5.5 lightweight mode) — see [okstra-convergence](../okstra-convergence/SKILL.md) "Reverify prompt: required-reading suppression".
@@ -99,8 +102,8 @@ Audience-scoped file enumeration (BLOCKING — performance optimization):
 | Recipient | Files the lead lists under `## Inputs` |
 |---|---|
-| Claude / Codex / Gemini analysis workers | task-brief, analysis-profile, analysis-material (if present), reference-expectations, clarification-response (if carry-in) |
-| Report writer worker (Phase 6) | all of the above **plus** the instruction-set-local `final-report-template.md` (phase-stripped) and `final-report-schema.json` (per-task-type excerpt) — NOT the full `templates/reports/...` / `schemas/...` sources |
+| Claude / Codex / Gemini analysis workers | `analysis-packet.md` as primary input; source/fallback paths may be listed below it but are not automatic first-read files |
+| Report writer worker (Phase 6) | task-brief, analysis-profile, analysis-material, reference-expectations, clarification-response (if carry-in), **plus** the instruction-set-local `final-report-template.md` (phase-stripped) and `final-report-schema.json` (per-task-type excerpt) — NOT the full `templates/reports/...` / `schemas/...` sources |
 | Reverify dispatches | none — the lead provides only the items to reverify |
 Asymmetry note: `claude-worker` runs in-process and the Agent SDK auto-loads its agent definition; lead's dispatch prompt body for claude-worker can therefore be shorter than for codex/gemini. The Worker Preamble pointer is still emitted for all three so the contract source is identical regardless of dispatch path.
@@ -147,7 +150,7 @@ After each worker subagent returns (regardless of role), Lead MUST verify the ca
 ### Result Frontmatter (mandatory, precedes Section 1)
-Every worker result file MUST begin with a YAML frontmatter block. The values are sourced from the corresponding fields of the input files' frontmatter (e.g. `analysis-material.md`, `task-brief.md`) — copy them verbatim; do NOT regenerate them. Only `workerId` and `title` are worker-specific.
+Every worker result file MUST begin with a YAML frontmatter block. For analysis workers, values are sourced from `analysis-packet.md` frontmatter; fall back to `analysis-material.md` or `task-brief.md` only if the packet is missing a field. Report-writer can use `analysis-material.md` / `task-brief.md` as before. Copy values verbatim; do NOT regenerate them. Only `workerId` and `title` are worker-specific.
 ```yaml
 ---

package/runtime/templates/reports/final-report.template.md CHANGED Viewed

@@ -18,6 +18,7 @@ project-id: {{ frontmatter.projectId | yaml_scalar }}
 taskType: {{ frontmatter.taskType | yaml_scalar }}
 workerId: {{ frontmatter.workerId | yaml_scalar }}
 approved: {{ frontmatter.approved | yaml_scalar }}
+implementation-option: {{ frontmatter.implementationOption | yaml_scalar }}
 ---
 # {{ header.taskKey }} - Multi-Agent Cross Verification Final Report

package/runtime/templates/worker-prompt-preamble.md CHANGED Viewed

@@ -8,7 +8,7 @@ It replaces the previous practice of inlining ~80 lines of identical boilerplate
 ## Required reading (analysis workers + report-writer worker)
-You are required to read every input file enumerated by the dispatcher (the lead's prompt lists them under `[Required reading]`) from the very first character to the very last character before you produce any analysis output. Skimming, partial reads, jumping to a single section, or relying on prior knowledge of a similar file's structure is not acceptable. Each file may contain decisive context that is not surfaced in its summary or first page.
+You are required to read every primary input file enumerated by the dispatcher (the lead's prompt lists them under `[Required reading]`) from the very first character to the very last character before you produce any analysis output. Skimming, partial reads, jumping to a single section, or relying on prior knowledge of a similar file's structure is not acceptable. Source files listed as fallback/evidence paths are read on demand when you need to verify a citation, resolve ambiguity, or inspect material the packet says it omitted.
 ### Audience-scoped enumeration (BLOCKING — performance optimization)
@@ -16,7 +16,7 @@ Different recipients need different files. Do NOT include `final-report-template
 | Recipient | Files included in `[Required reading]` |
 |---|---|
-| Claude / Codex / Gemini analysis workers | task-brief, analysis-profile, analysis-material (if present), reference-expectations, clarification-response (if carry-in) |
+| Claude / Codex / Gemini analysis workers | analysis-packet.md as the primary compact input; task-brief, analysis-profile, analysis-material, reference-expectations, and clarification-response remain source/fallback paths, not automatic first-read files |
 | Report writer worker (Phase 6) | all of the above **plus** the instruction-set-local `final-report-template.md` (phase-stripped) and `final-report-schema.json` (per-task-type excerpt) — NOT the full `templates/reports/...` / `schemas/...` sources |
 | Reverify dispatches (Phase 5.5, lightweight mode) | **do NOT inject `[Required reading]` at all** — see [okstra-convergence](../skills/okstra-convergence/SKILL.md) "Reverify prompt: required-reading suppression". |
@@ -25,7 +25,7 @@ Different recipients need different files. Do NOT include `final-report-template
 - Use a single `Read` tool call per file with no `offset` and no `limit`. If a file is genuinely too large for one read, page through with explicit `offset` / `limit` covering the entire file, and state the page boundaries in your Findings.
 - For the carry-in clarification response, walk every row of `## 1. Clarification Items` (`C-001`, `C-002`, ...) in full, including rows whose `User input` cell is blank — a blank `User input` with `Status=open` is itself a signal you must surface. The structural similarity between the prior final report and the upcoming output is NOT a license to skim.
 - Write the Reading Confirmation block to your **audit sidecar** at `runs/<task-type>/worker-results/<worker>-audit-<task-type>-<seq>.md` (sibling to the main worker-results file). One short line per input file confirming end-to-end reading. Do NOT include a `## 0. Reading Confirmation` heading in the main worker-results file — the validator fails worker-results that contain one. If you cannot truthfully confirm a file end-to-end, record a `tool-failure` in the errors sidecar instead of fabricating Findings.
-- Do not collapse multiple input files into a single mental summary before reading them all individually. Each file has its own canonical role: brief = the user's request, profile = the lead's rules for this phase, reference-expectations = ground-truth config/deployment values, clarification-response = prior run's open questions and the user's answers, final-report-template = the structure your eventual writeup must conform to. Conflating them loses signal.
+- Treat `analysis-packet.md` as the canonical primary analysis input. It preserves the source files' frontmatter and extracts the task-specific brief, phase focus, reference expectations, and carry-in clarification rows. If the packet appears incomplete or a finding depends on a source citation, open the corresponding source file and cite it directly.
 ---