okstra 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.kr.md CHANGED
@@ -89,6 +89,18 @@ npx -y okstra@latest doctor
89
89
 
90
90
  `result: OK` 라인이 보이면 준비 완료입니다. FAIL row 가 있으면 그 줄에 바로 복구 방법이 인쇄됩니다 (대개는 install 재실행).
91
91
 
92
+ #### (선택) `okstra` 명령을 글로벌로 설치
93
+
94
+ 이 README 의 모든 예제는 `npx -y okstra@latest <cmd>` 를 씁니다 — 글로벌 설치 없이도 동작합니다. 매번 npx 의 패키지 fetch / 버전 체크를 거치지 않고 `okstra` 한 단어로 호출하고 싶다면 글로벌 설치:
95
+
96
+ ```bash
97
+ npm i -g okstra
98
+ okstra --version # CLI 가 PATH 에 잡혔는지 확인
99
+ okstra install # 'npx -y okstra@latest install' 와 동일
100
+ ```
101
+
102
+ 글로벌 설치는 Node CLI 를 PATH 에 등록할 뿐입니다. 런타임(`~/.okstra/`) 과 Claude 스킬(`~/.claude/skills/`) 은 여전히 `okstra install` 이 생성합니다 — `npm i -g` 에 포함되지 않습니다. 이후 업그레이드: `npm i -g okstra@latest && okstra install`. 글로벌 바이너리 제거: `npm uninstall -g okstra` (`~/.okstra/` 는 그대로; 그것까지 지우려면 `okstra uninstall`).
103
+
92
104
  ### 3.2 프로젝트 등록 (프로젝트당 1회)
93
105
 
94
106
  CLI 에서:
package/README.md CHANGED
@@ -88,6 +88,18 @@ npx -y okstra@latest doctor
88
88
 
89
89
  A `result: OK` line means you're ready. Any FAIL row prints its remediation in-line (usually: rerun install).
90
90
 
91
+ #### Optional: install the `okstra` command globally
92
+
93
+ Every example in this README uses `npx -y okstra@latest <cmd>` so you don't need a global install. If you'd rather type a bare `okstra` (and skip npx's fetch / version-check on each invocation), install it globally:
94
+
95
+ ```bash
96
+ npm i -g okstra
97
+ okstra --version # confirm the CLI is on PATH
98
+ okstra install # same as 'npx -y okstra@latest install'
99
+ ```
100
+
101
+ The global install only registers the Node CLI on your PATH. The runtime (`~/.okstra/`) and the Claude skills (`~/.claude/skills/`) are still provisioned by `okstra install` — they are not part of `npm i -g`. To upgrade later: `npm i -g okstra@latest && okstra install`. To remove the global binary: `npm uninstall -g okstra` (leaves `~/.okstra/` untouched; remove that with `okstra uninstall`).
102
+
91
103
  ### 3.2 Register a project (once per project)
92
104
 
93
105
  From the CLI:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.5.0",
3
+ "version": "0.6.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.5.0",
3
- "builtAt": "2026-05-12T04:37:48.519Z",
2
+ "package": "0.6.0",
3
+ "builtAt": "2026-05-12T04:57:34.255Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -14,16 +14,21 @@
14
14
  - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract).
15
15
  - default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`Claude executor`/`Report writer worker`=`opus`, `Claude verifier`=`sonnet`, `Codex verifier`=`gpt-5.5`, `Gemini verifier`=`auto`
16
16
  - all three verifier roles (`Gemini verifier`, `Codex verifier`, `Claude verifier`) must be attempted; the final verdict waits until each has either a result or an explicit terminal status
17
+ - **All-verifier-failure policy**: if every required verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) ends with a non-result terminal status (`timeout`, `error`, `not-run`) — i.e. zero independent verdicts were produced — the run MUST end with status `blocked` and route to a follow-up `error-analysis` run. `Claude lead` MUST NOT substitute its own verdict in place of the missing verifier outputs; synthesis requires at least one independent verifier's verdict. If one or two verifiers fail but at least one returns a verdict, the run proceeds with the surviving verdict(s) and the final report MUST explicitly notate which verifiers were unavailable, with the captured error / timeout evidence per failed verifier.
17
18
  - unnamed generic parallel workers must not replace the required role roster, and no additional sub-agent dispatch is allowed beyond this roster
18
19
  - Tooling — read-only MCP availability:
19
20
  - `mcp__mysql-fontsninja-{common,fontradar,fontsninja,fonthelper}` (tools: `mysql_list_tables`, `mysql_describe_table`, `mysql_select_data`) may be queried by both executor and verifiers as a read-only cross-check (sanity-checking row counts after a migration script's dry-run, comparing observed schema against the plan's expectations, etc.); the canonical usage policy is in the task brief's `## Available MCP Servers` section, and any MCP-derived evidence MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files, never through this MCP.
20
21
  - Pre-implementation gate (mandatory — refuse to start if any item fails):
21
22
  - the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
22
- - that file MUST contain a `User Approval Request` block AND a recorded user approval marker (e.g. `APPROVED:` line, `[x] Approved`, or equivalent unambiguous evidence). If only the request block exists without an approval marker, refuse to start with status `contract-violated`.
23
+ - that file MUST contain a `User Approval Request` block AND a recorded user approval marker matching one of the following line-anchored, case-insensitive forms (the runtime regex in `okstra_ctl.run._validate_approved_plan` enforces this and rejects the run with `PrepareError` before any prompt is generated): `APPROVED` (alone, followed by `:`, or end-of-line), `[x] Approved`, or `User Approval: APPROVED|granted|yes`. Free-form approvals such as "lgtm", "go ahead", or paraphrased confirmations are intentionally NOT accepted; if the user's approval is informal, re-edit the plan file to add one of the exact markers above before invoking the implementation run.
23
24
  - the file's `Recommended option` and its bite-sized step list become the authoritative scope for this run; any deviation must be justified in the final report and routed back to a new `implementation-planning` run instead of being silently expanded.
24
25
  - Pre-implementation context exploration (executor before first edit):
25
26
  - re-read the approved plan end-to-end and extract: file list, step order, validation commands, rollback path
26
27
  - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
28
+ - "materially changed" means: the function, class, section, or behaviour the plan targets has been edited, renamed, moved, removed, or otherwise altered in a way that invalidates the plan's reasoning. Cosmetic edits (whitespace, comment-only changes, unrelated function modifications elsewhere in the same file) do NOT trigger a re-plan; cite the diff (`git log --oneline <plan-created-at>..HEAD -- <file>`) in the final report and proceed.
29
+ - distinguish the two file-scope rules (they are not in conflict):
30
+ - **drift rule** (this section): if a file *named in the plan* has materially drifted, refuse to edit and route back to planning. This protects trust in the approved scope.
31
+ - **out-of-plan rule** (Allowed actions section below): if a step *requires touching a file NOT in the plan list*, that is permitted with `Out-of-plan edits` justification. This handles honest scope discovery during execution.
27
32
  - confirm the test/build commands referenced in the plan still exist and run from a clean state
28
33
  - Allowed actions during the run:
29
34
  - **Edit / Write on any project file** (no path whitelist — scope is bounded by the approved plan's file list, not by directory). Editing files outside the plan's list is permitted only when strictly needed to satisfy a step, and MUST be recorded in the final report's `Out-of-plan edits` block with rationale.
@@ -53,7 +58,10 @@
53
58
  - **Validation evidence**: actual command output (stdout/stderr) for every `pre / mid / post` validation command from the plan. Truncated output is acceptable but the command line and exit code MUST be exact. No paraphrasing of test results.
54
59
  - **TDD evidence (when applicable)**: for steps that should be TDD-ordered, show the failing-test output BEFORE the implementation commit and the passing-test output AFTER, with commit SHAs framing the transition.
55
60
  - **Verifier results**: a section per verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) containing their independent verdict (PASS / CONCERNS / FAIL), their cited diff snippets, and any fix recommendations they declined to apply. `Claude lead` synthesises a unified verdict but MUST preserve dissent — do not collapse three opinions into one paragraph.
56
- - **Rollback verification**: confirmation that the plan's rollback path is still valid after the changes (e.g. revert SHA exists, feature flag toggle works, migration down step is reachable). Dry-run rollback is preferred when feasible.
61
+ - **Rollback verification**: confirmation that the plan's rollback path is still valid after the changes. Strength of verification depends on the change category:
62
+ - **Pure code changes** (no persisted state, no infra mutation): a reachable revert SHA is sufficient. Record the exact `git revert <SHA>` command that would undo the change, and confirm `git rev-parse <SHA>` resolves.
63
+ - **Feature-flag-gated changes**: confirm the off-switch path was exercised in this run's validation evidence (i.e. one of the validation commands ran with the flag off and succeeded). A plan that ships a flag without exercising the off-path does NOT satisfy this requirement.
64
+ - **Schema migrations, config-format changes, or any change with persisted state**: a **dry-run of the rollback step is mandatory**, not preferred. Record the exact rollback command and its captured exit code / stdout. If the migration tool offers no dry-run mode (`--dry-run`, `--plan`, equivalent), the executor MUST refuse to claim rollback verification and instead end the run with a routing recommendation back to `implementation-planning` for a safer rollback strategy. Skipping this step on a stateful change is treated as a `contract-violated` outcome by `final-verification`.
57
65
  - **Routing recommendation for `final-verification`**: brief note on whether the changes are ready for final-verification phase or need a new error-analysis / planning loop first.
58
66
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
59
67
  1. **Plan coverage** — every step in the approved plan's recommended option must point to a commit (or an explicit `Skipped: <reason>` entry). List gaps.
@@ -61,7 +69,11 @@
61
69
  3. **Out-of-plan honesty** — files in the diff that are NOT in the plan list must appear in the `Out-of-plan edits` block. Cross-check with `git diff --name-only`.
62
70
  4. **Verifier dissent preserved** — if the three verifiers disagree, the disagreement is visible in the report? Synthesis hides nothing?
63
71
  5. **Forbidden action audit** — `git push`, publish, deploy, migration, third-party write commands: scan the run's session transcripts for any occurrence and confirm none happened.
64
- 6. **Placeholder scan** — committed code searched for TBD / TODO / "handle edge cases" strings introduced by this run? (Pre-existing TODOs in untouched code are out of scope.)
72
+ 6. **Placeholder scan** — restrict the scan to lines this run actually introduced; pre-existing placeholders in unchanged regions of touched files are out of scope. Required command (substitute `<base>` with the parent of the first commit in this run's commit list):
73
+ ```
74
+ git diff <base>..HEAD | grep -E '^\+[^+].*\b(TBD|TODO|FIXME|XXX|implement later|handle edge cases|similar to|placeholder)\b' || echo 'clean'
75
+ ```
76
+ Only newly-added lines (those starting with `+` and not part of the `+++` header) are inspected. If output is anything other than `clean`, the run MUST either remove the placeholders before finalising or record an explicit justification per occurrence in the final report.
65
77
  - Skill provenance (for maintainers — these skills are referenced as inspiration, NOT invoked at runtime):
66
78
  - The pre-implementation gate, "Plan coverage" and "Evidence completeness" self-review items are adapted from the `executing-plans` skill (inline mode). The skill's "subagent-driven vs inline" choice prompt and its plan-document-header conventions are intentionally not adopted — okstra owns lifecycle handoff.
67
79
  - The "TDD evidence" requirement and the failing-test-before-implementation ordering preference are adapted from `test-driven-development`. Strict enforcement is relaxed where the touched area has no test infrastructure; in those cases the executor must add at minimum a regression test and record a justification.
@@ -98,7 +98,9 @@ PHASE_RULES: dict[str, dict[str, str]] = {
98
98
  " - source edits or Bash mutations performed by any verifier role (`Gemini verifier`, `Codex verifier`, `Claude verifier` are read-only — recommend, do not apply)\n"
99
99
  " - dispatching parallel sub-agents beyond the required worker roster\n"
100
100
  " - silent scope expansion: every file edited outside the approved plan list MUST appear in the `Out-of-plan edits` block with rationale\n"
101
- ' - leaving placeholders such as TBD / TODO / "implement later" / "handle edge cases" in committed code\n'
101
+ ' - leaving placeholders such as TBD / TODO / "implement later" / "handle edge cases" in newly-added lines of this run (check via `git diff <base>..HEAD | grep -E \'^\\+[^+].*\\b(TBD|TODO|FIXME|XXX|implement later|handle edge cases|similar to|placeholder)\\b\'`; pre-existing strings in untouched regions are out of scope)\n'
102
+ " - lead substituting its own verdict when every required verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) returned a non-result terminal status (`timeout`/`error`/`not-run`); in that case the run MUST end as `blocked` with routing recommendation back to `error-analysis`, never with a lead-only verdict\n"
103
+ " - claiming rollback verification on a schema migration, config-format change, or any persisted-state mutation without a recorded dry-run of the rollback step and its captured exit code\n"
102
104
  ' - declaring overall task acceptance — that is `final-verification` ownership; this phase reports only "ready for final-verification" or "needs new planning loop"\n'
103
105
  " - delegating the self-review pass to a generic subagent — `Claude lead` must run it"
104
106
  ),