okstra 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.9.0",
3
+ "version": "0.10.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.9.0",
3
- "builtAt": "2026-05-12T15:09:49.794Z",
2
+ "package": "0.10.0",
3
+ "builtAt": "2026-05-12T16:26:44.380Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -0,0 +1,27 @@
1
+ <!--
2
+ Shared contract fragment included verbatim by every profile via the
3
+ INCLUDE directive (see scripts/okstra_ctl/run.py:_expand_profile_includes).
4
+ Edit here once; every profile picks the change up at next render. Do NOT
5
+ add phase-specific rules to this file — phase rules stay in the per-
6
+ profile document.
7
+ -->
8
+ - Team contract (shared):
9
+ - `Claude lead` is synthesis-only and stays distinct from `Claude worker` (or, in `implementation`, the `Executor` and verifiers).
10
+ - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract).
11
+ - default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`. Phase-specific overrides (e.g. `implementation`'s executor binding) live in the per-profile document.
12
+ - every required worker listed in the per-profile `Required workers:` block must be attempted; the final verdict waits until each has either a result or an explicit terminal status (`timeout`, `error`, `not-run`).
13
+ - unnamed generic parallel workers must not replace the required role roster, and no additional sub-agent dispatch is allowed beyond this roster.
14
+ - Tooling — read-only MCP availability (shared):
15
+ - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried as a read-only cross-check; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived finding MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files reviewed by humans.
16
+ - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
17
+ - **Assume the user (and their team) holds full authority and every permission required for the anticipated, in-flight, or follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
18
+ - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, `Open Questions`, or any day/effort estimate. They are not legitimate sources of schedule extension.
19
+ - Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
20
+ - This rule does NOT relax any phase-specific Forbidden actions list; safety rules in the per-profile document remain in force regardless of the user's authority.
21
+ - Anti-escalation rule (shared):
22
+ - treating "다음 단계 진행해" or equivalent user phrases as authorisation to start a *different* lifecycle phase is forbidden. The next phase begins only in a separate okstra run launched with the new `--task-type`. Per-profile documents may further restrict this within their own scope.
23
+ - Clarification request policy (shared — applies whenever a profile uses `## 5. Clarification Requests for the Next Run`):
24
+ - section 5 MUST be split into two distinct sub-sections per `final-report-template.md` — `5.1 추가 자료 요청 (Additional Materials Requested)` for files/logs/screenshots/links the user must attach, and `5.2 사용자 확인 질문 (Questions for the User)` for decisions or facts only the user can confirm. Never mix material requests and decision questions in the same row or list.
25
+ - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the next step, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
26
+ - the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
27
+ - if a clarification response was carried in for this run, reconcile each prior `A*` (material) and `Q*` (question) row in section 0 and update its `Status` (`resolved`, `obsolete`) before issuing the next decision/verdict.
@@ -6,16 +6,7 @@
6
6
  - codex
7
7
  - gemini
8
8
  - report-writer
9
- - Team contract:
10
- - `Claude lead` is synthesis-only and stays distinct from `Claude worker`
11
- - required worker roles are `Claude worker`, `Codex worker`, `Gemini worker`, and `Report writer worker`
12
- - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract)
13
- - default model assignments are resolved from centralized defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`
14
- - `Gemini worker` must always be attempted for this workflow
15
- - the final verdict waits until each required worker has either a result or an explicit terminal status
16
- - unnamed generic parallel workers must not replace the required role roster
17
- - Tooling — read-only MCP availability:
18
- - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to confirm symptoms against live schema or to inspect rows that reproduce the failure; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived hypothesis MUST cite server, table, and the SELECT used
9
+ {{INCLUDE:_common-contract.md}}
19
10
  - Primary focus areas:
20
11
  - symptom and trigger clarification
21
12
  - root-cause candidates
@@ -27,17 +18,10 @@
27
18
  - evidence-backed cause analysis
28
19
  - uncertainty boundaries
29
20
  - practical next diagnostic steps
30
- - Clarification request policy:
21
+ - Clarification request policy (phase-specific addenda — shared policy is in `_common-contract.md`):
31
22
  - if any blocking uncertainty remains at the time of writing the final report, populate `## 5. Clarification Requests for the Next Run` in `final-report-template.md`
32
- - section 5 must be split into two distinct sub-sections per the template — `5.1 추가 자료 요청 (Additional Materials Requested)` for files/logs/screenshots the user must attach, and `5.2 사용자 확인 질문 (Questions for the User)` for decisions or facts only the user can confirm. Never mix material requests and decision questions in the same row or list.
33
- - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon (e.g. write "초당 평균 요청 수" instead of "QPS", "재현 절차" instead of "repro"). For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the next step, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
34
- - the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
35
- - if a clarification response was carried in for this run, reconcile each prior `A*` (material) and `Q*` (question) row in section 0 and update its `Status` (`resolved`, `obsolete`) before deciding the verdict
36
- - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
37
- - **Assume the user (and their team) holds full authority and every permission required for the anticipated work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the user explicitly states otherwise in the task brief.
38
- - Do NOT add such items to root-cause hypotheses, validation paths, missing-evidence lists, clarification questions, risk lists, or day/effort estimates. They are not legitimate sources of schedule extension.
39
- - Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
23
+ - prefer plain Korean over abbreviations (e.g. write "초당 평균 요청 수" instead of "QPS", "재현 절차" instead of "repro")
40
24
  - Non-goals:
41
25
  - implementation details unless they are necessary to validate the cause
42
26
  - **source code edits, builds, migrations, or deployments** — this run produces evidence and cause analysis only; the fix belongs to a later `implementation-planning` run followed by an `implementation` run
43
- - treating "다음 단계 진행해" or equivalent user phrases as authorisation to begin planning or coding — this run stays in `error-analysis` regardless of such phrasing
27
+ - this run stays in `error-analysis` regardless of user phrasing — the shared anti-escalation rule applies
@@ -6,16 +6,7 @@
6
6
  - codex
7
7
  - gemini
8
8
  - report-writer
9
- - Team contract:
10
- - `Claude lead` is synthesis-only and stays distinct from `Claude worker`
11
- - required worker roles are `Claude worker`, `Codex worker`, `Gemini worker`, and `Report writer worker`
12
- - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract)
13
- - default model assignments are resolved from centralized defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`
14
- - `Gemini worker` must always be attempted for this workflow
15
- - the final verdict waits until each required worker has either a result or an explicit terminal status
16
- - unnamed generic parallel workers must not replace the required role roster
17
- - Tooling — read-only MCP availability:
18
- - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to verify that the delivered change matches the live schema, that expected rows exist after a migration, or that invariants in `reference-expectations.md` hold against the database; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived blocker MUST cite server, table, and the SELECT used. MCP MUST NOT be used to perform fixes — defects become inputs to a new run.
9
+ {{INCLUDE:_common-contract.md}}
19
10
  - Primary focus areas:
20
11
  - requirement coverage
21
12
  - whether delivered config files and deployment manifests satisfy the recorded expected values
@@ -34,23 +25,16 @@
34
25
  - **Validation Evidence**: for every requirement in the originating plan or task brief, cite the artifact (commit SHA, test output, log line, MCP SELECT result) that demonstrates coverage. Paraphrased "verified" claims without an artifact are rejected.
35
26
  - **Read-only command log**: any pre-existing test/validation command executed during this run MUST be listed with its exact command line and exit code. No mutating commands may appear here.
36
27
  - **Routing recommendation**: brief note on the next safe phase (`done`, `error-analysis`, `implementation-planning`) tied to the verdict and blocker list.
37
- - Clarification request policy:
38
- - if a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation), populate `## 5. Clarification Requests for the Next Run` in `final-report-template.md`
39
- - section 5 must be split into `5.1 추가 자료 요청 (Additional Materials Requested)` and `5.2 사용자 확인 질문 (Questions for the User)` per the template. Never mix material requests and decision questions in the same row or list.
40
- - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the verdict, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
41
- - the preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>`; the lower-level form `--clarification-response <path>` remains available for scripted runs.
42
- - if a clarification response was carried in for this run, reconcile each prior `A*` and `Q*` row in section 0 and update its `Status` (`resolved`, `obsolete`) before issuing the final verdict
28
+ - Clarification request policy (phase-specific addendum — shared policy is in `_common-contract.md`):
29
+ - populate section 5 only when a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation)
43
30
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
44
31
  1. **Verdict precision** — section 2 uses one of the three allowed verdict tokens; `conditional-accept` lists every condition as an actionable item.
45
32
  2. **Blocker traceability** — every blocker cites a concrete artifact (file:line, log excerpt, test exit code, MCP SELECT). Blockers without evidence are demoted to residual risk or removed.
46
33
  3. **Coverage check** — every requirement in the originating plan/task brief is either marked covered (with artifact) or listed as a blocker. No silent omissions.
47
34
  4. **Verifier dissent preserved** — if workers reach different verdicts, the disagreement is visible in section 1.2; synthesis hides nothing.
48
35
  5. **No-mutation audit** — scan the run's session transcripts for any Edit / Write / mutating Bash command. Any occurrence means the run has crossed into implementation and MUST be re-routed; do NOT silently strip the evidence.
49
- - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
50
- - **Assume the user (and their team) holds full authority and every permission required for the delivered and follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
51
- - Do NOT raise such items as acceptance blockers, residual risks, or release recommendations, and do not factor them into any effort/day estimate for follow-up runs. They are not legitimate sources of schedule extension.
52
36
  - Non-goals:
53
37
  - proposing unrelated refactors beyond the delivered scope
54
38
  - **source code edits, follow-up bug fixes, or scope expansion** — this run renders a verdict only; defects detected here become inputs to a new `error-analysis` or `implementation-planning` run
55
39
  - read-only execution of pre-existing test or validation commands is permitted, but any command that mutates source, schema, or deployment state is forbidden
56
- - treating "다음 단계 진행해" or equivalent user phrases as authorisation to fix detected issues inside this run record the issues and end the run
40
+ - this run records detected issues and ends the shared anti-escalation rule forbids in-run fixes regardless of user phrasing
@@ -6,16 +6,7 @@
6
6
  - codex
7
7
  - gemini
8
8
  - report-writer
9
- - Team contract:
10
- - `Claude lead` is synthesis-only and stays distinct from `Claude worker`
11
- - required worker roles are `Claude worker`, `Codex worker`, `Gemini worker`, and `Report writer worker`
12
- - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract)
13
- - default model assignments are resolved from centralized defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`
14
- - `Gemini worker` must always be attempted for this workflow
15
- - the final verdict waits until each required worker has either a result or an explicit terminal status
16
- - unnamed generic parallel workers must not replace the required role roster
17
- - Tooling — read-only MCP availability:
18
- - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to size the blast radius of an option (table cardinality, column types, foreign-key fan-out, indexes), to validate migration assumptions, or to confirm that a proposed query shape returns the expected rows; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived figure entering the trade-off matrix or risk assessment MUST cite server, table, and the SELECT used. MCP MUST NOT be used as a write path even when planning a migration — schema changes belong in migration files reviewed by humans.
9
+ {{INCLUDE:_common-contract.md}}
19
10
  - Pre-planning context exploration (mandatory before option drafting):
20
11
  - read the task brief, related-task briefs, and any cited spec / design doc end-to-end
21
12
  - inspect the current state of every file the task names (or the closest matching files if names are stale) — record current responsibilities, public interfaces, and known coupling points
@@ -37,15 +28,13 @@
37
28
  - feasible plan options
38
29
  - dependency and risk visibility
39
30
  - recommended execution order
40
- - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
41
- - **Assume the user (and their team) holds full authority and every permission required for the anticipated work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the user explicitly states otherwise in the task brief.
42
- - Do NOT add such items to option trade-offs, dependency/migration risk, validation checklists, rollout plans, `Open Questions`, or day/effort estimates. They are not legitimate sources of schedule extension; effort sizing must reflect engineering work only.
43
- - The `User Approval Request (사용자 승인 게이트)` block at the **top of the final report** (immediately under the metadata header, before `Section 0`) is the only authorised approval gate — that gate is the user themselves, who clears it either by (a) editing the single checkbox `- [ ] Approved` to `- [x] Approved` directly, or (b) invoking the next phase with `--approve` so the CLI invocation itself is treated as the approval signal and the runtime flips the checkbox on the user's behalf. No external coordination is expected.
31
+ - Approval gate (phase-specific addendum to shared authority rule):
32
+ - The `User Approval Request (사용자 승인 게이트)` block at the **top of the final report** (immediately under the metadata header, before `Section 0`) is the only authorised approval gate. The user clears it either by (a) editing the single checkbox `- [ ] Approved` to `- [x] Approved` directly, or (b) invoking the next phase with `--approve` so the CLI invocation itself is treated as the approval signal and the runtime flips the checkbox on the user's behalf.
44
33
  - Non-goals:
45
34
  - code-level micro-optimization unless it changes the implementation approach
46
35
  - **source code edits of any kind** — this run produces a plan document only; Edit/Write on project source files is forbidden until the plan is approved and a separate `implementation` run starts
47
36
  - executing builds, migrations, deployments, or any command that mutates project state outside the run's own artifact directories (`reports/`, `prompts/`, `state/`, `manifests/`, `worker-results/`, `status/`, `sessions/`)
48
- - treating "다음 단계 진행해" or equivalent user phrases as authorisation to start the implementation phase — this run stays in `implementation-planning` regardless of such phrasing
37
+ - this run stays in `implementation-planning` regardless of user phrasing — the shared anti-escalation rule applies
49
38
  - dispatching parallel sub-agents beyond the required worker roster — okstra owns worker fan-out
50
39
  - writing artifacts to `docs/superpowers/specs/` or `docs/superpowers/plans/` — the run's `reports/` directory is the canonical location
51
40
  - Section heading contract (BLOCKING — validator scans for these literal English substrings):
@@ -79,7 +68,3 @@
79
68
  3. **Internal consistency** — option file lists, trade-off matrix, and recommended step list must agree on file paths, names, and signatures. A symbol called `clearLayers()` in the matrix and `clearFullLayers()` in the steps is a bug.
80
69
  4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to `Open Questions`.
81
70
  5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
82
- - Skill provenance (for maintainers):
83
- - "Pre-planning context exploration", "Design principles applied when scoring options", and the self-review pass items 3–5 are adapted from the `brainstorming` skill (clarification + spec self-review portions). The interactive one-question-at-a-time dialogue, visual companion, and `docs/superpowers/specs/` save path from that skill are intentionally **not** adopted — okstra is a non-interactive multi-worker run with its own artifact layout.
84
- - "File Structure" per option, bite-sized step granularity, the No-placeholder rule, and self-review items 1–2 are adapted from the `writing-plans` skill. The skill's default save path (`docs/superpowers/plans/`), the subagent-vs-inline execution-handoff prompt, and the plan-document header referencing other skills are intentionally **not** adopted — okstra owns artifact paths, worker dispatch, and lifecycle handoff.
85
- - Skill names above are written without the deprecated `superpowers:` prefix.
@@ -12,18 +12,13 @@
12
12
  - Executor subagent for dispatch: `{{EXECUTOR_WORKER_AGENT}}`
13
13
  - Executor model: `{{EXECUTOR_MODEL_DISPLAY}}` (launch value: `{{EXECUTOR_MODEL_EXECUTION_VALUE}}`)
14
14
  - Wherever this profile mentions the `Executor`, it refers to the role bound above. The other two providers in the roster (`claude` / `codex` / `gemini` minus the executor) are dispatched as **verifiers only** for this run and remain strictly read-only.
15
- - Team contract:
16
- - `Claude lead` is synthesis-only and stays distinct from the `Executor` and the verifiers
15
+ {{INCLUDE:_common-contract.md}}
16
+ - Team contract (phase-specific overrides — `Claude worker` is replaced by `Executor` + verifier triad in this phase):
17
17
  - **Executor role:** the `Executor` (bound above) is the **only worker permitted to use Edit / Write / state-mutating Bash commands** on project files. All other workers run read-only. When the executor provider is `codex` or `gemini`, the actual file mutation happens inside the executor CLI's own auto-edit mode (e.g. `codex exec --full-auto`, gemini's equivalent) — not through Claude-side Edit/Write tools — but the safety rules in this profile still apply identically.
18
18
  - **Verifier roles:** the three verifier slots are `Claude verifier`, `Codex verifier`, and `Gemini verifier`. All three are dispatched regardless of which provider holds the executor role; the executor's own provider is run *separately* as a verifier (a fresh CLI session with no shared context) so that no verdict is produced from the same session that wrote the diff. Verifiers MUST NOT call Edit, Write, or any Bash command that mutates files outside the run's artifact directories. If a verifier wants a fix, it records the recommendation in its worker result; it does not apply the fix itself.
19
- - Session isolation — not model-variant divergence — is the primary self-review safeguard: each verifier is a separate CLI invocation with its own context window, so reusing the same model variant for executor and same-provider verifier is acceptable. Assigning different model variants (e.g. executor=opus / Claude verifier=sonnet) remains recommended when available because it adds defence-in-depth, but it is no longer a hard requirement.
20
- - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract).
21
- - default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude verifier`=`sonnet`, `Codex verifier`=`gpt-5.5`, `Gemini verifier`=`auto`. The `Executor`'s model is taken from the provider-specific worker model corresponding to `--executor`: claude→`--claude-model` (default `sonnet`, override to `opus` recommended when this run's executor is claude), codex→`--codex-model` (default `gpt-5.5`), gemini→`--gemini-model` (default `auto`).
22
- - all three verifier roles (`Gemini verifier`, `Codex verifier`, `Claude verifier`) must be attempted; the final verdict waits until each has either a result or an explicit terminal status
23
- - **All-verifier-failure policy**: if every required verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) ends with a non-result terminal status (`timeout`, `error`, `not-run`) — i.e. zero independent verdicts were produced — the run MUST end with status `blocked` and route to a follow-up `error-analysis` run. `Claude lead` MUST NOT substitute its own verdict in place of the missing verifier outputs; synthesis requires at least one independent verifier's verdict. If one or two verifiers fail but at least one returns a verdict, the run proceeds with the surviving verdict(s) and the final report MUST explicitly notate which verifiers were unavailable, with the captured error / timeout evidence per failed verifier.
24
- - unnamed generic parallel workers must not replace the required role roster, and no additional sub-agent dispatch is allowed beyond this roster
25
- - Tooling — read-only MCP availability:
26
- - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried by both executor and verifiers as a read-only cross-check (sanity-checking row counts after a migration script's dry-run, comparing observed schema against the plan's expectations, etc.); that section is the canonical source of which servers and tools exist for this run, and any MCP-derived evidence MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files, never through this MCP.
19
+ - Session isolation — not model-variant divergence — is the primary self-review safeguard: each verifier is a separate CLI invocation with its own context window, so reusing the same model variant for executor and same-provider verifier is acceptable. Different model variants (e.g. executor=opus / Claude verifier=sonnet) remain recommended when available.
20
+ - Phase-specific model defaults override the shared defaults: `Claude verifier`=`sonnet`, `Codex verifier`=`gpt-5.5`, `Gemini verifier`=`auto`. The `Executor`'s model is taken from the provider-specific worker model corresponding to `--executor`: claude→`--claude-model` (default `sonnet`, override to `opus` recommended when this run's executor is claude), codex→`--codex-model` (default `gpt-5.5`), gemini→`--gemini-model` (default `auto`).
21
+ - **All-verifier-failure policy**: if every verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) ends with a non-result terminal status (`timeout`, `error`, `not-run`) i.e. zero independent verdicts were produced the run MUST end with status `blocked` and route to a follow-up `error-analysis` run. `Claude lead` MUST NOT substitute its own verdict in place of the missing verifier outputs; synthesis requires at least one independent verifier's verdict. If one or two verifiers fail but at least one returns a verdict, the run proceeds with the surviving verdict(s) and the final report MUST explicitly notate which verifiers were unavailable, with the captured error / timeout evidence per failed verifier.
27
22
  - Pre-implementation gate (mandatory — refuse to start if any item fails):
28
23
  - the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
29
24
  - that file MUST contain a `User Approval Request` block (canonically placed at the **top of the report**, immediately under the metadata header) AND a recorded user approval marker. The canonical, recommended form is the single markdown checkbox line `- [x] Approved`. The runtime regex in `okstra_ctl.run._validate_approved_plan` also accepts (case-insensitive, line-anchored, optional leading `-`/`*`/`+` bullet): `APPROVED` (alone, followed by `:`, or end-of-line), `[x] Approved`, or `User Approval: APPROVED|granted|yes`. Free-form approvals such as "lgtm", "go ahead", or paraphrased confirmations are intentionally NOT accepted; if the user's approval is informal, re-edit the plan file to flip the top checkbox to `- [x] Approved` before invoking the implementation run.
@@ -42,7 +37,12 @@
42
37
  - **Verifier behaviour**: all three verifier roles read from the SAME working tree path so they observe the exact diff the Executor produced. Verifiers remain strictly read-only there.
43
38
  - **Lifecycle**: the worktree is kept after the run completes (no automatic cleanup). It is the canonical artefact for manual PR authoring, rollback verification, and follow-up `final-verification` runs. Cleanup, when desired, is manual: `git -C <project_root> worktree remove <path>` followed by `git -C <project_root> branch -D <branch>`.
44
39
  - **Skipped paths**: when status is `skipped-in-worktree` or `skipped-not-git`, the executor operates in `project_root` as before. Cite the status in the final report's metadata header so reviewers know which path was taken.
40
+ - **Synced state directories (symlinks into the original `project_root`)**: at provision time `okstra-ctl` symlinks `.project-docs/`, `.scratch/`, and `graphify-out/` from the original `project_root` into the worktree (override via `OKSTRA_WORKTREE_SYNC_DIRS`; empty string disables). These are NOT independent copies — writes through them land in `project_root`. Inside this run the executor MUST confine writes under these paths to its own task scope (i.e. only `.project-docs/okstra/tasks/<this-task-id>/...`). Reading from elsewhere under the symlinks (other tasks, `graphify-out/GRAPH_REPORT.md`, `.scratch/` issues) is allowed and expected for context.
45
41
  - Pre-implementation context exploration (executor before first edit):
42
+ - **Mandatory skill invocation — `tdd`**: BEFORE the first `Edit` or `Write` call, the executor MUST invoke the `tdd` skill via the `Skill` tool and follow its red-green-refactor loop for every code change in this run. This is a hard requirement, not a recommendation; skipping it is a `contract-violated` outcome. The skill governs HOW each step is executed (failing test first → minimal implementation → refactor); it does not override the approved plan's WHAT/file scope.
43
+ - Order of operations per plan step: (1) write/extend the test that captures the step's acceptance criterion and confirm it fails for the right reason, (2) commit the failing test (`test(<scope>): ...`), (3) implement the minimum change to make it pass, (4) commit the implementation (`feat|fix(<scope>): ...`), (5) refactor without changing behaviour and commit separately if any cleanup is made (`refactor(<scope>): ...`). The failing-then-passing transition between steps (2) and (4) is the `TDD evidence` required by the final report.
44
+ - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
45
+ - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
46
46
  - re-read the approved plan end-to-end and extract: file list, step order, validation commands, rollback path
47
47
  - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
48
48
  - "materially changed" means: the function, class, section, or behaviour the plan targets has been edited, renamed, moved, removed, or otherwise altered in a way that invalidates the plan's reasoning. Cosmetic edits (whitespace, comment-only changes, unrelated function modifications elsewhere in the same file) do NOT trigger a re-plan; cite the diff (`git log --oneline <plan-created-at>..HEAD -- <file>`) in the final report and proceed.
@@ -55,10 +55,17 @@
55
55
  - read-only inspection commands: `git status`, `git diff`, `git log`, `grep`, `rg`, `find`, `cat`, `ls`, file Read tools
56
56
  - build, lint, type-check, and test commands (`npm test`, `pytest`, `go build`, `cargo test`, `bash -n`, etc.)
57
57
  - **local git operations only**: `git add`, `git commit`. Prefer small commits keyed to plan steps.
58
- - Authority & permissions assumption (HARD RULE applies to every okstra task-type):
59
- - **Assume the user (and their team) holds full authority and every permission required for the approved work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the approved plan or task brief explicitly states otherwise.
60
- - Do NOT add such items to commit ordering, validation evidence, rollback verification, routing recommendations, or any time/effort accounting. They are not legitimate sources of schedule extension.
61
- - The pre-implementation gate's recorded user approval marker is the only authorised approval gate at this phase proceed once it is satisfied without further external coordination. (This rule does NOT relax the existing forbidden-action list below; deploy/publish/push/migrate to shared environments remain blocked for safety, not for permission reasons.)
58
+ - **Commit message format (mandatory)**: every commit message MUST follow Conventional Commits — `<type>(<scope>): <subject>` for the first line, optional body separated by a blank line, optional footer. Constraints:
59
+ - `<type>` MUST be one of: `feat` / `fix` / `perf` / `revert` / `deps` / `docs` / `refactor` / `build` / `ci` / `chore` / `test`. When the repo is `release-please`-managed, this aligns the commit with a configured changelog section.
60
+ - `<scope>` SHOULD be the plan step identifier or the primary module touched (e.g. `feat(report-writer): ...`). Omit the parentheses only when no meaningful scope applies.
61
+ - `<subject>` MUST be ≤72 characters, imperative mood (`add`, `fix`, `remove` not `added` / `adding`), no trailing period, no emoji, no AI attribution lines (no `Co-Authored-By: Claude ...`, no `Generated with Claude Code`).
62
+ - Body (when present) explains *why*, not *what*; wrap at ~100 chars.
63
+ - Do NOT append okstra artefact paths to the commit message — no `Plan: .project-docs/okstra/...`, no `Report: ...`, no `Run: ...`, no `Task: ...` footers, and no other reference to files under `.project-docs/okstra/`. Those paths belong in the final report's `Plan link & approval evidence` section, not in git history; they rot quickly and leak internal layout into the upstream changelog.
64
+ - Allowed footers are limited to standard Conventional Commits trailers (`BREAKING CHANGE: ...`, `Refs: <issue/ticket-id>`, `Closes #<n>`). When citing a ticket, use the ticket id only (e.g. `Refs: DEV-9423`) — never a filesystem path.
65
+ - One commit MUST correspond to one plan step (or one cohesive sub-step). Do NOT bundle unrelated steps into a single commit, and do NOT split a single step across commits unless the plan explicitly sequenced it that way.
66
+ - The exact message used for each commit MUST be reproduced verbatim in the final report's `Commit list` so reviewers can audit it without re-running `git log`.
67
+ - Approval gate (phase-specific addendum to shared authority rule):
68
+ - the pre-implementation gate's recorded user approval marker is the only authorised approval gate at this phase — proceed once it is satisfied without further external coordination
62
69
  - Forbidden actions (any occurrence → terminal status `contract-violated`):
63
70
  - **`git push` of any kind**, including `--dry-run` against a real remote that produces side-effects
64
71
  - publishing or release commands: `npm publish`, `cargo publish`, `pip publish`, `gh release`, `docker push`
@@ -83,6 +90,7 @@
83
90
  - **Feature-flag-gated changes**: confirm the off-switch path was exercised in this run's validation evidence (i.e. one of the validation commands ran with the flag off and succeeded). A plan that ships a flag without exercising the off-path does NOT satisfy this requirement.
84
91
  - **Schema migrations, config-format changes, or any change with persisted state**: a **dry-run of the rollback step is mandatory**, not preferred. Record the exact rollback command and its captured exit code / stdout. If the migration tool offers no dry-run mode (`--dry-run`, `--plan`, equivalent), the executor MUST refuse to claim rollback verification and instead end the run with a routing recommendation back to `implementation-planning` for a safer rollback strategy. Skipping this step on a stateful change is treated as a `contract-violated` outcome by `final-verification`.
85
92
  - **Routing recommendation for `final-verification`**: brief note on whether the changes are ready for final-verification phase or need a new error-analysis / planning loop first.
93
+ - **Follow-up tasks (Section 7 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 7. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any open question carried over from `4.5.9`. An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 6. Recommended Next Steps` so the user knows to act manually.
86
94
  - Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
87
95
  1. **Plan coverage** — every step in the approved plan's recommended option must point to a commit (or an explicit `Skipped: <reason>` entry). List gaps.
88
96
  2. **Evidence completeness** — every `Validation evidence` and `TDD evidence` claim has the actual command line and exit code? No paraphrased "tests pass" without output?
@@ -94,10 +102,5 @@
94
102
  git diff <base>..HEAD | grep -E '^\+[^+].*\b(TBD|TODO|FIXME|XXX|implement later|handle edge cases|similar to|placeholder)\b' || echo 'clean'
95
103
  ```
96
104
  Only newly-added lines (those starting with `+` and not part of the `+++` header) are inspected. If output is anything other than `clean`, the run MUST either remove the placeholders before finalising or record an explicit justification per occurrence in the final report.
97
- - Skill provenance (for maintainers — these skills are referenced as inspiration, NOT invoked at runtime):
98
- - The pre-implementation gate, "Plan coverage" and "Evidence completeness" self-review items are adapted from the `executing-plans` skill (inline mode). The skill's "subagent-driven vs inline" choice prompt and its plan-document-header conventions are intentionally not adopted okstra owns lifecycle handoff.
99
- - The "TDD evidence" requirement and the failing-test-before-implementation ordering preference are adapted from `test-driven-development`. Strict enforcement is relaxed where the touched area has no test infrastructure; in those cases the executor must add at minimum a regression test and record a justification.
100
- - The "Validation evidence" and "Forbidden action audit" requirements are adapted from `verification-before-completion`. The skill's bare-name principle ("evidence before assertions") is treated as a hard rule.
101
- - Verifier role behaviour (read-only review, recommendation without application, dissent preservation) is adapted from `receiving-code-review` and `requesting-code-review`. The skills' UI/dialogue framing is replaced by structured worker-result sections.
102
- - In-phase debugging follows the spirit of `systematic-debugging` (root cause before fix), but the executor must NOT route to a separate `error-analysis` phase mid-run; if a defect blocks plan progress, the executor records findings and routes to a new run after this one ends.
103
- - Skill names above are written without the deprecated `superpowers:` prefix.
105
+ - In-phase debugging:
106
+ - follows the spirit of `systematic-debugging` (root cause before fix), but the executor MUST NOT route to a separate `error-analysis` phase mid-run; if a defect blocks plan progress, the executor records findings and routes to a new run after this one ends.
@@ -4,15 +4,11 @@
4
4
  - Required workers:
5
5
  - claude
6
6
  - report-writer
7
- - Workers:
8
- - claude
9
- - report-writer
10
- - Team contract:
7
+ {{INCLUDE:_common-contract.md}}
8
+ - Team contract (phase-specific overrides):
11
9
  - `Claude lead` is the **executor of every git / gh command** in this phase. Workers never call mutating git commands themselves.
12
10
  - `Claude worker` (drafter) is read-only and produces **commit message candidate(s) and a PR body candidate** in markdown; the lead presents these to the user, accepts edits, and only then runs git.
13
- - `Report writer worker` authors the final-report file documenting what was executed (commit SHA list, PR URL, user selections, skipped actions). `Claude lead` reviews and approves the draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer`).
14
- - default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`.
15
- - unnamed generic parallel workers must not replace the required role roster, and no additional sub-agent dispatch is allowed beyond this roster.
11
+ - Codex / Gemini workers are NOT part of this profile's roster (see `Required workers:` above); the shared contract's `Gemini worker must always be attempted` clause does not apply to release-handoff.
16
12
  - Pre-handoff entry gate (mandatory — refuse to start if any item fails):
17
13
  - the task brief MUST cite the originating `final-verification` final-report path under `## Source Verification Report`. The lead opens that file and confirms section `## 2. Final Verdict` contains exactly the token `accepted`.
18
14
  - if the verdict is `conditional-accept`, `blocked`, or any other token (including ambiguous phrasing like "looks good"), the run MUST end immediately with status `blocked` and a routing recommendation back to `error-analysis` or `implementation-planning`. Do NOT prompt the user; do NOT run any git command.
@@ -50,10 +46,6 @@
50
46
  - PR creation (only when the user picked `commit + PR` AND no PR with the same head already exists on origin): `gh pr create --base <chosen-base> --head <current-branch> --title "<title>" --body "<body>"`. The title is the commit message subject by default; the body is the user-confirmed PR body.
51
47
  - PR reuse: if `gh pr list --head <branch> --state open --json url --jq '.[0].url'` returns a URL, treat that PR as already existing — record the URL in the final report and SKIP `gh pr create`.
52
48
  - Idempotency: if `git diff --cached` and `git diff` are both empty (nothing to commit), record "no staged changes; commit skipped" in the final report and skip `git commit` while still proceeding to the PR step if requested.
53
- - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
54
- - **Assume the user (and their team) holds full authority and every permission required for the delivered and follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
55
- - Do NOT raise such items as blockers, residual risks, or release recommendations.
56
- - This rule does NOT relax the forbidden-action list below; the safety rules below remain in force regardless of the user's authority.
57
49
  - Forbidden actions (any occurrence → terminal status `contract-violated`):
58
50
  - any of the following git push variants, regardless of intent or whether the user said "force it":
59
51
  - `git push --force`
@@ -94,4 +86,4 @@
94
86
  - re-litigating the final-verification verdict — release-handoff trusts the cited `accepted` verdict and does not reopen acceptance checks.
95
87
  - opening additional PRs, releases, or deployments beyond the single PR the user chose to create.
96
88
  - merging the PR. Merging is a separate, manual step performed by the user (or by repo automation) after release-handoff ends; the lead MUST NOT call `gh pr merge`.
97
- - treating "다음 단계 진행해" or equivalent user phrases as authorisation to escalate beyond the menu choices every mutating action requires an explicit menu selection.
89
+ - escalating beyond the menu choices on user phrasing every mutating action requires an explicit menu selection (the shared anti-escalation rule applies, with this phase-specific tightening).
@@ -6,16 +6,7 @@
6
6
  - codex
7
7
  - gemini
8
8
  - report-writer
9
- - Team contract:
10
- - `Claude lead` is synthesis-only and stays distinct from `Claude worker`
11
- - required worker roles are `Claude worker`, `Codex worker`, `Gemini worker`, and `Report writer worker`
12
- - `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract)
13
- - default model assignments are resolved from centralized defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`
14
- - `Gemini worker` must always be attempted for this workflow
15
- - the final verdict waits until each required worker has either a result or an explicit terminal status
16
- - unnamed generic parallel workers must not replace the required role roster
17
- - Tooling — read-only MCP availability:
18
- - the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried when local schema or sample data clarifies the work category or routing decision; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived finding MUST cite server, table, and the SELECT used
9
+ {{INCLUDE:_common-contract.md}}
19
10
  - Primary focus areas:
20
11
  - classify the work as bugfix, feature, improvement, refactor, or ops-change
21
12
  - determine whether `error-analysis`, `implementation-planning`, or a direct implementation handoff is the next safe step
@@ -26,18 +17,9 @@
26
17
  - evidence-backed routing decision
27
18
  - uncertainty boundaries and missing inputs
28
19
  - next recommended phase and safe resume guidance
29
- - Clarification request policy:
20
+ - Clarification request policy (phase-specific addenda — shared policy is in `_common-contract.md`):
30
21
  - if any blocking input is missing at the time of writing the final report, populate `## 5. Clarification Requests for the Next Run` in `final-report-template.md`
31
- - section 5 must be split into two distinct sub-sections per the template — `5.1 추가 자료 요청 (Additional Materials Requested)` for files/screenshots/links the user must attach, and `5.2 사용자 확인 질문 (Questions for the User)` for routing or scoping decisions only the user can make. Never mix material requests and decision questions in the same row or list.
32
22
  - prefer concrete questions whose answers map directly to a routing decision (`bugfix` vs `feature`, `error-analysis` vs `implementation-planning`, etc.). State each option in plain language with one sentence describing what choosing it would mean for the next phase.
33
- - write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the routing decision, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
34
- - the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
35
- - if a clarification response was carried in for this run, reconcile each prior `A*` (material) and `Q*` (question) row in section 0 and update its `Status` (`resolved`, `obsolete`) before issuing the routing decision
36
- - Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
37
- - **Assume the user (and their team) holds full authority and every permission required for the anticipated work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the user explicitly states otherwise in the task brief.
38
- - Do NOT add such items to routing decisions, missing-materials lists, clarification questions, risk lists, dependencies, open questions, or day/effort estimates. They are not legitimate sources of schedule extension.
39
- - Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
40
23
  - Non-goals:
41
24
  - full implementation design unless it is required to decide the next phase
42
25
  - **source code edits, plan authoring, builds, or deployments** — this run only classifies the work and routes it; deeper analysis and planning belong to subsequent phases
43
- - treating "다음 단계 진행해" or equivalent user phrases as authorisation to start `error-analysis`, `implementation-planning`, or `implementation` — the next phase begins only in a separate okstra run launched with the new `--task-type`
@@ -340,6 +340,36 @@ def _canonical_argv(inp: PrepareInputs, ctx: dict) -> list[str]:
340
340
  return argv
341
341
 
342
342
 
343
+ _INCLUDE_DIRECTIVE = re.compile(r"\{\{INCLUDE:([^}]+?)\}\}")
344
+
345
+
346
+ def _expand_profile_includes(profile_path: Path, _depth: int = 0) -> str:
347
+ """Resolve `{{INCLUDE:<name>}}` directives in a profile file.
348
+
349
+ Includes are resolved relative to the profile's directory. A maximum
350
+ recursion depth of 4 prevents accidental cycles; the included file
351
+ contents replace the directive line in-place. Missing include targets
352
+ raise PrepareError so a bad reference fails fast instead of silently
353
+ leaving a `{{INCLUDE:...}}` token in the rendered profile.
354
+ """
355
+ if _depth > 4:
356
+ raise PrepareError(
357
+ f"profile include recursion exceeded depth 4 while resolving {profile_path}"
358
+ )
359
+ text = profile_path.read_text(encoding="utf-8")
360
+
361
+ def _sub(match: "re.Match[str]") -> str:
362
+ target_name = match.group(1).strip()
363
+ target = profile_path.parent / target_name
364
+ if not target.is_file():
365
+ raise PrepareError(
366
+ f"profile include target missing: {target} (referenced from {profile_path})"
367
+ )
368
+ return _expand_profile_includes(target, _depth + 1).rstrip("\n")
369
+
370
+ return _INCLUDE_DIRECTIVE.sub(_sub, text)
371
+
372
+
343
373
  def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
344
374
  """Produce a complete okstra task bundle on disk. See module docstring."""
345
375
  workspace_root = Path(inp.workspace_root)
@@ -510,7 +540,7 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
510
540
  claude_session_id = "" if inp.render_only else generate_claude_session_id()
511
541
 
512
542
  # ---- material + related-tasks ----
513
- profile_content = profile_file.read_text(encoding="utf-8")
543
+ profile_content = _expand_profile_includes(profile_file)
514
544
  review_material = build_analysis_material(inp.brief_path, inp.directive)
515
545
  related_items = resolve_related_tasks(
516
546
  task_manifest_path=Path(ctx["TASK_MANIFEST_FILE"]),
@@ -9,7 +9,12 @@ from __future__ import annotations
9
9
  from pathlib import Path
10
10
 
11
11
  ALLOWED_WORKERS = ["claude", "codex", "gemini", "report-writer"]
12
- PROFILE_BULLET_HEADERS = {"- Workers:", "- Reviewers:", "- Analysers:"}
12
+ PROFILE_BULLET_HEADERS = {
13
+ "- Workers:",
14
+ "- Required workers:",
15
+ "- Reviewers:",
16
+ "- Analysers:",
17
+ }
13
18
 
14
19
 
15
20
  class WorkersError(Exception):
@@ -27,6 +27,25 @@ from typing import Optional
27
27
  OKSTRA_WORKTREES_RELATIVE = Path(".okstra/worktrees")
28
28
 
29
29
 
30
+ # Project-root directories that hold okstra task state, ignored by git, or
31
+ # otherwise required for the executor to operate but NOT carried across by
32
+ # `git worktree add`. Each is symlinked from project_root into the new
33
+ # worktree at provision time. Symlinks (not copies) so the executor sees
34
+ # the live state and disk/CPU cost stays near zero; the trade-off is that
35
+ # any write through the link reaches the original project_root, which is
36
+ # acceptable because the executor only writes inside its own task-scoped
37
+ # subdirectory (e.g. `.project-docs/okstra/tasks/<task-id>/runs/...`).
38
+ #
39
+ # Override via the `OKSTRA_WORKTREE_SYNC_DIRS` env var: a colon-separated
40
+ # list of project-root-relative paths that REPLACES this default. Use an
41
+ # empty string to disable the feature entirely.
42
+ DEFAULT_WORKTREE_SYNC_DIRS: tuple[str, ...] = (
43
+ ".project-docs",
44
+ ".scratch",
45
+ "graphify-out",
46
+ )
47
+
48
+
30
49
  # Work-category → short branch prefix. Mirrors the values accepted by
31
50
  # `--work-category` (bugfix / feature / refactor / ops / improvement) and
32
51
  # falls back to `task` when the category is unset or unrecognised.
@@ -105,6 +124,47 @@ def _head_sha(project_root: Path) -> str:
105
124
  return res.stdout.strip()
106
125
 
107
126
 
127
+ def _resolve_sync_dirs() -> tuple[str, ...]:
128
+ """Return the list of project-root-relative dirs to symlink into the
129
+ new worktree. Reads `OKSTRA_WORKTREE_SYNC_DIRS` if set (colon-separated,
130
+ empty string disables); otherwise returns the built-in default.
131
+ """
132
+ raw = os.environ.get("OKSTRA_WORKTREE_SYNC_DIRS")
133
+ if raw is None:
134
+ return DEFAULT_WORKTREE_SYNC_DIRS
135
+ raw = raw.strip()
136
+ if not raw:
137
+ return ()
138
+ return tuple(part for part in (p.strip() for p in raw.split(":")) if part)
139
+
140
+
141
+ def _link_sync_dirs(project_root: Path, worktree_path: Path) -> list[str]:
142
+ """Symlink each configured project-root dir into the new worktree.
143
+
144
+ Skip rules:
145
+ - Source missing in project_root → silently skipped.
146
+ - Target path already exists in worktree (e.g. tracked content
147
+ checked out by `git worktree add`) → skipped to avoid clobbering
148
+ version-controlled files.
149
+ - Parent directories are created as needed for nested entries.
150
+
151
+ Returns a list of human-readable notes (one per linked entry) so the
152
+ caller can include them in the provisioning note.
153
+ """
154
+ notes: list[str] = []
155
+ for rel in _resolve_sync_dirs():
156
+ src = (project_root / rel).resolve()
157
+ if not src.exists():
158
+ continue
159
+ dst = worktree_path / rel
160
+ if dst.exists() or dst.is_symlink():
161
+ continue
162
+ dst.parent.mkdir(parents=True, exist_ok=True)
163
+ os.symlink(src, dst)
164
+ notes.append(rel)
165
+ return notes
166
+
167
+
108
168
  def compute_worktree_path(
109
169
  *,
110
170
  project_id: str,
@@ -223,6 +283,9 @@ def provision_implementation_worktree(
223
283
  f"{(res.stderr or res.stdout).strip()}"
224
284
  )
225
285
 
286
+ linked = _link_sync_dirs(project_root, worktree_path)
287
+ linked_suffix = f"; linked {', '.join(linked)}" if linked else ""
288
+
226
289
  return WorktreeProvision(
227
290
  status="created",
228
291
  path=str(worktree_path),
@@ -230,6 +293,6 @@ def provision_implementation_worktree(
230
293
  base_ref=base_ref,
231
294
  note=(
232
295
  f"executor worktree created at {worktree_path} on branch {branch} "
233
- f"(base {base_ref[:12]})"
296
+ f"(base {base_ref[:12]}){linked_suffix}"
234
297
  ),
235
298
  )
@@ -68,6 +68,31 @@ Lead-authored fallback is permitted only if all of the following are true and re
68
68
 
69
69
  Speculative reasons such as "session resume constraint", "team object no longer exists", or "lead can do it faster" are NOT valid.
70
70
 
71
+ ## Phase 7 follow-up task spawner (BLOCKING when Section 7 is non-empty)
72
+
73
+ After the token-usage collector finishes (the next subsection), Phase 7 must run the follow-up task spawner against the final-report file. This step is what turns the report's `## 7. Follow-up Tasks (후속 작업)` table into actual `tasks/<task-group>/<new-task-id>/` stubs that show up in `okstra-status`.
74
+
75
+ ```bash
76
+ python3 scripts/okstra-spawn-followups.py \
77
+ <runDirectoryPath>/reports/final-report-<task-type>-<seq>.md \
78
+ --project-root <project_root> \
79
+ --task-group <task-group> \
80
+ --parent-task-key <task-key>
81
+ ```
82
+
83
+ Behaviour contract:
84
+
85
+ - The script is **idempotent**: rows whose target directory already exists are reported as `existing` and skipped without modification. Re-running the spawner across reruns of the same parent task is safe.
86
+ - Rows with `Auto-spawn? != yes` are reported as `skipped` and never written to disk. Surface them in the final-report's Section 6 if the user should still take manual action.
87
+ - A row with an invalid `Origin`, `Suggested task-type`, missing `Title`, or missing `Reason / Why deferred` causes the script to exit `1`. The report-writer worker MUST refuse to ship a final-report whose Section 7 contains such rows. Either fix the row or change `Auto-spawn?` to `no` and document why in Section 6.
88
+ - For `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, Section 7 must be present in the final report. An empty section is acceptable and is expressed as the single line `- 후속 작업 없음.` under the heading. The spawner treats a missing or empty section as a no-op (exit `0`).
89
+
90
+ After the spawner completes, the report-writer worker MUST update Section 6 ("Recommended Next Steps") to list every newly created task-key together with its entry command, so the user can pick the follow-up up immediately:
91
+
92
+ ```
93
+ - Follow-up: `<task-group>/<new-task-id>` — `okstra --task-key <task-group>/<new-task-id> --task-type <suggested>`
94
+ ```
95
+
71
96
  ## Phase 7 token-usage collector (BLOCKING)
72
97
 
73
98
  At the start of Phase 7, run the token-usage collector with the final-report substitution flag. This step is BLOCKING — both the team-state aggregation AND the final-report placeholder substitution happen here, in one invocation:
@@ -219,7 +244,7 @@ Section numbering matches `okstra-final-report.template.md`. Section 0 is the ca
219
244
 
220
245
  ### Writing Guidelines
221
246
 
222
- - Write in Markdown (actively use tables, bullet points, and code blocks)
247
+ - Write in Markdown. **Prefer tables over prose bullet lists** for any section that enumerates multiple items with the same shape (evidence rows, risks, options, dependencies, rollback steps, follow-ups, open questions). Bullets are reserved for short, single-line standalone statements (e.g., "- 추가 정보 요청 없음."). When the template provides a table form, do NOT degrade it back to bullets in the rendered report.
223
248
  - Write the final report body in Korean.
224
249
  - Keep technical identifiers such as file paths, code symbols, model names, and status values in their original form when needed.
225
250
  - If only one worker is usable, perform a reduced-confidence synthesis
@@ -250,6 +275,7 @@ Persistence steps that must be performed in Phase 7:
250
275
  - [ ] 5. **Update task-index.md**: Refresh human-readable summary
251
276
  - [ ] 6. **Generate final status file**: `runs/<task-type>/status/final-<task-type>-<seq>.status` (if necessary)
252
277
  - [ ] 7. **Save convergence state**: `runs/<task-type>/state/convergence-<task-type>-<seq>.json` (when convergence is enabled)
278
+ - [ ] 8. **Spawn follow-up task stubs**: run `scripts/okstra-spawn-followups.py` against the final-report when the run's `task-type` ∈ {`implementation`, `final-verification`, `release-handoff`}, OR when Section 7 is non-empty for any other task-type. See "Phase 7 follow-up task spawner" above for the exact command and contract. The script is idempotent across reruns.
253
279
 
254
280
  ### Response after Persistence
255
281
 
@@ -25,8 +25,12 @@
25
25
  - Cite the entry id (`A*` or `Q*`) and the new evidence (file path, line number, log excerpt, worker finding, etc.) that resolves or refutes the prior answer or material.
26
26
 
27
27
  ## Summary of the Problem or Verification Target
28
- - Summarize the core problem, requirement, or verification target in 3 to 5 bullets.
29
- - Base the summary on the brief, source materials, and worker results.
28
+
29
+ 3~5개의 row로 핵심 문제·요구사항·검증 대상을 표로 정리합니다. brief, 소스 자료, worker 결과를 근거로 작성합니다.
30
+
31
+ | ID | Ticket ID | 한 줄 요약 | 출처 (brief/source/worker) |
32
+ |----|-----------|------------|----------------------------|
33
+ | P-001 | `<TICKET-or-fallback>` | <핵심 항목 한 줄> | <출처> |
30
34
 
31
35
  ## Ticket Coverage
32
36
 
@@ -46,8 +50,9 @@
46
50
  - `관련 항목 IDs`는 `F-001`, `M-002`, `A1`, `Q1` 같은 row ID를 콤마로 나열합니다. 섹션 헤더 단위로만 ticket이 표시된 경우(개별 row ID가 없는 경우)에는 헤더 번호 자체를 적습니다 (예: `4.5.3`).
47
51
 
48
52
  ## Execution Status by Agent
49
- - Summarize the status, assigned model, and key finding for each worker.
50
- - Do not replace worker outputs with unsupported claims.
53
+
54
+ worker의 status, 배정 모델, key finding을 표 한 곳에 모읍니다. worker 산출물을 근거 없는 주장으로 대체하지 않습니다.
55
+
51
56
  {{EXECUTION_STATUS_TABLE_ROWS}}
52
57
 
53
58
  ## Token Usage Summary
@@ -79,8 +84,15 @@
79
84
  - 유의미한 차이가 없을 경우 표 대신 다음 한 줄을 적습니다 — `- 유의미한 차이 없음. 1.1 Consensus가 그대로 유효합니다.`
80
85
 
81
86
  ## 2. Final Verdict
82
- - State the final conclusion clearly.
83
- - Recommend the most appropriate direction: continue investigation, begin implementation, approve, reject, or hold.
87
+
88
+ 최종 결론과 권장 방향을 표로 명시합니다. `Direction`은 다음 중 하나입니다 — `continue-investigation`, `begin-implementation`, `approve`, `reject`, `hold`.
89
+
90
+ | 항목 | 값 |
91
+ |------|----|
92
+ | Final Conclusion | <한 줄 결론> |
93
+ | Direction | `<continue-investigation / begin-implementation / approve / reject / hold>` |
94
+ | 근거 요약 | <`1.1`, `3.1` 등 본 보고서 행 ID를 콤마로> |
95
+ | 다음 단계 | <Section 6 또는 7 중 어디로 이어지는지> |
84
96
 
85
97
  ## 3. Evidence and Detailed Analysis
86
98
  ### 3.1 Primary Evidence
@@ -127,7 +139,15 @@
127
139
  | <Option A> | <…> | <…> | <…> | <…> | <…> |
128
140
 
129
141
  ### 4.5.3 Recommended Option (권장 옵션)
130
- - 어떤 옵션을 권장하는지와 이유를 작성합니다. 근거는 `4.5.2`의 비교 결과 + 디자인 원칙(isolation, files-that-change-together, follow-established-patterns, YAGNI)에 묶어 설명합니다.
142
+
143
+ 권장 옵션과 그 이유를 표로 정리합니다. `근거`는 `4.5.2`의 비교 결과 + 디자인 원칙(isolation, files-that-change-together, follow-established-patterns, YAGNI) 중 적용된 항목을 적습니다.
144
+
145
+ | 항목 | 값 |
146
+ |------|----|
147
+ | Recommended Option | <옵션 이름> |
148
+ | 핵심 이유 | <한 줄> |
149
+ | 근거 (Trade-off 행 / 원칙) | <4.5.2 행 + 적용 디자인 원칙> |
150
+ | 채택되지 않은 옵션 요약 | <짧게: 어떤 비용 때문에 탈락> |
131
151
 
132
152
  ### 4.5.4 Stepwise Execution Order (단계별 실행 순서)
133
153
 
@@ -143,7 +163,12 @@
143
163
  - 한 step이 여러 ticket을 동시에 진행한다면 `Ticket ID`에 콤마로 함께 적습니다.
144
164
 
145
165
  ### 4.5.5 Dependency / Migration Risk (의존성·마이그레이션 위험)
146
- - 순서 제약, 데이터 백필, feature-flag 선행 조건, 팀 간 조율 등을 모두 나열합니다.
166
+
167
+ 순서 제약, 데이터 백필, feature-flag 선행 조건, 팀 간 조율 등을 표로 정리합니다. 해당 없음 시: `- 의존성·마이그레이션 위험 없음.` 한 줄.
168
+
169
+ | ID | Kind (order / backfill / flag-precondition / coordination / other) | Item | 영향 | 완화 / 선행 작업 |
170
+ |----|--------------------------------------------------------------------|------|------|------------------|
171
+ | DM-001 | <kind> | <한 줄 요약> | <영향 범위> | <대응 방안> |
147
172
 
148
173
  ### 4.5.6 Validation Checklist (검증 체크리스트)
149
174
 
@@ -156,14 +181,24 @@
156
181
  추상적 서술 금지. 모든 row는 명령어 또는 관찰 가능한 결과를 포함해야 합니다.
157
182
 
158
183
  ### 4.5.7 Rollback Strategy (롤백 전략)
159
- - 정확한 revert 경로(commits, flags, migrations 등)와 롤백을 트리거하는 신호(에러율, latency, 사용자 보고 등)를 명시합니다.
184
+
185
+ revert 경로와 롤백 트리거 신호를 표로 정리합니다. 추상적 서술 금지 — 명령어 또는 관찰 가능한 신호여야 합니다.
186
+
187
+ | ID | Step | Action (revert command / flag toggle / migration down) | Trigger signal (error rate / latency / user report 등) | 확인 방법 |
188
+ |----|------|---------------------------------------------------------|--------------------------------------------------------|-----------|
189
+ | RB-001 | 1 | `<예: git revert <SHA>>` | <예: 에러율 > 1% 5분 지속> | `<관찰 명령 / 대시보드>` |
160
190
 
161
191
  ### 4.5.8 User Approval Request (사용자 승인 요청 — 본 보고서 상단 참조)
162
192
  - 실제 승인 게이트는 본 보고서 **상단 `User Approval Request (사용자 승인 게이트)` 블록**에 있습니다. 이 하위 섹션은 validator가 요구하는 영문 키워드(`User Approval Request`)와 본문 구조 일관성을 위해 남겨 둡니다.
163
193
  - 본 섹션에는 승인 결정에 영향을 주는 *플랜 측 보충 메모*만 적습니다(예: 위험을 줄이기 위한 사전 작업, 승인 전 사용자가 확인해 두어야 할 사항). 승인 마커는 본 섹션이 아니라 상단 블록의 체크박스로만 부여합니다.
164
194
 
165
195
  ### 4.5.9 Open Questions
166
- - pre-planning에서 발견된 모든 모호점을 항목으로 남겨, 사용자가 승인 전에 해소해야 할 질문 목록으로 사용합니다.
196
+
197
+ pre-planning에서 발견된 모호점을 표로 남깁니다. 사용자가 승인 전에 해소해야 할 질문 목록입니다. 없을 시: `- Open question 없음.` 한 줄.
198
+
199
+ | ID | Ticket ID | 질문 | Blocking (예/아니오) | 기대 답 형태 | Status |
200
+ |----|-----------|------|----------------------|--------------|--------|
201
+ | OQ-001 | `<TICKET-or-fallback>` | <질문 본문> | 예 | <예/아니오 / 선택 / 자유서술> | open |
167
202
 
168
203
  ## 4.6 Release Handoff Deliverables (release-handoff runs only)
169
204
 
@@ -280,6 +315,42 @@ Empty-state placeholder, copy verbatim when nothing else applies:
280
315
  - No further action required. Final verdict in section 2 stands.
281
316
  ```
282
317
 
318
+ ## 7. Follow-up Tasks (후속 작업)
319
+
320
+ 이 섹션은 본 run의 구현·검증 범위를 **넘어서는** 작업을 후속 task로 이어갈 수 있도록 표로 정리합니다.
321
+
322
+ - `task-type` = `implementation` / `final-verification` / `release-handoff` runs: **필수**. 채울 항목이 없으면 표 대신 `- 후속 작업 없음.` 한 줄.
323
+ - 그 외 task-type runs: 선택. lead가 필요하다고 판단할 때만 채웁니다.
324
+
325
+ 후속 항목의 출처는 다음 중 하나여야 합니다:
326
+
327
+ - `out-of-plan` — 구현 중 발견됐으나 본 plan의 file list / step 범위를 벗어나 처리하지 못한 항목 (Out-of-plan edits 블록에 기록되지 않고 미처리로 남은 것).
328
+ - `verifier-concern` — verifier가 PASS는 줬으나 후속 개선 권고로 남긴 항목.
329
+ - `scope-boundary` — `implementation-planning`의 `4.5.5 Dependency / Migration Risk` 또는 task-brief `Out of Scope` 에서 의도적으로 제외했으나, 본 run 결과에 비추어 별도 ticket이 필요한 항목.
330
+ - `open-question` — `4.5.9 Open Questions` / `Section 5 Clarification Requests` 에서 분리한 후속 작업.
331
+ - `manual` — lead가 추가로 식별한 follow-up.
332
+
333
+ 규칙:
334
+
335
+ - 각 row의 `Auto-spawn?` 값이 `yes` 이면 Phase 7 의 `scripts/okstra-spawn-followups.py` 가 자동으로 후속 task 디렉터리(`tasks/<TASK_GROUP>/<New Task ID>/`)와 `task-manifest.json` (status: `todo`), stub task-brief 를 생성합니다. `no` 이면 사람이 따로 결정합니다.
336
+ - `New Task ID` 는 알파숫자·하이픈만 사용하는 짧은 slug 입니다 (예: `8852-followup-logs`). 같은 task-group 안에서 유일해야 합니다.
337
+ - `Suggested task-type` 은 `requirements-discovery` / `error-analysis` / `implementation-planning` / `implementation` / `final-verification` / `release-handoff` 중 하나.
338
+ - `Scope` 는 영향 파일·영역을 콤마 또는 한 줄로 적습니다.
339
+ - `Reason / Why deferred` 는 본 run 에서 처리하지 못한 이유를 한 두 문장으로 적습니다. "시간이 없어서" 같은 모호한 사유는 거절됩니다.
340
+ - 동일 follow-up 이 여러 run 에 걸쳐 등장하면 `New Task ID` 를 동일하게 유지하여 중복 spawn 을 방지합니다 (script 가 기존 디렉터리 존재 여부로 idempotent 처리).
341
+
342
+ | ID | Origin | New Task ID | Title | Suggested task-type | Scope (files/areas) | Reason / Why deferred | Priority (P0/P1/P2) | Auto-spawn? |
343
+ |----|--------|-------------|-------|---------------------|---------------------|------------------------|---------------------|-------------|
344
+ | FU-001 | `<out-of-plan / verifier-concern / scope-boundary / open-question / manual>` | `<new-task-id-slug>` | <한 줄 제목> | `<task-type>` | `<files / areas>` | <한 두 문장 사유> | `P1` | `yes` |
345
+
346
+ 빈 상태 예시 (해당 없음):
347
+
348
+ ```
349
+ - 후속 작업 없음.
350
+ ```
351
+
352
+ 본 섹션이 채워진 경우, Section 6 의 "Follow-up tasks or related tasks" 항목에 자동 생성된 task-key 와 진입 명령을 함께 적어 사용자가 즉시 이어갈 수 있게 합니다.
353
+
283
354
  ## Writing Rules
284
355
  - Write the final report in Markdown.
285
356
  - Write the final report body in Korean.