okstra 0.6.1 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.kr.md +1 -1
- package/README.md +1 -1
- package/docs/kr/architecture.md +4 -3
- package/docs/kr/cli.md +26 -3
- package/package.json +1 -1
- package/runtime/BUILD.json +2 -2
- package/runtime/agents/SKILL.md +20 -4
- package/runtime/agents/TODO.md +15 -2
- package/runtime/agents/workers/claude-worker.md +2 -2
- package/runtime/agents/workers/report-writer-worker.md +2 -2
- package/runtime/bin/okstra.sh +2 -0
- package/runtime/prompts/launch.template.md +2 -2
- package/runtime/prompts/profiles/error-analysis.md +2 -2
- package/runtime/prompts/profiles/final-verification.md +20 -1
- package/runtime/prompts/profiles/implementation-planning.md +1 -1
- package/runtime/prompts/profiles/implementation.md +12 -6
- package/runtime/prompts/profiles/requirements-discovery.md +1 -1
- package/runtime/python/lib/okstra/cli.sh +6 -1
- package/runtime/python/lib/okstra/globals.sh +1 -0
- package/runtime/python/lib/okstra/usage.sh +8 -2
- package/runtime/python/okstra_ctl/render.py +65 -0
- package/runtime/python/okstra_ctl/run.py +36 -1
- package/runtime/skills/okstra-history/SKILL.md +1 -0
- package/runtime/skills/okstra-run/SKILL.md +3 -1
- package/runtime/skills/okstra-setup/SKILL.md +1 -1
- package/runtime/skills/okstra-team-contract/SKILL.md +1 -0
- package/runtime/templates/reports/settings.template.json +1 -13
- package/runtime/templates/reports/task-brief.template.md +3 -14
- package/runtime/validators/validate-run.py +145 -0
- package/src/setup.mjs +1 -1
package/README.kr.md
CHANGED
|
@@ -107,7 +107,7 @@ CLI 에서:
|
|
|
107
107
|
|
|
108
108
|
```bash
|
|
109
109
|
cd <대상 프로젝트>
|
|
110
|
-
npx -y okstra@latest setup --project-id <id> # 예: INV-1234,
|
|
110
|
+
npx -y okstra@latest setup --project-id <id> # 예: INV-1234, my-app, okstra
|
|
111
111
|
```
|
|
112
112
|
|
|
113
113
|
또는 Claude Code 세션 안에서 동일한 슬래시 커맨드:
|
package/README.md
CHANGED
|
@@ -106,7 +106,7 @@ From the CLI:
|
|
|
106
106
|
|
|
107
107
|
```bash
|
|
108
108
|
cd <your project>
|
|
109
|
-
npx -y okstra@latest setup --project-id <id> # e.g. INV-1234,
|
|
109
|
+
npx -y okstra@latest setup --project-id <id> # e.g. INV-1234, my-app, okstra
|
|
110
110
|
```
|
|
111
111
|
|
|
112
112
|
Or, inside a Claude Code session, invoke the equivalent slash command:
|
package/docs/kr/architecture.md
CHANGED
|
@@ -283,8 +283,8 @@ Claude launch prompt 본문은 항상 `prompts/launch.template.md` 템플릿에
|
|
|
283
283
|
|
|
284
284
|
```json
|
|
285
285
|
{
|
|
286
|
-
"projectId": "
|
|
287
|
-
"projectRoot": "/Volumes/Workspaces/workspace/projects/
|
|
286
|
+
"projectId": "sample-project-v2-api",
|
|
287
|
+
"projectRoot": "/Volumes/Workspaces/workspace/projects/sample-project",
|
|
288
288
|
"createdAt": "2026-05-10T00:00:00Z",
|
|
289
289
|
"updatedAt": "2026-05-10T00:00:00Z"
|
|
290
290
|
}
|
|
@@ -844,6 +844,7 @@ Claude가 작성하는 최종 보고서는 brief에 더 구체적인 형식이
|
|
|
844
844
|
- worker 생성과 결과 취합은 Claude가 수행합니다.
|
|
845
845
|
- standard workflow는 `Claude lead` + required worker `Claude worker`, `Codex worker`, `Gemini worker`, `Report writer worker`를 사용합니다.
|
|
846
846
|
- worker 모델은 `--lead-model`, `--claude-model`, `--codex-model`, `--gemini-model`, `--report-writer-model`로 override할 수 있고, 기본값은 `OKSTRA_DEFAULT_*` 환경 변수에서 중앙 관리합니다. fallback 기본값은 `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`입니다.
|
|
847
|
+
- `--task-type implementation` 에서는 Executor 역할을 맡을 provider 를 `--executor <claude|codex|gemini>` (또는 `OKSTRA_DEFAULT_EXECUTOR`, fallback `claude`) 로 선택합니다. Executor 만 프로젝트 파일을 mutate 할 수 있고, 나머지 두 provider 와 자기 자신의 provider 가 모두 별도 CLI 세션으로 verifier 로 dispatch 됩니다 (세션 분리만으로도 self-review 안전장치 유지). Executor 의 모델은 선택된 provider 의 worker 모델 플래그(`--claude-model` / `--codex-model` / `--gemini-model`) 를 그대로 재사용하며, run-manifest 의 `teamContract.executor` 블록에 provider / displayName / workerAgent / model 이 기록됩니다.
|
|
847
848
|
- project-level current-task convenience pointer는 `.project-docs/okstra/discovery/latest-task.json`입니다.
|
|
848
849
|
- project-level canonical task inventory는 `.project-docs/okstra/discovery/task-catalog.json`입니다.
|
|
849
850
|
- project-local okstra Claude asset은 `.claude/skills/`와 `.claude/agents/` 아래에 seed되며, 기본 rerun에서는 보존되고 `--refresh-assets`로 다시 생성할 수 있습니다.
|
|
@@ -876,7 +877,7 @@ Claude가 작성하는 최종 보고서는 brief에 더 구체적인 형식이
|
|
|
876
877
|
- 종료 처리는 `okstra-ctl` 의 모든 진입점에서 호출되는 lazy reconcile 이 수행한다(타깃 프로젝트의 `final-report-*.md` 존재로 추론).
|
|
877
878
|
- 다중 rerun 은 대상 1건당 tmux 세션 1개를 detached 로 spawn 하고 즉시 반환한다(fire-and-forget). 사용자는 반환된 attach 명령으로 임의 세션에 접속한다.
|
|
878
879
|
- spawn 임계 기본값은 10. `--max-spawn N` 또는 `OKSTRA_CTL_MAX_SPAWN` 으로 변경 가능.
|
|
879
|
-
- runId 형식: `<project-id>/<task-group>/<task-id>/<task-type>/r<run-seq>` (예: `
|
|
880
|
+
- runId 형식: `<project-id>/<task-group>/<task-id>/<task-type>/r<run-seq>` (예: `sample-project/payment/fail/error-analysis/r07`). 입력 시 prefix substring 매칭을 지원한다.
|
|
880
881
|
|
|
881
882
|
### 동시성 제어 (두 단계 mutex)
|
|
882
883
|
|
package/docs/kr/cli.md
CHANGED
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
기본 명령(첫 진입 / full args):
|
|
10
10
|
|
|
11
11
|
```bash
|
|
12
|
-
scripts/okstra.sh [--render-only] [--yes] [--refresh-assets] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--related-tasks taskA,taskB] [--clarification-response <previous-final-report>] --project-id <project-id> --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
|
|
12
|
+
scripts/okstra.sh [--render-only] [--yes] [--refresh-assets] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--executor claude|codex|gemini] [--related-tasks taskA,taskB] [--clarification-response <previous-final-report>] --project-id <project-id> --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
|
|
13
13
|
```
|
|
14
14
|
|
|
15
15
|
후속 phase 단축 형식(기존 task-manifest.json이 존재할 때):
|
|
@@ -41,7 +41,7 @@ interactive terminal에서 실행하면 다음 규칙이 추가로 적용됩니
|
|
|
41
41
|
|
|
42
42
|
예:
|
|
43
43
|
|
|
44
|
-
- `
|
|
44
|
+
- `sample-project-v2-api`
|
|
45
45
|
- `jobs`
|
|
46
46
|
|
|
47
47
|
### `--task-group`
|
|
@@ -268,6 +268,7 @@ scripts/okstra.sh --task-type implementation-planning --workers claude,codex --p
|
|
|
268
268
|
- `OKSTRA_DEFAULT_CODEX_MODEL`
|
|
269
269
|
- `OKSTRA_DEFAULT_GEMINI_MODEL`
|
|
270
270
|
- `OKSTRA_DEFAULT_REPORT_WRITER_MODEL`
|
|
271
|
+
- `OKSTRA_DEFAULT_EXECUTOR` (`claude` | `codex` | `gemini`, fallback `claude`)
|
|
271
272
|
|
|
272
273
|
fallback 기본값은 아래와 같습니다.
|
|
273
274
|
|
|
@@ -276,6 +277,28 @@ fallback 기본값은 아래와 같습니다.
|
|
|
276
277
|
- `Claude worker`: `sonnet`
|
|
277
278
|
- `Codex worker`: `gpt-5.5`
|
|
278
279
|
- `Gemini worker`: `auto`
|
|
280
|
+
- Implementation executor: `claude` (즉 기본은 `Claude executor`)
|
|
281
|
+
|
|
282
|
+
### `--executor`
|
|
283
|
+
|
|
284
|
+
`--task-type implementation` 에서 Executor 역할을 맡을 provider 를 선택합니다. 값은 `claude` | `codex` | `gemini` 중 하나이며, 다른 task-type 에서는 무시됩니다.
|
|
285
|
+
|
|
286
|
+
- 기본값: `OKSTRA_DEFAULT_EXECUTOR` → fallback `claude`
|
|
287
|
+
- Executor 는 이 run 에서 **유일하게 프로젝트 파일을 mutate 할 수 있는 worker** 입니다. 나머지 두 provider 는 같은 run 에서 strict read-only verifier 로 dispatch 됩니다.
|
|
288
|
+
- Executor 의 모델은 provider 별 worker 모델 플래그를 그대로 재사용합니다. 즉 `--executor codex` 이면 Executor 의 모델은 `--codex-model` (기본 `gpt-5.5`), `--executor gemini` 이면 `--gemini-model` (기본 `auto`) 가 됩니다.
|
|
289
|
+
- Claude/Codex/Gemini 세 verifier 는 executor provider 와 관계없이 항상 dispatch 됩니다. Executor 와 같은 provider 라도 별도 CLI 세션으로 verifier 가 호출되어 context 가 분리되므로 self-review 안전장치는 유지됩니다.
|
|
290
|
+
- 실제 파일 변경은 Codex/Gemini 의 경우 각 CLI 의 auto-edit 모드 (예: `codex exec --full-auto`) 를 통해 일어나며, Claude-side Edit/Write tool 을 거치지 않습니다.
|
|
291
|
+
|
|
292
|
+
예:
|
|
293
|
+
|
|
294
|
+
```bash
|
|
295
|
+
scripts/okstra.sh --task-type implementation \
|
|
296
|
+
--executor codex \
|
|
297
|
+
--codex-model gpt-5.5 \
|
|
298
|
+
--approved-plan .project-docs/.../runs/implementation-planning/.../reports/final-report-implementation-planning-001.md \
|
|
299
|
+
--project-id jobs --task-group tasks --task-id 8852 \
|
|
300
|
+
--task-brief .project-docs/tasks/8852/BUG_REPORT.md
|
|
301
|
+
```
|
|
279
302
|
|
|
280
303
|
### `--related-tasks`
|
|
281
304
|
|
|
@@ -387,7 +410,7 @@ chmod +x ~/.local/bin/okstra-ctl
|
|
|
387
410
|
|---|---|
|
|
388
411
|
| 작업한 프로젝트 목록 | `okstra-ctl projects` |
|
|
389
412
|
| 최근 run 검색 | `okstra-ctl list --since 7d` |
|
|
390
|
-
| 특정 프로젝트만 | `okstra-ctl list --project
|
|
413
|
+
| 특정 프로젝트만 | `okstra-ctl list --project sample-project` |
|
|
391
414
|
| 진행 중 run 보기 | `okstra-ctl tail active` |
|
|
392
415
|
| 단일 run 결과 메타 | `okstra-ctl show <runId-or-prefix>` |
|
|
393
416
|
| 결과 보고서 경로 | `okstra-ctl open <runId-or-prefix>` |
|
package/package.json
CHANGED
package/runtime/BUILD.json
CHANGED
package/runtime/agents/SKILL.md
CHANGED
|
@@ -96,6 +96,17 @@ Unless the task bundle overrides:
|
|
|
96
96
|
|
|
97
97
|
If the prepared task bundle contains explicit model assignments, those assignments are canonical for the run. All three analysis workers use dedicated agent definitions; Codex/Gemini wrappers handle external CLI invocation internally; Claude worker runs as an in-process subagent with explicitly registered MCP tools so it does not fall back to `claude --mcp-cli` Bash invocations.
|
|
98
98
|
|
|
99
|
+
### Implementation phase: Executor binding
|
|
100
|
+
|
|
101
|
+
For `--task-type implementation` runs, the task bundle additionally pins one of `claude` / `codex` / `gemini` as the Executor — the only worker permitted to mutate project files in that run. The binding is exposed in two canonical places:
|
|
102
|
+
|
|
103
|
+
- `instruction-set/analysis-profile.md` — top "Executor binding" block (provider, displayName, workerAgent, model)
|
|
104
|
+
- `runs/implementation/manifests/run-manifest-*.json` — `teamContract.executor` object (same fields plus `appliesTo: "implementation"`)
|
|
105
|
+
|
|
106
|
+
Lead MUST dispatch Edit/Write-bearing work only through the `workerAgent` declared there. The other two providers still run as read-only verifiers in the same run; the executor's own provider is *also* dispatched separately as a verifier (a fresh CLI session) so the diff is reviewed by a context-isolated session. Session isolation is the primary self-review safeguard — same-model executor and same-provider verifier is acceptable when running in distinct sessions. Selecting a different model variant (e.g. executor=opus / Claude verifier=sonnet) is recommended but no longer mandatory.
|
|
107
|
+
|
|
108
|
+
Executor is chosen at run-prep time via `--executor <claude|codex|gemini>` (or `OKSTRA_DEFAULT_EXECUTOR`, fallback `claude`); the model used by the executor is taken from the corresponding worker model flag (`--claude-model` / `--codex-model` / `--gemini-model`). For Codex/Gemini executors, the underlying file mutation happens inside the executor CLI's own auto-edit mode (e.g. `codex exec --full-auto`), not through Claude-side Edit/Write tools.
|
|
109
|
+
|
|
99
110
|
## Phase 1: Task-bundle intake and required reading order
|
|
100
111
|
|
|
101
112
|
**REQUIRED SUB-SKILL:** Invoke [okstra-context-loader](./skills/okstra-context-loader/SKILL.md) first to discover task bundle paths.
|
|
@@ -130,13 +141,14 @@ These phases are governed by [okstra-team-contract](./skills/okstra-team-contrac
|
|
|
130
141
|
|
|
131
142
|
`Report writer worker` is NOT an analysis worker. Do not dispatch it in Phase 4/5 alongside analysis workers. It is invoked only in Phase 6 — see [okstra-report-writer](./skills/okstra-report-writer/SKILL.md).
|
|
132
143
|
|
|
133
|
-
### Phase 3 — Team creation
|
|
144
|
+
### Phase 3 — Team creation (BLOCKING)
|
|
134
145
|
|
|
135
|
-
|
|
146
|
+
`TeamCreate` MUST be the first Agent-related tool call after Phase 2 prompt preparation. Do not call `Agent(... team_name: ...)` for any worker until this phase has executed — the Agent tool rejects `team_name` for non-existent teams with `"team을 먼저 생성하거나 team_name 없이 호출해야 합니다"` / `"team must be created first or call without team_name"`, and silently stripping `team_name` to retry is NOT a valid recovery (it loses the Teams split-pane behavior and is indistinguishable from never having attempted Teams mode).
|
|
136
147
|
|
|
137
148
|
1. Call `TeamCreate(team_name: "okstra-<task-key>", description: "Lead-plus-worker okstra run for <task-key>")`.
|
|
138
|
-
2.
|
|
139
|
-
3. If `TeamCreate`
|
|
149
|
+
2. Record the `TeamCreate` outcome in team-state under `teamCreate: { attempted: true, status: "ok"|"error", error?: <message> }` before any dispatch. This is the audit trail that justifies a later no-`team_name` fallback.
|
|
150
|
+
3. If `TeamCreate` succeeds, proceed to Phase 4 (dispatch with `team_name`).
|
|
151
|
+
4. If `TeamCreate` fails (tool unavailable, permission denied, environment lacks Agent Teams support), proceed to Phase 5 fallback (dispatch with `run_in_background: true` and no `team_name`).
|
|
140
152
|
|
|
141
153
|
Use agent and subagent names that map cleanly to the selected worker roles. Do not create ambiguous role names that differ from `Claude worker`, `Codex worker`, `Gemini worker`, or `Report writer worker`.
|
|
142
154
|
|
|
@@ -144,6 +156,8 @@ Use agent and subagent names that map cleanly to the selected worker roles. Do n
|
|
|
144
156
|
|
|
145
157
|
Spawn **analysis workers only** in the same turn (Phase 4 in Teams mode; Phase 5 with `run_in_background: true` and no `team_name` when Teams unavailable). Preserve exact roster, role labels, assigned models from the task bundle.
|
|
146
158
|
|
|
159
|
+
The no-`team_name` fallback (Phase 5) is only legal when team-state's `teamCreate.status` is `"error"` for this run. If `teamCreate` is missing or `attempted: false`, the correct action when an Agent dispatch is rejected for a missing team is to GO BACK to Phase 3 and call `TeamCreate` — never to strip `team_name` and continue.
|
|
160
|
+
|
|
147
161
|
After each worker terminates (any terminal status), if a worker errors sidecar exists at `runs/.../worker-results/<role-slug>-errors.json`, dump it to the run error log:
|
|
148
162
|
|
|
149
163
|
```bash
|
|
@@ -224,6 +238,8 @@ After persistence, reply briefly in Korean with: completion status, final report
|
|
|
224
238
|
|
|
225
239
|
| Mistake | Fix |
|
|
226
240
|
|---------|-----|
|
|
241
|
+
| Dispatching workers with `team_name` before calling `TeamCreate` (Phase 3 skipped) | Phase 3 is BLOCKING — call `TeamCreate` first. The Agent tool's `"team must be created first"` rejection is not an environment-availability signal |
|
|
242
|
+
| Stripping `team_name` and retrying when the Agent tool rejects the call for a non-existent team | This is silent loss of Teams split-pane mode. Correct action: go back to Phase 3 and call `TeamCreate`. The no-`team_name` fallback (Phase 5) is only legal after `TeamCreate` was attempted and recorded as `error` in team-state |
|
|
227
243
|
| Substituting Claude lead reasoning for a worker result | Claude lead synthesizes only — spawn the worker |
|
|
228
244
|
| Skipping a worker silently | Always record terminal status with reason |
|
|
229
245
|
| Writing verdict before all workers report | Wait for all results or explicit terminal statuses |
|
package/runtime/agents/TODO.md
CHANGED
|
@@ -46,9 +46,22 @@ runs/<task-type>/
|
|
|
46
46
|
|
|
47
47
|
---
|
|
48
48
|
|
|
49
|
-
## 항목 F4 — `implementation.md` 프로필의 워커 이름 매핑 누락 [
|
|
49
|
+
## 항목 F4 — `implementation.md` 프로필의 워커 이름 매핑 누락 [부분 진행 — Executor 선택 CLI 만 처리됨, 2026-05-12]
|
|
50
50
|
|
|
51
|
-
###
|
|
51
|
+
### 부분 진행 메모 (2026-05-12)
|
|
52
|
+
|
|
53
|
+
`--executor <claude|codex|gemini>` 플래그가 추가되어 Executor provider 를 run-prep 시점에 선택할 수 있게 됐고, profile 텍스트도 "Claude executor" → 일반화된 "Executor" 로 재작성됨. 즉 **CLI / 매니페스트 / 프로필 표현 레이어는 정리됨**.
|
|
54
|
+
|
|
55
|
+
그러나 아래는 여전히 미해결:
|
|
56
|
+
|
|
57
|
+
- `claude-executor.md` / `claude-verifier.md` / `codex-verifier.md` / `gemini-verifier.md` 4종 subagent 정식 등록은 안 됨. 현재 lead 는 dispatch 할 때 기존 `claude-worker` / `codex-worker` / `gemini-worker` 를 그대로 재사용하며, executor vs verifier 의 도구 화이트리스트 차이는 프롬프트 레벨에서만 강제됨.
|
|
58
|
+
- run-manifest 의 `teamContract.executor.workerAgent` 가 가리키는 subagent 도 위 기존 `*-worker` 이름이라 도구-레벨 차단은 동작하지 않음.
|
|
59
|
+
|
|
60
|
+
본 항목의 원래 의도(도구-레벨 read-only 강제)는 4종 subagent 등록이 머지되어야 완성됨.
|
|
61
|
+
|
|
62
|
+
### 문제 (원본)
|
|
63
|
+
|
|
64
|
+
> 2026-05-12 주: 아래 본문은 F4 가 처음 기록된 시점의 분석. "다음 워커 이름을 사용한다" 의 명단(특히 `Claude executor`)은 그 후 프로필 일반화로 `Executor` 단일 role 로 통합됐고 provider 는 `--executor` 로 선택하게 됐다. 핵심 미해결인 *도구-레벨 read-only 강제* 부분은 그대로 유효.
|
|
52
65
|
|
|
53
66
|
[prompts/profiles/implementation.md](../../../prompts/profiles/implementation.md) 프로필은 다른 4개 프로필과 달리 다음 워커 이름을 사용한다:
|
|
54
67
|
|
|
@@ -18,7 +18,7 @@ description: |
|
|
|
18
18
|
</example>
|
|
19
19
|
model: inherit
|
|
20
20
|
color: blue
|
|
21
|
-
tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"
|
|
21
|
+
tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"]
|
|
22
22
|
---
|
|
23
23
|
|
|
24
24
|
You are a Claude worker agent for okstra cross-verification. Your emphasis: **broad reasoning quality, hidden assumptions, missing context, execution risk**.
|
|
@@ -43,7 +43,7 @@ Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you
|
|
|
43
43
|
|
|
44
44
|
4. Anchor all file operations to the absolute `Project Root` from the lead prompt. Use absolute paths — do NOT rely on inherited cwd. Never use `cd` to change directory.
|
|
45
45
|
|
|
46
|
-
5. **MCP usage**: When the task requires
|
|
46
|
+
5. **MCP usage**: The canonical list of MCP servers and tools available for this run lives in the lead prompt's `## Available MCP Servers` section (sourced from `.project-docs/okstra/project.json`'s `mcpServers` array). When the task requires inspection of an external system covered by one of those servers, call the listed tool directly by name (e.g. `mcp__<server>__<tool>`). Do NOT shell out via `claude --mcp-cli call ...` or run the tool name as a Bash command — those are not valid invocation paths. If a server you need is not listed, record `MCP not available for this run` in your worker output rather than guessing a tool name.
|
|
47
47
|
|
|
48
48
|
6. If the task brief includes an `## Available MCP Servers` section in the lead prompt, treat that as the canonical list of MCP tools you may invoke for this run. If a needed server is not listed, record `MCP not available for this run` rather than calling it.
|
|
49
49
|
|
|
@@ -11,7 +11,7 @@ description: |
|
|
|
11
11
|
</example>
|
|
12
12
|
color: purple
|
|
13
13
|
model: inherit
|
|
14
|
-
tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"
|
|
14
|
+
tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"]
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
You are the `Report writer worker` for okstra cross-verification. Your sole responsibility is to **author the final-report file** at the assigned `Result Path`. You are NOT an analysis worker — you do not produce independent findings, you do not vote in convergence, and you do not re-do the workers' analysis.
|
|
@@ -39,7 +39,7 @@ If you find yourself thinking "I'll just return the report inline and let lead s
|
|
|
39
39
|
|
|
40
40
|
5. Anchor all file operations to the absolute `Project Root`. Use absolute paths everywhere — do not rely on inherited cwd, do not `cd`.
|
|
41
41
|
|
|
42
|
-
6. **MCP usage**: If the lead prompt's `## Available MCP Servers` block lists tools, you may invoke them by name to verify evidence cited by analysis workers
|
|
42
|
+
6. **MCP usage**: If the lead prompt's `## Available MCP Servers` block lists tools, you may invoke them by name (e.g. `mcp__<server>__<tool>`) to verify evidence cited by analysis workers. Do not invent MCP tools that are not listed.
|
|
43
43
|
|
|
44
44
|
## Required Reading Before Authoring
|
|
45
45
|
|
package/runtime/bin/okstra.sh
CHANGED
|
@@ -77,6 +77,7 @@ okstra execution summary:
|
|
|
77
77
|
directive: ${DIRECTIVE:-None}
|
|
78
78
|
clarification response: ${CLARIFICATION_RESPONSE_PATH:-None}
|
|
79
79
|
workers override: ${WORKERS_OVERRIDE:-None}
|
|
80
|
+
executor (implementation only): ${EXECUTOR_OVERRIDE:-default(claude)}
|
|
80
81
|
approved plan: ${APPROVED_PLAN_PATH:-None}
|
|
81
82
|
related tasks: ${RELATED_TASKS_RAW:-None}
|
|
82
83
|
CONFIRM_EOF
|
|
@@ -111,6 +112,7 @@ PY_ARGS=(
|
|
|
111
112
|
[[ -n "${CODEX_MODEL_OVERRIDE-}" ]] && PY_ARGS+=(--codex-model "$CODEX_MODEL_OVERRIDE")
|
|
112
113
|
[[ -n "${GEMINI_MODEL_OVERRIDE-}" ]] && PY_ARGS+=(--gemini-model "$GEMINI_MODEL_OVERRIDE")
|
|
113
114
|
[[ -n "${REPORT_WRITER_MODEL_OVERRIDE-}" ]] && PY_ARGS+=(--report-writer-model "$REPORT_WRITER_MODEL_OVERRIDE")
|
|
115
|
+
[[ -n "${EXECUTOR_OVERRIDE-}" ]] && PY_ARGS+=(--executor "$EXECUTOR_OVERRIDE")
|
|
114
116
|
[[ -n "${RELATED_TASKS_RAW-}" ]] && PY_ARGS+=(--related-tasks "$RELATED_TASKS_RAW")
|
|
115
117
|
[[ -n "${APPROVED_PLAN_PATH-}" ]] && PY_ARGS+=(--approved-plan "$APPROVED_PLAN_PATH")
|
|
116
118
|
[[ -n "${CLARIFICATION_RESPONSE_PATH-}" ]] && PY_ARGS+=(--clarification-response "$CLARIFICATION_RESPONSE_PATH")
|
|
@@ -39,9 +39,9 @@ Invoke the `okstra` skill now. Read the manifests below for all task metadata, p
|
|
|
39
39
|
|
|
40
40
|
## Available MCP Servers
|
|
41
41
|
|
|
42
|
-
|
|
42
|
+
{{AVAILABLE_MCP_SERVERS}}
|
|
43
43
|
- The full usage policy and per-phase rules live in the task brief's `## Available MCP Servers` section. Read them there before dispatching workers and **forward that section verbatim into every worker prompt** during Phase 2 so workers know they are allowed to call these tools.
|
|
44
|
-
- **Invocation rule (forward to every worker prompt)**: MCP tools are addressed by their tool name through the host's tool interface — **never via `Bash`**. Claude-side workers call the tool directly (e.g. `
|
|
44
|
+
- **Invocation rule (forward to every worker prompt)**: MCP tools are addressed by their tool name through the host's tool interface — **never via `Bash`**. Claude-side workers call the tool directly (e.g. `mcp__<server>__<tool>`). Codex/Gemini workers call through their CLI's own MCP transport (e.g. `codex mcp call ...`). Running the tool name as a shell command is a contract violation and will always fail regardless of permission grants.
|
|
45
45
|
- Codex worker and Gemini worker run external CLIs; they can only use these MCP servers if their own CLI configs mirror them. If not, instruct the worker to record `MCP not available in this CLI` in its `Missing Information or Assumptions` block rather than guessing or shell-falling-back.
|
|
46
46
|
- MCP queries are evidence-grade. Cite server, table, and the SELECT used in worker output. MCP must NOT be used as a write path in any phase, including `implementation`.
|
|
47
47
|
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
- the final verdict waits until each required worker has either a result or an explicit terminal status
|
|
16
16
|
- unnamed generic parallel workers must not replace the required role roster
|
|
17
17
|
- Tooling — read-only MCP availability:
|
|
18
|
-
-
|
|
18
|
+
- the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to confirm symptoms against live schema or to inspect rows that reproduce the failure; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived hypothesis MUST cite server, table, and the SELECT used
|
|
19
19
|
- Primary focus areas:
|
|
20
20
|
- symptom and trigger clarification
|
|
21
21
|
- root-cause candidates
|
|
@@ -31,7 +31,7 @@
|
|
|
31
31
|
- if any blocking uncertainty remains at the time of writing the final report, populate `## 5. Clarification Requests for the Next Run` in `final-report-template.md`
|
|
32
32
|
- section 5 must be split into two distinct sub-sections per the template — `5.1 추가 자료 요청 (Additional Materials Requested)` for files/logs/screenshots the user must attach, and `5.2 사용자 확인 질문 (Questions for the User)` for decisions or facts only the user can confirm. Never mix material requests and decision questions in the same row or list.
|
|
33
33
|
- write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon (e.g. write "초당 평균 요청 수" instead of "QPS", "재현 절차" instead of "repro"). For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the next step, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
|
|
34
|
-
- the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `okstra --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
|
|
34
|
+
- the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
|
|
35
35
|
- if a clarification response was carried in for this run, reconcile each prior `A*` (material) and `Q*` (question) row in section 0 and update its `Status` (`resolved`, `obsolete`) before deciding the verdict
|
|
36
36
|
- Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
|
|
37
37
|
- **Assume the user (and their team) holds full authority and every permission required for the anticipated work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the user explicitly states otherwise in the task brief.
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
- the final verdict waits until each required worker has either a result or an explicit terminal status
|
|
16
16
|
- unnamed generic parallel workers must not replace the required role roster
|
|
17
17
|
- Tooling — read-only MCP availability:
|
|
18
|
-
-
|
|
18
|
+
- the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to verify that the delivered change matches the live schema, that expected rows exist after a migration, or that invariants in `reference-expectations.md` hold against the database; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived blocker MUST cite server, table, and the SELECT used. MCP MUST NOT be used to perform fixes — defects become inputs to a new run.
|
|
19
19
|
- Primary focus areas:
|
|
20
20
|
- requirement coverage
|
|
21
21
|
- whether delivered config files and deployment manifests satisfy the recorded expected values
|
|
@@ -27,6 +27,25 @@
|
|
|
27
27
|
- acceptance blockers
|
|
28
28
|
- residual risk
|
|
29
29
|
- final release recommendations
|
|
30
|
+
- Required deliverable shape (final report, in addition to the standard sections):
|
|
31
|
+
- **Verdict vocabulary**: Section 2 (`Final Verdict`) MUST state exactly one of `accepted`, `conditional-accept`, or `blocked`. `conditional-accept` requires an explicit, exhaustive list of conditions; ambiguous verdicts ("looks good", "mostly ready") are not allowed.
|
|
32
|
+
- **Acceptance Blockers block** (under section 4): one row per blocker with `id`, `severity` (`critical` / `major` / `minor`), evidence (file path, log excerpt, or test output), and the recommended follow-up phase (`error-analysis` or `implementation-planning`). Empty block is acceptable and preferred — render the single line `- No acceptance blockers found.`
|
|
33
|
+
- **Residual Risk block** (under section 4): risks that are not blockers but should be tracked, each with mitigation owner and a trigger that would escalate them to a blocker.
|
|
34
|
+
- **Validation Evidence**: for every requirement in the originating plan or task brief, cite the artifact (commit SHA, test output, log line, MCP SELECT result) that demonstrates coverage. Paraphrased "verified" claims without an artifact are rejected.
|
|
35
|
+
- **Read-only command log**: any pre-existing test/validation command executed during this run MUST be listed with its exact command line and exit code. No mutating commands may appear here.
|
|
36
|
+
- **Routing recommendation**: brief note on the next safe phase (`done`, `error-analysis`, `implementation-planning`) tied to the verdict and blocker list.
|
|
37
|
+
- Clarification request policy:
|
|
38
|
+
- if a blocker hinges on information only the user can supply (deployment intent, intended target environment, business-rule interpretation), populate `## 5. Clarification Requests for the Next Run` in `final-report-template.md`
|
|
39
|
+
- section 5 must be split into `5.1 추가 자료 요청 (Additional Materials Requested)` and `5.2 사용자 확인 질문 (Questions for the User)` per the template. Never mix material requests and decision questions in the same row or list.
|
|
40
|
+
- write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. For each material request, state *why* it is needed, *where* the user can find it, and *where* to place it. For each question, state *why* the answer changes the verdict, *what* is being asked in a complete sentence, and *what shape of answer* is expected (예/아니오, 보기 중 하나, 숫자/날짜, 짧은 서술 등); supply concrete option choices when applicable.
|
|
41
|
+
- the preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>`; the lower-level form `--clarification-response <path>` remains available for scripted runs.
|
|
42
|
+
- if a clarification response was carried in for this run, reconcile each prior `A*` and `Q*` row in section 0 and update its `Status` (`resolved`, `obsolete`) before issuing the final verdict
|
|
43
|
+
- Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent):
|
|
44
|
+
1. **Verdict precision** — section 2 uses one of the three allowed verdict tokens; `conditional-accept` lists every condition as an actionable item.
|
|
45
|
+
2. **Blocker traceability** — every blocker cites a concrete artifact (file:line, log excerpt, test exit code, MCP SELECT). Blockers without evidence are demoted to residual risk or removed.
|
|
46
|
+
3. **Coverage check** — every requirement in the originating plan/task brief is either marked covered (with artifact) or listed as a blocker. No silent omissions.
|
|
47
|
+
4. **Verifier dissent preserved** — if workers reach different verdicts, the disagreement is visible in section 1.2; synthesis hides nothing.
|
|
48
|
+
5. **No-mutation audit** — scan the run's session transcripts for any Edit / Write / mutating Bash command. Any occurrence means the run has crossed into implementation and MUST be re-routed; do NOT silently strip the evidence.
|
|
30
49
|
- Authority & permissions assumption (HARD RULE — applies to every okstra task-type):
|
|
31
50
|
- **Assume the user (and their team) holds full authority and every permission required for the delivered and follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
|
|
32
51
|
- Do NOT raise such items as acceptance blockers, residual risks, or release recommendations, and do not factor them into any effort/day estimate for follow-up runs. They are not legitimate sources of schedule extension.
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
- the final verdict waits until each required worker has either a result or an explicit terminal status
|
|
16
16
|
- unnamed generic parallel workers must not replace the required role roster
|
|
17
17
|
- Tooling — read-only MCP availability:
|
|
18
|
-
-
|
|
18
|
+
- the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried to size the blast radius of an option (table cardinality, column types, foreign-key fan-out, indexes), to validate migration assumptions, or to confirm that a proposed query shape returns the expected rows; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived figure entering the trade-off matrix or risk assessment MUST cite server, table, and the SELECT used. MCP MUST NOT be used as a write path even when planning a migration — schema changes belong in migration files reviewed by humans.
|
|
19
19
|
- Pre-planning context exploration (mandatory before option drafting):
|
|
20
20
|
- read the task brief, related-task briefs, and any cited spec / design doc end-to-end
|
|
21
21
|
- inspect the current state of every file the task names (or the closest matching files if names are stale) — record current responsibilities, public interfaces, and known coupling points
|
|
@@ -6,18 +6,24 @@
|
|
|
6
6
|
- codex
|
|
7
7
|
- gemini
|
|
8
8
|
- report-writer
|
|
9
|
+
- **Executor binding (resolved at run-prep time, fixed for this run):**
|
|
10
|
+
- Executor display name: `{{EXECUTOR_DISPLAY_NAME}}`
|
|
11
|
+
- Executor provider: `{{EXECUTOR_PROVIDER}}` (one of: `claude` | `codex` | `gemini`; chosen via `--executor` or `OKSTRA_DEFAULT_EXECUTOR`, default `claude`)
|
|
12
|
+
- Executor subagent for dispatch: `{{EXECUTOR_WORKER_AGENT}}`
|
|
13
|
+
- Executor model: `{{EXECUTOR_MODEL_DISPLAY}}` (launch value: `{{EXECUTOR_MODEL_EXECUTION_VALUE}}`)
|
|
14
|
+
- Wherever this profile mentions the `Executor`, it refers to the role bound above. The other two providers in the roster (`claude` / `codex` / `gemini` minus the executor) are dispatched as **verifiers only** for this run and remain strictly read-only.
|
|
9
15
|
- Team contract:
|
|
10
|
-
- `Claude lead` is synthesis-only and stays distinct from `
|
|
11
|
-
- **Executor role:**
|
|
12
|
-
- **Verifier roles:** `
|
|
13
|
-
-
|
|
16
|
+
- `Claude lead` is synthesis-only and stays distinct from the `Executor` and the verifiers
|
|
17
|
+
- **Executor role:** the `Executor` (bound above) is the **only worker permitted to use Edit / Write / state-mutating Bash commands** on project files. All other workers run read-only. When the executor provider is `codex` or `gemini`, the actual file mutation happens inside the executor CLI's own auto-edit mode (e.g. `codex exec --full-auto`, gemini's equivalent) — not through Claude-side Edit/Write tools — but the safety rules in this profile still apply identically.
|
|
18
|
+
- **Verifier roles:** the three verifier slots are `Claude verifier`, `Codex verifier`, and `Gemini verifier`. All three are dispatched regardless of which provider holds the executor role; the executor's own provider is run *separately* as a verifier (a fresh CLI session with no shared context) so that no verdict is produced from the same session that wrote the diff. Verifiers MUST NOT call Edit, Write, or any Bash command that mutates files outside the run's artifact directories. If a verifier wants a fix, it records the recommendation in its worker result; it does not apply the fix itself.
|
|
19
|
+
- Session isolation — not model-variant divergence — is the primary self-review safeguard: each verifier is a separate CLI invocation with its own context window, so reusing the same model variant for executor and same-provider verifier is acceptable. Assigning different model variants (e.g. executor=opus / Claude verifier=sonnet) remains recommended when available because it adds defence-in-depth, but it is no longer a hard requirement.
|
|
14
20
|
- `Report writer worker` is the **author** of the final-report file; `Claude lead` reviews and approves the produced draft and does NOT write the file itself (see `okstra-team-contract` and `okstra-report-writer` for the authoritative contract).
|
|
15
|
-
- default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`
|
|
21
|
+
- default model assignments are resolved from centralised defaults; the fallback values are `Claude lead`/`Report writer worker`=`opus`, `Claude verifier`=`sonnet`, `Codex verifier`=`gpt-5.5`, `Gemini verifier`=`auto`. The `Executor`'s model is taken from the provider-specific worker model corresponding to `--executor`: claude→`--claude-model` (default `sonnet`, override to `opus` recommended when this run's executor is claude), codex→`--codex-model` (default `gpt-5.5`), gemini→`--gemini-model` (default `auto`).
|
|
16
22
|
- all three verifier roles (`Gemini verifier`, `Codex verifier`, `Claude verifier`) must be attempted; the final verdict waits until each has either a result or an explicit terminal status
|
|
17
23
|
- **All-verifier-failure policy**: if every required verifier (`Gemini verifier`, `Codex verifier`, `Claude verifier`) ends with a non-result terminal status (`timeout`, `error`, `not-run`) — i.e. zero independent verdicts were produced — the run MUST end with status `blocked` and route to a follow-up `error-analysis` run. `Claude lead` MUST NOT substitute its own verdict in place of the missing verifier outputs; synthesis requires at least one independent verifier's verdict. If one or two verifiers fail but at least one returns a verdict, the run proceeds with the surviving verdict(s) and the final report MUST explicitly notate which verifiers were unavailable, with the captured error / timeout evidence per failed verifier.
|
|
18
24
|
- unnamed generic parallel workers must not replace the required role roster, and no additional sub-agent dispatch is allowed beyond this roster
|
|
19
25
|
- Tooling — read-only MCP availability:
|
|
20
|
-
-
|
|
26
|
+
- the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried by both executor and verifiers as a read-only cross-check (sanity-checking row counts after a migration script's dry-run, comparing observed schema against the plan's expectations, etc.); that section is the canonical source of which servers and tools exist for this run, and any MCP-derived evidence MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files, never through this MCP.
|
|
21
27
|
- Pre-implementation gate (mandatory — refuse to start if any item fails):
|
|
22
28
|
- the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
|
|
23
29
|
- that file MUST contain a `User Approval Request` block AND a recorded user approval marker matching one of the following line-anchored, case-insensitive forms (the runtime regex in `okstra_ctl.run._validate_approved_plan` enforces this and rejects the run with `PrepareError` before any prompt is generated): `APPROVED` (alone, followed by `:`, or end-of-line), `[x] Approved`, or `User Approval: APPROVED|granted|yes`. Free-form approvals such as "lgtm", "go ahead", or paraphrased confirmations are intentionally NOT accepted; if the user's approval is informal, re-edit the plan file to add one of the exact markers above before invoking the implementation run.
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
- the final verdict waits until each required worker has either a result or an explicit terminal status
|
|
16
16
|
- unnamed generic parallel workers must not replace the required role roster
|
|
17
17
|
- Tooling — read-only MCP availability:
|
|
18
|
-
-
|
|
18
|
+
- the read-only MCP servers declared in the task brief's `## Available MCP Servers` section may be queried when local schema or sample data clarifies the work category or routing decision; that section is the canonical source of which servers and tools exist for this run, and any MCP-derived finding MUST cite server, table, and the SELECT used
|
|
19
19
|
- Primary focus areas:
|
|
20
20
|
- classify the work as bugfix, feature, improvement, refactor, or ops-change
|
|
21
21
|
- determine whether `error-analysis`, `implementation-planning`, or a direct implementation handoff is the next safe step
|
|
@@ -51,6 +51,7 @@ okstra execution summary:
|
|
|
51
51
|
recommended workers: ${SELECTED_REVIEWERS}
|
|
52
52
|
lead model: ${LEAD_MODEL_DISPLAY}
|
|
53
53
|
worker models: claude=${CLAUDE_WORKER_MODEL_DISPLAY}, codex=${CODEX_WORKER_MODEL_DISPLAY}, gemini=${GEMINI_WORKER_MODEL_DISPLAY}, report-writer=${REPORT_WRITER_MODEL_DISPLAY}
|
|
54
|
+
executor (implementation only): ${EXECUTOR_OVERRIDE:-default(claude)}
|
|
54
55
|
task key input: ${TASK_KEY_INPUT:-None}
|
|
55
56
|
task key: ${TASK_KEY}
|
|
56
57
|
task root: ${TASK_ROOT}
|
|
@@ -131,6 +132,10 @@ while [[ $# -gt 0 ]]; do
|
|
|
131
132
|
REPORT_WRITER_MODEL_OVERRIDE="$(require_option_value --report-writer-model "${2-}")"
|
|
132
133
|
shift 2
|
|
133
134
|
;;
|
|
135
|
+
--executor)
|
|
136
|
+
EXECUTOR_OVERRIDE="$(require_option_value --executor "${2-}")"
|
|
137
|
+
shift 2
|
|
138
|
+
;;
|
|
134
139
|
--related-tasks)
|
|
135
140
|
RELATED_TASKS_RAW="$(require_option_value --related-tasks "${2-}")"
|
|
136
141
|
shift 2
|
|
@@ -204,7 +209,7 @@ while [[ $# -gt 0 ]]; do
|
|
|
204
209
|
printf ' hint: did you mean --task-id?\n' >&2
|
|
205
210
|
;;
|
|
206
211
|
esac
|
|
207
|
-
printf ' valid options: --render-only --resume-clarification --yes --refresh-assets --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan -h|--help\n' >&2
|
|
212
|
+
printf ' valid options: --render-only --resume-clarification --yes --refresh-assets --workers --lead-model --claude-model --codex-model --gemini-model --report-writer-model --executor --related-tasks --task-type --project-id --project-root --task-group --task-id --task-brief --directive --clarification-response --approved-plan -h|--help\n' >&2
|
|
208
213
|
usage
|
|
209
214
|
exit 1
|
|
210
215
|
;;
|
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
usage() {
|
|
4
4
|
cat >&2 <<USAGE_EOF
|
|
5
5
|
usage:
|
|
6
|
-
$DISPLAY_COMMAND_NAME [--render-only] [--yes] [--refresh-assets] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--related-tasks taskA,taskB] --project-id <project-id> [--project-root <path>] --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
|
|
6
|
+
$DISPLAY_COMMAND_NAME [--render-only] [--yes] [--refresh-assets] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--executor claude|codex|gemini] [--related-tasks taskA,taskB] --project-id <project-id> [--project-root <path>] --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
|
|
7
7
|
|
|
8
8
|
summary:
|
|
9
9
|
$DISPLAY_TOOL_NAME prepares a task-keyed instruction bundle for Claude Code and launches an interactive Claude session by default.
|
|
@@ -15,7 +15,7 @@ summary:
|
|
|
15
15
|
permissions are injected via 'claude --settings' at launch time.
|
|
16
16
|
|
|
17
17
|
required arguments:
|
|
18
|
-
--project-id Globally unique project ID. Example:
|
|
18
|
+
--project-id Globally unique project ID. Example: sample-project-v2-api.
|
|
19
19
|
Each project is registered at <project-root>/.project-docs/okstra/project.json
|
|
20
20
|
on first run; subsequent runs verify the projectId there matches.
|
|
21
21
|
--task-group Logical task group. Example: backend-api, bugfix, linear-8858
|
|
@@ -66,6 +66,11 @@ options:
|
|
|
66
66
|
--gemini-model Model for Gemini worker. Default: OKSTRA_DEFAULT_GEMINI_MODEL or auto
|
|
67
67
|
--report-writer-model
|
|
68
68
|
Model for report writer worker. Default: OKSTRA_DEFAULT_REPORT_WRITER_MODEL or lead model default
|
|
69
|
+
--executor Provider that performs the Executor role during --task-type=implementation.
|
|
70
|
+
One of: claude | codex | gemini. Default: OKSTRA_DEFAULT_EXECUTOR or claude.
|
|
71
|
+
The Executor is the only worker allowed to mutate project files; the other two
|
|
72
|
+
providers are dispatched as read-only verifiers regardless of this selection.
|
|
73
|
+
Has no effect on other task types.
|
|
69
74
|
--related-tasks Optional comma-separated related task identifiers. Example: auth-token-refresh,frontend-login-ui
|
|
70
75
|
--task-type Set the task purpose for this run and select the matching profile file.
|
|
71
76
|
-h, --help Show this help.
|
|
@@ -76,6 +81,7 @@ model defaults:
|
|
|
76
81
|
Claude worker: OKSTRA_DEFAULT_CLAUDE_MODEL or sonnet
|
|
77
82
|
Codex worker: OKSTRA_DEFAULT_CODEX_MODEL or gpt-5.5
|
|
78
83
|
Gemini worker: OKSTRA_DEFAULT_GEMINI_MODEL or auto
|
|
84
|
+
Implementation executor: OKSTRA_DEFAULT_EXECUTOR or claude (one of: claude | codex | gemini)
|
|
79
85
|
|
|
80
86
|
output:
|
|
81
87
|
Stable task bundles are stored under:
|
|
@@ -681,6 +681,14 @@ def render_run_manifest(run_manifest_path: str, ctx: dict) -> None:
|
|
|
681
681
|
"disallowLeadSoloAnalysisAsWorkerResult": True,
|
|
682
682
|
"disallowGenericParallelOnlyExecution": True,
|
|
683
683
|
"preferredCompletedWorkerResults": len(reviewers),
|
|
684
|
+
"executor": {
|
|
685
|
+
"provider": ctx.get("EXECUTOR_PROVIDER", ""),
|
|
686
|
+
"displayName": ctx.get("EXECUTOR_DISPLAY_NAME", ""),
|
|
687
|
+
"workerAgent": ctx.get("EXECUTOR_WORKER_AGENT", ""),
|
|
688
|
+
"model": ctx.get("EXECUTOR_MODEL_DISPLAY", ""),
|
|
689
|
+
"modelExecutionValue": ctx.get("EXECUTOR_MODEL_EXECUTION_VALUE", ""),
|
|
690
|
+
"appliesTo": "implementation",
|
|
691
|
+
},
|
|
684
692
|
},
|
|
685
693
|
"validation": {
|
|
686
694
|
"required": True,
|
|
@@ -857,6 +865,59 @@ def render_task_index(template_path: str, output_path: str, ctx: dict) -> None:
|
|
|
857
865
|
_write_text(Path(output_path), rendered.rstrip() + "\n")
|
|
858
866
|
|
|
859
867
|
|
|
868
|
+
# --------------------------------------------------------------------------- #
|
|
869
|
+
# Available MCP servers block
|
|
870
|
+
# --------------------------------------------------------------------------- #
|
|
871
|
+
|
|
872
|
+
|
|
873
|
+
_NO_MCP_SERVERS_LINE = (
|
|
874
|
+
"- No MCP servers are declared in `.project-docs/okstra/project.json`'s "
|
|
875
|
+
"`mcpServers` array. Treat MCP tools as unavailable for this run. To enable "
|
|
876
|
+
"them, add entries shaped `{name, description, tools, notes?}` to that array "
|
|
877
|
+
"and re-render the bundle."
|
|
878
|
+
)
|
|
879
|
+
|
|
880
|
+
|
|
881
|
+
def build_available_mcp_servers_block(project_root: Path) -> str:
|
|
882
|
+
"""Render the `## Available MCP Servers` first bullet from project.json.
|
|
883
|
+
|
|
884
|
+
The MCP server list used to be hardcoded for one specific environment.
|
|
885
|
+
It now comes from the project's `.project-docs/okstra/project.json`
|
|
886
|
+
(`mcpServers` array), so each user/project declares the MCP surface
|
|
887
|
+
available to their lead+workers. Missing file or empty array yields a
|
|
888
|
+
generic "none declared" fallback.
|
|
889
|
+
"""
|
|
890
|
+
config_path = project_root / ".project-docs" / "okstra" / "project.json"
|
|
891
|
+
try:
|
|
892
|
+
raw = json.loads(config_path.read_text(encoding="utf-8"))
|
|
893
|
+
except (FileNotFoundError, json.JSONDecodeError):
|
|
894
|
+
return _NO_MCP_SERVERS_LINE
|
|
895
|
+
servers = raw.get("mcpServers") if isinstance(raw, dict) else None
|
|
896
|
+
if not isinstance(servers, list) or not servers:
|
|
897
|
+
return _NO_MCP_SERVERS_LINE
|
|
898
|
+
lines: list[str] = []
|
|
899
|
+
for entry in servers:
|
|
900
|
+
if not isinstance(entry, dict):
|
|
901
|
+
continue
|
|
902
|
+
name = str(entry.get("name", "")).strip()
|
|
903
|
+
if not name:
|
|
904
|
+
continue
|
|
905
|
+
description = str(entry.get("description", "")).strip()
|
|
906
|
+
tools = entry.get("tools") or []
|
|
907
|
+
notes = str(entry.get("notes", "")).strip()
|
|
908
|
+
parts = [f"`mcp__{name}`"]
|
|
909
|
+
if description:
|
|
910
|
+
parts.append(description)
|
|
911
|
+
if isinstance(tools, list) and tools:
|
|
912
|
+
tool_names = ", ".join(f"`{str(t).strip()}`" for t in tools if str(t).strip())
|
|
913
|
+
if tool_names:
|
|
914
|
+
parts.append(f"Tools: {tool_names}")
|
|
915
|
+
if notes:
|
|
916
|
+
parts.append(notes)
|
|
917
|
+
lines.append("- " + ". ".join(parts) + ".")
|
|
918
|
+
return "\n".join(lines) if lines else _NO_MCP_SERVERS_LINE
|
|
919
|
+
|
|
920
|
+
|
|
860
921
|
# --------------------------------------------------------------------------- #
|
|
861
922
|
# launch.template.md rendering
|
|
862
923
|
# --------------------------------------------------------------------------- #
|
|
@@ -1003,6 +1064,10 @@ def render_template_file(template_path: str, output_path: str, ctx: dict) -> Non
|
|
|
1003
1064
|
"{{WORKFLOW_NEXT_RECOMMENDED_PHASE}}": ctx.get("WORKFLOW_NEXT_RECOMMENDED_PHASE", ""),
|
|
1004
1065
|
"{{PHASE_ALLOWED_OUTPUTS}}": ctx.get("PHASE_ALLOWED_OUTPUTS", ""),
|
|
1005
1066
|
"{{PHASE_FORBIDDEN_ACTIONS}}": ctx.get("PHASE_FORBIDDEN_ACTIONS", ""),
|
|
1067
|
+
"{{AVAILABLE_MCP_SERVERS}}": ctx.get(
|
|
1068
|
+
"AVAILABLE_MCP_SERVERS",
|
|
1069
|
+
build_available_mcp_servers_block(Path(ctx.get("PROJECT_ROOT", "."))),
|
|
1070
|
+
),
|
|
1006
1071
|
}
|
|
1007
1072
|
rendered = template
|
|
1008
1073
|
for k, v in mapping.items():
|
|
@@ -81,6 +81,7 @@ class PrepareInputs:
|
|
|
81
81
|
codex_model: str = ""
|
|
82
82
|
gemini_model: str = ""
|
|
83
83
|
report_writer_model: str = ""
|
|
84
|
+
executor: str = ""
|
|
84
85
|
related_tasks_raw: str = ""
|
|
85
86
|
approved_plan_path: str = ""
|
|
86
87
|
clarification_response_path: str = "" # absolute or empty
|
|
@@ -265,6 +266,7 @@ def _canonical_argv(inp: PrepareInputs, ctx: dict) -> list[str]:
|
|
|
265
266
|
("--codex-model", inp.codex_model or ctx.get("CODEX_WORKER_MODEL_DISPLAY", "")),
|
|
266
267
|
("--gemini-model", inp.gemini_model or ctx.get("GEMINI_WORKER_MODEL_DISPLAY", "")),
|
|
267
268
|
("--report-writer-model", inp.report_writer_model or ctx.get("REPORT_WRITER_MODEL_DISPLAY", "")),
|
|
269
|
+
("--executor", inp.executor or ctx.get("EXECUTOR_PROVIDER", "")),
|
|
268
270
|
("--related-tasks", inp.related_tasks_raw),
|
|
269
271
|
]
|
|
270
272
|
argv: list[str] = []
|
|
@@ -383,6 +385,22 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
|
|
|
383
385
|
default_display=report_writer_default, default_execution=report_writer_default,
|
|
384
386
|
)
|
|
385
387
|
|
|
388
|
+
# ---- executor binding (implementation phase only; recorded universally for manifest consistency) ----
|
|
389
|
+
executor_default = _default("OKSTRA_DEFAULT_EXECUTOR", "claude")
|
|
390
|
+
executor_provider = (inp.executor or executor_default).strip().lower()
|
|
391
|
+
if executor_provider not in ("claude", "codex", "gemini"):
|
|
392
|
+
raise PrepareError(
|
|
393
|
+
f"--executor must be one of: claude, codex, gemini (got: {executor_provider!r})"
|
|
394
|
+
)
|
|
395
|
+
executor_provider_to_meta = {
|
|
396
|
+
"claude": ("Claude executor", "claude-worker", cw),
|
|
397
|
+
"codex": ("Codex executor", "codex-worker", co),
|
|
398
|
+
"gemini": ("Gemini executor", "gemini-worker", ge),
|
|
399
|
+
}
|
|
400
|
+
executor_display_name, executor_worker_agent, executor_model_meta = (
|
|
401
|
+
executor_provider_to_meta[executor_provider]
|
|
402
|
+
)
|
|
403
|
+
|
|
386
404
|
# ---- paths under per-task mutex (writes run-context-*.json) ----
|
|
387
405
|
# OKSTRA_RUN_SEQ_OVERRIDE: okstra-ctl rerun / 테스트 hook 이 미리 reserve
|
|
388
406
|
# 한 seq 를 강제하는 user-knob 환경 변수.
|
|
@@ -444,6 +462,11 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
|
|
|
444
462
|
"GEMINI_WORKER_MODEL_EXECUTION_VALUE": ge.execution,
|
|
445
463
|
"REPORT_WRITER_MODEL_DISPLAY": rw.display,
|
|
446
464
|
"REPORT_WRITER_MODEL_EXECUTION_VALUE": rw.execution,
|
|
465
|
+
"EXECUTOR_PROVIDER": executor_provider,
|
|
466
|
+
"EXECUTOR_DISPLAY_NAME": executor_display_name,
|
|
467
|
+
"EXECUTOR_WORKER_AGENT": executor_worker_agent,
|
|
468
|
+
"EXECUTOR_MODEL_DISPLAY": executor_model_meta.display,
|
|
469
|
+
"EXECUTOR_MODEL_EXECUTION_VALUE": executor_model_meta.execution,
|
|
447
470
|
"RELATED_TASKS_JSON": related_tasks_json_str,
|
|
448
471
|
"RELATED_TASKS_BULLETS": bullets,
|
|
449
472
|
"RELATED_TASKS_INLINE": inline,
|
|
@@ -473,7 +496,16 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
|
|
|
473
496
|
# ---- write instruction-set scaffolding ----
|
|
474
497
|
instruction_set = Path(ctx["INSTRUCTION_SET_DIR"])
|
|
475
498
|
instruction_set.mkdir(parents=True, exist_ok=True)
|
|
476
|
-
|
|
499
|
+
profile_rendered = profile_content
|
|
500
|
+
for key in (
|
|
501
|
+
"EXECUTOR_PROVIDER",
|
|
502
|
+
"EXECUTOR_DISPLAY_NAME",
|
|
503
|
+
"EXECUTOR_WORKER_AGENT",
|
|
504
|
+
"EXECUTOR_MODEL_DISPLAY",
|
|
505
|
+
"EXECUTOR_MODEL_EXECUTION_VALUE",
|
|
506
|
+
):
|
|
507
|
+
profile_rendered = profile_rendered.replace("{{" + key + "}}", ctx.get(key, ""))
|
|
508
|
+
(instruction_set / "analysis-profile.md").write_text(profile_rendered, encoding="utf-8")
|
|
477
509
|
(instruction_set / "analysis-material.md").write_text(review_material, encoding="utf-8")
|
|
478
510
|
shutil.copyfile(inp.brief_path, instruction_set / "task-brief.md")
|
|
479
511
|
if inp.clarification_response_path:
|
|
@@ -512,6 +544,7 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
|
|
|
512
544
|
"codexModel": co.display,
|
|
513
545
|
"geminiModel": ge.display,
|
|
514
546
|
"reportWriterModel": rw.display,
|
|
547
|
+
"executor": executor_provider,
|
|
515
548
|
"relatedTasks": inp.related_tasks_raw,
|
|
516
549
|
"approvedPlanPath": inp.approved_plan_path,
|
|
517
550
|
"clarificationResponsePath": inp.clarification_response_path,
|
|
@@ -603,6 +636,7 @@ def main(argv: list[str]) -> int:
|
|
|
603
636
|
p.add_argument("--codex-model", default="")
|
|
604
637
|
p.add_argument("--gemini-model", default="")
|
|
605
638
|
p.add_argument("--report-writer-model", default="")
|
|
639
|
+
p.add_argument("--executor", default="")
|
|
606
640
|
p.add_argument("--related-tasks", default="", dest="related_tasks_raw")
|
|
607
641
|
p.add_argument("--approved-plan", default="", dest="approved_plan_path")
|
|
608
642
|
p.add_argument("--clarification-response", default="", dest="clarification_response_path")
|
|
@@ -641,6 +675,7 @@ def main(argv: list[str]) -> int:
|
|
|
641
675
|
codex_model=args.codex_model,
|
|
642
676
|
gemini_model=args.gemini_model,
|
|
643
677
|
report_writer_model=args.report_writer_model,
|
|
678
|
+
executor=args.executor,
|
|
644
679
|
related_tasks_raw=args.related_tasks_raw,
|
|
645
680
|
approved_plan_path=args.approved_plan_path,
|
|
646
681
|
clarification_response_path=clarification_abs,
|
|
@@ -103,6 +103,7 @@ To re-run a specific run:
|
|
|
103
103
|
- `recommendedWorkers` → `--workers` (comma-separated)
|
|
104
104
|
- `relatedTasks` → `--related-tasks` (if present)
|
|
105
105
|
- model overrides → `--claude-model`, `--codex-model`, `--gemini-model` (if different from default)
|
|
106
|
+
- for `taskType: implementation`: `teamContract.executor.provider` → `--executor <claude|codex|gemini>` (if different from `claude`)
|
|
106
107
|
4. Display the assembled command:
|
|
107
108
|
|
|
108
109
|
```bash
|
|
@@ -128,8 +128,9 @@ Validate that slugified `task_group` and `task_id` each contain at least one alp
|
|
|
128
128
|
|
|
129
129
|
For existing tasks, present `nextRecommendedPhase` as the first option (recommended default).
|
|
130
130
|
|
|
131
|
-
If `implementation` chosen, ask
|
|
131
|
+
If `implementation` chosen, ask two more `AskUserQuestion` in order:
|
|
132
132
|
- `"Path to the approved final-report.md (must contain APPROVED marker)"` — the underlying python `prepare_task_bundle` re-validates the marker, but you can pre-check with `grep`.
|
|
133
|
+
- `"Executor provider for this run (claude | codex | gemini)?"` — only this provider mutates project files; the other two run as read-only verifiers. Default `claude` (or `OKSTRA_DEFAULT_EXECUTOR` if set). Pass the answer through `PrepareInputs.executor`.
|
|
133
134
|
|
|
134
135
|
## Step 5: Brief path
|
|
135
136
|
|
|
@@ -176,6 +177,7 @@ out = prepare_task_bundle(PrepareInputs(
|
|
|
176
177
|
lead_model="...", claude_model="...", codex_model="...",
|
|
177
178
|
gemini_model="...", report_writer_model="...",
|
|
178
179
|
related_tasks_raw="...",
|
|
180
|
+
executor="<claude|codex|gemini or empty>", # implementation only; empty → default (claude / OKSTRA_DEFAULT_EXECUTOR)
|
|
179
181
|
approved_plan_path="<approved-plan-or-empty>",
|
|
180
182
|
clarification_response_path=str(clarification_abs) if clarification_abs else "",
|
|
181
183
|
render_only=True,
|
|
@@ -96,7 +96,7 @@ so overwriting requires manually deleting the file first.
|
|
|
96
96
|
|
|
97
97
|
If the file does NOT exist, ask via `AskUserQuestion`:
|
|
98
98
|
|
|
99
|
-
- **Question**: `"Project id for okstra (e.g. INV-1234,
|
|
99
|
+
- **Question**: `"Project id for okstra (e.g. INV-1234, my-app, okstra)"`
|
|
100
100
|
- **Validate**: slugified must contain at least one alphanumeric character.
|
|
101
101
|
|
|
102
102
|
Then create the file:
|
|
@@ -42,6 +42,7 @@ Only workers selected from `recommendedWorkers` in `task-manifest.json` and `res
|
|
|
42
42
|
|
|
43
43
|
## Operating Rules
|
|
44
44
|
|
|
45
|
+
0. **TeamCreate ordering (BLOCKING).** Before issuing any `Agent` dispatch that includes `team_name`, Lead MUST have called `TeamCreate(team_name: "okstra-<task-key>", ...)` in this run and recorded the outcome in team-state as `teamCreate: { attempted: true, status: "ok"|"error", error?: <message> }`. If the Agent tool rejects a dispatch with `"team must be created first or call without team_name"` / `"team을 먼저 생성하거나 team_name 없이 호출해야 합니다"`, the correct response is to go back to Phase 3 and call `TeamCreate` — NOT to strip `team_name` and retry. The no-`team_name` Phase 5 fallback is only legal when `teamCreate.status == "error"` is already recorded; otherwise stripping `team_name` silently degrades the run to in-process background dispatch and loses the Teams split-pane behavior. See [okstra agent SKILL.md Phase 3](../../agents/SKILL.md) for the full team-creation sequence.
|
|
45
46
|
1. `Claude lead` is responsible for orchestration, convergence supervision, and final-report review/approval. It never overrides worker analysis results, and it never authors the final-report file when `Report writer worker` is in the roster.
|
|
46
47
|
2. `Report writer worker` is NOT an analysis worker. It is excluded from Phase 4/5 (initial analysis) and Phase 5.5 (convergence re-verification). It is spawned only in Phase 6 and is the **author** of the final-report file at `runs/<task-type>/reports/final-report-<task-type>-<seq>.md`.
|
|
47
48
|
3. When `Report writer worker` is in the roster, Lead MUST dispatch it in Phase 6. The only legal lead-authored fallback is when a dispatch was attempted and recorded a terminal status of `error` / `timeout` / `not-run` with a concrete logged reason. Speculative reasons such as "session resume constraint" or "team is no longer alive" are NOT valid — Lead can always dispatch a fresh subagent (omit `team_name` if the team is gone).
|
|
@@ -83,19 +83,7 @@
|
|
|
83
83
|
"Bash(curl:*)",
|
|
84
84
|
"Bash(wget:*)",
|
|
85
85
|
"mcp__test-context7__resolve-library-id",
|
|
86
|
-
"mcp__test-context7__query-docs"
|
|
87
|
-
"mcp__mysql-fontsninja-common__mysql_list_tables",
|
|
88
|
-
"mcp__mysql-fontsninja-common__mysql_describe_table",
|
|
89
|
-
"mcp__mysql-fontsninja-common__mysql_select_data",
|
|
90
|
-
"mcp__mysql-fontsninja-fontradar__mysql_list_tables",
|
|
91
|
-
"mcp__mysql-fontsninja-fontradar__mysql_describe_table",
|
|
92
|
-
"mcp__mysql-fontsninja-fontradar__mysql_select_data",
|
|
93
|
-
"mcp__mysql-fontsninja-fontsninja__mysql_list_tables",
|
|
94
|
-
"mcp__mysql-fontsninja-fontsninja__mysql_describe_table",
|
|
95
|
-
"mcp__mysql-fontsninja-fontsninja__mysql_select_data",
|
|
96
|
-
"mcp__mysql-fontsninja-fonthelper__mysql_list_tables",
|
|
97
|
-
"mcp__mysql-fontsninja-fonthelper__mysql_describe_table",
|
|
98
|
-
"mcp__mysql-fontsninja-fonthelper__mysql_select_data"
|
|
86
|
+
"mcp__test-context7__query-docs"
|
|
99
87
|
]
|
|
100
88
|
}
|
|
101
89
|
}
|
|
@@ -127,24 +127,13 @@
|
|
|
127
127
|
|
|
128
128
|
## Available MCP Servers
|
|
129
129
|
|
|
130
|
-
The
|
|
130
|
+
The MCP servers available to this run are declared in `.project-docs/okstra/project.json`'s `mcpServers` array and rendered into the Claude lead's launch prompt under `## Available MCP Servers`. They may be invoked **as needed** by Claude lead, Claude worker, and Report writer worker. The lead is responsible for forwarding the rendered list verbatim into the worker prompts (Phase 2) so workers know which tools they are allowed to call.
|
|
131
131
|
|
|
132
|
-
|
|
133
|
-
|--------|--------------|------|-------------|
|
|
134
|
-
| `mcp__mysql-fontsninja-common` | local Docker MySQL `common` schema | read-only | shared lookups, code list, reference data |
|
|
135
|
-
| `mcp__mysql-fontsninja-fontradar` | local Docker MySQL `fontradar` schema | read-only | fontradar service data inspection |
|
|
136
|
-
| `mcp__mysql-fontsninja-fontsninja` | local Docker MySQL `fontsninja` schema | read-only | fontsninja service data inspection |
|
|
137
|
-
| `mcp__mysql-fontsninja-fonthelper` | local Docker MySQL `fonthelper` schema | read-only | fonthelper service data inspection |
|
|
138
|
-
|
|
139
|
-
Available tools per server (all read-only — write tools are disabled at the server):
|
|
140
|
-
|
|
141
|
-
- `mysql_list_tables`
|
|
142
|
-
- `mysql_describe_table`
|
|
143
|
-
- `mysql_select_data`
|
|
132
|
+
To declare servers, add entries shaped `{ "name": "<server>", "description": "...", "tools": ["..."], "notes": "..." }` to that array. If the array is empty or absent, treat MCP as unavailable for this run.
|
|
144
133
|
|
|
145
134
|
How to invoke (worker-by-worker):
|
|
146
135
|
|
|
147
|
-
- **Claude lead / Claude worker / Report writer worker**: invoke the MCP tool **directly by its tool name** (e.g. `
|
|
136
|
+
- **Claude lead / Claude worker / Report writer worker**: invoke the MCP tool **directly by its tool name** (e.g. `mcp__<server>__<tool>`) through the host's tool interface. **Do NOT call it via `Bash`** — these names are MCP tools, not shell commands; running them in a shell will always fail with `command not found` regardless of permission settings.
|
|
148
137
|
- **Codex worker / Gemini worker**: invoke through the external CLI's own MCP transport (e.g. `codex mcp call <server> <tool> <args>` for Codex CLI; the equivalent Gemini CLI MCP invocation for Gemini). If the worker's CLI has no matching MCP config, treat the server as unavailable for this run and record `MCP not available in this CLI` in `Missing Information or Assumptions` — do **not** attempt a shell fallback such as `mysql -h ...` or piping a tool name into `bash`.
|
|
149
138
|
- All workers: cite the exact server, tool, and SELECT (or `WHERE` filters) used in the result file. Tool-call failures must be logged in the worker's `*-errors.json` (commandKind `mcp_call`) so the lead can decide whether to retry under a different worker.
|
|
150
139
|
|
|
@@ -4,6 +4,7 @@ from __future__ import annotations
|
|
|
4
4
|
|
|
5
5
|
import argparse
|
|
6
6
|
import json
|
|
7
|
+
import os
|
|
7
8
|
import sys
|
|
8
9
|
from datetime import datetime, timezone
|
|
9
10
|
from pathlib import Path
|
|
@@ -478,6 +479,137 @@ def validate_phase_boundary(
|
|
|
478
479
|
)
|
|
479
480
|
|
|
480
481
|
|
|
482
|
+
def _import_token_usage():
|
|
483
|
+
"""Resolve and import the okstra_token_usage package across layouts.
|
|
484
|
+
|
|
485
|
+
Source tree: <repo>/scripts/okstra_token_usage
|
|
486
|
+
Built runtime: <runtime>/python/okstra_token_usage (next to validators/)
|
|
487
|
+
Installed: $OKSTRA_PYTHONPATH/okstra_token_usage (~/.okstra/lib/python)
|
|
488
|
+
"""
|
|
489
|
+
here = Path(__file__).resolve().parent
|
|
490
|
+
candidates = [
|
|
491
|
+
here.parent / "scripts",
|
|
492
|
+
here.parent / "python",
|
|
493
|
+
]
|
|
494
|
+
env_pp = os.environ.get("OKSTRA_PYTHONPATH", "").strip()
|
|
495
|
+
if env_pp:
|
|
496
|
+
candidates.append(Path(env_pp))
|
|
497
|
+
for candidate in candidates:
|
|
498
|
+
if candidate.is_dir() and (candidate / "okstra_token_usage").is_dir():
|
|
499
|
+
if str(candidate) not in sys.path:
|
|
500
|
+
sys.path.insert(0, str(candidate))
|
|
501
|
+
break
|
|
502
|
+
from okstra_token_usage.collect import collect # noqa: E402
|
|
503
|
+
from okstra_token_usage.report import substitute_final_report # noqa: E402
|
|
504
|
+
return collect, substitute_final_report
|
|
505
|
+
|
|
506
|
+
|
|
507
|
+
def _needs_token_autofix(team_state: dict, report_path: Path) -> bool:
|
|
508
|
+
summary = team_state.get("usageSummary") or {}
|
|
509
|
+
if not summary or not summary.get("collectedAt"):
|
|
510
|
+
return True
|
|
511
|
+
if report_path.is_file():
|
|
512
|
+
content = report_path.read_text()
|
|
513
|
+
if any(p in content for p in TOKEN_PLACEHOLDERS):
|
|
514
|
+
return True
|
|
515
|
+
return False
|
|
516
|
+
|
|
517
|
+
|
|
518
|
+
def _accuracy_failures(updated: dict) -> list[str]:
|
|
519
|
+
"""Return human-readable reasons the collected usage is incomplete.
|
|
520
|
+
|
|
521
|
+
Goal: never let zero-valued usage be silently written or substituted into
|
|
522
|
+
the final report. If a session jsonl is missing, the operator must know
|
|
523
|
+
which one and why so they can re-collect — recording accurate token usage
|
|
524
|
+
is the contract this autofix preserves.
|
|
525
|
+
"""
|
|
526
|
+
reasons: list[str] = []
|
|
527
|
+
lead_usage = updated.get("leadUsage") or {}
|
|
528
|
+
if lead_usage.get("source") == "unavailable":
|
|
529
|
+
reasons.append(
|
|
530
|
+
"lead Claude session jsonl was not found — "
|
|
531
|
+
f"{lead_usage.get('note', 'reason unknown')}. "
|
|
532
|
+
"Token usage cannot be recorded accurately until the lead session is locatable."
|
|
533
|
+
)
|
|
534
|
+
for worker in updated.get("workers") or []:
|
|
535
|
+
role = worker.get("role") or worker.get("workerId") or "<unknown worker>"
|
|
536
|
+
status = worker.get("status")
|
|
537
|
+
usage = worker.get("usage") or {}
|
|
538
|
+
if status == "completed" and usage.get("source") == "unavailable":
|
|
539
|
+
reasons.append(
|
|
540
|
+
f"worker `{role}` (status=completed) has no usage data — "
|
|
541
|
+
f"{usage.get('note', 'reason unknown')}."
|
|
542
|
+
)
|
|
543
|
+
if worker.get("agent") in ("codex", "gemini") and usage.get("source") != "unavailable":
|
|
544
|
+
if "cliTotalTokens" not in usage:
|
|
545
|
+
reasons.append(
|
|
546
|
+
f"worker `{role}` ({worker.get('agent')}) wrapper jsonl was located "
|
|
547
|
+
f"but its underlying CLI session usage was not — "
|
|
548
|
+
f"{usage.get('cliNote', 'reason unknown')}."
|
|
549
|
+
)
|
|
550
|
+
return reasons
|
|
551
|
+
|
|
552
|
+
|
|
553
|
+
def attempt_token_usage_autofix(
|
|
554
|
+
team_state: dict,
|
|
555
|
+
team_state_path: Path,
|
|
556
|
+
report_path: Path,
|
|
557
|
+
project_root: Path,
|
|
558
|
+
) -> tuple[str, list[str]]:
|
|
559
|
+
"""Run the Phase 7 token-usage collector in-process when artifacts indicate
|
|
560
|
+
Phase 7 was skipped.
|
|
561
|
+
|
|
562
|
+
Returns ``(state, messages)`` where ``state`` is one of:
|
|
563
|
+
|
|
564
|
+
- ``"skipped"`` — opt-out or autofix not needed; messages is empty.
|
|
565
|
+
- ``"recovered"`` — collector ran AND every session that should have a
|
|
566
|
+
jsonl was found; team-state is rewritten and the final report's token
|
|
567
|
+
placeholders are substituted with real values. messages carries a
|
|
568
|
+
single info line.
|
|
569
|
+
- ``"accuracy-failed"`` — collector ran but at least one expected
|
|
570
|
+
session is missing. Nothing is written to disk; messages contains the
|
|
571
|
+
contract violations the validator must surface so the operator can
|
|
572
|
+
re-collect accurately rather than ship a report containing zeros.
|
|
573
|
+
- ``"import-failed"`` / ``"collector-error"`` — autofix could not run;
|
|
574
|
+
caller falls back to the original contract failures.
|
|
575
|
+
"""
|
|
576
|
+
if os.environ.get("OKSTRA_VALIDATE_NO_AUTOFIX") == "1":
|
|
577
|
+
return "skipped", []
|
|
578
|
+
if not _needs_token_autofix(team_state, report_path):
|
|
579
|
+
return "skipped", []
|
|
580
|
+
try:
|
|
581
|
+
collect, substitute_final_report = _import_token_usage()
|
|
582
|
+
except Exception as exc: # noqa: BLE001
|
|
583
|
+
return "import-failed", [f"okstra_token_usage import failed: {exc}"]
|
|
584
|
+
try:
|
|
585
|
+
updated = collect(team_state_path, project_root)
|
|
586
|
+
except Exception as exc: # noqa: BLE001
|
|
587
|
+
return "collector-error", [f"token-usage collector raised: {exc}"]
|
|
588
|
+
|
|
589
|
+
accuracy_problems = _accuracy_failures(updated)
|
|
590
|
+
if accuracy_problems:
|
|
591
|
+
# Refuse to persist zeroed usage. Surface specific reasons so the
|
|
592
|
+
# operator can locate the missing session(s) instead of silently
|
|
593
|
+
# shipping a report with `0` token counts.
|
|
594
|
+
return "accuracy-failed", [
|
|
595
|
+
f"Phase 7 token-usage auto-recovery refused to write incomplete data: {reason}"
|
|
596
|
+
for reason in accuracy_problems
|
|
597
|
+
]
|
|
598
|
+
|
|
599
|
+
team_state_path.write_text(
|
|
600
|
+
json.dumps(updated, indent=2, ensure_ascii=False) + "\n"
|
|
601
|
+
)
|
|
602
|
+
replaced = substitute_final_report(report_path, updated)
|
|
603
|
+
detail = (
|
|
604
|
+
f"replaced {replaced} placeholder(s)"
|
|
605
|
+
if replaced > 0
|
|
606
|
+
else "no placeholders to replace"
|
|
607
|
+
if replaced == 0
|
|
608
|
+
else "report file missing"
|
|
609
|
+
)
|
|
610
|
+
return "recovered", [f"usageSummary repopulated; {detail}"]
|
|
611
|
+
|
|
612
|
+
|
|
481
613
|
def main() -> int:
|
|
482
614
|
parser = argparse.ArgumentParser(
|
|
483
615
|
description="Validate okstra run contract artifacts."
|
|
@@ -527,7 +659,20 @@ def main() -> int:
|
|
|
527
659
|
report_path = resolve_input(args.report)
|
|
528
660
|
team_state = load_json(team_state_path)
|
|
529
661
|
|
|
662
|
+
autofix_state, autofix_messages = attempt_token_usage_autofix(
|
|
663
|
+
team_state, team_state_path, report_path, project_root
|
|
664
|
+
)
|
|
665
|
+
if autofix_state == "recovered":
|
|
666
|
+
team_state = load_json(team_state_path)
|
|
667
|
+
for msg in autofix_messages:
|
|
668
|
+
print(f"validate-run: Phase 7 auto-recovery — {msg}", file=sys.stderr)
|
|
669
|
+
elif autofix_state in ("import-failed", "collector-error"):
|
|
670
|
+
for msg in autofix_messages:
|
|
671
|
+
print(f"validate-run: Phase 7 auto-recovery skipped — {msg}", file=sys.stderr)
|
|
672
|
+
|
|
530
673
|
failures: list[str] = []
|
|
674
|
+
if autofix_state == "accuracy-failed":
|
|
675
|
+
failures.extend(autofix_messages)
|
|
531
676
|
contract = extract_contract(run_manifest, task_manifest, failures)
|
|
532
677
|
validate_team_state(team_state, project_root, contract, failures)
|
|
533
678
|
validate_report(report_path, contract["required_agent_status_entries"], failures)
|
package/src/setup.mjs
CHANGED
|
@@ -265,7 +265,7 @@ export async function run(args) {
|
|
|
265
265
|
return 1;
|
|
266
266
|
}
|
|
267
267
|
process.stderr.write(`PROJECT_ROOT: ${projectRoot}\n`);
|
|
268
|
-
const answer = await prompt("project-id (e.g. INV-1234,
|
|
268
|
+
const answer = await prompt("project-id (e.g. INV-1234, my-app): ");
|
|
269
269
|
projectId = answer;
|
|
270
270
|
}
|
|
271
271
|
|