okstra 0.18.0 → 0.18.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docs/kr/cli.md CHANGED
@@ -236,6 +236,11 @@ scripts/okstra.sh --task-type implementation-planning ... \
236
236
  scripts/okstra.sh --task-type implementation-planning --workers claude,codex --project-id jobs --task-group tasks --task-id 8852 --task-brief .project-docs/tasks/8852/BUG_REPORT.md
237
237
  ```
238
238
 
239
+ > 모든 `--*-model` 플래그는 `scripts/okstra_ctl/models.py` 의 provider 별 mapping 에 등록된 alias 만 허용합니다. 등록되지 않은 값은 `UnknownModelError` 로 즉시 거부됩니다 (manifest 의 `modelExecutionValue` 와 실제 실행값 불일치로 인한 contract-violation 을 사전에 차단). 허용값:
240
+ > - Claude (`--lead-model` / `--claude-model` / `--report-writer-model`): `opus`, `opus-4-7`, `claude-opus-4-7`, `sonnet`, `sonnet-4-6`, `claude-sonnet-4-6`, `haiku`, `haiku-4-5`, `claude-haiku-4-5`, `claude-haiku-4-5-20251001`
241
+ > - Codex (`--codex-model`): `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.3-codex`, `gpt-5.2`, `codex-auto-review`
242
+ > - Gemini (`--gemini-model`): `auto`, `pro`, `gemini-3-flash-preview`, `gemini-3-pro-preview` (그리고 `gemini auto` / `gemini pro` 별칭)
243
+
239
244
  ### `--claude-model`
240
245
 
241
246
  `Claude worker`에 사용할 모델을 지정합니다.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.18.0",
3
+ "version": "0.18.2",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.18.0",
3
- "builtAt": "2026-05-13T11:17:29.529Z",
2
+ "package": "0.18.2",
3
+ "builtAt": "2026-05-13T13:51:20.597Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -175,12 +175,21 @@ Spawn **analysis workers only** in the same turn (Phase 4 in Teams mode; Phase 5
175
175
 
176
176
  The no-`team_name` fallback (Phase 5) is only legal when team-state's `teamCreate.status` is `"error"` for this run. If `teamCreate` is missing or `attempted: false`, the correct action when an Agent dispatch is rejected for a missing team is to GO BACK to Phase 3 and call `TeamCreate` — never to strip `team_name` and continue.
177
177
 
178
- After each worker terminates (any terminal status), if a worker errors sidecar exists at `runs/.../worker-results/<role-slug>-errors.json`, dump it to the run error log:
178
+ ### Errors log path wiring (BLOCKING)
179
+
180
+ The launch prompt's `## Run Logs (error-log wiring)` section gives Lead the resolved absolute paths for the run-level errors log and every per-worker sidecar. When Lead constructs each worker's dispatch prompt body, Lead MUST inject the matching two header lines verbatim:
181
+
182
+ - `**Errors log path:** <absolute run-level errors log path from launch prompt>`
183
+ - `**Errors sidecar path:** <absolute per-worker sidecar path matching the dispatched worker>`
184
+
185
+ Workers are contractually required to extract these two lines and abort with `<WORKER>_ERRORS_PATH_MISSING` if either is absent (see each worker definition's "Path extraction (BLOCKING)" block). Omitting these headers reproduces the historical bug class where every run's `errors-<task-type>-<seq>.jsonl` stayed empty because workers had only template placeholders to work from.
186
+
187
+ After each worker terminates (any terminal status), if its errors sidecar exists, dump it to the run error log using the same resolved paths from the launch prompt:
179
188
 
180
189
  ```bash
181
190
  python3 scripts/okstra-error-log.py append-from-worker \
182
- --sidecar <sidecar> \
183
- --out <runDir>/logs/errors-<task-type>-<seq>.jsonl \
191
+ --sidecar <absolute-sidecar-path-from-launch-prompt> \
192
+ --out <absolute-errors-log-path-from-launch-prompt> \
184
193
  --task-key <taskKey> --agent <agent> --agent-role <role> --model <model>
185
194
  ```
186
195
 
@@ -261,7 +270,7 @@ After persistence, reply briefly in Korean with: completion status, final report
261
270
  | Skipping a worker silently | Always record terminal status with reason |
262
271
  | Writing verdict before all workers report | Wait for all results or explicit terminal statuses |
263
272
  | Ignoring task bundle model assignments | Task bundle overrides are canonical |
264
- | Sending identical prompts to all workers | Add role-specific emphasis per [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) |
273
+ | Inserting per-worker emphasis sentences ("you focus on X") into dispatch prompts | Send byte-identical dispatch prompts per [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Dispatch-prompt invariant" — specialization lives in Section 6 of the worker output, not the prompt body |
265
274
  | Omitting contested or worker-unique findings | All categories must appear in the report |
266
275
  | Running full re-analysis when lightweight suffices | Default lightweight; full only when manifest opts in |
267
276
  | Using `/tmp/*prompt*.txt` for worker prompt persistence | Persist the exact worker prompt to the assigned run-level `prompts/` path |
@@ -21,7 +21,9 @@ color: blue
21
21
  tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"]
22
22
  ---
23
23
 
24
- You are a Claude worker agent for okstra cross-verification. Your emphasis: **broad reasoning quality, hidden assumptions, missing context, execution risk**.
24
+ You are a Claude worker agent for okstra cross-verification. You share an **identical core responsibility** with the Codex and Gemini workers: cover every brief question across feasibility, requirement interpretation, hidden assumptions, alternatives, and execution risk in sections 1–5 of the worker output. Cross-verification only triangulates if all three workers answer the same questions against the same brief.
25
+
26
+ Your specialization lens — **broad reasoning depth, hidden-assumption surfacing, execution-risk decomposition** — is the only content that belongs in optional Section 6 (additive, not subject to convergence). Do NOT let the lens narrow sections 1–5: a Claude-only "Findings" populated solely with assumption-class items is a contract violation.
25
27
 
26
28
  Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you do NOT shell out to a CLI. Use your native tools (Read / Grep / Glob / MCP) directly.
27
29
 
@@ -92,13 +94,15 @@ If you find yourself thinking "let me double-check section 3" or "I should read
92
94
 
93
95
  This agent is responsible for recording its own tool failures via `scripts/okstra-error-log.py`:
94
96
 
95
- **Tool failure (worker-reported)** if `Write` of the prompt history file, `mkdir`, any MCP call, or any other pre-/in-analysis tool call fails (non-zero exit, exception, or empty result where data was required), append a `tool-failure` entry to the worker errors sidecar at:
97
+ **Path extraction (BLOCKING).** Before recording anything, extract the absolute sidecar path from the lead's dispatch prompt body:
98
+
99
+ - `**Errors sidecar path:** <abs-path>` — this worker's per-run sidecar JSON.
100
+
101
+ If the header line is absent from the dispatch prompt, return `CLAUDE_WORKER_ERRORS_PATH_MISSING: lead prompt did not include **Errors sidecar path:** header` without proceeding. Do NOT synthesize the path from `runs/<task-type>/...` — that template syntax is documentation only and historically led to silent log loss.
96
102
 
97
- ```
98
- runs/<task-type>/worker-results/claude-worker-errors-<task-type>-<seq>.json
99
- ```
103
+ **Tool failure (worker-reported)** — if `Write` of the prompt history file, `mkdir`, any MCP call, or any other pre-/in-analysis tool call fails (non-zero exit, exception, or empty result where data was required), append a `tool-failure` entry to the worker errors sidecar at the absolute path extracted from the `**Errors sidecar path:**` header. If the file does not exist, create it with `{"schemaVersion": 1, "errors": []}` then append.
100
104
 
101
- The sidecar follows the schema in `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). If the file does not exist, create it with `{"schemaVersion": 1, "errors": []}` then append. Lead will dump it to the run error log after this subagent terminates.
105
+ The sidecar follows the schema in `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). Lead will dump it to the run error log after this subagent terminates.
102
106
 
103
107
  There is NO `cli-failure` category for this worker — Claude worker has no external CLI to invoke. Treat MCP errors and Bash errors uniformly as `tool-failure`.
104
108
 
@@ -106,4 +110,4 @@ There is NO `cli-failure` category for this worker — Claude worker has no exte
106
110
 
107
111
  - Return error messages as-is on failure.
108
112
  - Do not summarize or modify your own analysis output beyond the structured sections above.
109
- - Your emphasis: broad reasoning quality, hidden assumptions, missing context, execution risk distinct from Codex (implementation realism) and Gemini (requirement interpretation, alternative viewpoints).
113
+ - Sections 1–5 are the common core — same dimensions for every analysis worker. Your specialization (broad reasoning depth, hidden-assumption surfacing, execution-risk decomposition) only enters Section 6 if you have additive content beyond the core. See `skills/okstra-team-contract/SKILL.md` "Worker Output Contract" for the authoritative split.
@@ -148,19 +148,26 @@ This contract mirrors the `okstra-team-contract` skill's Worker Output Contract
148
148
  The wrapper agent (this Codex worker subagent) is responsible for recording
149
149
  two kinds of errors via `scripts/okstra-error-log.py`:
150
150
 
151
- 1. **Wrapper-internal tool failure (worker-reported)** if `Write` of the
152
- prompt history file, `mkdir`, or any pre-CLI tool call fails, append a
153
- `tool-failure` entry to the worker errors sidecar at:
151
+ **Path extraction (BLOCKING).** Before recording anything, extract the
152
+ following two absolute paths verbatim from the lead's dispatch prompt body:
154
153
 
155
- ```
156
- runs/<task-type>/worker-results/codex-worker-errors-<task-type>-<seq>.json
157
- ```
154
+ - `**Errors log path:** <abs-path>` — the run-level errors JSONL.
155
+ - `**Errors sidecar path:** <abs-path>` — this worker's per-run sidecar JSON.
158
156
 
159
- The sidecar follows the schema in
160
- `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). If the
161
- file does not exist, create it with `{"schemaVersion": 1, "errors": []}`
162
- then append. Lead will dump it to the run error log after this subagent
163
- terminates.
157
+ If either header line is absent from the dispatch prompt, return
158
+ `CODEX_ERRORS_PATH_MISSING: lead prompt did not include **Errors log path:** / **Errors sidecar path:** headers`
159
+ without proceeding. Do NOT synthesize the path from `<runDir>/logs/...`
160
+ historical bug class: workers writing to a literally-named template path
161
+ and the run-level error log staying empty.
162
+
163
+ 1. **Wrapper-internal tool failure (worker-reported)** — if `Write` of the
164
+ prompt history file, `mkdir`, or any pre-CLI tool call fails, append a
165
+ `tool-failure` entry to the worker errors sidecar at the absolute path
166
+ extracted from the `**Errors sidecar path:**` header. If the file does
167
+ not exist, create it with `{"schemaVersion": 1, "errors": []}` then
168
+ append. The sidecar follows the schema in
169
+ `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). Lead
170
+ will dump it to the run error log after this subagent terminates.
164
171
 
165
172
  2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
166
173
  reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
@@ -171,7 +178,7 @@ two kinds of errors via `scripts/okstra-error-log.py`:
171
178
 
172
179
  ```bash
173
180
  python3 scripts/okstra-error-log.py append-observed \
174
- --out "<runDir>/logs/errors-<task-type>-<seq>.jsonl" \
181
+ --out "<absolute-errors-log-path-from-lead-prompt>" \
175
182
  --task-key "<task-key>" \
176
183
  --phase "<phase>" \
177
184
  --agent codex-worker --agent-role worker \
@@ -184,10 +191,10 @@ two kinds of errors via `scripts/okstra-error-log.py`:
184
191
  --stderr-excerpt-file "<captured-stderr-path or omit>"
185
192
  ```
186
193
 
187
- The lead prompt provides `<runDir>`, `<task-key>`, and `<phase>` alongside
188
- the prompt history path. If any of these are missing, fall back to logging
189
- to the worker errors sidecar instead — never silently swallow a CLI
190
- failure.
194
+ The lead prompt provides `**Errors log path:**`, `<task-key>`, and
195
+ `<phase>` alongside the prompt history path. If any of these are
196
+ missing, fall back to logging to the worker errors sidecar instead —
197
+ never silently swallow a CLI failure.
191
198
 
192
199
  Do not record a `cli-failure` for `CODEX_NOT_INSTALLED` returns — that is a
193
200
  pre-flight terminal status, not a runtime CLI error.
@@ -197,4 +204,4 @@ pre-flight terminal status, not a runtime CLI error.
197
204
  - Ignore stderr warnings from MCP integration.
198
205
  - Return error messages as-is on failure.
199
206
  - Do not summarize or modify Codex results.
200
- - Your emphasis: implementation realism, code-path implications, edge cases, technical trade-offs.
207
+ - Sections 1–5 of the worker output are the common core shared with the Claude and Gemini workers — the dispatched prompt asks identical questions for all three roles, and the Codex CLI must answer all of them, not only implementation-realism findings. Your specialization (implementation realism, code-path implications, edge cases, technical trade-offs) belongs only in optional Section 6 as additive depth. A Codex result whose Findings section is populated solely with implementation-feasibility items is in breach of contract; see `skills/okstra-team-contract/SKILL.md` "Worker Output Contract".
@@ -148,19 +148,26 @@ This contract mirrors the `okstra-team-contract` skill's Worker Output Contract
148
148
  The wrapper agent (this Gemini worker subagent) is responsible for recording
149
149
  two kinds of errors via `scripts/okstra-error-log.py`:
150
150
 
151
- 1. **Wrapper-internal tool failure (worker-reported)** if `Write` of the
152
- prompt history file, `mkdir`, or any pre-CLI tool call fails, append a
153
- `tool-failure` entry to the worker errors sidecar at:
151
+ **Path extraction (BLOCKING).** Before recording anything, extract the
152
+ following two absolute paths verbatim from the lead's dispatch prompt body:
154
153
 
155
- ```
156
- runs/<task-type>/worker-results/gemini-worker-errors-<task-type>-<seq>.json
157
- ```
154
+ - `**Errors log path:** <abs-path>` — the run-level errors JSONL.
155
+ - `**Errors sidecar path:** <abs-path>` — this worker's per-run sidecar JSON.
158
156
 
159
- The sidecar follows the schema in
160
- `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). If the
161
- file does not exist, create it with `{"schemaVersion": 1, "errors": []}`
162
- then append. Lead will dump it to the run error log after this subagent
163
- terminates.
157
+ If either header line is absent from the dispatch prompt, return
158
+ `GEMINI_ERRORS_PATH_MISSING: lead prompt did not include **Errors log path:** / **Errors sidecar path:** headers`
159
+ without proceeding. Do NOT synthesize the path from `<runDir>/logs/...`
160
+ historical bug class: workers writing to a literally-named template path
161
+ and the run-level error log staying empty.
162
+
163
+ 1. **Wrapper-internal tool failure (worker-reported)** — if `Write` of the
164
+ prompt history file, `mkdir`, or any pre-CLI tool call fails, append a
165
+ `tool-failure` entry to the worker errors sidecar at the absolute path
166
+ extracted from the `**Errors sidecar path:**` header. If the file does
167
+ not exist, create it with `{"schemaVersion": 1, "errors": []}` then
168
+ append. The sidecar follows the schema in
169
+ `skills/okstra-team-contract/SKILL.md` (Optional errors sidecar). Lead
170
+ will dump it to the run error log after this subagent terminates.
164
171
 
165
172
  2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
166
173
  reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
@@ -171,7 +178,7 @@ two kinds of errors via `scripts/okstra-error-log.py`:
171
178
 
172
179
  ```bash
173
180
  python3 scripts/okstra-error-log.py append-observed \
174
- --out "<runDir>/logs/errors-<task-type>-<seq>.jsonl" \
181
+ --out "<absolute-errors-log-path-from-lead-prompt>" \
175
182
  --task-key "<task-key>" \
176
183
  --phase "<phase>" \
177
184
  --agent gemini-worker --agent-role worker \
@@ -184,10 +191,10 @@ two kinds of errors via `scripts/okstra-error-log.py`:
184
191
  --stderr-excerpt-file "<captured-stderr-path or omit>"
185
192
  ```
186
193
 
187
- The lead prompt provides `<runDir>`, `<task-key>`, and `<phase>` alongside
188
- the prompt history path. If any of these are missing, fall back to logging
189
- to the worker errors sidecar instead — never silently swallow a CLI
190
- failure.
194
+ The lead prompt provides `**Errors log path:**`, `<task-key>`, and
195
+ `<phase>` alongside the prompt history path. If any of these are
196
+ missing, fall back to logging to the worker errors sidecar instead —
197
+ never silently swallow a CLI failure.
191
198
 
192
199
  Do not record a `cli-failure` for `GEMINI_NOT_INSTALLED` returns — that is a
193
200
  pre-flight terminal status, not a runtime CLI error.
@@ -197,4 +204,4 @@ pre-flight terminal status, not a runtime CLI error.
197
204
  - Always specify the assigned `-m` value for the current run.
198
205
  - Return error messages as-is on failure.
199
206
  - Do not summarize or modify Gemini results.
200
- - Your emphasis: requirement interpretation, consistency, safety, documentation quality, alternative viewpoints.
207
+ - Sections 1–5 of the worker output are the common core shared with the Claude and Codex workers — the dispatched prompt asks identical questions for all three roles, and the Gemini CLI must answer all of them, not only requirement-interpretation findings. Your specialization (requirement interpretation, consistency, safety, documentation quality, alternative viewpoints) belongs only in optional Section 6 as additive depth. A Gemini result whose Findings section is populated solely with requirement-interpretation items is in breach of contract; see `skills/okstra-team-contract/SKILL.md` "Worker Output Contract".
@@ -46,6 +46,21 @@ Invoke the `okstra` skill now. Read the manifests below for all task metadata, p
46
46
  - Final status: `{{FINAL_STATUS_RELATIVE_PATH}}`
47
47
  - Validator: `{{RUN_VALIDATOR_RELATIVE_PATH}}`
48
48
 
49
+ ## Run Logs (error-log wiring)
50
+
51
+ - Run-level errors log (absolute): `{{RUN_ERRORS_LOG_PATH}}`
52
+ - Run-level errors log (relative): `{{RUN_ERRORS_LOG_RELATIVE_PATH}}`
53
+ - Worker error sidecars (absolute):
54
+ - Claude worker: `{{CLAUDE_WORKER_ERRORS_SIDECAR_PATH}}`
55
+ - Codex worker: `{{CODEX_WORKER_ERRORS_SIDECAR_PATH}}`
56
+ - Gemini worker: `{{GEMINI_WORKER_ERRORS_SIDECAR_PATH}}`
57
+ - Report writer worker: `{{REPORT_WRITER_WORKER_ERRORS_SIDECAR_PATH}}`
58
+ - When dispatching any worker you MUST inject **two header lines** into the dispatch prompt body so the worker subagent can record errors without guessing paths:
59
+ - `**Errors log path:** <absolute run-level errors log path>`
60
+ - `**Errors sidecar path:** <absolute per-worker sidecar path matching the dispatched worker>`
61
+ - These lines are the canonical contract — worker subagents extract them verbatim and pass them to `okstra-error-log.py append-observed --out ...` (run-level cli-failure / contract-violation events) and to their internal sidecar writes (worker-reported tool-failure events) respectively.
62
+ - After each worker terminates, dump its sidecar into the run-level errors log via `python3 scripts/okstra-error-log.py append-from-worker --sidecar <sidecar-path> --out <run-errors-log-path> --task-key {{TASK_KEY}} --agent <worker-id> --agent-role worker --model <assigned-model-execution-value>` (per `okstra-team-contract` Worker Output Contract).
63
+
49
64
  ## Executor Worktree
50
65
 
51
66
  - Status: `{{EXECUTOR_WORKTREE_STATUS}}`
@@ -4,8 +4,9 @@
4
4
  - Required workers:
5
5
  - claude
6
6
  - codex
7
- - gemini
8
7
  - report-writer
8
+ - Optional workers (opt-in via `--workers`):
9
+ - gemini — when added to the roster it joins the analyser set; omitted by default
9
10
  {{INCLUDE:_common-contract.md}}
10
11
  - Primary focus areas:
11
12
  - symptom and trigger clarification
@@ -4,8 +4,9 @@
4
4
  - Required workers:
5
5
  - claude
6
6
  - codex
7
- - gemini
8
7
  - report-writer
8
+ - Optional workers (opt-in via `--workers`):
9
+ - gemini — when added to the roster it joins the analyser set; omitted by default
9
10
  {{INCLUDE:_common-contract.md}}
10
11
  - Primary focus areas:
11
12
  - requirement coverage
@@ -4,8 +4,9 @@
4
4
  - Required workers:
5
5
  - claude
6
6
  - codex
7
- - gemini
8
7
  - report-writer
8
+ - Optional workers (opt-in via `--workers`):
9
+ - gemini — when added to the roster it joins the analyser set; omitted by default
9
10
  {{INCLUDE:_common-contract.md}}
10
11
  - Pre-planning context exploration (mandatory before option drafting):
11
12
  - read the task brief, related-task briefs, and any cited spec / design doc end-to-end
@@ -4,8 +4,9 @@
4
4
  - Required workers:
5
5
  - claude
6
6
  - codex
7
- - gemini
8
7
  - report-writer
8
+ - Optional workers (opt-in via `--workers`):
9
+ - gemini — when added to the roster it joins the analyser set; omitted by default
9
10
  {{INCLUDE:_common-contract.md}}
10
11
  - Primary focus areas:
11
12
  - classify the work as bugfix, feature, improvement, refactor, or ops-change
@@ -45,19 +45,36 @@ class ModelAssignment:
45
45
  execution: str
46
46
 
47
47
 
48
+ class UnknownModelError(ValueError):
49
+ """raw_value 가 provider mapping 에 등록되어 있지 않을 때 발생."""
50
+
51
+
48
52
  def resolve_model_metadata(
49
53
  *, provider: str, raw_value: str, default_display: str, default_execution: str,
50
54
  ) -> ModelAssignment:
51
- """alias → (display, execution_value). 알 수 없는 값은 그대로 통과.
55
+ """alias → (display, execution_value).
52
56
 
53
57
  provider: "claude" | "codex" | "gemini" (그 외는 mapping 미적용)
58
+
59
+ Mapping 이 정의된 provider 에 대해 사용자가 빈 값이 아닌 raw_value 를 줬는데
60
+ mapping 에 없는 경우 `UnknownModelError` 를 발생시킨다. 이는 manifest 에
61
+ Codex 가 지원하지 않는 `gpt-5.5-high` 같은 유령 모델이 기록되어 실제
62
+ 실행값과 contract 가 불일치하는 사고를 막기 위함이다.
54
63
  """
55
64
  raw_value = (raw_value or "").strip()
65
+ mapping = PROVIDER_MAPPINGS.get(provider)
66
+ if raw_value and mapping is not None:
67
+ normalized = raw_value.lower()
68
+ if normalized not in mapping:
69
+ allowed = ", ".join(sorted(mapping.keys()))
70
+ raise UnknownModelError(
71
+ f"{provider} model {raw_value!r} is not a supported alias. "
72
+ f"Allowed values: {allowed}"
73
+ )
56
74
  value = raw_value or default_display
57
75
  display = value
58
76
  execution = raw_value or default_execution
59
77
  normalized = value.strip().lower()
60
- mapping = PROVIDER_MAPPINGS.get(provider)
61
78
  if mapping and normalized in mapping:
62
79
  display, execution = mapping[normalized]
63
80
  return ModelAssignment(display=display, execution=execution)
@@ -143,6 +143,12 @@ def compute_run_paths(
143
143
  gemini_worker_result = worker_results / f"gemini-worker{suffixes['worker_results']}.md"
144
144
  report_writer_worker_result = worker_results / f"report-writer-worker{suffixes['worker_results']}.md"
145
145
 
146
+ run_errors_log = run_logs / f"errors-{task_type_segment}-{seqs['state']}.jsonl"
147
+ claude_worker_errors_sidecar = worker_results / f"claude-worker-errors{suffixes['worker_results']}.json"
148
+ codex_worker_errors_sidecar = worker_results / f"codex-worker-errors{suffixes['worker_results']}.json"
149
+ gemini_worker_errors_sidecar = worker_results / f"gemini-worker-errors{suffixes['worker_results']}.json"
150
+ report_writer_worker_errors_sidecar = worker_results / f"report-writer-worker-errors{suffixes['worker_results']}.json"
151
+
146
152
  run_validator_script = workspace_root / "validators" / "validate-run.py"
147
153
 
148
154
  abs_paths = {
@@ -193,6 +199,11 @@ def compute_run_paths(
193
199
  "CODEX_WORKER_RESULT_FILE": str(codex_worker_result),
194
200
  "GEMINI_WORKER_RESULT_FILE": str(gemini_worker_result),
195
201
  "REPORT_WRITER_WORKER_RESULT_FILE": str(report_writer_worker_result),
202
+ "RUN_ERRORS_LOG_FILE": str(run_errors_log),
203
+ "CLAUDE_WORKER_ERRORS_SIDECAR_FILE": str(claude_worker_errors_sidecar),
204
+ "CODEX_WORKER_ERRORS_SIDECAR_FILE": str(codex_worker_errors_sidecar),
205
+ "GEMINI_WORKER_ERRORS_SIDECAR_FILE": str(gemini_worker_errors_sidecar),
206
+ "REPORT_WRITER_WORKER_ERRORS_SIDECAR_FILE": str(report_writer_worker_errors_sidecar),
196
207
  "RUN_VALIDATOR_SCRIPT": str(run_validator_script),
197
208
  "RUN_MANIFEST_FILENAME": run_manifest_file.name,
198
209
  "RUN_PROMPT_SNAPSHOT_FILENAME": run_prompt_snapshot.name,
@@ -245,6 +256,11 @@ def compute_run_paths(
245
256
  ("CODEX_WORKER_RESULT_RELATIVE_PATH", codex_worker_result),
246
257
  ("GEMINI_WORKER_RESULT_RELATIVE_PATH", gemini_worker_result),
247
258
  ("REPORT_WRITER_WORKER_RESULT_RELATIVE_PATH", report_writer_worker_result),
259
+ ("RUN_ERRORS_LOG_RELATIVE_PATH", run_errors_log),
260
+ ("CLAUDE_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", claude_worker_errors_sidecar),
261
+ ("CODEX_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", codex_worker_errors_sidecar),
262
+ ("GEMINI_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", gemini_worker_errors_sidecar),
263
+ ("REPORT_WRITER_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", report_writer_worker_errors_sidecar),
248
264
  ("LATEST_RUN_RELATIVE_PATH", run_dir),
249
265
  ]
250
266
  rel_paths = {key: _rel(project_root, target) for key, target in rel_pairs}
@@ -1078,6 +1078,16 @@ def render_template_file(template_path: str, output_path: str, ctx: dict) -> Non
1078
1078
  "{{CODEX_WORKER_RESULT_RELATIVE_PATH}}": ctx.get("CODEX_WORKER_RESULT_RELATIVE_PATH", ""),
1079
1079
  "{{GEMINI_WORKER_RESULT_RELATIVE_PATH}}": ctx.get("GEMINI_WORKER_RESULT_RELATIVE_PATH", ""),
1080
1080
  "{{REPORT_WRITER_WORKER_RESULT_RELATIVE_PATH}}": ctx.get("REPORT_WRITER_WORKER_RESULT_RELATIVE_PATH", ""),
1081
+ "{{RUN_ERRORS_LOG_PATH}}": ctx.get("RUN_ERRORS_LOG_FILE", ""),
1082
+ "{{RUN_ERRORS_LOG_RELATIVE_PATH}}": ctx.get("RUN_ERRORS_LOG_RELATIVE_PATH", ""),
1083
+ "{{CLAUDE_WORKER_ERRORS_SIDECAR_PATH}}": ctx.get("CLAUDE_WORKER_ERRORS_SIDECAR_FILE", ""),
1084
+ "{{CLAUDE_WORKER_ERRORS_SIDECAR_RELATIVE_PATH}}": ctx.get("CLAUDE_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", ""),
1085
+ "{{CODEX_WORKER_ERRORS_SIDECAR_PATH}}": ctx.get("CODEX_WORKER_ERRORS_SIDECAR_FILE", ""),
1086
+ "{{CODEX_WORKER_ERRORS_SIDECAR_RELATIVE_PATH}}": ctx.get("CODEX_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", ""),
1087
+ "{{GEMINI_WORKER_ERRORS_SIDECAR_PATH}}": ctx.get("GEMINI_WORKER_ERRORS_SIDECAR_FILE", ""),
1088
+ "{{GEMINI_WORKER_ERRORS_SIDECAR_RELATIVE_PATH}}": ctx.get("GEMINI_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", ""),
1089
+ "{{REPORT_WRITER_WORKER_ERRORS_SIDECAR_PATH}}": ctx.get("REPORT_WRITER_WORKER_ERRORS_SIDECAR_FILE", ""),
1090
+ "{{REPORT_WRITER_WORKER_ERRORS_SIDECAR_RELATIVE_PATH}}": ctx.get("REPORT_WRITER_WORKER_ERRORS_SIDECAR_RELATIVE_PATH", ""),
1081
1091
  "{{LEAD_MODEL}}": lead_model,
1082
1092
  "{{LEAD_MODEL_EXECUTION_VALUE}}": lead_model_execution,
1083
1093
  "{{CLAUDE_WORKER_MODEL}}": ctx.get("CLAUDE_WORKER_MODEL_DISPLAY", ""),
@@ -37,6 +37,8 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
37
37
 
38
38
  Read the worker result files generated in Phase 4/5 and extract individual findings.
39
39
 
40
+ **Convergence scope.** Convergence operates on sections 1–5 of the worker output (the common core, see `okstra-team-contract` "Worker Output Contract"). Section 6 ("Specialization Lens") is additive worker-specific depth and MUST NOT be fed into the consensus grouping, the verification queue, or the round-N reverify prompts. Carry Section 6 forward into the final report verbatim through the report-writer worker — do not let it inflate `unique` counts or trigger spurious `verification-error` statuses.
41
+
40
42
  1. In the "Findings" section of each worker's results, identify individual items by number (F-001, F-002, ...) and parse the ticket identifier attached to each item:
41
43
  - For table-form findings, read the `Ticket ID` column.
42
44
  - For bullet/numbered findings, parse `[TICKETID: <id>]` from the item title.
@@ -184,31 +184,41 @@ Single `AskUserQuestion` first: `"기본 워커/모델로 진행할까요, 아
184
184
  - `Use defaults` → all overrides remain empty.
185
185
  - `Customize` → the prompts you ask depend on the `task_type` chosen in Step 4. Blank answer always means "use default". Never call the prompt label "worker CSV" — use plain Korean labels as shown below.
186
186
 
187
+ ### Model selection options (used by 6a and 6b)
188
+
189
+ All model prompts MUST use `AskUserQuestion` with a fixed option list — never free text. This prevents typos like `gpt-5.5-high` (a non-existent model) reaching the manifest. The options below are derived from `scripts/okstra_ctl/models.py` `*_MAPPING` and show "default + 3 latest". Blank/`default` means "use phase default".
190
+
191
+ - **Claude (lead / claude-worker / report-writer)** options: `default`, `opus`, `sonnet`, `haiku`
192
+ - **Codex (codex-worker)** options: `default`, `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`
193
+ - **Gemini (gemini-worker)** options: `default`, `gemini-3-pro-preview`, `gemini-3-flash-preview`, `auto`
194
+
195
+ When the user picks `default`, pass an empty string to the corresponding `--*-model` flag. Pick any other option ⇒ pass it verbatim. If the user truly needs a value outside the list (e.g. a pinned long-form id), they can use the question's built-in `Other` to type it — but the four canonical options cover the supported set, so `Other` should be rare.
196
+
187
197
  ### 6a. `implementation` phase (executor-driven)
188
198
 
189
- In this phase the roster is fixed by the profile (executor + two verifiers + report-writer). The Step 4 `executor` answer already determines who mutates code; verifier models use phase-specific defaults (`Claude verifier`=sonnet, `Codex verifier`=gpt-5.5, `Gemini verifier`=auto). So ask **only three model prompts**, plus directive/related/clarification:
199
+ In this phase the roster is fixed by the profile (executor + two verifiers + report-writer). The Step 4 `executor` answer already determines who mutates code; verifier models use phase-specific defaults (`Claude verifier`=sonnet, `Codex verifier`=gpt-5.5, `Gemini verifier`=auto). So ask **only three model prompts** (each via `AskUserQuestion` with options from the table above), plus directive/related/clarification:
190
200
 
191
- 1. `"리더(Claude lead) 모델? ( 칸 = 기본값)"` → `lead_model`
192
- 2. `"실행자({executor-provider}) 모델? ( = 기본값)"` → maps to `claude_model` / `codex_model` / `gemini_model` based on the Step 4 executor choice. The other two provider model fields stay empty (verifiers use defaults).
193
- 3. `"리포트 작성자(report-writer) 모델? ( 칸 = 기본값)"` → `report_writer_model`
194
- 4. `"추가 directive (선택, 빈 칸 가능)"` → `directive`
195
- 5. `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` → `related_tasks_raw`
201
+ 1. `AskUserQuestion` `"리더(Claude lead) 모델?"` (Claude options) → `lead_model`
202
+ 2. `AskUserQuestion` `"실행자({executor-provider}) 모델?"` with options matching the executor's provider (Claude / Codex / Gemini list above) → maps to `claude_model` / `codex_model` / `gemini_model`. The other two provider model fields stay empty (verifiers use defaults).
203
+ 3. `AskUserQuestion` `"리포트 작성자(report-writer) 모델?"` (Claude options) → `report_writer_model`
204
+ 4. `AskUserQuestion` `"추가 directive (선택, 빈 칸 가능)"` (free text) → `directive`
205
+ 5. `AskUserQuestion` `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` (free text) → `related_tasks_raw`
196
206
 
197
207
  Do NOT ask for `workers_override` in implementation — the profile's required roster must be preserved (verifier slots are mandatory). Leave `workers_override=""`.
198
208
 
199
209
  ### 6b. Other phases (`requirements-discovery`, `error-analysis`, `implementation-planning`, `final-verification`, `release-handoff`)
200
210
 
201
- Ask each in turn (free text, blank = default):
211
+ Ask each in turn (model prompts use `AskUserQuestion` with the option lists above; others are free text):
202
212
 
203
- 1. `"참여 워커 목록 (쉼표 구분, 빈 칸 = 프로필 기본값 claude,codex,report-writer). 선택지: claude, codex, gemini, report-writer — gemini는 옵션이므로 필요할 때 명시"` → `workers_override`
204
- 2. `"리더(Claude lead) 모델? ( 칸 = 기본값)"` → `lead_model`
205
- 3. `"claude 워커 모델? ( 칸 = 기본값)"` → `claude_model`
206
- 4. `"codex 워커 모델? ( 칸 = 기본값)"` → `codex_model`
207
- 5. `"gemini 워커 모델? ( 칸 = 기본값)"` → `gemini_model`
208
- 6. `"리포트 작성자 모델? ( 칸 = 기본값)"` → `report_writer_model`
209
- 7. `"추가 directive (선택, 빈 칸 가능)"` → `directive`
210
- 8. `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` → `related_tasks_raw`
211
- 9. `"clarification-response 파일 경로 (follow-up 시에만, 빈 칸 가능)"` → `clarification_response_path`
213
+ 1. `AskUserQuestion` `"참여 워커 목록 (쉼표 구분, 빈 칸 = 프로필 기본값 claude,codex,report-writer). 선택지: claude, codex, gemini, report-writer — gemini는 옵션이므로 필요할 때 명시"` (free text) → `workers_override`
214
+ 2. `AskUserQuestion` `"리더(Claude lead) 모델?"` (Claude options) → `lead_model`
215
+ 3. `AskUserQuestion` `"claude 워커 모델?"` (Claude options) → `claude_model`
216
+ 4. `AskUserQuestion` `"codex 워커 모델?"` (Codex options) → `codex_model`
217
+ 5. `AskUserQuestion` `"gemini 워커 모델?"` (Gemini options) → `gemini_model`
218
+ 6. `AskUserQuestion` `"리포트 작성자 모델?"` (Claude options) → `report_writer_model`
219
+ 7. `AskUserQuestion` `"추가 directive (선택, 빈 칸 가능)"` (free text) → `directive`
220
+ 8. `AskUserQuestion` `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` (free text) → `related_tasks_raw`
221
+ 9. `AskUserQuestion` `"clarification-response 파일 경로 (follow-up 시에만, 빈 칸 가능)"` (free text) → `clarification_response_path`
212
222
 
213
223
  For prompts whose target worker is NOT in the resolved workers list (after override), skip the prompt and present a single line such as `gemini-model 생략 (workers에 gemini 없음)`.
214
224
 
@@ -224,7 +234,7 @@ Before invoking `okstra render-bundle`, echo the resolved selections back to the
224
234
  executor : codex
225
235
  workers : (프로필 기본 — executor + verifier 2 + report-writer)
226
236
  lead-model : default (opus)
227
- codex-model : gpt-5.5-high ← executor model
237
+ codex-model : gpt-5.5 ← executor model
228
238
  claude-model : default (sonnet) ← verifier
229
239
  gemini-model : default (auto) ← verifier
230
240
  report-writer : default (opus)
@@ -17,13 +17,17 @@ okstra tasks are always operated using the `Claude lead` + required worker team
17
17
 
18
18
  ### Role Definitions
19
19
 
20
- | Role | Responsibilities | Default Model | subagent_type | Notes |
21
- |------|------|-----------|---------------|------|
22
- | Claude lead | orchestration + convergence supervision + final-report review/approval | opus | -- | Does NOT author the final-report file when `Report writer worker` is in the roster |
23
- | Claude worker | Inference quality, hidden assumptions, execution risks | sonnet | claude-worker | `agents/claude-worker.md` |
24
- | Codex worker | Implementation feasibility, code paths, edge cases | gpt-5.5 | codex-worker | `agents/codex-worker.md` |
25
- | Gemini worker | Requirement interpretation, consistency, safety, alternatives | auto | gemini-worker | `agents/gemini-worker.md` |
26
- | Report writer worker | **Authors** the final-report file in Phase 6. NOT an analysis worker. | opus | report-writer-worker | `agents/report-writer-worker.md`. Excluded from Phase 4/5 and convergence |
20
+ **All analysis workers (Claude / Codex / Gemini) share an identical core responsibility.** Specialization is additive — it lives in optional Section 6 of the worker output, NOT in differentiated core questions. This is intentional: cross-verification only converges if all three workers are answering the same questions against the same brief. Disjoint per-worker scopes produce union-of-perspectives, not triangulation.
21
+
22
+ | Role | Core responsibility | Specialization lens (Section 6 only) | Default Model | subagent_type | Notes |
23
+ |------|------|------|-----------|---------------|------|
24
+ | Claude lead | orchestration + convergence supervision + final-report review/approval | | opus | -- | Does NOT author the final-report file when `Report writer worker` is in the roster |
25
+ | Claude worker | Answer every brief question across feasibility, requirement interpretation, hidden assumptions, and alternatives — with file:line evidence | broad reasoning depth, hidden assumptions, execution-risk surfacing | sonnet | claude-worker | `agents/claude-worker.md` |
26
+ | Codex worker | Same core responsibility as Claude worker identical questions, identical sections 1–5 | implementation realism, code-path implications, edge cases, technical trade-offs | gpt-5.5 | codex-worker | `agents/codex-worker.md` |
27
+ | Gemini worker | Same core responsibility as Claude worker — identical questions, identical sections 1–5 | requirement interpretation, consistency, safety, alternative viewpoints | auto | gemini-worker | `agents/gemini-worker.md` |
28
+ | Report writer worker | **Authors** the final-report file in Phase 6. NOT an analysis worker. | — | opus | report-writer-worker | `agents/report-writer-worker.md`. Excluded from Phase 4/5 and convergence |
29
+
30
+ **Dispatch-prompt invariant (BLOCKING).** Lead's dispatch prompt body for Claude / Codex / Gemini workers MUST be byte-identical except for the role label and any wrapper-specific path headers (e.g. `**Worktree:**`, `**Errors sidecar path:**`). Lead MUST NOT bias the brief by inserting per-worker emphasis sentences ("you focus on X") into the body. Bias-by-prompt reproduces the historical failure mode where Claude commented only on assumptions, Codex only on code paths, and Gemini only on requirements — leaving convergence with nothing to converge on.
27
31
 
28
32
  ### Model Assignment Rules
29
33
 
@@ -72,7 +76,7 @@ If the task brief contains an `## Available MCP Servers` section, copy that sect
72
76
 
73
77
  Before dispatching any required worker, lead persists the exact worker prompt to the assigned current-run prompt history path under `runs/<task-type>/prompts/`. Do not use `/tmp/*prompt*.txt` as the canonical artifact path.
74
78
 
75
- Do not send identical undifferentiated prompts to every worker unless that is clearly the best option. Role-specific emphasis (Phase 2 of `okstra` skill) is the canonical guidance.
79
+ Send byte-identical dispatch prompts to every analysis worker (Claude / Codex / Gemini), modulo the role label and the wrapper-specific path headers enumerated in the "Dispatch-prompt invariant" rule of the Role Definitions section. The prior "role-specific emphasis" guidance is retired — emphasis in the body biases each worker toward its lens and silently kills convergence (see Role Definitions for the failure mode). Specialization lives in Section 6 of the worker output, not in the dispatch prompt body.
76
80
 
77
81
  ### Required reading clause (analysis workers + report-writer worker)
78
82
 
@@ -206,6 +210,11 @@ A successful worker result must include the following sections in this exact ord
206
210
  3. Safe or Reasonable Areas
207
211
  4. Uncertain Points
208
212
  5. Recommended Next Actions
213
+ 6. **Specialization Lens (optional, worker-specific deep dive)** — additive content produced from this worker's specialization lens (see Role Definitions table). Items here are NOT subject to convergence cross-verification and MUST NOT duplicate sections 1–5. If the worker has nothing additional to add from its lens, omit Section 6 entirely or write `- No additional lens-specific findings.`
214
+
215
+ **Sections 1–5 are the common core.** Every analysis worker (Claude / Codex / Gemini) MUST cover the same set of dimensions in sections 1–5 — feasibility, requirement interpretation, hidden assumptions, alternatives, and execution risk — regardless of which model is producing the result. The point of running three models is to triangulate the same answer space, not to partition it. A worker that produces "Findings" populated only with items from its own lens (e.g. Codex only listing implementation-feasibility findings) is in breach of contract; convergence will treat coverage gaps as `verification-error`.
216
+
217
+ **Section 6 is the only legal home for specialization.** When the worker has a depth-of-perspective contribution that genuinely sits outside the common core (e.g. a Codex-specific stack-trace decomposition, a Claude-specific assumption-chain teardown, a Gemini-specific alternative-architecture sketch), put it there. Lead and convergence treat Section 6 as additive context, not as input to consensus measurement.
209
218
 
210
219
  Code evidence must include file paths and line numbers.
211
220
 
@@ -263,12 +272,27 @@ Schema:
263
272
  ```
264
273
 
265
274
  Workers MUST omit `source` / `recordedAt` / `agent` / `agentRole` / `model` /
266
- `taskKey`. Claude lead fills those in when dumping the sidecar to
267
- `runs/<task-type>/logs/errors-<task-type>-<seq>.jsonl` via
268
- `scripts/okstra-error-log.py append-from-worker`.
275
+ `taskKey`. Claude lead fills those in when dumping the sidecar to the
276
+ run-level errors log (`runs/<task-type>/logs/errors-<task-type>-<seq>.jsonl`)
277
+ via `scripts/okstra-error-log.py append-from-worker`.
269
278
 
270
279
  Workers MUST use only `errorType: "tool-failure"` in the **sidecar file**.
271
280
 
281
+ **Path delivery contract (BLOCKING).** Workers do NOT synthesize the
282
+ run-level errors log path or their sidecar path from the
283
+ `runs/<task-type>/...` template syntax. Both absolute paths are delivered
284
+ by Lead via two dispatch-prompt header lines:
285
+
286
+ - `**Errors log path:** <absolute path>` — run-level JSONL (`okstra-error-log.py append-observed --out ...`)
287
+ - `**Errors sidecar path:** <absolute path>` — per-worker JSON (`{ "schemaVersion": 1, "errors": [...] }`)
288
+
289
+ Lead obtains both paths from the launch prompt's `## Run Logs (error-log
290
+ wiring)` section (resolved by the okstra runtime via `paths.py`). If Lead
291
+ omits either header, the worker MUST return `<WORKER>_ERRORS_PATH_MISSING`
292
+ without proceeding — this is the contractual replacement for the previous
293
+ "derive from template placeholders" behavior, which silently produced
294
+ empty run-level error logs in production.
295
+
272
296
  - `cli-failure` events are recorded by the wrapper subagent itself (Codex / Gemini), but **directly to the run-level error log** via `okstra-error-log.py append-observed --error-type cli-failure ...` — NOT via the sidecar. The sidecar is an in-process tool-failure channel only.
273
297
  - **Wrapper invocation arity.** Both `okstra-codex-exec.sh` and `okstra-gemini-exec.sh` accept four positional arguments: `<project-root> <model> <prompt-path> [<worktree-path>]`. The fourth (worktree) argument is **mandatory for implementation phase** and optional otherwise. For codex it becomes `--add-dir <worktree>` (sandbox write access); for gemini it is appended to `--include-directories`. Omitting it during implementation causes the codex sandbox to reject every Edit/Write targeting the worktree with EPERM. Workers extract the path from the `**Worktree:**` / `EXECUTOR_WORKTREE_PATH` / `cwd for every mutating command:` line in the lead prompt.
274
298
  - **Background dispatch + polling contract (Codex / Gemini wrappers).** Both wrapper subagents MUST dispatch `okstra-codex-exec.sh` / `okstra-gemini-exec.sh` via `Bash(run_in_background: true)` and poll with `BashOutput(bash_id)` on a 60-second cadence, capped at 30 minutes (1800s). The legacy "single foreground `Bash` with 120000ms timeout" rule is retired — it forced workers into ad-hoc background dispatch that lost stdout and silently broke Phase 5 synthesis. The new rule applies in **every phase** (analysis runs typically complete in 1–2 polls, so there is no regression for short jobs). Recording responsibilities: