okstra 0.18.1 → 0.18.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/kr/cli.md +5 -0
- package/package.json +1 -1
- package/runtime/BUILD.json +2 -2
- package/runtime/agents/SKILL.md +1 -1
- package/runtime/agents/workers/claude-worker.md +4 -2
- package/runtime/agents/workers/codex-worker.md +1 -1
- package/runtime/agents/workers/gemini-worker.md +1 -1
- package/runtime/python/okstra_ctl/models.py +19 -2
- package/runtime/python/okstra_token_usage/collect.py +104 -28
- package/runtime/skills/okstra-convergence/SKILL.md +2 -0
- package/runtime/skills/okstra-report-writer/SKILL.md +1 -1
- package/runtime/skills/okstra-run/SKILL.md +27 -17
- package/runtime/skills/okstra-team-contract/SKILL.md +17 -8
- package/runtime/templates/reports/final-report.template.md +22 -8
package/docs/kr/cli.md
CHANGED
|
@@ -236,6 +236,11 @@ scripts/okstra.sh --task-type implementation-planning ... \
|
|
|
236
236
|
scripts/okstra.sh --task-type implementation-planning --workers claude,codex --project-id jobs --task-group tasks --task-id 8852 --task-brief .project-docs/tasks/8852/BUG_REPORT.md
|
|
237
237
|
```
|
|
238
238
|
|
|
239
|
+
> 모든 `--*-model` 플래그는 `scripts/okstra_ctl/models.py` 의 provider 별 mapping 에 등록된 alias 만 허용합니다. 등록되지 않은 값은 `UnknownModelError` 로 즉시 거부됩니다 (manifest 의 `modelExecutionValue` 와 실제 실행값 불일치로 인한 contract-violation 을 사전에 차단). 허용값:
|
|
240
|
+
> - Claude (`--lead-model` / `--claude-model` / `--report-writer-model`): `opus`, `opus-4-7`, `claude-opus-4-7`, `sonnet`, `sonnet-4-6`, `claude-sonnet-4-6`, `haiku`, `haiku-4-5`, `claude-haiku-4-5`, `claude-haiku-4-5-20251001`
|
|
241
|
+
> - Codex (`--codex-model`): `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.3-codex`, `gpt-5.2`, `codex-auto-review`
|
|
242
|
+
> - Gemini (`--gemini-model`): `auto`, `pro`, `gemini-3-flash-preview`, `gemini-3-pro-preview` (그리고 `gemini auto` / `gemini pro` 별칭)
|
|
243
|
+
|
|
239
244
|
### `--claude-model`
|
|
240
245
|
|
|
241
246
|
`Claude worker`에 사용할 모델을 지정합니다.
|
package/package.json
CHANGED
package/runtime/BUILD.json
CHANGED
package/runtime/agents/SKILL.md
CHANGED
|
@@ -270,7 +270,7 @@ After persistence, reply briefly in Korean with: completion status, final report
|
|
|
270
270
|
| Skipping a worker silently | Always record terminal status with reason |
|
|
271
271
|
| Writing verdict before all workers report | Wait for all results or explicit terminal statuses |
|
|
272
272
|
| Ignoring task bundle model assignments | Task bundle overrides are canonical |
|
|
273
|
-
|
|
|
273
|
+
| Inserting per-worker emphasis sentences ("you focus on X") into dispatch prompts | Send byte-identical dispatch prompts per [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Dispatch-prompt invariant" — specialization lives in Section 6 of the worker output, not the prompt body |
|
|
274
274
|
| Omitting contested or worker-unique findings | All categories must appear in the report |
|
|
275
275
|
| Running full re-analysis when lightweight suffices | Default lightweight; full only when manifest opts in |
|
|
276
276
|
| Using `/tmp/*prompt*.txt` for worker prompt persistence | Persist the exact worker prompt to the assigned run-level `prompts/` path |
|
|
@@ -21,7 +21,9 @@ color: blue
|
|
|
21
21
|
tools: ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "TodoWrite", "WebFetch", "WebSearch"]
|
|
22
22
|
---
|
|
23
23
|
|
|
24
|
-
You are a Claude worker agent for okstra cross-verification.
|
|
24
|
+
You are a Claude worker agent for okstra cross-verification. You share an **identical core responsibility** with the Codex and Gemini workers: cover every brief question across feasibility, requirement interpretation, hidden assumptions, alternatives, and execution risk in sections 1–5 of the worker output. Cross-verification only triangulates if all three workers answer the same questions against the same brief.
|
|
25
|
+
|
|
26
|
+
Your specialization lens — **broad reasoning depth, hidden-assumption surfacing, execution-risk decomposition** — is the only content that belongs in optional Section 6 (additive, not subject to convergence). Do NOT let the lens narrow sections 1–5: a Claude-only "Findings" populated solely with assumption-class items is a contract violation.
|
|
25
27
|
|
|
26
28
|
Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you do NOT shell out to a CLI. Use your native tools (Read / Grep / Glob / MCP) directly.
|
|
27
29
|
|
|
@@ -108,4 +110,4 @@ There is NO `cli-failure` category for this worker — Claude worker has no exte
|
|
|
108
110
|
|
|
109
111
|
- Return error messages as-is on failure.
|
|
110
112
|
- Do not summarize or modify your own analysis output beyond the structured sections above.
|
|
111
|
-
- Your
|
|
113
|
+
- Sections 1–5 are the common core — same dimensions for every analysis worker. Your specialization (broad reasoning depth, hidden-assumption surfacing, execution-risk decomposition) only enters Section 6 if you have additive content beyond the core. See `skills/okstra-team-contract/SKILL.md` "Worker Output Contract" for the authoritative split.
|
|
@@ -204,4 +204,4 @@ pre-flight terminal status, not a runtime CLI error.
|
|
|
204
204
|
- Ignore stderr warnings from MCP integration.
|
|
205
205
|
- Return error messages as-is on failure.
|
|
206
206
|
- Do not summarize or modify Codex results.
|
|
207
|
-
- Your
|
|
207
|
+
- Sections 1–5 of the worker output are the common core shared with the Claude and Gemini workers — the dispatched prompt asks identical questions for all three roles, and the Codex CLI must answer all of them, not only implementation-realism findings. Your specialization (implementation realism, code-path implications, edge cases, technical trade-offs) belongs only in optional Section 6 as additive depth. A Codex result whose Findings section is populated solely with implementation-feasibility items is in breach of contract; see `skills/okstra-team-contract/SKILL.md` "Worker Output Contract".
|
|
@@ -204,4 +204,4 @@ pre-flight terminal status, not a runtime CLI error.
|
|
|
204
204
|
- Always specify the assigned `-m` value for the current run.
|
|
205
205
|
- Return error messages as-is on failure.
|
|
206
206
|
- Do not summarize or modify Gemini results.
|
|
207
|
-
- Your
|
|
207
|
+
- Sections 1–5 of the worker output are the common core shared with the Claude and Codex workers — the dispatched prompt asks identical questions for all three roles, and the Gemini CLI must answer all of them, not only requirement-interpretation findings. Your specialization (requirement interpretation, consistency, safety, documentation quality, alternative viewpoints) belongs only in optional Section 6 as additive depth. A Gemini result whose Findings section is populated solely with requirement-interpretation items is in breach of contract; see `skills/okstra-team-contract/SKILL.md` "Worker Output Contract".
|
|
@@ -45,19 +45,36 @@ class ModelAssignment:
|
|
|
45
45
|
execution: str
|
|
46
46
|
|
|
47
47
|
|
|
48
|
+
class UnknownModelError(ValueError):
|
|
49
|
+
"""raw_value 가 provider mapping 에 등록되어 있지 않을 때 발생."""
|
|
50
|
+
|
|
51
|
+
|
|
48
52
|
def resolve_model_metadata(
|
|
49
53
|
*, provider: str, raw_value: str, default_display: str, default_execution: str,
|
|
50
54
|
) -> ModelAssignment:
|
|
51
|
-
"""alias → (display, execution_value).
|
|
55
|
+
"""alias → (display, execution_value).
|
|
52
56
|
|
|
53
57
|
provider: "claude" | "codex" | "gemini" (그 외는 mapping 미적용)
|
|
58
|
+
|
|
59
|
+
Mapping 이 정의된 provider 에 대해 사용자가 빈 값이 아닌 raw_value 를 줬는데
|
|
60
|
+
mapping 에 없는 경우 `UnknownModelError` 를 발생시킨다. 이는 manifest 에
|
|
61
|
+
Codex 가 지원하지 않는 `gpt-5.5-high` 같은 유령 모델이 기록되어 실제
|
|
62
|
+
실행값과 contract 가 불일치하는 사고를 막기 위함이다.
|
|
54
63
|
"""
|
|
55
64
|
raw_value = (raw_value or "").strip()
|
|
65
|
+
mapping = PROVIDER_MAPPINGS.get(provider)
|
|
66
|
+
if raw_value and mapping is not None:
|
|
67
|
+
normalized = raw_value.lower()
|
|
68
|
+
if normalized not in mapping:
|
|
69
|
+
allowed = ", ".join(sorted(mapping.keys()))
|
|
70
|
+
raise UnknownModelError(
|
|
71
|
+
f"{provider} model {raw_value!r} is not a supported alias. "
|
|
72
|
+
f"Allowed values: {allowed}"
|
|
73
|
+
)
|
|
56
74
|
value = raw_value or default_display
|
|
57
75
|
display = value
|
|
58
76
|
execution = raw_value or default_execution
|
|
59
77
|
normalized = value.strip().lower()
|
|
60
|
-
mapping = PROVIDER_MAPPINGS.get(provider)
|
|
61
78
|
if mapping and normalized in mapping:
|
|
62
79
|
display, execution = mapping[normalized]
|
|
63
80
|
return ModelAssignment(display=display, execution=execution)
|
|
@@ -2,6 +2,7 @@
|
|
|
2
2
|
from __future__ import annotations
|
|
3
3
|
|
|
4
4
|
import json
|
|
5
|
+
from datetime import datetime
|
|
5
6
|
from pathlib import Path
|
|
6
7
|
from .blocks import na_block, usage_block
|
|
7
8
|
from .claude import claude_session_totals, find_claude_team_sessions
|
|
@@ -11,6 +12,76 @@ from .paths import claude_project_dir, utc_now
|
|
|
11
12
|
from .pricing import codex_cost_usd, gemini_cost_usd
|
|
12
13
|
|
|
13
14
|
|
|
15
|
+
def match_prefixes(worker_id: str) -> list[str]:
|
|
16
|
+
"""Return the agentName prefixes that should be attributed to ``worker_id``.
|
|
17
|
+
|
|
18
|
+
The Agent harness records the `name` arg on every dispatch as `agentName`
|
|
19
|
+
in the subagent jsonl. Lead frequently appends suffixes (`-002`,
|
|
20
|
+
`-reverify-r1`, `-impl`, `-2`) when it dispatches the same role multiple
|
|
21
|
+
times or in different sub-flows. We treat every `agentName` matching one of
|
|
22
|
+
these prefixes — either exactly or as `<prefix>-<suffix>` — as belonging
|
|
23
|
+
to this worker so its tokens get aggregated. For implementation runs the
|
|
24
|
+
executor variant `<provider>-executor` is also attributed back to the
|
|
25
|
+
matching provider worker.
|
|
26
|
+
"""
|
|
27
|
+
if not worker_id:
|
|
28
|
+
return []
|
|
29
|
+
if worker_id == "report-writer":
|
|
30
|
+
return ["report-writer"]
|
|
31
|
+
prefixes = [worker_id]
|
|
32
|
+
if not worker_id.endswith("-worker"):
|
|
33
|
+
prefixes.append(f"{worker_id}-worker")
|
|
34
|
+
prefixes.append(f"{worker_id}-executor")
|
|
35
|
+
return prefixes
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
def agent_matches(agent_name: str, prefixes: list[str]) -> bool:
|
|
39
|
+
if not agent_name:
|
|
40
|
+
return False
|
|
41
|
+
for prefix in prefixes:
|
|
42
|
+
if agent_name == prefix or agent_name.startswith(f"{prefix}-"):
|
|
43
|
+
return True
|
|
44
|
+
return False
|
|
45
|
+
|
|
46
|
+
|
|
47
|
+
def _aggregate_totals(items: list[dict]) -> dict:
|
|
48
|
+
"""Sum token + tool counters across multiple session totals dicts.
|
|
49
|
+
|
|
50
|
+
`startedAt` / `endedAt` collapse to the union window; `durationMs` is
|
|
51
|
+
recomputed from that window so re-tries and convergence rounds count
|
|
52
|
+
against a single contiguous span. `model` and `agentName` keep the first
|
|
53
|
+
non-empty value (the canonical role identity).
|
|
54
|
+
"""
|
|
55
|
+
aggregate: dict = {
|
|
56
|
+
"totalTokens": 0, "inputTokens": 0, "outputTokens": 0,
|
|
57
|
+
"cacheCreationTokens": 0, "cacheReadTokens": 0,
|
|
58
|
+
"toolUses": 0, "durationMs": 0,
|
|
59
|
+
"agentName": None, "model": None,
|
|
60
|
+
"startedAt": None, "endedAt": None,
|
|
61
|
+
}
|
|
62
|
+
for t in items:
|
|
63
|
+
for k in ("totalTokens", "inputTokens", "outputTokens",
|
|
64
|
+
"cacheCreationTokens", "cacheReadTokens", "toolUses"):
|
|
65
|
+
aggregate[k] += t.get(k, 0) or 0
|
|
66
|
+
if aggregate["agentName"] is None and t.get("agentName"):
|
|
67
|
+
aggregate["agentName"] = t["agentName"]
|
|
68
|
+
if aggregate["model"] is None and t.get("model"):
|
|
69
|
+
aggregate["model"] = t["model"]
|
|
70
|
+
s, e = t.get("startedAt"), t.get("endedAt")
|
|
71
|
+
if s and (aggregate["startedAt"] is None or s < aggregate["startedAt"]):
|
|
72
|
+
aggregate["startedAt"] = s
|
|
73
|
+
if e and (aggregate["endedAt"] is None or e > aggregate["endedAt"]):
|
|
74
|
+
aggregate["endedAt"] = e
|
|
75
|
+
if aggregate["startedAt"] and aggregate["endedAt"]:
|
|
76
|
+
try:
|
|
77
|
+
a = datetime.fromisoformat(aggregate["startedAt"].replace("Z", "+00:00"))
|
|
78
|
+
b = datetime.fromisoformat(aggregate["endedAt"].replace("Z", "+00:00"))
|
|
79
|
+
aggregate["durationMs"] = max(0, int((b - a).total_seconds() * 1000))
|
|
80
|
+
except ValueError:
|
|
81
|
+
pass
|
|
82
|
+
return aggregate
|
|
83
|
+
|
|
84
|
+
|
|
14
85
|
def collect(team_state_path: Path, project_root: Path | None = None) -> dict:
|
|
15
86
|
state = json.loads(team_state_path.read_text())
|
|
16
87
|
cwd = project_root or _infer_project_root(team_state_path, state)
|
|
@@ -26,19 +97,20 @@ def collect(team_state_path: Path, project_root: Path | None = None) -> dict:
|
|
|
26
97
|
team_name = f"okstra-{task_id}" if task_id else ""
|
|
27
98
|
lead_sid = (state.get("lead") or {}).get("sessionId")
|
|
28
99
|
|
|
29
|
-
# 1) Claude sessions (lead + claude-side workers).
|
|
100
|
+
# 1) Claude sessions (lead + claude-side workers). Cache totals at scan
|
|
101
|
+
# time so we don't re-read the jsonl when a worker matches multiple
|
|
102
|
+
# sessions.
|
|
30
103
|
claude_sessions = find_claude_team_sessions(cwd, team_name, lead_sid)
|
|
31
|
-
by_agent: dict[str, tuple[str, Path]] = {}
|
|
104
|
+
by_agent: dict[str, list[tuple[str, Path, dict]]] = {}
|
|
32
105
|
lead_path: Path | None = None
|
|
33
106
|
for sid, path in claude_sessions.items():
|
|
34
107
|
if sid == lead_sid:
|
|
35
108
|
lead_path = path
|
|
36
109
|
continue
|
|
37
|
-
# Read agentName lazily.
|
|
38
110
|
totals = claude_session_totals(path)
|
|
39
111
|
agent = totals.get("agentName")
|
|
40
112
|
if agent:
|
|
41
|
-
by_agent[
|
|
113
|
+
by_agent.setdefault(agent, []).append((sid, path, totals))
|
|
42
114
|
|
|
43
115
|
# Lead.
|
|
44
116
|
if lead_path is not None:
|
|
@@ -50,35 +122,39 @@ def collect(team_state_path: Path, project_root: Path | None = None) -> dict:
|
|
|
50
122
|
f"lead session jsonl not found under {claude_project_dir(cwd)} (sessionId={lead_sid})"
|
|
51
123
|
)
|
|
52
124
|
|
|
53
|
-
# Workers
|
|
125
|
+
# Workers — match by prefix and aggregate every session that belongs to
|
|
126
|
+
# the same role (re-dispatches with `-002`, convergence `-reverify-r1`,
|
|
127
|
+
# implementation `-executor`, report-writer `-impl` / `-2`, etc.).
|
|
54
128
|
for worker in state.get("workers", []):
|
|
55
129
|
worker_id = worker.get("workerId")
|
|
56
130
|
agent = worker.get("agent")
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
if
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
wrapper = by_agent[cand]
|
|
69
|
-
break
|
|
70
|
-
if wrapper is None:
|
|
71
|
-
worker["usage"] = na_block(f"no Claude subagent jsonl found with agentName matching {agent_name_candidates}")
|
|
131
|
+
prefixes = match_prefixes(worker_id) if worker_id else []
|
|
132
|
+
|
|
133
|
+
matched: list[tuple[str, Path, dict]] = []
|
|
134
|
+
for agent_name, entries in by_agent.items():
|
|
135
|
+
if agent_matches(agent_name, prefixes):
|
|
136
|
+
matched.extend(entries)
|
|
137
|
+
|
|
138
|
+
if not matched:
|
|
139
|
+
worker["usage"] = na_block(
|
|
140
|
+
f"no Claude subagent jsonl found with agentName matching prefixes {prefixes}"
|
|
141
|
+
)
|
|
72
142
|
continue
|
|
73
|
-
sid, path = wrapper
|
|
74
|
-
totals = claude_session_totals(path)
|
|
75
|
-
block = usage_block(totals, source="claude-jsonl")
|
|
76
|
-
block["sessionId"] = sid
|
|
77
143
|
|
|
78
|
-
#
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
144
|
+
# Stable order by startedAt so the "primary" session is the first one.
|
|
145
|
+
matched.sort(key=lambda x: x[2].get("startedAt") or "")
|
|
146
|
+
primary_sid, _primary_path, _primary_totals = matched[0]
|
|
147
|
+
aggregate = _aggregate_totals([t for _, _, t in matched])
|
|
148
|
+
block = usage_block(aggregate, source="claude-jsonl")
|
|
149
|
+
block["sessionId"] = primary_sid
|
|
150
|
+
if len(matched) > 1:
|
|
151
|
+
block["additionalSessionIds"] = [sid for sid, _, _ in matched[1:]]
|
|
152
|
+
block["matchedAgentNames"] = sorted({t.get("agentName") for _, _, t in matched if t.get("agentName")})
|
|
153
|
+
|
|
154
|
+
# For codex/gemini workers, find every CLI session that fell inside the
|
|
155
|
+
# aggregated wrapper window.
|
|
156
|
+
wrapper_start = aggregate.get("startedAt") or ""
|
|
157
|
+
wrapper_end = aggregate.get("endedAt") or ""
|
|
82
158
|
if agent in ("codex", "gemini"):
|
|
83
159
|
if agent == "codex":
|
|
84
160
|
cli = find_codex_session(cwd, wrapper_start, wrapper_end)
|
|
@@ -37,6 +37,8 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
|
|
|
37
37
|
|
|
38
38
|
Read the worker result files generated in Phase 4/5 and extract individual findings.
|
|
39
39
|
|
|
40
|
+
**Convergence scope.** Convergence operates on sections 1–5 of the worker output (the common core, see `okstra-team-contract` "Worker Output Contract"). Section 6 ("Specialization Lens") is additive worker-specific depth and MUST NOT be fed into the consensus grouping, the verification queue, or the round-N reverify prompts. Carry Section 6 forward into the final report verbatim through the report-writer worker — do not let it inflate `unique` counts or trigger spurious `verification-error` statuses.
|
|
41
|
+
|
|
40
42
|
1. In the "Findings" section of each worker's results, identify individual items by number (F-001, F-002, ...) and parse the ticket identifier attached to each item:
|
|
41
43
|
- For table-form findings, read the `Ticket ID` column.
|
|
42
44
|
- For bullet/numbered findings, parse `[TICKETID: <id>]` from the item title.
|
|
@@ -90,7 +90,7 @@ Behaviour contract:
|
|
|
90
90
|
After the spawner completes, the report-writer worker MUST update Section 6 ("Recommended Next Steps") to list every newly created task-key together with its entry command, so the user can pick the follow-up up immediately:
|
|
91
91
|
|
|
92
92
|
```
|
|
93
|
-
- Follow-up: `<task-group>/<new-task-id>` — `okstra --task-key <task-group>/<new-task-id> --task-type <suggested>`
|
|
93
|
+
- Follow-up: `<task-group>/<new-task-id>` — Claude Code 세션 안 `/okstra-run task-key=<task-group>/<new-task-id> task-type=<suggested>` / 별도 터미널 `scripts/okstra.sh --task-key <task-group>/<new-task-id> --task-type <suggested>`
|
|
94
94
|
```
|
|
95
95
|
|
|
96
96
|
## Phase 7 token-usage collector (BLOCKING)
|
|
@@ -184,31 +184,41 @@ Single `AskUserQuestion` first: `"기본 워커/모델로 진행할까요, 아
|
|
|
184
184
|
- `Use defaults` → all overrides remain empty.
|
|
185
185
|
- `Customize` → the prompts you ask depend on the `task_type` chosen in Step 4. Blank answer always means "use default". Never call the prompt label "worker CSV" — use plain Korean labels as shown below.
|
|
186
186
|
|
|
187
|
+
### Model selection options (used by 6a and 6b)
|
|
188
|
+
|
|
189
|
+
All model prompts MUST use `AskUserQuestion` with a fixed option list — never free text. This prevents typos like `gpt-5.5-high` (a non-existent model) reaching the manifest. The options below are derived from `scripts/okstra_ctl/models.py` `*_MAPPING` and show "default + 3 latest". Blank/`default` means "use phase default".
|
|
190
|
+
|
|
191
|
+
- **Claude (lead / claude-worker / report-writer)** options: `default`, `opus`, `sonnet`, `haiku`
|
|
192
|
+
- **Codex (codex-worker)** options: `default`, `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`
|
|
193
|
+
- **Gemini (gemini-worker)** options: `default`, `gemini-3-pro-preview`, `gemini-3-flash-preview`, `auto`
|
|
194
|
+
|
|
195
|
+
When the user picks `default`, pass an empty string to the corresponding `--*-model` flag. Pick any other option ⇒ pass it verbatim. If the user truly needs a value outside the list (e.g. a pinned long-form id), they can use the question's built-in `Other` to type it — but the four canonical options cover the supported set, so `Other` should be rare.
|
|
196
|
+
|
|
187
197
|
### 6a. `implementation` phase (executor-driven)
|
|
188
198
|
|
|
189
|
-
In this phase the roster is fixed by the profile (executor + two verifiers + report-writer). The Step 4 `executor` answer already determines who mutates code; verifier models use phase-specific defaults (`Claude verifier`=sonnet, `Codex verifier`=gpt-5.5, `Gemini verifier`=auto). So ask **only three model prompts
|
|
199
|
+
In this phase the roster is fixed by the profile (executor + two verifiers + report-writer). The Step 4 `executor` answer already determines who mutates code; verifier models use phase-specific defaults (`Claude verifier`=sonnet, `Codex verifier`=gpt-5.5, `Gemini verifier`=auto). So ask **only three model prompts** (each via `AskUserQuestion` with options from the table above), plus directive/related/clarification:
|
|
190
200
|
|
|
191
|
-
1. `"리더(Claude lead) 모델? (
|
|
192
|
-
2. `"실행자({executor-provider}) 모델? (
|
|
193
|
-
3. `"리포트 작성자(report-writer) 모델? (
|
|
194
|
-
4. `"추가 directive (선택, 빈 칸 가능)"` → `directive`
|
|
195
|
-
5. `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` → `related_tasks_raw`
|
|
201
|
+
1. `AskUserQuestion` `"리더(Claude lead) 모델?"` (Claude options) → `lead_model`
|
|
202
|
+
2. `AskUserQuestion` `"실행자({executor-provider}) 모델?"` with options matching the executor's provider (Claude / Codex / Gemini list above) → maps to `claude_model` / `codex_model` / `gemini_model`. The other two provider model fields stay empty (verifiers use defaults).
|
|
203
|
+
3. `AskUserQuestion` `"리포트 작성자(report-writer) 모델?"` (Claude options) → `report_writer_model`
|
|
204
|
+
4. `AskUserQuestion` `"추가 directive (선택, 빈 칸 가능)"` (free text) → `directive`
|
|
205
|
+
5. `AskUserQuestion` `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` (free text) → `related_tasks_raw`
|
|
196
206
|
|
|
197
207
|
Do NOT ask for `workers_override` in implementation — the profile's required roster must be preserved (verifier slots are mandatory). Leave `workers_override=""`.
|
|
198
208
|
|
|
199
209
|
### 6b. Other phases (`requirements-discovery`, `error-analysis`, `implementation-planning`, `final-verification`, `release-handoff`)
|
|
200
210
|
|
|
201
|
-
Ask each in turn (
|
|
211
|
+
Ask each in turn (model prompts use `AskUserQuestion` with the option lists above; others are free text):
|
|
202
212
|
|
|
203
|
-
1. `"참여 워커 목록 (쉼표 구분, 빈 칸 = 프로필 기본값 claude,codex,report-writer). 선택지: claude, codex, gemini, report-writer — gemini는 옵션이므로 필요할 때 명시"` → `workers_override`
|
|
204
|
-
2. `"리더(Claude lead) 모델? (
|
|
205
|
-
3. `"claude 워커 모델? (
|
|
206
|
-
4. `"codex 워커 모델? (
|
|
207
|
-
5. `"gemini 워커 모델? (
|
|
208
|
-
6. `"리포트 작성자 모델? (
|
|
209
|
-
7. `"추가 directive (선택, 빈 칸 가능)"` → `directive`
|
|
210
|
-
8. `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` → `related_tasks_raw`
|
|
211
|
-
9. `"clarification-response 파일 경로 (follow-up 시에만, 빈 칸 가능)"` → `clarification_response_path`
|
|
213
|
+
1. `AskUserQuestion` `"참여 워커 목록 (쉼표 구분, 빈 칸 = 프로필 기본값 claude,codex,report-writer). 선택지: claude, codex, gemini, report-writer — gemini는 옵션이므로 필요할 때 명시"` (free text) → `workers_override`
|
|
214
|
+
2. `AskUserQuestion` `"리더(Claude lead) 모델?"` (Claude options) → `lead_model`
|
|
215
|
+
3. `AskUserQuestion` `"claude 워커 모델?"` (Claude options) → `claude_model`
|
|
216
|
+
4. `AskUserQuestion` `"codex 워커 모델?"` (Codex options) → `codex_model`
|
|
217
|
+
5. `AskUserQuestion` `"gemini 워커 모델?"` (Gemini options) → `gemini_model`
|
|
218
|
+
6. `AskUserQuestion` `"리포트 작성자 모델?"` (Claude options) → `report_writer_model`
|
|
219
|
+
7. `AskUserQuestion` `"추가 directive (선택, 빈 칸 가능)"` (free text) → `directive`
|
|
220
|
+
8. `AskUserQuestion` `"관련 task id 목록, 쉼표 구분 (선택, 빈 칸 가능)"` (free text) → `related_tasks_raw`
|
|
221
|
+
9. `AskUserQuestion` `"clarification-response 파일 경로 (follow-up 시에만, 빈 칸 가능)"` (free text) → `clarification_response_path`
|
|
212
222
|
|
|
213
223
|
For prompts whose target worker is NOT in the resolved workers list (after override), skip the prompt and present a single line such as `gemini-model 생략 (workers에 gemini 없음)`.
|
|
214
224
|
|
|
@@ -224,7 +234,7 @@ Before invoking `okstra render-bundle`, echo the resolved selections back to the
|
|
|
224
234
|
executor : codex
|
|
225
235
|
workers : (프로필 기본 — executor + verifier 2 + report-writer)
|
|
226
236
|
lead-model : default (opus)
|
|
227
|
-
codex-model : gpt-5.5
|
|
237
|
+
codex-model : gpt-5.5 ← executor model
|
|
228
238
|
claude-model : default (sonnet) ← verifier
|
|
229
239
|
gemini-model : default (auto) ← verifier
|
|
230
240
|
report-writer : default (opus)
|
|
@@ -17,13 +17,17 @@ okstra tasks are always operated using the `Claude lead` + required worker team
|
|
|
17
17
|
|
|
18
18
|
### Role Definitions
|
|
19
19
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
|
23
|
-
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
20
|
+
**All analysis workers (Claude / Codex / Gemini) share an identical core responsibility.** Specialization is additive — it lives in optional Section 6 of the worker output, NOT in differentiated core questions. This is intentional: cross-verification only converges if all three workers are answering the same questions against the same brief. Disjoint per-worker scopes produce union-of-perspectives, not triangulation.
|
|
21
|
+
|
|
22
|
+
| Role | Core responsibility | Specialization lens (Section 6 only) | Default Model | subagent_type | Notes |
|
|
23
|
+
|------|------|------|-----------|---------------|------|
|
|
24
|
+
| Claude lead | orchestration + convergence supervision + final-report review/approval | — | opus | -- | Does NOT author the final-report file when `Report writer worker` is in the roster |
|
|
25
|
+
| Claude worker | Answer every brief question across feasibility, requirement interpretation, hidden assumptions, and alternatives — with file:line evidence | broad reasoning depth, hidden assumptions, execution-risk surfacing | sonnet | claude-worker | `agents/claude-worker.md` |
|
|
26
|
+
| Codex worker | Same core responsibility as Claude worker — identical questions, identical sections 1–5 | implementation realism, code-path implications, edge cases, technical trade-offs | gpt-5.5 | codex-worker | `agents/codex-worker.md` |
|
|
27
|
+
| Gemini worker | Same core responsibility as Claude worker — identical questions, identical sections 1–5 | requirement interpretation, consistency, safety, alternative viewpoints | auto | gemini-worker | `agents/gemini-worker.md` |
|
|
28
|
+
| Report writer worker | **Authors** the final-report file in Phase 6. NOT an analysis worker. | — | opus | report-writer-worker | `agents/report-writer-worker.md`. Excluded from Phase 4/5 and convergence |
|
|
29
|
+
|
|
30
|
+
**Dispatch-prompt invariant (BLOCKING).** Lead's dispatch prompt body for Claude / Codex / Gemini workers MUST be byte-identical except for the role label and any wrapper-specific path headers (e.g. `**Worktree:**`, `**Errors sidecar path:**`). Lead MUST NOT bias the brief by inserting per-worker emphasis sentences ("you focus on X") into the body. Bias-by-prompt reproduces the historical failure mode where Claude commented only on assumptions, Codex only on code paths, and Gemini only on requirements — leaving convergence with nothing to converge on.
|
|
27
31
|
|
|
28
32
|
### Model Assignment Rules
|
|
29
33
|
|
|
@@ -72,7 +76,7 @@ If the task brief contains an `## Available MCP Servers` section, copy that sect
|
|
|
72
76
|
|
|
73
77
|
Before dispatching any required worker, lead persists the exact worker prompt to the assigned current-run prompt history path under `runs/<task-type>/prompts/`. Do not use `/tmp/*prompt*.txt` as the canonical artifact path.
|
|
74
78
|
|
|
75
|
-
|
|
79
|
+
Send byte-identical dispatch prompts to every analysis worker (Claude / Codex / Gemini), modulo the role label and the wrapper-specific path headers enumerated in the "Dispatch-prompt invariant" rule of the Role Definitions section. The prior "role-specific emphasis" guidance is retired — emphasis in the body biases each worker toward its lens and silently kills convergence (see Role Definitions for the failure mode). Specialization lives in Section 6 of the worker output, not in the dispatch prompt body.
|
|
76
80
|
|
|
77
81
|
### Required reading clause (analysis workers + report-writer worker)
|
|
78
82
|
|
|
@@ -206,6 +210,11 @@ A successful worker result must include the following sections in this exact ord
|
|
|
206
210
|
3. Safe or Reasonable Areas
|
|
207
211
|
4. Uncertain Points
|
|
208
212
|
5. Recommended Next Actions
|
|
213
|
+
6. **Specialization Lens (optional, worker-specific deep dive)** — additive content produced from this worker's specialization lens (see Role Definitions table). Items here are NOT subject to convergence cross-verification and MUST NOT duplicate sections 1–5. If the worker has nothing additional to add from its lens, omit Section 6 entirely or write `- No additional lens-specific findings.`
|
|
214
|
+
|
|
215
|
+
**Sections 1–5 are the common core.** Every analysis worker (Claude / Codex / Gemini) MUST cover the same set of dimensions in sections 1–5 — feasibility, requirement interpretation, hidden assumptions, alternatives, and execution risk — regardless of which model is producing the result. The point of running three models is to triangulate the same answer space, not to partition it. A worker that produces "Findings" populated only with items from its own lens (e.g. Codex only listing implementation-feasibility findings) is in breach of contract; convergence will treat coverage gaps as `verification-error`.
|
|
216
|
+
|
|
217
|
+
**Section 6 is the only legal home for specialization.** When the worker has a depth-of-perspective contribution that genuinely sits outside the common core (e.g. a Codex-specific stack-trace decomposition, a Claude-specific assumption-chain teardown, a Gemini-specific alternative-architecture sketch), put it there. Lead and convergence treat Section 6 as additive context, not as input to consensus measurement.
|
|
209
218
|
|
|
210
219
|
Code evidence must include file paths and line numbers.
|
|
211
220
|
|
|
@@ -13,8 +13,12 @@
|
|
|
13
13
|
> 다음 `implementation` run은 아래 체크박스가 `[x]`로 표시되어 있을 때에만 진입할 수 있습니다 (`okstra_ctl.run._validate_approved_plan` 가 이 마커를 line-anchored 정규식으로 검사하여 통과/거부합니다). 본문(`Sections 1`–`4.5`)을 끝까지 읽고, `4.5.9 Open Questions`가 비어 있거나 모두 해소된 뒤 승인해 주세요.
|
|
14
14
|
|
|
15
15
|
- 승인 여부 (사용자가 직접 편집): `- [ ] Approved` ← 승인하려면 `[ ]` 를 `[x]` 로 변경하여 저장하세요.
|
|
16
|
-
- 승인 후 다음 단계 명령어 (방법 A — 수동 편집):
|
|
17
|
-
-
|
|
16
|
+
- 승인 후 다음 단계 명령어 (방법 A — 수동 편집):
|
|
17
|
+
- Claude Code 세션 안: `/okstra-run task-key={{TASK_KEY}} task-type=implementation approved-plan=<이 보고서 경로>`
|
|
18
|
+
- 별도 터미널: `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type implementation --approved-plan <이 보고서 경로>`
|
|
19
|
+
- 승인 + 실행 한 번에 (방법 B — 진입 명령 자체를 승인 의사로):
|
|
20
|
+
- Claude Code 세션 안: `/okstra-run task-key={{TASK_KEY}} task-type=implementation approved-plan=<이 보고서 경로> approve`
|
|
21
|
+
- 별도 터미널: `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type implementation --approved-plan <이 보고서 경로> --approve`
|
|
18
22
|
- 방법 B 는 `--approve` 입력 행위 자체를 승인 의사로 모델링합니다. 런타임이 본 블록의 체크박스를 자동으로 `[x]` 로 바꾸고, 본 섹션 하단에 `승인 일시 (CLI ack): <ISO8601>` audit 라인을 한 줄 덧붙입니다.
|
|
19
23
|
- 승인을 보류하거나 거부하려면 체크박스는 `[ ]` 로 두고 `--approve` 도 사용하지 마세요. 필요한 변경 사항은 `4.5.9 Open Questions` 또는 `Section 5 Clarification Requests` 에 기록한 뒤 같은 phase 를 재실행해 주세요.
|
|
20
24
|
|
|
@@ -270,7 +274,11 @@ H1 이 `skip` 이거나 H3 가 `cancel` 인 경우, 본 섹션 다음의 4.6.4 ~
|
|
|
270
274
|
- `resolved`: 다음 run에서 lead가 답변을 받아 검증을 마쳤습니다.
|
|
271
275
|
- `obsolete`: 이후 분석 결과로 더 이상 필요 없어진 항목입니다.
|
|
272
276
|
|
|
273
|
-
이 보고서에 답을 채우신 뒤에는
|
|
277
|
+
이 보고서에 답을 채우신 뒤에는 한 줄로 같은 phase를 다시 실행하실 수 있습니다(자동으로 `$EDITOR`가 이 파일을 열고, 저장하면 같은 phase가 `--clarification-response`로 carry-in 되어 재실행됩니다).
|
|
278
|
+
- Claude Code 세션 안: `/okstra-run resume-clarification task-key={{TASK_KEY}}`
|
|
279
|
+
- 별도 터미널: `scripts/okstra.sh --resume-clarification --task-key {{TASK_KEY}}`
|
|
280
|
+
|
|
281
|
+
스크립트로 자동화하실 때는 셸 형식 `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type {{TASK_TYPE}} --clarification-response <이 파일 경로>`도 그대로 사용하실 수 있습니다. Node `okstra` admin CLI 는 `--task-key`/`--task-type`/`--resume-clarification` 을 받지 않으므로 위 두 진입점 중 하나를 사용하세요.
|
|
274
282
|
|
|
275
283
|
### 5.1 추가 자료 요청 (Additional Materials Requested)
|
|
276
284
|
|
|
@@ -298,16 +306,22 @@ H1 이 `skip` 이거나 H3 가 `cancel` 인 경우, 본 섹션 다음의 4.6.4 ~
|
|
|
298
306
|
|
|
299
307
|
This section is **always present** in every final report — never omit the heading. If there are no concrete actions to take, write the single line `- No further action required. Final verdict in section 2 stands.` under the heading and stop.
|
|
300
308
|
|
|
301
|
-
When concrete actions exist, list them as a numbered list using the rules below. Each item must include the exact command(s) the user can copy-paste.
|
|
309
|
+
When concrete actions exist, list them as a numbered list using the rules below. Each item must include the exact command(s) the user can copy-paste. Show **both** the Claude Code in-session form (`/okstra-run …`) and the external-terminal shell form (`scripts/okstra.sh …`) — the Node `okstra` admin CLI does NOT accept `--task-key` / `--task-type` / `--resume-clarification`. Prefer the `task-key` shorthand for follow-up runs and `resume-clarification` for clarification answer turn-arounds; show the equivalent full-args form only when useful.
|
|
302
310
|
|
|
303
311
|
1. **Highest-priority next action.** State what to do and why in one sentence, then the command. Example shortcut forms:
|
|
304
|
-
- Same phase rerun:
|
|
305
|
-
|
|
312
|
+
- Same phase rerun:
|
|
313
|
+
- Claude Code 세션 안: `/okstra-run task-key={{TASK_KEY}} task-type={{TASK_TYPE}}`
|
|
314
|
+
- 별도 터미널: `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type {{TASK_TYPE}}`
|
|
315
|
+
- Next phase (omit `task-type` to use the manifest's `workflow.nextRecommendedPhase` automatically when it is a concrete phase, not `pending-routing-decision` / `done-or-follow-up`):
|
|
316
|
+
- Claude Code 세션 안: `/okstra-run task-key={{TASK_KEY}} task-type=<next-phase>`
|
|
317
|
+
- 별도 터미널: `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type <next-phase>`
|
|
306
318
|
2. **Additional verification needed before implementation or release.** List read-only checks (test commands, log queries, dashboard URLs) that the user should run before moving to the next phase. No state-mutating commands here.
|
|
307
319
|
3. **Follow-up tasks or related tasks if needed.** Reference them by `task-key` when they already exist; otherwise describe the new brief to draft.
|
|
308
320
|
4. **If section 5 has any `open` rows**, the highest-priority next step MUST be the clarification turn-around. Show both forms:
|
|
309
|
-
- Preferred (interactive)
|
|
310
|
-
|
|
321
|
+
- Preferred (interactive) — opens this file in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in:
|
|
322
|
+
- Claude Code 세션 안: `/okstra-run resume-clarification task-key={{TASK_KEY}}`
|
|
323
|
+
- 별도 터미널: `scripts/okstra.sh --resume-clarification --task-key {{TASK_KEY}}`
|
|
324
|
+
- Scripted: `scripts/okstra.sh --task-key {{TASK_KEY}} --task-type {{TASK_TYPE}} --clarification-response <path-to-this-file-after-editing>`.
|
|
311
325
|
|
|
312
326
|
Empty-state placeholder, copy verbatim when nothing else applies:
|
|
313
327
|
|