okstra 0.20.1 → 0.21.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/README.kr.md +2 -2
  2. package/README.md +2 -2
  3. package/docs/kr/architecture.md +1 -0
  4. package/docs/kr/cli.md +1 -1
  5. package/docs/kr/performance-improvement-plan-v2.md +330 -0
  6. package/docs/kr/performance-improvement-plan.md +125 -0
  7. package/docs/project-structure-overview.md +388 -0
  8. package/docs/superpowers/plans/2026-05-14-convergence-queue-pruning.md +1568 -0
  9. package/package.json +1 -1
  10. package/runtime/BUILD.json +2 -2
  11. package/runtime/agents/SKILL.md +7 -1
  12. package/runtime/agents/workers/claude-worker.md +3 -1
  13. package/runtime/agents/workers/report-writer-worker.md +4 -0
  14. package/runtime/bin/okstra-codex-exec.sh +42 -0
  15. package/runtime/bin/okstra-gemini-exec.sh +7 -0
  16. package/runtime/bin/okstra-trace-cleanup.sh +42 -0
  17. package/runtime/prompts/profiles/final-verification.md +8 -2
  18. package/runtime/prompts/profiles/implementation-planning.md +1 -1
  19. package/runtime/prompts/profiles/release-handoff.md +26 -28
  20. package/runtime/prompts/profiles/requirements-discovery.md +1 -1
  21. package/runtime/python/okstra_ctl/render.py +78 -4
  22. package/runtime/python/okstra_ctl/run_context.py +5 -0
  23. package/runtime/python/okstra_ctl/workflow.py +8 -7
  24. package/runtime/python/okstra_ctl/worktree.py +155 -12
  25. package/runtime/skills/okstra-brief/SKILL.md +523 -0
  26. package/runtime/skills/okstra-convergence/SKILL.md +149 -37
  27. package/runtime/skills/okstra-report-writer/SKILL.md +8 -6
  28. package/runtime/templates/prd/brief.template.md +12 -0
  29. package/runtime/templates/project-docs/task-index.template.md +12 -0
  30. package/runtime/templates/reports/error-analysis-input.template.md +12 -0
  31. package/runtime/templates/reports/final-report.template.md +39 -12
  32. package/runtime/templates/reports/final-verification-input.template.md +22 -0
  33. package/runtime/templates/reports/implementation-input.template.md +12 -0
  34. package/runtime/templates/reports/implementation-planning-input.template.md +12 -0
  35. package/runtime/templates/reports/quick-input.template.md +12 -0
  36. package/runtime/templates/reports/release-handoff-input.template.md +23 -10
  37. package/runtime/templates/reports/schedule.template.md +12 -0
  38. package/runtime/templates/reports/settings.template.json +92 -30
  39. package/runtime/templates/reports/task-brief.template.md +12 -0
  40. package/src/install.mjs +1 -0
  41. package/src/uninstall.mjs +1 -0
@@ -6,6 +6,14 @@ user-invocable: false
6
6
 
7
7
  # OKSTRA Convergence
8
8
 
9
+ ## Scope and Terminology (BLOCKING)
10
+
11
+ This skill governs **Phase 5.5 (Convergence loop)** — a *lead operating phase* inside a single okstra run, not a task-type lifecycle phase. The 6 task-type lifecycle phases (`requirements-discovery` → `error-analysis` → `implementation-planning` → `implementation` → `final-verification` → `release-handoff`, see [okstra/SKILL.md](../../SKILL.md) "Lifecycle Phase Boundaries") are unchanged by this skill. The lead operating phases (Phase 1 Intake → Phase 7 Persist, see [okstra/SKILL.md](../../SKILL.md) "Quick Reference") describe how the lead drives a *single* task-type run.
12
+
13
+ **`contested` is a final classification only.** It is NEVER an intermediate queue label. The verification queue carries findings that are *unique to a single worker* (entered in Round 0) or *mixed/unresolved after a re-verification round* (carried forward). The `contested` label is assigned only when the **last executed round** completes and the queue is still non-empty.
14
+
15
+ When this skill says "queue" without qualifier, it means the *verification queue*: the set of findings that are still candidates for re-verification in subsequent rounds. The queue shrinks monotonically as findings get classified as `full-consensus`, `partial-consensus`, or `worker-unique`. Findings classified into any of these three categories MUST NOT appear in any subsequent round's reverify prompt, for any worker.
16
+
9
17
  ## When to Use
10
18
 
11
19
  - When the okstra skill Phase 5.5 (convergence loop) begins
@@ -28,7 +36,7 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
28
36
  |------|------|------------|
29
37
  | `full-consensus` | All participating workers agree | Required |
30
38
  | `partial-consensus` | Majority of workers agree; dissenting opinions are recorded | Required |
31
- | `contested` | No consensus reached even after max rounds; each worker's position is recorded | Required |
39
+ | `contested` | Final classification only. Assigned to a finding that remains in the verification queue after the **last executed round** completes (round index = `effectiveMaxRounds`). Each worker's position across all executed rounds is recorded. NEVER used as an intermediate label. | Required |
32
40
  | `worker-unique` | Only the discoverer confirms; others oppose or remain unverified | Required |
33
41
 
34
42
  ## Convergence Algorithm
@@ -51,37 +59,104 @@ Read the worker result files generated in Phase 4/5 and extract individual findi
51
59
  - Only one worker confirms a finding → `unique`, enter the verification queue.
52
60
  4. When grouping is ambiguous, prefer splitting over merging (avoid over-merging).
53
61
  5. Persist each finding's ticket set in the convergence state artifact under a `ticketIds` field on the finding record. Re-verification rounds carry the same field forward.
62
+ 6. After grouping, the verification queue contains EXACTLY the `unique`-marked findings (Step 3 case "Only one worker confirms"). `full-consensus` findings reached in Step 3 are recorded immediately in the convergence state with `classification: "full-consensus"` and DO NOT enter the queue.
54
63
 
55
- ### Round 1-N: Re-verification Loop
64
+ ### Round 1-N: Re-verification Loop (queue-pruned)
56
65
 
66
+ The verification queue holds only findings that are not yet classified. Confirmed items are *removed* from the queue and never re-sent.
67
+
68
+ ```text
69
+ roundIndex = 0
70
+ WHILE roundIndex < effectiveMaxRounds AND queue is non-empty:
71
+ roundIndex += 1
72
+
73
+ # Round 2 gate (only evaluated when entering round 2 or higher)
74
+ IF roundIndex > 1 AND NOT round_gate_open(queue, roundHistory[-1].dispatches):
75
+ record round2SkippedReason in convergence state
76
+ BREAK
77
+
78
+ inputQueueSize = len(queue)
79
+ dispatches = []
80
+ skippedWorkers = []
81
+
82
+ FOR each analysis worker W (excluding report-writer-worker):
83
+ items_for_W = [f for f in queue if W != f.originWorker]
84
+ IF items_for_W is empty:
85
+ skippedWorkers.append({worker: W, reason: "no items to verify"})
86
+ CONTINUE
87
+ dispatch = send_reverify_request(W, items_for_W, roundIndex)
88
+ dispatches.append(dispatch)
89
+
90
+ IF len(dispatches) > 0 AND all dispatches in this round are terminal non-result (timeout/error/no-result-file):
91
+ # Per "Worker failure handling in reverify" below — do NOT treat as DISAGREE.
92
+ record verification-error evidence on each finding in the queue for this round
93
+ record round2SkippedReason = "all-reverify-non-result" for any subsequent round
94
+ BREAK
95
+
96
+ resolvedCount = 0
97
+ carriedForwardCount = 0
98
+
99
+ FOR each finding F in queue (snapshot):
100
+ votes = aggregate_votes(F, dispatches) # AGREE / DISAGREE / SUPPLEMENT / verification-error
101
+ IF all non-error votes are AGREE or SUPPLEMENT:
102
+ F.classification = "full-consensus"
103
+ queue.remove(F); resolvedCount += 1
104
+ ELIF majority non-error votes are AGREE or SUPPLEMENT:
105
+ F.classification = "partial-consensus"
106
+ queue.remove(F); resolvedCount += 1
107
+ ELIF all non-error votes are DISAGREE:
108
+ F.classification = "worker-unique"
109
+ queue.remove(F); resolvedCount += 1
110
+ ELSE:
111
+ # mixed / insufficient non-error votes, or all-error votes → carry forward
112
+ carriedForwardCount += 1
113
+
114
+ record roundHistory entry { round: roundIndex, inputQueueSize, resolvedCount,
115
+ carriedForwardCount, dispatches, skippedWorkers }
116
+
117
+ # Final classification — runs after the WHILE loop exits (queue empty OR roundIndex == effectiveMaxRounds OR Round 2 gate closed)
118
+ FOR each finding F still in queue:
119
+ IF majority AGREE-or-SUPPLEMENT across all executed rounds:
120
+ F.classification = "partial-consensus"
121
+ ELSE:
122
+ F.classification = "contested"
57
123
  ```
58
- FOR round = 1 to convergence.maxRounds:
59
- IF the verification queue is empty:
60
- BREAK (early convergence)
61
-
62
- FOR each worker W (excluding the report writer):
63
- List of findings W must verify = items in the verification queue for which W is not the discoverer
64
- IF the list is empty:
65
- SKIP
66
- Send a re-verification request to W (batch: spawn once per worker)
67
- Collect responses: AGREE / DISAGREE / SUPPLEMENT for each finding
68
-
69
- FOR each finding F in the verification queue:
70
- Vote aggregation:
71
- - All AGREE or SUPPLEMENT full consensus
72
- - Majority AGREE or SUPPLEMENT → partial consensus
73
- - All DISAGREE worker-unique
74
- - Mixed results → Carry over to next round (or marked as contested if this is the final round)
75
-
76
- Update convergence state (record current round results)
77
- ```
124
+
125
+ The lead MUST construct the per-worker reverify prompt body from `items_for_W` only — confirmed findings from earlier rounds MUST NOT appear in the prompt, even as background. The dispatch-prompt invariant (every worker gets the same prompt content modulo their own findings) continues to apply to the per-round prompt body.
126
+
127
+ #### Round 2 gate (`round_gate_open` predicate)
128
+
129
+ `round_gate_open(queue, previous_dispatches)` returns `true` iff ALL three conditions hold (here `previous_dispatches` is the most recent entry's `dispatches` array in `roundHistory`); otherwise the lead records `round2SkippedReason` and breaks out of the loop:
130
+
131
+ | Condition | Required value | `round2SkippedReason` if not met |
132
+ |---|---|---|
133
+ | `effectiveMaxRounds >= 2` | true | `"max-rounds-1"` |
134
+ | `len(queue) > 0` after round 1 | true | `"queue-empty"` |
135
+ | At least one round-1 reverify dispatch terminated as `completed` | true | `"all-reverify-non-result"` |
136
+
137
+ When all conditions hold the predicate returns `true` and `round2SkippedReason` is set to `"not-skipped"`. The field is mandatory on every convergence state artifact — write `"not-skipped"` rather than omitting the key.
138
+
139
+ #### Worker failure handling in reverify (BLOCKING)
140
+
141
+ A reverify dispatch that returns a **terminal non-result** (`timeout`, `error`, no result file, or the wrapper records `cli-failure`) MUST NOT be aggregated as `DISAGREE`. Misclassifying a worker failure as DISAGREE biases the queue toward `contested`/`worker-unique` and produces meaningless final classifications.
142
+
143
+ Rules:
144
+
145
+ 1. For each affected finding, append a `votes[W].verdict = "verification-error"` entry instead of `disagree`, plus the wrapper's captured exit reason in `votes[W].explanation`.
146
+ 2. Record one event per failed dispatch via `python3 scripts/okstra-error-log.py append-observed --error-type cli-failure --agent <worker> ...` (the worker wrapper does this for Codex/Gemini; for Claude worker timeouts the lead does it).
147
+ 3. Add an entry to the round's `skippedWorkers[]` with `{worker: <W>, reason: "dispatch-non-result", terminalStatus: <timeout|error|not-run>}`.
148
+ 4. If at least one dispatch was issued AND all reverify dispatches in a round terminate as non-result (mirroring the pseudocode's `len(dispatches) > 0` guard), the round is treated as gate-closed: write `round2SkippedReason: "all-reverify-non-result"` (even if the round in question is round 1 — i.e. round 2 never runs because round 1 produced no usable votes), record one `contract-violation` event per non-result dispatch, and exit the WHILE loop.
149
+ 5. Section 6 (Specialization Lens) of a worker output is OUT of convergence scope per "Convergence scope" above — its absence is NEVER a `verification-error`.
150
+
151
+ The final classifier (`FOR each finding F still in queue` block) treats `verification-error` as "no usable vote" — it counts neither toward AGREE nor toward DISAGREE.
78
152
 
79
153
  ### Convergence Test
80
154
 
81
- - If the validation queue is empty → Convergence complete (`converged`)
82
- - Upon reaching the maximum number of rounds → Apply final classification to remaining unresolved findings:
83
- - Majority agreement → `partial-consensus`
155
+ - If the verification queue is empty at the end of any round → Convergence complete (`finalState: "converged"`), remaining rounds are not executed
156
+ - Upon completing the **last executed round** (where round index == `effectiveMaxRounds`, OR where Round 2 was suppressed per the Round 2 gate below) → Apply final classification to remaining queue items:
157
+ - Majority agreement across executed rounds → `partial-consensus`
84
158
  - Otherwise → `contested`
159
+ - The final classification step never runs while the queue is still being re-verified — confirmed items always exit the queue first.
85
160
 
86
161
  ## Verification Mode
87
162
 
@@ -229,13 +304,16 @@ For each finding:
229
304
 
230
305
  Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
231
306
 
307
+ Schema version `1.1` extends `1.0` (legacy fields kept as aliases for backward-compat with already-shipped reports):
308
+
232
309
  ```json
233
310
  {
234
- "schemaVersion": "1.0",
311
+ "schemaVersion": "1.1",
235
312
  "taskKey": "<task-key>",
236
313
  "config": {
237
314
  "enabled": true,
238
315
  "maxRounds": 2,
316
+ "effectiveMaxRounds": 2,
239
317
  "verificationMode": "lightweight"
240
318
  },
241
319
  "findings": [
@@ -243,36 +321,52 @@ Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
243
321
  "findingId": "F-001",
244
322
  "summary": "<one-line summary>",
245
323
  "category": "<bug|risk|missing|observation|...>",
246
- "originWorker": "<worker-id>",
324
+ "ticketIds": ["TICKET-123"],
325
+ "originWorker": "claude-worker",
247
326
  "originEvidence": "<evidence text>",
248
327
  "classification": "full-consensus",
249
328
  "rounds": [
250
329
  {
251
330
  "round": 1,
252
331
  "votes": {
253
- "<worker-id>": {
254
- "verdict": "agree",
255
- "explanation": "<brief>"
256
- }
332
+ "codex-worker": { "verdict": "agree", "explanation": "<brief>" },
333
+ "gemini-worker": { "verdict": "supplement", "explanation": "<brief>" }
257
334
  }
258
335
  }
259
336
  ],
260
- "consensusWorkers": ["worker-a", "worker-b", "worker-c"],
337
+ "consensusWorkers": ["claude-worker", "codex-worker", "gemini-worker"],
261
338
  "dissentingWorkers": []
262
339
  }
263
340
  ],
264
341
  "roundHistory": [
265
342
  {
266
343
  "round": 1,
267
- "verificationsRequested": 4,
268
- "verificationsCompleted": 4,
269
- "newConsensus": 2,
270
- "remainingInQueue": 1,
271
- "earlyExit": false
344
+ "inputQueueSize": 3,
345
+ "resolvedCount": 3,
346
+ "carriedForwardCount": 0,
347
+ "dispatches": [
348
+ { "worker": "codex-worker", "status": "completed", "durationMs": 184221 },
349
+ { "worker": "gemini-worker", "status": "completed", "durationMs": 201337 }
350
+ ],
351
+ "skippedWorkers": [
352
+ { "worker": "claude-worker", "reason": "no items to verify" }
353
+ ],
354
+ "verificationsRequested": 2,
355
+ "verificationsCompleted": 2,
356
+ "newConsensus": 3,
357
+ "remainingInQueue": 0,
358
+ "earlyExit": true
272
359
  }
273
360
  ],
361
+ "round2SkippedReason": "queue-empty",
274
362
  "finalState": "converged",
275
363
  "totalRounds": 1,
364
+ "finalClassificationCounts": {
365
+ "fullConsensus": 5,
366
+ "partialConsensus": 1,
367
+ "contested": 0,
368
+ "workerUnique": 1
369
+ },
276
370
  "summary": {
277
371
  "fullConsensus": 5,
278
372
  "partialConsensus": 1,
@@ -282,6 +376,24 @@ Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
282
376
  }
283
377
  ```
284
378
 
379
+ > The example above shows an abbreviated artifact: the `findings[]` array contains only `F-001` even though `finalClassificationCounts` totals 7 — a real artifact has one `findings[]` entry per finding. The example uses a clean one-round queue-drained run for clarity; runs that hit Round 2 add a second `roundHistory[]` entry with the same shape.
380
+
381
+ Schema rules:
382
+
383
+ - `schemaVersion`: literal string `"1.1"` for new runs. Readers MUST accept `"1.0"` for historical artifacts and treat any missing v1.1 field as `null`.
384
+ - `config.effectiveMaxRounds`: the integer the lead actually used after resolving the phase-aware default (`1` for `requirements-discovery`, `2` otherwise). MUST equal `config.maxRounds` when the manifest explicitly set it.
385
+ - `findings[].ticketIds`: array of ticket keys from Phase 4 grouping (parsed per the Round 0 step 5 rule). MAY be empty when the discovering worker tagged the finding `unknown`.
386
+ - `roundHistory[].inputQueueSize`: queue size at the start of this round.
387
+ - `roundHistory[].resolvedCount`: number of findings that exited the queue this round (sum of full+partial+worker-unique classifications produced this round).
388
+ - `roundHistory[].carriedForwardCount`: queue size at the END of this round (must equal `inputQueueSize - resolvedCount` when there are no in-round queue insertions; in-round insertions are forbidden).
389
+ - `roundHistory[].dispatches[]`: one entry per worker that was actually dispatched in this round. Each entry is `{worker, status, durationMs}`. `status ∈ {completed, timeout, error, not-run}`. `durationMs` is integer milliseconds and is always present, even for terminal-non-result dispatches (use the elapsed time before the wrapper gave up).
390
+ - `roundHistory[].skippedWorkers[]`: per-worker `{worker, reason}` for workers with no items to verify OR with a non-result dispatch.
391
+ - `roundHistory[].verificationsRequested|verificationsCompleted|newConsensus|remainingInQueue|earlyExit`: legacy v1.0 aliases. New runs SHOULD populate them so existing parsers keep working: `verificationsRequested == len(dispatches)`, `verificationsCompleted == len(d for d in dispatches if d.status == "completed")`, `newConsensus == resolvedCount`, `remainingInQueue == carriedForwardCount`, `earlyExit == (round < effectiveMaxRounds AND carriedForwardCount == 0)`.
392
+ - `round2SkippedReason`: literal enum `queue-empty | max-rounds-1 | all-reverify-non-result | not-skipped`. Always present. Use `"not-skipped"` when Round 2 actually ran. Use `"max-rounds-1"` when `effectiveMaxRounds == 1` (Round 2 was never attempted). Use `"queue-empty"` when Round 1 fully drained the queue. Use `"all-reverify-non-result"` when all Round 1 dispatches terminated as non-result.
393
+ - `finalClassificationCounts`: post-loop counts. New required field — must equal `summary` 1:1. `summary` is retained as the v1.0 alias.
394
+ - `finalState ∈ {converged, max-rounds-reached, aborted-non-result}`. Assigned by the lead at WHILE-loop exit: `converged` when the queue is empty at the end of any round; `max-rounds-reached` when the loop exits because `roundIndex == effectiveMaxRounds` with the queue still non-empty; `aborted-non-result` when the loop exits via the Worker-failure BREAK (Task 3's "Worker failure handling in reverify" rule 4). `aborted-non-result` is the new v1.1 value.
395
+ - `totalRounds`: count of rounds actually executed (not `effectiveMaxRounds`). May be `0` when Round 0 produced no queue items (all findings reached consensus during grouping).
396
+
285
397
  ## Output
286
398
 
287
399
  Information to be passed to Phase 6 after executing this skill:
@@ -46,7 +46,7 @@ The prompt MUST include, in this order at the top:
46
46
  6. `**Model:** Report writer worker, <modelExecutionValue>` (resolved per Phase 5.5 anchor-header rules)
47
47
  7. The full `[Required reading]` clause (see [okstra-team-contract](../okstra-team-contract/SKILL.md)) including `final-report-template.md`.
48
48
  8. The verbatim `## Available MCP Servers` block from the task brief, if present.
49
- 9. The convergence classifications (Full/Partial/Contested/Worker-Unique) and pointers to all worker result files under `worker-results/`.
49
+ 9. The convergence classifications (Full/Partial/Contested/Worker-Unique), the round history table (`roundHistory[]`), the `round2SkippedReason` value, and pointers to all worker result files under `worker-results/`. The report-writer worker must reproduce a Round History sub-table in Section 1 of the final report so the reader can see which rounds executed, queue sizes, and why Round 2 was (or was not) skipped.
50
50
  10. For implementation-planning runs: a literal block listing the 8 required English section headings the validator scans for (`Option Candidates`, `Trade-off`, `Recommended Option`, `Stepwise Execution Order`, `Dependency`, `Validation Checklist`, `Rollback`, `User Approval Request`). The writer must use these exact substrings as section headings (Korean translation in parentheses is allowed).
51
51
  11. An explicit instruction: `You are the author of TWO files: (a) the final-report file at <Result Path>, (b) the worker-results file at <Worker Result Path>. Write both directly using your Write tool. Do not return the report inline. The validator fails the run when (b) is missing.`
52
52
 
@@ -199,7 +199,7 @@ The final-report template `okstra-final-report.template.md` Section 4.5 already
199
199
 
200
200
  ### Release-handoff section contract (release-handoff runs only)
201
201
 
202
- When the run's `task-type` is `release-handoff`, the final report MUST include Section `## 4.6 Release Handoff Deliverables` with all seven sub-sections (`4.6.1` Source Verification Report, `4.6.2` Feature Branch & Working-Tree State, `4.6.3` User Selections, `4.6.4` Executed Commands, `4.6.5` Commit List, `4.6.6` Pull Request Outcome, `4.6.7` Routing Recommendation). Every entry is dictated by the lead's recorded git/gh command log and the user's verbatim answers to the H1/H2/H3 menu prompts. If the user picked `skip` (H1) or `cancel` (H3), keep 4.6.3 populated but leave 4.6.4–4.6.6 explicitly empty per the template's empty-state lines.
202
+ When the run's `task-type` is `release-handoff`, the final report MUST include Section `## 4.6 Release Handoff Deliverables` with all seven sub-sections (`4.6.1` Source Verification Report, `4.6.2` Feature Branch & Working-Tree State, `4.6.3` User Selections, `4.6.4` Executed Commands, `4.6.5` Commit List, `4.6.6` Pull Request Outcome, `4.6.7` Routing Recommendation). Every entry is dictated by the lead's recorded git/gh command log and the user's verbatim answers to the H1/H2/H3 menu prompts. H1 choices are `local only`, `push + PR`, or `skip`; release-handoff records existing implementation commits and MUST NOT create new commits. If the user picked `skip` (H1) or `cancel` (H3), keep 4.6.3 populated but leave 4.6.4–4.6.6 explicitly empty per the template's empty-state lines.
203
203
 
204
204
  **Single-lead authorship (release-handoff only):** release-handoff has no worker roster (no `Report writer worker`, no `Claude worker` drafter). The Claude lead authors the final-report file directly — there is no `Report writer worker` dispatch to perform in Phase 6, no resume-safe dispatch concern, and no mandatory worker-results file for a report-writer role. The rest of this skill's dispatch / resume / fallback machinery applies ONLY when `Report writer worker` is in the roster (i.e. every task-type other than `release-handoff`).
205
205
 
@@ -226,12 +226,13 @@ Section numbering matches `okstra-final-report.template.md`. Section 0 is the ca
226
226
  0. **Clarification Response Carried In** - if `{{CLARIFICATION_RESPONSE_RELATIVE_PATH}}` is non-empty, read `instruction-set/clarification-response.md`, reconcile every prior `Q*` row, and record the outcome (`resolved`/`obsolete`) plus the new evidence in this section before drafting the verdict
227
227
  1. **Problem or Verification Summary** - Key summary based on the brief and data (3–5 bullet points)
228
228
  2. **Cross Verification Results** (Use 4 categories when convergence is enabled, per `okstra-convergence`)
229
+ - Round History sub-table (convergence-enabled runs only): one row per executed round with columns `Round | inputQueueSize | resolvedCount | carriedForwardCount | dispatches (worker:status:durationMs) | skippedWorkers (worker:reason)`. Add a one-line note immediately under the table with `round2SkippedReason: <value>` (always present, even when `"not-skipped"`). Pull all values verbatim from `convergence-<task-type>-<seq>.json`.
229
230
  - Full Consensus: Findings agreed upon by all workers
230
231
  - Partial Consensus: Agreed upon by a majority of workers; dissenting opinions are specified
231
- - Contested: No consensus after max rounds; each worker’s position specified
232
+ - Contested: No consensus after the last executed round; each worker’s position specified. Empty contested list is shown as the literal line `- 합의 미달 항목 없음.`
232
233
  - Worker-Unique: Verified only by the discoverer; verification history specified
233
- - In runs with convergence disabled, maintain the existing Consensus/Differences format
234
- 3. **Final Verdict** - Conclusion based on comprehensive evidence; direction provided
234
+ - In runs with convergence disabled, maintain the existing Consensus/Differences format and omit the Round History sub-table.
235
+ 3. **Final Verdict** - Conclusion based on comprehensive evidence; direction provided. For `final-verification`, include a `Verdict Token` field whose value is exactly `accepted`, `conditional-accept`, or `blocked`; `release-handoff` uses that field as its entry gate.
235
236
  4. **Evidence and Detailed Analysis**
236
237
  - Key Evidence: File path, line number, actual evidence
237
238
  - If explicit expected values are present in `reference-expectations.md`, specify whether they match or differ from the expected values in config files / deployment manifests
@@ -257,7 +258,8 @@ Section numbering matches `okstra-final-report.template.md`. Section 0 is the ca
257
258
  - Write the actual analysis text instead of a meta-description
258
259
  - Do not make unfounded assertions
259
260
  - Include findings from all four categories. Do not omit "contested" or "worker-unique" findings
260
- - Include the convergence round history and a summary of votes by worker for each finding
261
+ - Include the convergence round history sub-table (Section 1) so the reader can audit which rounds executed and what `round2SkippedReason` indicates (e.g. `"not-skipped"` when Round 2 ran, or one of the three skip reasons). Pull values verbatim from `convergence-<task-type>-<seq>.json`; do NOT recompute.
262
+ - For each finding, include a brief summary of votes per worker across executed rounds. `verification-error` votes are listed as such — never as `DISAGREE`.
261
263
  - The report writer worker does not participate in the re-verification vote. It is responsible only for drafting the final report
262
264
 
263
265
  ## Artifact Persistence Checklist
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA PRD Brief - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: new
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Task Brief
2
14
 
3
15
  <!--
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Task Index - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: "{{CURRENT_TASK_STATUS}}"
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Task Summary
2
14
 
3
15
  ## Current Snapshot
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Error Analysis Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Error Analysis Input
2
14
 
3
15
  ## Identity
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Final Report - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: in-progress
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # {{TASK_KEY}} - Multi-Agent Cross Verification Final Report
2
14
  - Created at: {{RUN_TIMESTAMP_ISO}}
3
15
  - Task Key: {{TASK_KEY}}
@@ -71,6 +83,19 @@
71
83
  > 처리 토큰 = input + output + cache_creation + cache_read (raw). 환산 토큰 = cache_read×0.1 + cache_creation×1.25 + output×5 + input (input-등가). 비용은 공시 가격 기준 추정치.
72
84
 
73
85
  ## 1. Cross Verification Results
86
+
87
+ ### 1.0 Round History (convergence-enabled runs only)
88
+
89
+ `state/convergence-<task-type>-<seq>.json` 의 값을 그대로 옮긴다. convergence가 비활성화된 run에서는 이 섹션 전체를 삭제한다.
90
+
91
+ | Round | inputQueueSize | resolvedCount | carriedForwardCount | dispatches (worker:status:durationMs) | skippedWorkers (worker:reason) |
92
+ |-------|----------------|---------------|----------------------|----------------------------------------|---------------------------------|
93
+ | 1 | 3 | 2 | 1 | codex-worker:completed:184221, gemini-worker:completed:201337 | claude-worker:no-items |
94
+ | 2 | 1 | 1 | 0 | claude-worker:completed:92110 | -- |
95
+
96
+ - `round2SkippedReason`: `not-skipped` ← 값은 `queue-empty | max-rounds-1 | all-reverify-non-result | not-skipped` 중 하나.
97
+ - 실행된 round 수가 0 (Round 0에서 모든 finding이 곧장 full-consensus 가 된 경우) 이면 표 대신 한 줄로 적는다 — `- Round 0 grouping에서 모든 finding이 합의되어 재검증 라운드가 실행되지 않았습니다.`
98
+
74
99
  ### 1.1 Consensus
75
100
 
76
101
  | ID | Ticket ID | Statement | Supporting workers | Evidence (path:line / log / worker report) |
@@ -89,11 +114,12 @@
89
114
 
90
115
  ## 2. Final Verdict
91
116
 
92
- 최종 결론과 권장 방향을 한 표로 명시합니다. `Direction`은 다음 중 하나입니다 — `continue-investigation`, `begin-implementation`, `approve`, `reject`, `hold`.
117
+ 최종 결론과 권장 방향을 한 표로 명시합니다. `Direction`은 다음 중 하나입니다 — `continue-investigation`, `begin-implementation`, `approve`, `reject`, `hold`. `task-type`이 `final-verification`이면 `Verdict Token` 값은 반드시 `accepted` / `conditional-accept` / `blocked` 중 정확히 하나여야 하며, `release-handoff`는 이 값을 진입 게이트로 사용합니다. 다른 task-type에서는 `Verdict Token`에 `not-applicable`을 적습니다.
93
118
 
94
119
  | 항목 | 값 |
95
120
  |------|----|
96
121
  | Final Conclusion | <한 줄 결론> |
122
+ | Verdict Token | `<accepted / conditional-accept / blocked / not-applicable>` |
97
123
  | Direction | `<continue-investigation / begin-implementation / approve / reject / hold>` |
98
124
  | 근거 요약 | <`1.1`, `3.1` 등 본 보고서 행 ID를 콤마로> |
99
125
  | 다음 단계 | <Section 6 또는 7 중 어디로 이어지는지> |
@@ -168,10 +194,10 @@
168
194
 
169
195
  ### 4.5.5 Dependency / Migration Risk (의존성·마이그레이션 위험)
170
196
 
171
- 순서 제약, 데이터 백필, feature-flag 선행 조건, 조율 등을 표로 정리합니다. 해당 없음 시: `- 의존성·마이그레이션 위험 없음.` 한 줄.
197
+ 순서 제약, 데이터 백필, feature-flag 선행 조건, repo-internal sequencing 등을 표로 정리합니다. 외부 승인·권한 확인·vendor 또는 외부 팀 조율은 공통 권한 규칙상 위험/일정 항목으로 추가하지 않습니다. 해당 없음 시: `- 의존성·마이그레이션 위험 없음.` 한 줄.
172
198
 
173
- | ID | Kind (order / backfill / flag-precondition / coordination / other) | Item | 영향 | 완화 / 선행 작업 |
174
- |----|--------------------------------------------------------------------|------|------|------------------|
199
+ | ID | Kind (order / backfill / flag-precondition / repo-sequencing / other) | Item | 영향 | 완화 / 선행 작업 |
200
+ |----|------------------------------------------------------------------------|------|------|------------------|
175
201
  | DM-001 | <kind> | <한 줄 요약> | <영향 범위> | <대응 방안> |
176
202
 
177
203
  ### 4.5.6 Validation Checklist (검증 체크리스트)
@@ -212,9 +238,9 @@ pre-planning에서 발견된 모호점을 표로 남깁니다. 사용자가 승
212
238
 
213
239
  ### 4.6.1 Source Verification Report (선행 final-verification 인용)
214
240
  - Path (project-relative): `<runs/final-verification/.../reports/final-report-final-verification-<seq>.md>`
215
- - Quoted final verdict line (정확히 `accepted` 토큰을 포함해야 함):
241
+ - Quoted `Verdict Token` row from that report's `## 2. Final Verdict` table (값이 정확히 `accepted`여야 함):
216
242
  > <원문 인용>
217
- - 만약 원본 verdict `accepted` 가 아니라면 본 run 은 **실행되지 않아야 했습니다**. self-review 단계에서 contract-violated 로 처리하고 routing 을 `final-verification` 으로 되돌립니다.
243
+ - 만약 원본 `Verdict Token` 값이 `accepted` 가 아니라면 본 run 은 **실행되지 않아야 했습니다**. self-review 단계에서 contract-violated 로 처리하고 routing 을 `final-verification` 으로 되돌립니다.
218
244
 
219
245
  ### 4.6.2 Feature Branch & Working-Tree State (run 시작 시점)
220
246
  - Feature branch (`git rev-parse --abbrev-ref HEAD`): `<branch-name>`
@@ -227,9 +253,9 @@ pre-planning에서 발견된 모호점을 표로 남깁니다. 사용자가 승
227
253
  ### 4.6.3 User Selections (메뉴 응답 기록)
228
254
  | 질문 ID | 질문 본문 | 사용자 응답 (원문) | 응답이 가능한 보기 |
229
255
  |---------|-----------|--------------------|--------------------|
230
- | H1 | 어떤 작업을 실행할까요? | <`commit only` / `commit + PR` / `skip`> | `commit only` / `commit + PR` / `skip` |
231
- | H2 | PR base 브랜치를 골라주세요. (H1=`commit + PR` 인 경우에만 묻습니다) | <`staging` / `preprod` / `prod` / `main` / `dev` / 사용자가 입력한 브랜치명> | `staging` / `preprod` / `prod` / `main` / `dev` / 직접 입력 |
232
- | H3 | 워커가 작성한 commit 메시지 / PR 본문 초안을 어떻게 처리할까요? | <`use as-is` / `edit then proceed` / `cancel`> | `use as-is` / `edit then proceed` / `cancel` |
256
+ | H1 | 어떤 작업을 실행할까요? | <`local only` / `push + PR` / `skip`> | `local only` / `push + PR` / `skip` |
257
+ | H2 | PR base 브랜치를 골라주세요. (H1=`push + PR` 인 경우에만 묻습니다) | <`staging` / `preprod` / `prod` / `main` / `dev` / 사용자가 입력한 브랜치명> | `staging` / `preprod` / `prod` / `main` / `dev` / 직접 입력 |
258
+ | H3 | lead가 작성한 PR title / PR body 초안을 어떻게 처리할까요? | <`use as-is` / `edit then proceed` / `cancel`> | `use as-is` / `edit then proceed` / `cancel` |
233
259
 
234
260
  H1 이 `skip` 이거나 H3 가 `cancel` 인 경우, 본 섹션 다음의 4.6.4 ~ 4.6.6 은 빈 결과로 채우고 (mutating 명령 미실행) 4.6.7 routing 만 채웁니다.
235
261
 
@@ -242,13 +268,14 @@ H1 이 `skip` 이거나 H3 가 `cancel` 인 경우, 본 섹션 다음의 4.6.4 ~
242
268
  | 1 | `<예: git add path/to/file.py>` | `0` | `<요약>` |
243
269
 
244
270
  ### 4.6.5 Commit List (생성된 commit)
271
+ - `implementation` phase에서 이미 생성된 commit 범위(`git log <base>..HEAD`)를 기록합니다. release-handoff는 새 commit을 만들지 않습니다.
245
272
  - 각 commit 의 short SHA / full SHA / subject / 영향 파일 목록을 한 항목씩 기록합니다.
246
- - staged 변경이 없어 commit 만들어지지 않았다면 다음 한 줄만 적습니다.
247
- > `- No commit was produced (working tree had no staged changes).`
273
+ - commit 범위가 비어 있으면 release-handoff가 실행되면 됩니다. 다음 한 줄을 적고 routing을 `implementation`으로 되돌립니다.
274
+ > `- No implementation commits found; release-handoff is blocked.`
248
275
 
249
276
  ### 4.6.6 Pull Request Outcome (PR 결과)
250
277
  - 다음 네 가지 중 정확히 하나의 형식으로 한 줄을 적습니다.
251
- - `- No PR action requested.` (H1=`commit only` 또는 `skip` 인 경우)
278
+ - `- No PR action requested.` (H1=`local only` 또는 `skip` 인 경우)
252
279
  - `- PR created: <url>` + 타이틀 + base 브랜치
253
280
  - `- PR reused: <url>` (run 시작 시점에 같은 head 의 open PR 이 이미 존재해 `gh pr create` 를 생략한 경우)
254
281
  - `- PR creation skipped: <reason>` (H3=`cancel`, 또는 push/PR 생성 도중 사용자가 중단 지시한 경우. reason 은 풀어 쓴 한 문장)
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Final Verification Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Final Verification Input
2
14
 
3
15
  ## Identity
@@ -16,6 +28,16 @@
16
28
  - What was supposed to be delivered?
17
29
  - What is the intended acceptance decision?
18
30
 
31
+ ## Source Implementation Report
32
+
33
+ - Path (project-relative) to the originating `implementation` final-report:
34
+ - Worktree / checkout path that final-verification must inspect:
35
+ - Implementation base ref (`<base>` for `git diff --stat <base>..HEAD`):
36
+ - Implementation head SHA expected at verification start:
37
+ - Quoted `Commit list` / `Diff summary` excerpt from the implementation report:
38
+
39
+ > If this section is empty, points to a missing report, or names a checkout that does not match the implementation report's commit list / diff summary, final-verification MUST end with status `blocked` and route back to `implementation` or `implementation-planning`. Do not verify an ambiguous target.
40
+
19
41
  ## Verification Evidence
20
42
 
21
43
  - PR or change summary:
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Implementation Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Implementation Input
2
14
 
3
15
  ## Identity
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Implementation Planning Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Implementation Planning Input
2
14
 
3
15
  ## Identity
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Quick Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Quick Input
2
14
 
3
15
  ## Basic Identity
@@ -1,3 +1,15 @@
1
+ ---
2
+ title: OKSTRA Release Handoff Input - {{TASK_KEY}}
3
+ id: {{FM_ID}}
4
+ tags: {{FM_TAGS}}
5
+ status: ready-for-agent
6
+ aliases: {{FM_ALIASES}}
7
+ date: {{TASK_DATE}}
8
+ task-id: "{{TASK_ID}}"
9
+ task-group: "{{TASK_GROUP}}"
10
+ project-id: "{{PROJECT_ID}}"
11
+ ---
12
+
1
13
  # OKSTRA Release Handoff Input
2
14
 
3
15
  ## Identity
@@ -13,15 +25,16 @@
13
25
  ## Source Verification Report
14
26
 
15
27
  - Path (project-relative) to the `final-verification` final-report whose verdict authorises this handoff:
16
- - Verbatim quoted line from that report's `## 2. Final Verdict` (MUST read exactly `accepted`):
28
+ - Verbatim quoted `Verdict Token` row from that report's `## 2. Final Verdict` table (MUST have value `accepted`):
17
29
  - Run timestamp of that final-verification run:
18
30
 
19
- > If this section is empty or cites a verdict other than `accepted`, the lead MUST end the run immediately and route back to `final-verification`. Release-handoff never operates on `conditional-accept` or `blocked` outcomes.
31
+ > If this section is empty or cites a `Verdict Token` value other than `accepted`, the lead MUST end the run immediately and route back to `final-verification`. Release-handoff never operates on `conditional-accept` or `blocked` outcomes.
20
32
 
21
33
  ## Working-Tree Snapshot (filled at run start)
22
34
 
23
35
  - Feature branch (`git rev-parse --abbrev-ref HEAD`):
24
36
  - `git status --short` output at run start:
37
+ - Existing implementation commits (`git log --oneline <base>..HEAD`):
25
38
  - Existing PR for this head, if any (`gh pr list --head <branch> --state open --json url --jq '.[0].url'`):
26
39
 
27
40
  ## Candidate PR Base Branches
@@ -30,10 +43,10 @@
30
43
  - Repo-specific preference, if known (e.g. `main` is the integration branch):
31
44
  - Branches that MUST NOT be used as a base in this repo (security / freeze rules):
32
45
 
33
- ## Commit Message Drafter Inputs
46
+ ## PR Draft Inputs
34
47
 
35
- - Commit type convention this repo follows (`release-please` types, plain conventional commits, free-form):
36
- - `git diff <base>..HEAD --stat` (or equivalent change summary) for the drafter to ground its message on:
48
+ - PR title convention this repo follows (`release-please` types, plain conventional commits, free-form):
49
+ - `git log --oneline <base>..HEAD` and `git diff <base>..HEAD --stat` for the lead to ground its PR draft on:
37
50
  - Files known to be part of the prior `implementation` run's approved plan:
38
51
  - Files appearing in the diff that were in the prior run's `Out-of-plan edits` block:
39
52
 
@@ -45,7 +58,7 @@
45
58
 
46
59
  ## User-Selection Defaults (advisory only — the user still chooses interactively)
47
60
 
48
- - Suggested action (Q1): `commit only` | `commit + PR` | `skip`
61
+ - Suggested action (Q1): `local only` | `push + PR` | `skip`
49
62
  - Suggested base (Q2): one of the candidate base branches above
50
63
  - Suggested message handling (Q3): `use as-is` | `edit then proceed`
51
64
 
@@ -59,12 +72,12 @@
59
72
 
60
73
  > The lead MUST NOT extend handoff actions into items listed here. If an excluded item should ship in this PR, edit this section before the run starts — do not silently fold it in.
61
74
 
62
- ## Questions for Drafter Worker
75
+ ## Questions for Lead Drafting
63
76
 
64
- 1. What commit type and scope best describe the cumulative diff?
65
- 2. What single subject line summarises the change in under 72 characters?
77
+ 1. What PR title best describes the cumulative committed diff?
78
+ 2. Which implementation commits should be highlighted in the PR body?
66
79
  3. What changed at a behavioural level (not just file-level) that reviewers need to know?
67
- 4. Which prior commits in this feature branch should be referenced or amended by this commit?
80
+ 4. Which prior commits in this feature branch should be referenced in the PR?
68
81
  5. Does the diff include any change that requires a follow-up PR (migration squash, config split, etc.) — and if so, should that be noted in the PR body's `## Follow-ups` block?
69
82
 
70
83
  ## Conversion Note