@open-agent-toolkit/cli 0.0.43 → 0.0.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/assets/agents/oat-phase-implementer.md +230 -0
  2. package/assets/agents/oat-reviewer.md +3 -3
  3. package/assets/docs/cli-utilities/configuration.md +15 -3
  4. package/assets/docs/reference/cli-reference.md +17 -14
  5. package/assets/docs/reference/oat-directory-structure.md +17 -17
  6. package/assets/docs/workflows/projects/artifacts.md +34 -0
  7. package/assets/docs/workflows/projects/implementation-execution.md +161 -0
  8. package/assets/docs/workflows/projects/lifecycle.md +22 -29
  9. package/assets/docs/workflows/projects/reviews.md +4 -2
  10. package/assets/docs/workflows/skills/index.md +0 -1
  11. package/assets/public-package-versions.json +4 -4
  12. package/assets/skills/oat-doctor/SKILL.md +3 -3
  13. package/assets/skills/oat-project-implement/SKILL.md +363 -126
  14. package/assets/skills/oat-project-import-plan/SKILL.md +2 -3
  15. package/assets/skills/oat-project-next/SKILL.md +11 -12
  16. package/assets/skills/oat-project-plan/SKILL.md +23 -5
  17. package/assets/skills/oat-project-plan-writing/SKILL.md +2 -2
  18. package/assets/skills/oat-project-progress/SKILL.md +29 -35
  19. package/assets/skills/oat-project-quick-start/SKILL.md +13 -3
  20. package/assets/skills/oat-worktree-bootstrap-auto/SKILL.md +2 -2
  21. package/assets/templates/implementation.md +8 -3
  22. package/assets/templates/plan.md +24 -3
  23. package/assets/templates/state.md +1 -1
  24. package/dist/commands/config/index.d.ts.map +1 -1
  25. package/dist/commands/config/index.js +17 -4
  26. package/dist/commands/init/tools/index.js +1 -1
  27. package/dist/commands/init/tools/shared/skill-manifest.d.ts +2 -2
  28. package/dist/commands/init/tools/shared/skill-manifest.d.ts.map +1 -1
  29. package/dist/commands/init/tools/shared/skill-manifest.js +1 -1
  30. package/dist/commands/project/index.d.ts.map +1 -1
  31. package/dist/commands/project/index.js +3 -1
  32. package/dist/commands/project/set-mode/index.d.ts +0 -6
  33. package/dist/commands/project/set-mode/index.d.ts.map +1 -1
  34. package/dist/commands/project/set-mode/index.js +16 -96
  35. package/dist/commands/project/validate-plan/index.d.ts +3 -0
  36. package/dist/commands/project/validate-plan/index.d.ts.map +1 -0
  37. package/dist/commands/project/validate-plan/index.js +44 -0
  38. package/dist/commands/project/validate-plan/validate-plan.d.ts +20 -0
  39. package/dist/commands/project/validate-plan/validate-plan.d.ts.map +1 -0
  40. package/dist/commands/project/validate-plan/validate-plan.js +77 -0
  41. package/dist/commands/tools/update/index.d.ts +4 -0
  42. package/dist/commands/tools/update/index.d.ts.map +1 -1
  43. package/dist/commands/tools/update/index.js +17 -1
  44. package/dist/commands/tools/update/update-tools.d.ts +1 -0
  45. package/dist/commands/tools/update/update-tools.d.ts.map +1 -1
  46. package/dist/commands/tools/update/update-tools.js +80 -1
  47. package/dist/config/oat-config.d.ts +1 -0
  48. package/dist/config/oat-config.d.ts.map +1 -1
  49. package/dist/config/oat-config.js +3 -0
  50. package/dist/config/resolve.d.ts.map +1 -1
  51. package/dist/config/resolve.js +9 -0
  52. package/package.json +2 -2
  53. package/assets/skills/oat-project-subagent-implement/SKILL.md +0 -549
  54. package/assets/skills/oat-project-subagent-implement/examples/pattern-hill-checkpoint.md +0 -110
  55. package/assets/skills/oat-project-subagent-implement/examples/pattern-parallel-phases.md +0 -118
  56. package/assets/skills/oat-project-subagent-implement/scripts/dispatch.sh +0 -133
  57. package/assets/skills/oat-project-subagent-implement/scripts/reconcile.sh +0 -182
  58. package/assets/skills/oat-project-subagent-implement/scripts/review-gate.sh +0 -187
  59. package/assets/skills/oat-project-subagent-implement/tests/fixtures/sample-plan.md +0 -234
  60. package/assets/skills/oat-project-subagent-implement/tests/test-dry-run.sh +0 -126
  61. package/assets/skills/oat-project-subagent-implement/tests/test-hill-checkpoint.sh +0 -127
  62. package/assets/skills/oat-project-subagent-implement/tests/test-reconcile.sh +0 -254
  63. package/assets/skills/oat-project-subagent-implement/tests/test-review-gate.sh +0 -220
@@ -1,10 +1,11 @@
1
1
  ---
2
2
  name: oat-project-implement
3
- version: 1.3.1
4
- description: Use when plan.md is ready for execution. Implements plan tasks sequentially with TDD discipline and state tracking.
3
+ version: 2.0.5
4
+ description: Use when plan.md is ready for execution. Dispatches phase-level subagents with bounded fix loops; supports plan-declared parallel phase groups with worktree-isolated execution and ordered fan-in.
5
+ argument-hint: '[--retry-limit <N>] [--dry-run]'
5
6
  disable-model-invocation: true
6
7
  user-invocable: true
7
- allowed-tools: Read, Write, Bash(git:*), Glob, Grep, AskUserQuestion
8
+ allowed-tools: Read, Write, Bash(git:*), Glob, Grep, AskUserQuestion, Task
8
9
  ---
9
10
 
10
11
  # Implementation Phase
@@ -29,7 +30,7 @@ Do not enter checkpoint review, final review, revise, or PR-final handoff with d
29
30
 
30
31
  ## Progress Indicators (User-Facing)
31
32
 
32
- When executing this skill, provide lightweight progress feedback so the user can tell whats happening after they confirm.
33
+ When executing this skill, provide lightweight progress feedback so the user can tell what's happening after they confirm.
33
34
 
34
35
  - Print a phase banner once at start using horizontal separators, e.g.:
35
36
 
@@ -39,13 +40,13 @@ When executing this skill, provide lightweight progress feedback so the user can
39
40
 
40
41
  - For each task, announce a compact header before doing work:
41
42
  - `OAT ▸ IMPLEMENT {task_id}: {task_name}`
42
- - Before multi-step bookkeeping work (updating artifacts/state, verification, committing, dashboard refresh), print 2–5 short step indicators, e.g.:
43
+ - Before multi-step "bookkeeping" work (updating artifacts/state, verification, committing, dashboard refresh), print 2–5 short step indicators, e.g.:
43
44
  - `[1/4] Updating implementation.md + state.md…`
44
45
  - `[2/4] Running verification…`
45
46
  - `[3/4] Committing…`
46
47
  - `[4/4] Refreshing dashboard…`
47
48
  - For long-running operations (tests/lint/type-check/build, reviews, large diffs), print a start line and a completion line (duration optional).
48
- - Keep it concise; dont print a line for every shell command.
49
+ - Keep it concise; don't print a line for every shell command.
49
50
 
50
51
  **BLOCKED Activities:**
51
52
 
@@ -97,19 +98,95 @@ PROJECTS_ROOT="${PROJECTS_ROOT%/}"
97
98
 
98
99
  **If `PROJECT_PATH` is valid:** derive `{project-name}` as the directory name (basename of the path).
99
100
 
100
- ### Step 0.5: Execution Mode Redirect Guard
101
+ ### Step 0.5: Capability Detection and Tier Selection
101
102
 
102
- Read execution mode from `"$PROJECT_PATH/state.md"` frontmatter:
103
+ Detect whether native subagent dispatch is available. The detection logic follows the same pattern used by `oat-project-review-provide` but produces a two-tier outcome (no fresh-session tier — this skill runs autonomously and cannot block on user-initiated fresh sessions mid-run).
103
104
 
104
- ```bash
105
- EXEC_MODE=$(grep "^oat_execution_mode:" "$PROJECT_PATH/state.md" 2>/dev/null | awk '{print $2}')
106
- EXEC_MODE="${EXEC_MODE:-single-thread}"
105
+ Detection logic:
106
+
107
+ - If the host is Claude Code, check Task-tool availability with `subagent_type: "oat-phase-implementer"` and `subagent_type: "oat-reviewer"`. Available → Tier 1.
108
+ - If the host is Cursor, use Cursor-native invocation. Available → Tier 1.
109
+ - If the host is Codex multi-agent, verify `[features] multi_agent = true` and whether `spawn_agent` requires explicit authorization.
110
+ - Codex Tier 1 dispatches for `oat-phase-implementer` and `oat-reviewer` must use self-contained scope packets and fresh context. Do not rely on forked full-thread context when pinning a specialized OAT role.
111
+ - Available without auth → Tier 1.
112
+ - Available with auth required → fail closed. You MUST ask the user once at skill start before selecting Tier 2 or starting implementation work:
113
+
114
+ ```
115
+ This OAT implementation skill normally delegates phase implementation and review to subagents. Authorize subagent delegation for this run?
116
+
117
+ Yes authorizes both oat-phase-implementer and oat-reviewer across every phase in this run.
118
+ ```
119
+
120
+ - Approved → Tier 1.
121
+ - Declined → Tier 2.
122
+
123
+ - If the host does not resolve either agent → Tier 2.
124
+
125
+ **Approval scope rule:** this Tier selection applies to both phase implementation and checkpoint review. Do not infer a mixed mode from conversational emphasis on review checkpoints. If the user has not explicitly approved Tier 1 for the run, stay Tier 2 throughout. Mixed mode is only valid when the user explicitly requests it.
126
+
127
+ **Codex fail-closed rule:** after this skill is invoked, "user did not separately ask for subagents" is not a valid Tier 2 reason. If Codex can spawn agents but requires explicit user authorization, the implementation MUST NOT continue until the delegation question above is answered. Tier 2 is allowed only when:
128
+
129
+ - `user declined delegation`
130
+ - `spawn_agent unavailable`
131
+ - `required agent role unresolved`
132
+
133
+ Report the selected tier to the user:
134
+
135
+ ```
136
+ [preflight] Checking subagent availability…
137
+ → oat-phase-implementer + oat-reviewer: {available | authorization required | not resolved}
138
+ → Selected: Tier {1 | 2} — {Subagents | Inline}
139
+ → Reason: {authorized | available without auth | user declined delegation | spawn_agent unavailable | required agent role unresolved}
107
140
  ```
108
141
 
109
- If `EXEC_MODE` is `subagent-driven`:
142
+ Do not print `[0/N]` for this preflight step. The implementation denominator is not established by capability detection; use the literal `[preflight]` label above.
143
+
144
+ **Hard pre-work guard:** before any code edit, test run, or implementation commit, print the selected tier and reason. If Tier 2 is selected, the reason must be one of the three allowed Tier 2 reasons above. Do not run tests, edit files, or create implementation commits until Step 0.5 has completed and the tier report has been printed.
145
+
146
+ **Tier is locked for the remainder of the run.** Subsequent phase implementation and review dispatches use the same tier. No mid-run re-evaluation or downgrade unless the user explicitly asks to change execution mode.
147
+
148
+ **Recovery if Step 0.5 was skipped:** If implementation work has already started inline before completing Step 0.5, STOP immediately. Preserve any work in progress, complete or revert to a clean task boundary, and re-run Step 0.5 before continuing. Do not silently continue in Tier 2.
149
+
150
+ **Codex authorization example:**
151
+
152
+ ```
153
+ User invokes: $oat-project-implement
154
+ Detected: Codex multi-agent support available; explicit authorization required.
155
+ Expected: ask "This OAT implementation skill normally delegates phase implementation and review to subagents. Authorize subagent delegation for this run?"
156
+ If approved: Selected: Tier 1 — Subagents
157
+ Forbidden: Selected: Tier 2 — Inline because the user did not separately mention subagents.
158
+ ```
159
+
160
+ **Legacy state migration:** If `state.md` contains `oat_execution_mode: subagent-driven`, silently ignore it. On the next bookkeeping write, remove that key. Do not redirect to `oat-project-subagent-implement` — that skill is deprecated.
161
+
162
+ ### Dry-Run Mode
163
+
164
+ When the skill is invoked with `--dry-run`:
165
+
166
+ 1. Perform Steps 0–2 fully (resolve project, capability detection, read plan, validate metadata, build schedule).
167
+ 2. Skip all phase dispatches, merges, and artifact writes.
168
+ 3. Output the execution plan:
169
+
170
+ ```
171
+ OAT ▸ IMPLEMENT (dry-run)
172
+
173
+ Project: {PROJECT_PATH}
174
+ Tier: {1 | 2}
175
+ Retry: {N}
176
+
177
+ Schedule:
178
+ [1] p01 (sequential)
179
+ [2] p02, p03 (parallel group, worktrees)
180
+ [3] p04 (sequential)
181
+
182
+ Worktrees that would be created:
183
+ - {project-name}/p02
184
+ - {project-name}/p03
185
+
186
+ No commits, no artifact writes.
187
+ ```
110
188
 
111
- - Tell the user: `Execution mode is subagent-driven. Use oat-project-subagent-implement instead.`
112
- - STOP (do not proceed with sequential implementation)
189
+ 4. Exit without modifying any files.
113
190
 
114
191
  ### Step 1: Check Plan Complete
115
192
 
@@ -124,6 +201,33 @@ cat "$PROJECT_PATH/plan.md" | head -10 | grep "oat_status:"
124
201
 
125
202
  **If not complete:** Block and ask user to finish plan first.
126
203
 
204
+ ### Step 1.5: Resumption Detection
205
+
206
+ If `{PROJECT_PATH}/implementation.md` already contains orchestration run entries, we may be resuming an interrupted run.
207
+
208
+ 1. Read `implementation.md` and find the most recent `### Run N` entry.
209
+ 2. Compare its phases-passed / phases-failed / phases-stopped counts against the plan's phase list.
210
+ 3. If there are phases in the plan that are not yet covered by any run entry, those are the resume targets.
211
+ 4. Read `state.md` for `oat_current_task` to cross-check the expected resume point.
212
+ 5. Read `git log` to verify the most recent bookkeeping commit matches the last reported state.
213
+
214
+ **Detected state reconciliation:**
215
+
216
+ - If there is an in-flight phase (implementer committed but no review verdict in implementation.md), re-dispatch the reviewer for that phase's current HEAD.
217
+ - If there are un-cleaned worktrees from a prior parallel group, list them and ask the user whether to resume or clean up:
218
+
219
+ ```
220
+ Found un-cleaned worktrees from a prior run:
221
+ - ../worktrees/{name}/p02 — verdict was: excluded
222
+ - ../worktrees/{name}/p03 — verdict was: pass, not merged
223
+
224
+ Resume (merge pending verdicts into orchestration branch) or clean up?
225
+ ```
226
+
227
+ 6. Once resume target is identified, continue from that phase with the normal per-phase flow.
228
+
229
+ **On first-ever invocation** (no prior run entries), skip resumption detection and proceed to Step 2.
230
+
127
231
  ### Step 2: Read Plan Document
128
232
 
129
233
  Read `"$PROJECT_PATH/plan.md"` completely to understand:
@@ -133,6 +237,44 @@ Read `"$PROJECT_PATH/plan.md"` completely to understand:
133
237
  - Verification commands
134
238
  - Commit messages
135
239
 
240
+ ### Step 2.1: Validate Parallelism Metadata
241
+
242
+ Invoke the CLI validator to check plan.md parallelism metadata:
243
+
244
+ ```bash
245
+ oat project validate-plan --project-path "${PROJECT_PATH}"
246
+ ```
247
+
248
+ (If `oat` is not in PATH, use: `pnpm run cli -- project validate-plan --project-path "${PROJECT_PATH}"`)
249
+
250
+ The command validates:
251
+
252
+ - `oat_plan_parallel_groups` is either missing / empty (meaning fully sequential, no check needed) or a nested array of phase ID strings.
253
+ - Every referenced phase ID exists in the plan.
254
+ - No phase ID appears in more than one group.
255
+ - No singleton groups (each group must contain at least 2 phases).
256
+
257
+ **Reactions:**
258
+
259
+ - Exit code 0 → validation passed; continue to Step 2.2.
260
+ - Non-zero exit code → STOP immediately. Surface the validator's stderr output to the user. Do not silently fall back to sequential — the plan must be fixed first.
261
+
262
+ The validation contract is enforced by the CLI command and unit-tested there; the skill is just the consumer.
263
+
264
+ ### Step 2.2: Build Execution Schedule
265
+
266
+ From the phase list and the validated parallel groups, build an execution schedule:
267
+
268
+ - Phases not listed in any group form singleton entries (run sequentially).
269
+ - Each parallel group forms a multi-phase entry (run concurrently in worktrees).
270
+ - Schedule entries execute in plan order.
271
+
272
+ Example:
273
+
274
+ - Plan phases: p01, p02, p03, p04, p05
275
+ - `oat_plan_parallel_groups: [["p02", "p03"], ["p04", "p05"]]`
276
+ - Schedule: `[p01]` → `[p02, p03]` (group) → `[p04, p05]` (group)
277
+
136
278
  ### Step 2.5: Confirm Plan HiLL Checkpoints
137
279
 
138
280
  Read `oat_plan_hill_phases` from `"$PROJECT_PATH/plan.md"` frontmatter when present and validate it.
@@ -191,21 +333,24 @@ When user confirms/changes:
191
333
  - Update `"$PROJECT_PATH/plan.md"` frontmatter `oat_plan_hill_phases` to the confirmed value before executing tasks.
192
334
  - Keep the value stable for the rest of the run unless the user explicitly requests a change.
193
335
 
194
- #### Auto-Review at Checkpoints (Touchpoint A)
336
+ #### Auto-Review at HiLL Checkpoints (Touchpoint A)
195
337
 
196
338
  After checkpoint behavior is confirmed, resolve auto-review preference:
197
339
 
198
- 1. Read `.oat/config.json` `autoReviewAtCheckpoints` (default: `false`)
199
- 2. **If config explicitly `true`:** Skip the prompt. Write `oat_auto_review_at_checkpoints: true` to plan.md frontmatter. Print: "Auto-review at checkpoints: enabled (from config)."
200
- 3. **If config `false` or absent:** Add one question after the checkpoint choice:
340
+ 1. Read `workflow.autoReviewAtHillCheckpoints` via `oat config get workflow.autoReviewAtHillCheckpoints`. This uses local > shared > user resolution and falls back to legacy `.oat/config.json` `autoReviewAtCheckpoints` when the workflow key is unset.
341
+ 2. **If config explicitly `true`:** Skip the prompt. Write `oat_auto_review_at_hill_checkpoints: true` to plan.md frontmatter. Print: "Auto-review at HiLL checkpoints: enabled (from workflow.autoReviewAtHillCheckpoints)."
342
+ 3. **If config explicitly `false`:** Skip the prompt. Write `oat_auto_review_at_hill_checkpoints: false` to plan.md frontmatter. Print: "Auto-review at HiLL checkpoints: disabled (from workflow.autoReviewAtHillCheckpoints)."
343
+ 4. **If config is unset:** Add one question after the checkpoint choice:
201
344
  ```
202
- 4. Auto-review at checkpoints?
203
- - yes: automatically spawn a subagent code review when a checkpoint phase completes
204
- - no (default): manual review triggering (current behavior)
345
+ 4. Auto-review at HiLL checkpoints?
346
+ - yes: automatically run the lifecycle review when a HiLL checkpoint phase completes
347
+ - no (default): manual lifecycle review triggering
205
348
  ```
206
- 4. Write `oat_auto_review_at_checkpoints: true|false` to plan.md frontmatter alongside `oat_plan_hill_phases`.
349
+ 5. Write `oat_auto_review_at_hill_checkpoints: true|false` to plan.md frontmatter alongside `oat_plan_hill_phases`.
350
+
351
+ This setting controls only the extra `oat-project-review-provide` lifecycle review at HiLL checkpoints. It does not control Tier 1 phase gate reviews; Tier 1 always runs `oat-reviewer` after each phase.
207
352
 
208
- **On resume:** If `oat_auto_review_at_checkpoints` is already present in plan.md frontmatter, skip Touchpoint A entirely — do not re-ask, do not re-read config, do not print the auto-review note. The stored value is authoritative.
353
+ **On resume:** If `oat_auto_review_at_hill_checkpoints` is already present in plan.md frontmatter, skip Touchpoint A entirely — do not re-ask, do not re-read config, do not print the auto-review note. The stored value is authoritative. If only legacy `oat_auto_review_at_checkpoints` is present, treat it as authoritative for this run and write the new `oat_auto_review_at_hill_checkpoints` key on the next plan frontmatter update.
209
354
 
210
355
  ### Step 3: Check Implementation State
211
356
 
@@ -268,148 +413,240 @@ Initialize project state so other skills (e.g., `oat-project-progress`) reflect
268
413
  - `oat_current_task: p01-t01`
269
414
  - `oat_project_state_updated: "{ISO 8601 UTC timestamp}"`
270
415
 
271
- ### Step 5: Execute Current Task
416
+ ### Step 5: Per-Phase Execution
272
417
 
273
- For the current task in plan.md:
418
+ For each phase `pNN` in the plan (or each phase in the current parallel group), the orchestrator dispatches phase-level work as follows.
274
419
 
275
- **5a. Announce task:**
420
+ **Tier 1 dispatch (native subagents):**
276
421
 
277
- ```
278
- Starting {task_id}: {Task Name}
279
- Files: {file list}
280
- ```
422
+ 1. Build the Phase Scope block:
281
423
 
282
- **5b. Follow steps exactly:**
424
+ ```
425
+ project: {PROJECT_PATH}
426
+ phase: {pNN}
427
+ mode: implement
428
+ artifact_paths:
429
+ plan: {PROJECT_PATH}/plan.md
430
+ design: {PROJECT_PATH}/design.md
431
+ spec: {PROJECT_PATH}/spec.md
432
+ implementation: {PROJECT_PATH}/implementation.md
433
+ discovery: {PROJECT_PATH}/discovery.md
434
+ commit_convention: {from plan.md header}
435
+ workflow_mode: {from state.md or plan.md frontmatter}
436
+ ```
283
437
 
284
- - Read each step from plan
285
- - Execute as specified
286
- - Run verification commands
438
+ 2. Dispatch `oat-phase-implementer` (Tier 1 via provider-native subagent mechanism) with the Phase Scope block as input.
287
439
 
288
- **5c. Apply TDD discipline:**
440
+ 3. Receive the structured summary (DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED).
289
441
 
290
- 1. Write test first (if applicable)
291
- 2. Run tests → expect red
292
- 3. Write implementation
293
- 4. Run tests → expect green
294
- 5. Refactor if needed
442
+ **Tier 2 dispatch (inline fallback):**
295
443
 
296
- **5d. Handle issues:**
444
+ If Tier 2 is selected, do not dispatch. Instead:
297
445
 
298
- - If step unclear ask user
299
- - If verification fails debug and retry
300
- - If blocked mark task as blocked, note reason
446
+ 1. Read `.agents/agents/oat-phase-implementer.md` for the phase-execution process.
447
+ 2. Execute that process yourself against the same Phase Scope.
448
+ 3. Produce an equivalent summary in your own context.
301
449
 
302
- ### Step 6: Commit Task
450
+ #### Handling Implementer Status
303
451
 
304
- After task verification passes:
452
+ - **DONE:** Proceed to phase review (Step below).
453
+ - **DONE_WITH_CONCERNS:** Read the concerns block. If any concern is correctness-related (bug, wrong behavior, missing requirement), address it before review — re-dispatch implementer with a targeted fix instruction. If concerns are advisory (e.g., "this file is getting large"), note them in `implementation.md` and proceed to review.
454
+ - **NEEDS_CONTEXT:** Provide the missing context (usually an artifact path or a cross-phase reference) and re-dispatch. This counts toward the retry limit.
455
+ - **BLOCKED:** STOP the run. Surface the block to the user with:
456
+ - Phase ID
457
+ - What the implementer reported as blocking
458
+ - Recommended next step (plan fix, external resolution, user guidance)
459
+ Do not proceed to subsequent phases while a phase is blocked.
305
460
 
306
- ```bash
307
- git add {files from plan}
308
- git commit -m "{commit message from plan}"
309
- ```
461
+ #### Dispatch Retry (Transient Failures)
310
462
 
311
- Store commit SHA for implementation.md.
463
+ If a Tier 1 dispatch fails (agent did not resolve, returned empty, etc.), retry exactly once. If the second attempt also fails, treat the phase as `failed` via the same mechanism as fix-loop retry exhaustion (see Step 7 below). Tier is never silently downgraded.
312
464
 
313
- ### Step 7: Update Implementation State
465
+ ### Per-Phase Review
314
466
 
315
- After each task:
467
+ After the implementer returns DONE (or DONE_WITH_CONCERNS without correctness concerns), dispatch the reviewer for the phase.
316
468
 
317
- **Update frontmatter:**
469
+ **Dispatch:**
318
470
 
319
- ```yaml
320
- oat_current_task_id: { next_task_id } # e.g., p01-t02
321
- oat_last_updated: { today }
322
- ```
471
+ - Use the same tier that was selected at start.
472
+ - Tier 1: dispatch `oat-reviewer` via provider-native subagent mechanism with Review Scope:
323
473
 
324
- **Update task entry:**
474
+ ```
475
+ project: {PROJECT_PATH}
476
+ type: code
477
+ scope: {pNN}
478
+ commits: {base_sha}..{head_sha}
479
+ files_changed: {optional hint from implementer's report}
480
+ workflow_mode: {from state.md}
481
+ artifact_paths: {same as Phase Scope}
482
+ tasks_in_scope: {list of pNN-tNN IDs in the phase}
483
+ ```
325
484
 
326
- ```markdown
327
- ### Task {task_id}: {Task Name}
485
+ - For Codex Tier 1 dispatches, send the Review Scope block as a self-contained packet and keep fresh context (`fork_context: false`). The reviewer is expected to reconstruct context from git state and the OAT artifacts listed above.
486
+ - Treat the commit range as authoritative for review scope. `files_changed` is optional orientation metadata only.
487
+ - If a Codex reviewer does not return a terminal result on the first wait, poll once more. If it still has not concluded, send one concise nudge to return immediately with current findings. If the reviewer still does not conclude, treat the Tier 1 review dispatch as failed for this phase and perform the review inline instead of waiting indefinitely.
328
488
 
329
- **Status:** completed
330
- **Commit:** {sha}
489
+ - Tier 2: inline — read `.agents/agents/oat-reviewer.md` and perform the review yourself.
331
490
 
332
- **Outcome (required):**
491
+ **Verdict outcomes:**
333
492
 
334
- - {2-5 bullets describing what materially changed}
493
+ Parse the reviewer's confirmation for verdict + finding severities. Map to pass / fail:
335
494
 
336
- **Files changed:**
495
+ - **pass:** zero Critical and zero Important findings.
496
+ - **fail:** one or more Critical or Important findings.
337
497
 
338
- - `{path}` - {why}
498
+ Medium / Minor findings do not block the phase but are recorded.
339
499
 
340
- **Verification:**
500
+ #### Bounded Fix Loop
341
501
 
342
- - Run: `{command(s)}`
343
- - Result: {pass/fail + notes}
502
+ On reviewer verdict `fail`, run a bounded fix loop.
344
503
 
345
- **Notes / Decisions:**
504
+ 1. Read `oat_orchestration_retry_limit` from `state.md` frontmatter (default: `2`, range 0–5).
505
+ 2. For each retry (up to the limit):
506
+ a. Dispatch `oat-phase-implementer` in `fix` mode (Tier 1) OR read the agent and apply fixes inline (Tier 2), with: - `review_artifact`: the path written by the reviewer - `findings`: the Critical + Important findings list - `prior_summary`: the last implementer summary
507
+ b. Receive the fix summary.
508
+ c. Re-dispatch the reviewer with the updated commit range.
509
+ d. Parse the new verdict.
510
+ e. If pass → exit the loop successfully.
511
+ f. If fail and retries remain → continue.
512
+ g. If fail and retries exhausted → exit the loop with terminal verdict `failed`.
346
513
 
347
- - {gotchas, trade-offs, design deltas}
348
- ```
514
+ **Terminal `failed` handling:**
349
515
 
350
- **Update progress overview table.**
516
+ - **Sequential mode:** STOP the run. Surface to user with phase ID, unresolved findings, review artifact path. Do not proceed to subsequent phases.
517
+ - **Parallel group mode:** mark the phase `excluded`. Do not merge its worktree. Continue the remaining phases in the group. Report in Outstanding Items after the group completes.
351
518
 
352
- Keep project state in sync after each task (recommended source of truth for “where are we?” across sessions):
519
+ ### Parallel Group Execution
353
520
 
354
- - Update `"$PROJECT_PATH/state.md"` frontmatter:
355
- - `oat_phase: implement`
356
- - `oat_phase_status: in_progress`
357
- - `oat_current_task: {next_task_id}`
358
- - `oat_last_commit: {sha}`
359
- - `oat_project_state_updated: "{ISO 8601 UTC timestamp}"`
521
+ When the current schedule entry is a multi-phase group, execute as follows.
360
522
 
361
- **Bookkeeping commit (required):**
523
+ **Tier 2 degradation:** If Tier 2 was selected at skill start, Tier 2 cannot run concurrent subagents. Degrade the entire group to sequential inline execution — run each phase in the group sequentially on the orchestration branch. Do not create worktrees. Proceed through the per-phase loop (dispatch / review / fix-loop / bookkeeping) for each phase in plan order.
362
524
 
363
- **DO NOT SKIP.** This commit prevents state drift across sessions.
525
+ **Tier 1 parallel execution:**
364
526
 
365
- After the code commit (Step 6) and state updates above, commit all modified OAT tracking files:
527
+ 1. **Bootstrap worktrees:** for each phase in the group, invoke `oat-worktree-bootstrap-auto` with branch name `{project-name}/{pNN}` and base = orchestration branch.
528
+ - If **any** bootstrap fails, cancel any worktrees that bootstrapped successfully for this group and degrade the whole group to sequential inline execution. Log the degradation reason to `implementation.md` Outstanding Items.
366
529
 
367
- ```bash
368
- git add "$PROJECT_PATH/implementation.md" "$PROJECT_PATH/state.md" "$PROJECT_PATH/plan.md"
369
- git diff --cached --quiet || git commit -m "chore(oat): update tracking artifacts for {task_id}"
370
- ```
530
+ 2. **Concurrent dispatch:** for each successfully bootstrapped worktree, dispatch `oat-phase-implementer` (with the worktree as working directory) concurrently. Each dispatch runs the per-phase loop internally (implementer → reviewer → fix-loop).
371
531
 
372
- Do not use `git add -A` or glob patterns. Only commit the three OAT project files listed above.
532
+ 3. **Wait for all phases:** do not proceed until every phase in the group reports a terminal verdict (pass or excluded).
373
533
 
374
- **If executing review-generated tasks** (task title prefixed with `(review)`):
534
+ 4. **Fan-in reconciliation (merge back in plan order):**
375
535
 
376
- - Ensure `implementation.md` stays accurate:
377
- - The “Review Received” section reflects whether findings were deferred vs converted to tasks
378
- - The “Next” line is updated once review fix tasks are complete (don’t leave “Next: execute fix tasks” after they’re done)
379
- - Keep `plan.md` internally consistent:
380
- - If `## Implementation Complete` contains phase/task totals, update totals when review fix tasks are added (via `oat-project-review-receive`) or removed.
381
- - Review status lifecycle:
382
- - When review-generated fix tasks are added, the Reviews table should be `fixes_added`.
383
- - After all fix tasks are implemented, update the Reviews table to `fixes_completed` (not `passed`).
384
- - Only set `passed` after a re-review is run and processed via `oat-project-review-receive` with no Critical/Important findings.
536
+ For each phase in the group, in plan order (p02 before p03, etc.), if its verdict is pass:
385
537
 
386
- **Review-fix completion bookkeeping (required):**
538
+ a. Attempt `git merge --no-ff {project-name}/{pNN} -m "merge({pNN}): {summary from implementer}"`.
539
+ b. If merge produces conflicts, abort the merge and attempt cherry-pick of the phase's commits.
540
+ c. If cherry-pick also produces conflicts, dispatch an inline conflict-resolution subagent via the Task tool. The orchestrator MUST NOT read the conflicted files itself — delegate to the subagent. Use this dispatch shape:
387
541
 
388
- - When you complete the last outstanding review-fix task:
389
- 1. Update the relevant `plan.md` `## Reviews` row from `fixes_added` → `fixes_completed` and set Date to `{today}`.
390
- - If multiple rows are `fixes_added`, ask the user which scope you just addressed (or choose the matching phase if obvious).
391
- 2. Update `plan.md` `## Implementation Complete` totals (phase counts + total tasks) so summaries reflect the additional fix work.
392
- 3. Update `implementation.md` so it’s unambiguous that tasks are complete and the project is awaiting re-review:
393
- - `oat_current_task_id: null` (reviews are not tasks)
394
- - “Next” guidance should say “request re-review” (not “execute fix tasks”).
395
- 4. Update `{PROJECT_PATH}/state.md` to reflect the correct “awaiting re-review” posture:
396
- - `oat_phase: implement`
397
- - `oat_phase_status: in_progress` (until the re-review passes)
398
- - `oat_current_task: null`
399
- - `oat_project_state_updated: “{ISO 8601 UTC timestamp}”`
542
+ ```
543
+ Task (general-purpose subagent):
544
+ description: "Resolve merge conflict for phase {pNN}"
545
+ prompt: |
546
+ You are resolving a git merge conflict during parallel-phase fan-in.
400
547
 
401
- **Bookkeeping commit (required):**
548
+ Phase: {pNN}
549
+ Orchestration branch: {orchestration-branch}
550
+ Worktree: {worktree-path}
551
+ Conflicted files: {list from git status}
552
+ Project artifacts:
553
+ plan: {PROJECT_PATH}/plan.md
554
+ design: {PROJECT_PATH}/design.md
555
+ spec: {PROJECT_PATH}/spec.md
402
556
 
403
- **DO NOT SKIP.** This commit prevents state drift across sessions.
557
+ Steps:
558
+ 1. Read each conflicted file. Parse conflict markers (<<<<<<<, =======, >>>>>>>).
559
+ 2. Read the project artifacts to understand intent from both sides.
560
+ 3. Apply a resolution that preserves intent from both sides where possible.
561
+ 4. Remove conflict markers. Save files.
562
+ 5. Stage resolved files with `git add <files>`.
563
+ 6. Run integration verification: `pnpm test && pnpm lint && pnpm type-check`.
564
+ 7. If all pass: commit with `merge({pNN}): resolved conflict during fan-in`.
565
+ 8. If any step fails: do NOT commit. Return with the appropriate status.
404
566
 
405
- After completing the review-fix checklist above, commit all modified OAT tracking files:
567
+ Return format (end of response):
568
+ status: RESOLVED | UNRESOLVABLE | VERIFICATION_FAILED
569
+ reasoning: <2-4 sentence summary of what you did or why you stopped>
570
+ commit: <sha if RESOLVED, else null>
571
+ ```
406
572
 
407
- ```bash
408
- git add "$PROJECT_PATH/implementation.md" "$PROJECT_PATH/state.md" "$PROJECT_PATH/plan.md"
409
- git diff --cached --quiet || git commit -m "chore(oat): update tracking artifacts for {task_id}"
410
- ```
573
+ d. Parse the subagent's return status: - `RESOLVED` → subagent has committed the merge; orchestrator proceeds to integration verification (Step 5) and the next phase in the group. - `UNRESOLVABLE` or `VERIFICATION_FAILED` → STOP the run. Surface to user with phase ID, conflicting files, worktree path, subagent's reasoning summary. Do not merge remaining phases.
574
+
575
+ **Tier 2 (inline) exception:** In Tier 2 runs, parallel groups already degrade to sequential, so fan-in conflicts don't arise from this code path. If a conflict ever surfaces in Tier 2 (e.g., from another operation), the orchestrator resolves inline since the whole run is already inline — consistent with Tier 2 semantics.
576
+
577
+ 5. **Integration verification after each merge:**
578
+
579
+ After each successful merge, run project verification (tests, lint, type-check). If verification fails:
580
+ - Attempt a tractable fix (missing import, trivial type error). If the fix succeeds and verification passes, commit the fix.
581
+ - If the fix is not tractable → revert the merge, STOP the run. Surface to user.
582
+
583
+ 6. **Worktree cleanup:**
584
+
585
+ For phases that merged successfully and passed integration verification, clean up the worktree using the existing worktree cleanup mechanism (e.g., `git worktree remove`).
586
+
587
+ For phases that were excluded (fix-loop exhausted), preserve the worktree and log its path in `implementation.md` Outstanding Items.
588
+
589
+ 7. **Bookkeeping commit** after the group completes. Then HiLL checkpoint check.
590
+
591
+ ### Step 7: Artifact Updates After Each Phase (or Group)
592
+
593
+ After each phase (sequential) or each parallel group (multi-phase) completes, update the tracking artifacts before moving on.
594
+
595
+ **`implementation.md`:**
596
+
597
+ Append a new entry to the `## Orchestration Runs` section between the `<!-- orchestration-runs-start -->` and `<!-- orchestration-runs-end -->` markers. Format:
598
+
599
+ ```markdown
600
+ ### Run {N} — {YYYY-MM-DD HH:MM}
601
+
602
+ **Branch:** {orchestration-branch}
603
+ **Tier:** {1 | 2}
604
+ **Policy:** merge-strategy=merge, retry-limit={N}
605
+ **Phases:** {N} executed, {N} passed, {N} failed, {N} stopped
606
+
607
+ #### Phase Outcomes
608
+
609
+ | Phase | Implementer | Review | Fix Iterations | Disposition |
610
+ | ----- | ----------- | ------ | -------------- | ----------- | ------- | -------- | -------- |
611
+ | pNN | {status} | {pass | fail} | N/{limit} | {merged | excluded | stopped} |
612
+
613
+ #### Parallel Groups
614
+
615
+ - Group {N} [{phase list}]: worktree-based, merged in order
616
+ - {singleton phases}: sequential
617
+
618
+ #### Outstanding Items
619
+
620
+ - {None | list of excluded phases with review paths and worktree paths}
621
+ ```
622
+
623
+ Append only — never overwrite prior run entries.
624
+
625
+ **`plan.md` review table:**
626
+
627
+ For each phase that completed:
628
+
629
+ - Pass on first try → set phase row to `passed` with date + review artifact path.
630
+ - Pass after fixes → set to `fixes_added` → `fixes_completed` → `passed` (match existing lifecycle).
631
+ - Fix-loop exhausted → leave at `fixes_added` with "excluded" note in the artifact link.
632
+ - `final` review row is never touched by this skill.
633
+
634
+ **`state.md`:**
635
+
636
+ - Update `oat_current_task` to the next un-run task ID (or the final task if run complete).
637
+ - Update `oat_last_commit` to the bookkeeping commit SHA about to be made.
638
+ - Update `oat_project_state_updated` to current ISO 8601 UTC timestamp.
639
+ - If `oat_execution_mode: subagent-driven` is present, remove the key.
640
+ - If the user supplied a `--retry-limit` override, persist as `oat_orchestration_retry_limit`.
641
+
642
+ **Bookkeeping commit (mandatory):**
643
+
644
+ ```bash
645
+ git add {PROJECT_PATH}/implementation.md {PROJECT_PATH}/state.md {PROJECT_PATH}/plan.md
646
+ git commit -m "chore(oat): bookkeeping after {pNN} {pass|fail}"
647
+ ```
411
648
 
412
- Do not use `git add -A` or glob patterns. Only commit the three OAT project files listed above.
649
+ Then check HiLL checkpoint if the phase ID is in `oat_plan_hill_phases`, pause for user approval before continuing.
413
650
 
414
651
  ### Step 8: Check Plan Phase Completion
415
652
 
@@ -432,11 +669,11 @@ At the end of each plan phase (p01, p02, etc.), check `oat_plan_hill_phases` in
432
669
 
433
670
  **Key semantic: listed phases are where you stop AFTER completing them, not before.** `["p03"]` means "complete p03, then pause" — not "pause before starting p03."
434
671
 
435
- **Auto-review at checkpoints (Touchpoint B):**
672
+ **Auto-review at HiLL checkpoints (Touchpoint B):**
436
673
 
437
674
  Before pausing at a checkpoint, check if auto-review is enabled:
438
675
 
439
- 1. Read `oat_auto_review_at_checkpoints` from plan.md frontmatter. If not present, fall back to `.oat/config.json` `autoReviewAtCheckpoints` (default: `false`).
676
+ 1. Read `oat_auto_review_at_hill_checkpoints` from plan.md frontmatter. If not present, fall back to legacy `oat_auto_review_at_checkpoints`. If neither is present, fall back to `oat config get workflow.autoReviewAtHillCheckpoints` (which itself falls back to legacy `.oat/config.json` `autoReviewAtCheckpoints` when unset).
440
677
 
441
678
  2. If enabled and this is a checkpoint phase:
442
679
  a. **Determine review scope:** Find the highest completed implementation phase already covered by a **`passed`** code-review row in plan.md Reviews table. Count only whole-phase scopes: `pNN` or `pNN-pMM`. Ignore task scopes (`pNN-tNN`) and rows with `fixes_added` or `fixes_completed` because those reviews did not pass and must be re-covered. Scope = every implementation phase after that passed coverage through the current phase, inclusive. If no earlier passed whole-phase review exists, start from the first implementation phase. Use `pNN-pMM` when the scope spans multiple phases. If this is the final implementation phase checkpoint, use scope `final`.
@@ -468,7 +705,7 @@ When pausing:
468
705
 
469
706
  **Phase summaries (required):**
470
707
 
471
- - When a plan phase completes (p01, p02, etc.), update the Phase Summary section in `implementation.md` for that phase:
708
+ - When a plan phase completes (p01, p02, etc.), update the "Phase Summary" section in `implementation.md` for that phase:
472
709
  - Outcome (behavior-level)
473
710
  - Key files touched (paths)
474
711
  - Verification run
@@ -551,7 +788,7 @@ Options:
551
788
 
552
789
  When all plan tasks are complete (i.e., there is no next incomplete `pNN-tNN` task):
553
790
 
554
- **Update Final Summary (required):**
791
+ **Update "Final Summary" (required):**
555
792
 
556
793
  - Before requesting final review / running `oat-project-pr-final`, update the `## Final Summary (for PR/docs)` section in `"$PROJECT_PATH/implementation.md"`:
557
794
  - What shipped (capabilities, behavior-level)
@@ -731,13 +968,13 @@ To run in a separate session use: oat-project-review-provide code final
731
968
  - `oat_phase_status: complete`
732
969
  - `oat_project_state_updated: "{ISO 8601 UTC timestamp}"`
733
970
  - Append `"implement"` to `oat_hill_completed` (only if configured as a HiLL gate)
734
- - Update state content to Implementation complete”.
971
+ - Update state content to "Implementation complete".
735
972
  - Update `"$PROJECT_PATH/plan.md"`:
736
973
  - Set the `final` review row status to `passed` (if not already)
737
974
  - Ensure `## Implementation Complete` totals reflect any review fix tasks that were added
738
975
  - Update `"$PROJECT_PATH/implementation.md"`:
739
976
  - Ensure `oat_current_task_id: null`
740
- - Ensure the Review Received section reflects completed fixes and points to the next action (PR) rather than execute fix tasks
977
+ - Ensure the "Review Received" section reflects completed fixes and points to the next action (PR) rather than "execute fix tasks"
741
978
 
742
979
  ### Step 15: Prompt for Next Steps
743
980