@open-agent-toolkit/cli 0.1.6 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/assets/agents/oat-phase-implementer.md +14 -3
  2. package/assets/agents/oat-reviewer.md +19 -3
  3. package/assets/docs/cli-utilities/configuration.md +33 -1
  4. package/assets/docs/provider-sync/config.md +1 -0
  5. package/assets/docs/reference/oat-directory-structure.md +2 -0
  6. package/assets/docs/workflows/projects/implementation-execution.md +60 -12
  7. package/assets/docs/workflows/projects/lifecycle.md +4 -0
  8. package/assets/public-package-versions.json +4 -4
  9. package/assets/skills/oat-project-implement/SKILL.md +193 -94
  10. package/assets/skills/oat-project-plan/SKILL.md +57 -1
  11. package/assets/skills/oat-project-plan-writing/SKILL.md +6 -2
  12. package/assets/skills/oat-project-quick-start/SKILL.md +57 -1
  13. package/assets/skills/oat-project-review-provide/SKILL.md +9 -1
  14. package/assets/skills/oat-project-review-receive/SKILL.md +21 -1
  15. package/assets/skills/oat-project-summary/SKILL.md +15 -13
  16. package/assets/templates/implementation.md +5 -5
  17. package/assets/templates/plan.md +1 -1
  18. package/assets/templates/state.md +4 -0
  19. package/assets/templates/summary.md +2 -1
  20. package/dist/commands/config/index.d.ts.map +1 -1
  21. package/dist/commands/config/index.js +36 -0
  22. package/dist/commands/project/dispatch-ceiling/index.d.ts +16 -0
  23. package/dist/commands/project/dispatch-ceiling/index.d.ts.map +1 -0
  24. package/dist/commands/project/dispatch-ceiling/index.js +288 -0
  25. package/dist/commands/project/index.d.ts.map +1 -1
  26. package/dist/commands/project/index.js +2 -0
  27. package/dist/config/oat-config.d.ts +7 -0
  28. package/dist/config/oat-config.d.ts.map +1 -1
  29. package/dist/config/oat-config.js +23 -0
  30. package/dist/config/resolve.d.ts.map +1 -1
  31. package/dist/config/resolve.js +4 -0
  32. package/dist/providers/codex/codec/sync-extension.d.ts.map +1 -1
  33. package/dist/providers/codex/codec/sync-extension.js +16 -8
  34. package/package.json +2 -2
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-project-implement
3
- version: 2.0.16
3
+ version: 2.0.18
4
4
  description: Use when plan.md is ready for execution. Dispatches phase-level subagents with bounded fix loops; supports plan-declared parallel phase groups with worktree-isolated execution and ordered fan-in.
5
5
  argument-hint: '[--retry-limit <N>] [--dry-run]'
6
6
  disable-model-invocation: true
@@ -28,6 +28,9 @@ After every code commit and after every phase/review-fix completion, you MUST co
28
28
  **CRITICAL — Review boundaries require a committed artifact baseline.**
29
29
  Do not enter checkpoint review, final review, revise, or PR-final handoff with dirty core project artifacts (`discovery.md`, `spec.md`, `design.md`, `plan.md`, `implementation.md`, `state.md`, plus `.oat/state.md` when refreshed). If one of those boundaries is next and artifact bookkeeping is still uncommitted, stop and create the bookkeeping commit first.
30
30
 
31
+ **CRITICAL — Intentional artifact divergence must be recorded.**
32
+ If implementation intentionally diverges from `spec.md`, `design.md`, or `plan.md`, record the delta in `implementation.md` before the next phase/review boundary. Include what diverged, why it diverged, whether the implementation or original artifact is now source of truth, and any follow-up artifact updates or explicit deferral. Do not leave accepted design drift only in chat, a review artifact, or code comments; final summary generation depends on `implementation.md` preserving the delta.
33
+
31
34
  ## Progress Indicators (User-Facing)
32
35
 
33
36
  When executing this skill, provide lightweight progress feedback so the user can tell what's happening after they confirm.
@@ -159,132 +162,204 @@ Forbidden: Selected: Tier 2 — Inline because the user did not separately menti
159
162
 
160
163
  **Legacy state migration:** If `state.md` contains `oat_execution_mode: subagent-driven`, silently ignore it. On the next bookkeeping write, remove that key. Do not redirect to `oat-project-subagent-implement` — that skill is deprecated.
161
164
 
162
- ### Runtime dispatch selection
165
+ ### Dispatch Ceiling Preflight
163
166
 
164
- Before each phase implementation dispatch, choose and log the phase's runtime dispatch controls. This is separate from the Tier 1/Tier 2 execution mode above: Tier 1/Tier 2 decides whether OAT uses subagents or inline fallback; runtime dispatch selection decides the model and effort controls to use for the specific phase when the host exposes them.
167
+ Before any phase work, resolve and print the OAT dispatch ceiling for the
168
+ current provider. This is a preflight gate, not a mid-run question.
165
169
 
166
- Use these inputs:
170
+ Use the CLI helper as the source of truth for resolution:
167
171
 
168
- - phase ID
169
- - phase scope, including task count, file boundaries, verification commands, and integration risk
170
- - optional `## Dispatch Profile` row in `plan.md`
171
- - host-exposed provider controls, by axis
172
- - prior outcomes for the phase, including review results and failed retries
172
+ ```bash
173
+ oat project dispatch-ceiling resolve --provider <codex|claude> --preflight --json
174
+ ```
173
175
 
174
- Selection rule:
176
+ If `oat` is not in PATH, use:
175
177
 
176
- 1. If a valid Dispatch Profile override row applies and the host can honor it, use the requested provider control and log that the choice came from the override.
177
- 2. If no override applies, choose the lowest available model and/or effort that can confidently complete the phase.
178
- 3. Treat model and effort as separate axes. Each axis logs exactly one state:
179
- - `selected:<value>` — host exposes the axis and the orchestrator chose a value.
180
- - `inherited` — host exposes the axis and the orchestrator deliberately defers to the parent session.
181
- - `not-applicable` — this host/API has no meaningful per-dispatch concept for that axis.
182
- - `host-auto` — exceptional; the host uses that axis internally but the orchestrator cannot read or pin it.
183
- 4. In Codex implementation/fix dispatch, the model axis normally logs `inherited`; choose `effort_axis=selected:low|medium|high` from phase complexity and dispatch the matching effort-specific implementer role. Treat `effort_axis=inherited` as the parent-session ceiling path, not a neutral default.
184
- 5. In Claude Code, when subagent model selection is available, choose the lowest sufficient model on the model axis; the effort axis is `not-applicable` because Claude Code does not expose a separate `reasoning_effort` control for subagent dispatch.
185
- 6. If a host uses model/effort internally but exposes neither axis to the orchestrator, log `model_axis=host-auto, effort_axis=host-auto` and include the rationale that would have informed selection.
186
- 7. If confidence is low, choose a stronger available control before dispatch rather than knowingly underpowering the phase.
178
+ ```bash
179
+ pnpm run cli -- project dispatch-ceiling resolve --provider <codex|claude> --preflight --json
180
+ ```
187
181
 
188
- **Payload-first dispatch invariant.** Select dispatch controls, construct the actual host dispatch argument map, then print the dispatch log from that argument map. Do not emit an `OAT Dispatch:` block with a `Model axis: selected:<value>` or `Effort axis: selected:<value>` field until the corresponding host-tool selection is present in the argument map you are about to call. A selected axis that exists only in the Phase Scope text is invalid; if you cannot or will not pass the host-tool selection, log that axis as `inherited`, `not-applicable`, or `host-auto` instead of `selected:<value>`.
182
+ Resolution order:
189
183
 
190
- **Passing axis values to the host dispatch API.** The log shape and the actual dispatch call must agree: never log a `selected:<value>` axis without passing the corresponding parameter on the dispatch invocation, and never pass an explicit parameter that the log does not reflect.
184
+ 1. Effective config key `workflow.dispatchCeiling.<provider>` via the resolver CLI
185
+ 2. Project `state.md` frontmatter key `oat_dispatch_ceiling`
186
+ 3. Interactive implementation preflight prompt
187
+ 4. Non-interactive unresolved: block before work starts
191
188
 
192
- - **Claude Code implementer/fix dispatch:** when `model_axis=selected:<value>`, pass `model: "<value>"` on the Task tool call. When `model_axis=inherited`, omit the `model` parameter so Claude Code uses its own default. `effort_axis=not-applicable` for both cases because the Task tool exposes no per-dispatch `reasoning_effort` control.
193
- - **Codex implementer/fix dispatch:** default to a selected effort. Classify phase complexity, choose the lowest sufficient `effort_axis=selected:low|medium|high`, and dispatch the matching configured role: `agent_type: "oat-phase-implementer-low"`, `agent_type: "oat-phase-implementer-medium"`, or `agent_type: "oat-phase-implementer-high"`. Those roles set `model_reasoning_effort` in `.codex/agents/*.toml`. Use the base `agent_type: "oat-phase-implementer"` only when `effort_axis=inherited` is intentionally selected for an allowed reason below. Do not use top-level per-call `reasoning_effort` as the standard OAT selected-effort path; dogfooding showed that path can be inconsistent in some Codex runs.
194
- - **Codex inherited effort is the ceiling path:** because inherited effort follows the parent/orchestrator session and may be `xhigh`, do not use `effort_axis=inherited` merely because it is valid, convenient, or avoids choosing a selected effort. Use inherited effort for Codex implementer/fix dispatch only when the user explicitly requested inherited/default parent controls; a valid Dispatch Profile row explicitly requests inherited/default controls; the phase requires the parent-session ceiling and `selected:high` is insufficient; or the selected-effort roles are unavailable or fail to resolve. The dispatch rationale must cite the allowed reason. For ceiling-needed dispatch, explain why `selected:high` is insufficient.
195
- - **Codex xhigh:** do not create or select an `xhigh` implementer variant. Use `xhigh` only when the parent/orchestrator session is already xhigh and therefore `effort_axis=inherited` on the base role is the correct representation. If a phase appears to require xhigh while the parent is not xhigh, choose `selected:high` only if high is sufficient; otherwise split/revise the phase or stop for user re-invocation at xhigh.
196
- - **Claude Code `opus`:** unlike Codex `xhigh`, `opus` is directly selectable. Claude Code exposes `opus` through the Task tool's `model` parameter, so OAT may select it when available (`model_axis=selected:opus`) — including as a terminal escalation step. There is no `opus` inherited-only restriction; the `xhigh` rule above is specific to Codex's effort-variant mechanism, not a general "never select the maximum tier" rule.
197
- - **Reviewer dispatch on either host:** use `model_axis=inherited` by default. For `effort_axis`: use `inherited` on hosts that expose an effort axis (such as Codex); use `not-applicable` on hosts that do not expose a meaningful effort axis (such as Claude Code). Omit `model` and, on Codex, `reasoning_effort` overrides entirely.
189
+ Provider values:
198
190
 
199
- Codex selected-effort implementer/fix dispatch shape:
191
+ - Codex: `low`, `medium`, `high`, `xhigh`
192
+ - Claude: `haiku`, `sonnet`, `opus`
200
193
 
201
- ```yaml
202
- agent_type: oat-phase-implementer-low # or oat-phase-implementer-medium/high
203
- message: |
204
- Phase Scope:
205
- model_axis: inherited
206
- effort_axis: selected:low
207
- ...
194
+ For Codex, also resolve the provider default effort when possible by reading
195
+ Codex configuration (for example `.codex/config.toml`). If it cannot be found,
196
+ display `unknown`. Do not treat provider default as the OAT ceiling.
197
+ The resolver prints this as `providerDefaultEffort` in JSON and includes it in
198
+ human-readable output.
199
+
200
+ Print this before phase work:
201
+
202
+ ```text
203
+ Codex dispatch ceiling: high
204
+ Source: project state
205
+ Codex provider default effort: medium
206
+ Note: OAT will use pinned subagent variants up to high. Base/unpinned roles resolve through the provider default.
208
207
  ```
209
208
 
210
- Invalid Codex selected-effort dispatch shape:
209
+ If no ceiling resolves and the session is interactive, ask before starting
210
+ implementation and persist the answer in project `state.md` frontmatter:
211
211
 
212
212
  ```yaml
213
- agent_type: oat-phase-implementer
214
- reasoning_effort: low
215
- message: |
216
- Phase Scope:
217
- effort_axis: selected:low
213
+ oat_dispatch_ceiling:
214
+ provider: codex
215
+ value: high
216
+ source: project-state
218
217
  ```
219
218
 
220
- The invalid shape relies on per-call override behavior that has proven inconsistent during dogfooding. It also risks creating a log/dispatch mismatch if the override is ignored.
219
+ If no ceiling resolves and `OAT_NON_INTERACTIVE=1` or no user-response channel
220
+ exists, rerun the resolver with non-interactive behavior and stop before work
221
+ starts if it blocks:
221
222
 
222
- **Post-spawn verification gate.** After any Codex implementer/fix `spawn_agent` call with `effort_axis=selected:<value>`, immediately inspect the returned spawn status before waiting for work or updating the plan. If the status shows a different effort, such as `effort_axis=selected:low` followed by `(gpt-5.5 high)`, this is an orchestration deviation. Stop using that agent, record the mismatch in `implementation.md`, and redispatch with the correct effort-specific `agent_type`. Do not continue to `wait_agent`, phase bookkeeping, or the next phase with a mismatched selected-effort dispatch.
223
+ ```bash
224
+ oat project dispatch-ceiling resolve --provider <codex|claude> --preflight --non-interactive
225
+ ```
226
+
227
+ ```text
228
+ BLOCKED: Codex dispatch ceiling is unresolved in non-interactive mode.
229
+ Set workflow.dispatchCeiling.codex in .oat/config.json or oat_dispatch_ceiling in project state.
230
+ ```
231
+
232
+ Dry-run mode must report the unresolved ceiling and planned behavior without
233
+ modifying project state.
234
+
235
+ ### Runtime dispatch selection
236
+
237
+ Before each phase implementation, fix, or review dispatch, choose and log the
238
+ runtime dispatch controls. This is separate from Tier 1/Tier 2 execution mode:
239
+ Tier 1/Tier 2 decides whether OAT uses subagents or inline fallback; runtime
240
+ dispatch selection decides model/effort controls for the specific work.
241
+
242
+ Use these inputs:
223
243
 
224
- After the payload-first check, log the choice before dispatch in this structured shape:
244
+ - resolved dispatch ceiling and source
245
+ - phase ID and phase scope
246
+ - optional `## Dispatch Profile` row in `plan.md`
247
+ - host-exposed provider controls, by axis
248
+ - prior outcomes for the phase, including review results and failed retries
249
+
250
+ Axis states:
251
+
252
+ - `selected:<value>` - host exposes the axis and the orchestrator chose a value.
253
+ - `provider-default` - Codex base/unpinned role follows configured/provider default effort.
254
+ - `inherited` - host/API explicitly inherits the parent setting and OAT can trust that behavior.
255
+ - `not-applicable` - this host/API has no meaningful per-dispatch concept for that axis.
256
+ - `host-auto` - exceptional; the host uses that axis internally but OAT cannot read or pin it.
257
+
258
+ Codex rules:
259
+
260
+ 1. Codex effort order is `low < medium < high < xhigh`.
261
+ 2. Classify preferred effort from scope:
262
+ - `low`: trivial docs-only, narrow single-file, or mechanical changes
263
+ - `medium`: normal multi-file implementation and moderate integration risk
264
+ - `high`: broad architecture, security/auth/redaction boundaries, subtle state behavior, or repeated substantive review failures
265
+ - `xhigh`: highest-risk work that requires the configured ceiling to allow xhigh
266
+ 3. Selected effort is `min(preferred, resolved_ceiling)`.
267
+ 4. Dispatch implementer/fix work through `oat-phase-implementer-<selected>`.
268
+ 5. Dispatch review work through `oat-reviewer-<resolved_ceiling>` for deterministic quality gate behavior.
269
+ 6. Use base/unpinned Codex roles only as a fallback or explicit provider-default choice. Log `Selected effort: provider-default`, display provider default effort when known, and do not describe this as parent-ceiling inheritance.
270
+ 7. Do not use top-level per-call `reasoning_effort` as the standard OAT selected-effort path; dogfooding showed that path can be inconsistent.
271
+
272
+ Claude rules:
273
+
274
+ - Claude ceiling is model-based: `haiku < sonnet < opus`.
275
+ - Select the lowest sufficient model capped by `workflow.dispatchCeiling.claude` or project `oat_dispatch_ceiling`.
276
+ - Pass `model: "<value>"` when `model_axis=selected:<value>` on the Task tool call.
277
+ - Keep `effort_axis=not-applicable`; Claude Code has no separate per-dispatch effort axis.
278
+
279
+ Payload-first invariant:
280
+
281
+ - Build the actual host dispatch argument map before logging.
282
+ - Do not emit `selected:<value>` unless the host invocation contains the corresponding role/model selection.
283
+ - Derive `Dispatch target` and `Effort axis` / `Model axis` from the payload.
284
+
285
+ Structured dispatch log:
225
286
 
226
287
  ```text
227
288
  OAT Dispatch: Phase {phase_id} {implementation | fix | review}
228
289
  Host: {Claude Code | Codex | Cursor | other host}
290
+ Preferred effort: {low | medium | high | xhigh | not-applicable}
291
+ Dispatch ceiling: {resolved ceiling value}
292
+ Selected effort: {low | medium | high | xhigh | provider-default | not-applicable}
293
+ Ceiling source: {repo config | project state | preflight prompt}
294
+ Provider default effort: {value | unknown | not-applicable}
229
295
  Model axis: { selected:<value> | inherited | not-applicable | host-auto }
230
- Effort axis: { selected:<value> | inherited | not-applicable | host-auto }
296
+ Effort axis: { selected:<value> | provider-default | inherited | not-applicable | host-auto }
231
297
  Dispatch target: {host-specific subagent/role/tool target}
232
- Rationale: {short rationale grounded in phase scope}
298
+ Rationale: {short rationale grounded in phase scope and any ceiling cap}
233
299
  ```
234
300
 
235
- For Codex implementation/fix dispatches, the rationale must include the phase complexity class that drove the selected effort (for example, mechanical, normal multi-file, or broad/high-risk). If `Effort axis: inherited`, the rationale must also cite the allowed reason for using the parent-session ceiling instead of `selected:low|medium|high`.
236
-
237
- Examples:
301
+ Codex capped example:
238
302
 
239
303
  ```text
240
- OAT Dispatch: Phase p01 implementation
241
- Host: Claude Code
242
- Model axis: selected:haiku
243
- Effort axis: not-applicable
244
- Dispatch target: oat-phase-implementer
245
- Rationale: mechanical template edits; haiku is the lowest sufficient Claude model.
246
-
247
304
  OAT Dispatch: Phase p02 implementation
248
- Host: Claude Code
249
- Model axis: selected:sonnet
250
- Effort axis: not-applicable
251
- Dispatch target: oat-phase-implementer
252
- Rationale: multi-file integration with mock wiring; sonnet is the lowest sufficient Claude model.
253
-
254
- OAT Dispatch: Phase p03 implementation
255
305
  Host: Codex
306
+ Preferred effort: high
307
+ Dispatch ceiling: medium
308
+ Selected effort: medium
309
+ Ceiling source: repo config
310
+ Provider default effort: high
256
311
  Model axis: inherited
257
312
  Effort axis: selected:medium
258
313
  Dispatch target: oat-phase-implementer-medium
259
- Rationale: shared TypeScript/config substrate with cross-file contracts; medium is the lowest sufficient Codex effort.
314
+ Rationale: normal multi-file implementation; high preferred due to integration risk, capped by configured ceiling.
315
+ ```
260
316
 
261
- OAT Dispatch: Phase p04 implementation
262
- Host: Other
263
- Model axis: host-auto
264
- Effort axis: host-auto
265
- Dispatch target: host default
266
- Rationale: host does not expose readable or pinnable dispatch controls; rationale maps to standard effort.
317
+ Codex reviewer example:
267
318
 
268
- OAT Dispatch: Phase p05 review
319
+ ```text
320
+ OAT Dispatch: Phase p02 review
269
321
  Host: Codex
322
+ Preferred effort: high
323
+ Dispatch ceiling: high
324
+ Selected effort: high
325
+ Ceiling source: project state
326
+ Provider default effort: medium
270
327
  Model axis: inherited
271
- Effort axis: inherited
272
- Dispatch target: oat-reviewer
273
- Rationale: reviewer dispatches inherit parent controls by default.
328
+ Effort axis: selected:high
329
+ Dispatch target: oat-reviewer-high
330
+ Rationale: reviewer runs at the configured ceiling for deterministic quality gate behavior.
274
331
  ```
275
332
 
276
- Use `low` for trivial docs-only, narrow single-file, or mechanical changes; `medium` for normal multi-file implementation and moderate integration risk; `high` for broad architecture, security/auth/redaction boundaries, subtle state behavior, or repeated substantive review failures. Use inherited `xhigh` only when the parent/orchestrator session is already xhigh.
333
+ Codex base/unpinned fallback example:
334
+
335
+ ```text
336
+ OAT Dispatch: Phase p02 review
337
+ Host: Codex
338
+ Preferred effort: provider-default
339
+ Dispatch ceiling: high
340
+ Selected effort: provider-default
341
+ Ceiling source: project state
342
+ Provider default effort: medium
343
+ Model axis: inherited
344
+ Effort axis: provider-default
345
+ Dispatch target: oat-reviewer
346
+ Rationale: base unpinned role fallback; effective effort follows Codex provider default.
347
+ ```
277
348
 
278
- Include the resolved implementation dispatch axes and rationale in the Phase Scope packet when known. Reserve `host-auto` for an axis the host uses internally but the orchestrator cannot read or pin; use `inherited` for deliberate inheritance and `not-applicable` when an axis is not meaningful for that host/API.
349
+ Include resolved dispatch context in scope packets when known:
279
350
 
280
351
  ```yaml
281
352
  model_axis: { selected:<value> | inherited | not-applicable | host-auto }
282
- effort_axis: { selected:<value> | inherited | not-applicable | host-auto }
353
+ effort_axis:
354
+ {
355
+ selected:<value> | provider-default | inherited | not-applicable | host-auto,
356
+ }
357
+ dispatch_ceiling: { resolved ceiling value }
358
+ ceiling_source: { repo config | project state | preflight prompt }
359
+ provider_default_effort: { value | unknown | not-applicable }
283
360
  dispatch_rationale: { short rationale }
284
361
  ```
285
362
 
286
- Review dispatch is intentionally different. A reviewer should inherit the parent session's model and effort axes unless the user explicitly requests a review override. In Codex, omit `model` and `reasoning_effort` overrides when spawning `oat-reviewer`; in Claude Code, do not pass a per-review model override. Log review scope as `model_axis=inherited` and `effort_axis=inherited` on hosts that expose an effort axis (such as Codex), or `effort_axis=not-applicable` on hosts that do not (such as Claude Code).
287
-
288
363
  ### Dry-Run Mode
289
364
 
290
365
  When the skill is invoked with `--dry-run`:
@@ -557,25 +632,29 @@ For each phase `pNN` in the plan (or each phase in the current parallel group),
557
632
  spec: {PROJECT_PATH}/spec.md
558
633
  implementation: {PROJECT_PATH}/implementation.md
559
634
  discovery: {PROJECT_PATH}/discovery.md
635
+ delta_recording: record any intentional divergence from spec/design/plan in implementation.md with rationale, source of truth, and follow-up artifact disposition
560
636
  commit_convention: {from plan.md header}
561
637
  workflow_mode: {from state.md or plan.md frontmatter}
562
638
  model_axis: {selected:<value> | inherited | not-applicable | host-auto; omit if unknown}
563
- effort_axis: {selected:<value> | inherited | not-applicable | host-auto; omit if unknown}
639
+ effort_axis: {selected:<value> | provider-default | inherited | not-applicable | host-auto; omit if unknown}
640
+ dispatch_ceiling: {resolved ceiling value; omit if unknown}
641
+ ceiling_source: {repo config | project state | preflight prompt; omit if unknown}
642
+ provider_default_effort: {value | unknown | not-applicable; omit if unknown}
564
643
  dispatch_rationale: {short rationale; omit if unknown}
565
644
  ```
566
645
 
567
646
  2. Perform a pre-dispatch assertion against the host invocation parameters. The Phase Scope fields are audit/context fields; selected axes must also be represented in the actual host dispatch call.
568
647
  - Codex implementer/fix dispatch:
569
- - Before building the `spawn_agent` argument map, classify the phase complexity and choose the lowest sufficient selected effort (`low`, `medium`, or `high`) when the matching effort-specific role is available.
570
- - Build the `spawn_agent` argument map before logging the dispatch. If `effort_axis=selected:low|medium|high`, the argument map MUST use the matching `agent_type`: `"oat-phase-implementer-low"`, `"oat-phase-implementer-medium"`, or `"oat-phase-implementer-high"`. Then derive the `OAT Dispatch:` block `Effort axis:` field from that same argument map.
648
+ - Before building the `spawn_agent` argument map, classify the phase complexity and choose preferred effort (`low`, `medium`, `high`, or `xhigh`), then cap it to the resolved Codex dispatch ceiling.
649
+ - Build the `spawn_agent` argument map before logging the dispatch. If `effort_axis=selected:low|medium|high|xhigh`, the argument map MUST use the matching `agent_type`: `"oat-phase-implementer-low"`, `"oat-phase-implementer-medium"`, `"oat-phase-implementer-high"`, or `"oat-phase-implementer-xhigh"`. Then derive the `OAT Dispatch:` block `Effort axis:` field from that same argument map.
571
650
  - Example selected low payload shape: `agent_type: "oat-phase-implementer-low"` and a Phase Scope message containing `effort_axis: selected:low`.
572
651
  - Immediately after spawning, compare the returned Codex status line with the selected effort before waiting on the agent. If the spawned status reports a different effort than the selected value (for example, the log says `effort_axis=selected:medium` but the spawn result reports `gpt-5.5 high`), treat this as an orchestration deviation. Stop, record the deviation in `implementation.md`, and redispatch with corrected parameters before continuing. Do not use work from the mismatched dispatch.
573
- - If `effort_axis=inherited`, use base `agent_type: "oat-phase-implementer"` and omit `reasoning_effort`. This is the parent-session ceiling path, so the dispatch rationale MUST cite the explicit user/Dispatch Profile override, explain why `selected:high` is insufficient, or record that the selected-effort roles are unavailable or failed to resolve.
652
+ - If `effort_axis=provider-default`, use base `agent_type: "oat-phase-implementer"` and omit `reasoning_effort`. The dispatch rationale MUST say this is a base/unpinned fallback and include provider default effort when known.
574
653
  - Claude Code implementer/fix dispatch:
575
654
  - If `model_axis=selected:<value>`, the Task tool call MUST include `model: "<value>"`.
576
655
  - If `model_axis=inherited`, omit `model`.
577
656
 
578
- 3. Dispatch the selected implementer role (Tier 1 via provider-native subagent mechanism) — the role asserted in the pre-dispatch step above (e.g., `oat-phase-implementer-low`, `oat-phase-implementer-medium`, `oat-phase-implementer-high`, or base `oat-phase-implementer` for inherited effort) — with the Phase Scope block as input and with the asserted host invocation parameters.
657
+ 3. Dispatch the selected implementer role (Tier 1 via provider-native subagent mechanism) — the role asserted in the pre-dispatch step above (e.g., `oat-phase-implementer-low`, `oat-phase-implementer-medium`, `oat-phase-implementer-high`, `oat-phase-implementer-xhigh`, or base `oat-phase-implementer` only for provider-default fallback) — with the Phase Scope block as input and with the asserted host invocation parameters.
579
658
 
580
659
  4. Receive the structured summary (DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED).
581
660
 
@@ -610,8 +689,8 @@ Escalate the runtime dispatch control when there is evidence that the current co
610
689
  When escalation is needed:
611
690
 
612
691
  1. If a stronger available control exists, re-dispatch at the next stronger control and include the reason in the scope packet. The escalation ladder is provider-specific:
613
- - **Codex:** `selected:low selected:medium selected:high exhausted`. `high` is the strongest control OAT can select. Beyond `high`: if the parent/orchestrator session is already `xhigh`, dispatch uses `effort_axis=inherited`; otherwise escalation is exhausted — stop, split the phase, or ask the user to re-invoke at `xhigh` (see step 4).
614
- - **Claude Code:** `selected:haiku selected:sonnet selected:opus`. `opus` is a selectable terminal step when available (and not capped by a future Claude-specific ceiling).
692
+ - **Codex:** `selected:low -> selected:medium -> selected:high -> selected:xhigh`, capped by the resolved Codex dispatch ceiling.
693
+ - **Claude Code:** `selected:haiku -> selected:sonnet -> selected:opus`, capped by the resolved Claude dispatch ceiling.
615
694
  2. Count the escalation redispatch against the existing bounded retry budget. Escalation changes the control; it does not create extra retry attempts.
616
695
  3. Record a compact note in `implementation.md` when practical:
617
696
  - `Dispatch: p03 escalated to model_axis=selected:opus, effort_axis=not-applicable after repeated review failures.` (Claude Code)
@@ -630,8 +709,9 @@ After the implementer returns DONE (or DONE_WITH_CONCERNS without correctness co
630
709
  **Dispatch:**
631
710
 
632
711
  - Use the same tier that was selected at start.
633
- - Inherit the parent session's model/effort/control for review. Do not choose a separate reviewer model or reasoning effort unless the user explicitly requests an override.
634
- - Tier 1: dispatch `oat-reviewer` via provider-native subagent mechanism with Review Scope:
712
+ - For Codex, dispatch the reviewer variant matching the resolved ceiling (`oat-reviewer-low|medium|high|xhigh`) for deterministic quality gates.
713
+ - For Claude Code, cap any selected review model by the resolved Claude ceiling and keep `effort_axis=not-applicable`.
714
+ - Tier 1: dispatch the selected reviewer target via provider-native subagent mechanism with Review Scope:
635
715
 
636
716
  ```
637
717
  project: {PROJECT_PATH}
@@ -642,13 +722,16 @@ After the implementer returns DONE (or DONE_WITH_CONCERNS without correctness co
642
722
  workflow_mode: {from state.md}
643
723
  artifact_paths: {same as Phase Scope}
644
724
  tasks_in_scope: {list of pNN-tNN IDs in the phase}
725
+ dispatch_ceiling: {resolved ceiling value}
726
+ ceiling_source: {repo config | project state | preflight prompt}
727
+ provider_default_effort: {value | unknown | not-applicable}
645
728
  model_axis: inherited
646
- effort_axis: inherited # on Codex; use not-applicable on Claude Code
647
- dispatch_rationale: review dispatch inherits parent session controls
729
+ effort_axis: selected:{resolved Codex ceiling} # on Codex; use not-applicable on Claude Code
730
+ dispatch_rationale: reviewer runs at the configured ceiling for deterministic quality gate behavior
648
731
  ```
649
732
 
650
733
  - For Codex Tier 1 dispatches, send the Review Scope block as a self-contained packet and keep fresh context (`fork_context: false`). The reviewer is expected to reconstruct context from git state and the OAT artifacts listed above.
651
- - For Codex Tier 1 review dispatches, omit `model` and `reasoning_effort` overrides in the `spawn_agent` call. For Claude Code review dispatches, do not pass a per-review model override. `host-auto` is not the right label when the review is intentionally inheriting parent controls.
734
+ - For Codex Tier 1 review dispatches, use `agent_type: "oat-reviewer-low|medium|high|xhigh"` matching the resolved ceiling. Use base `oat-reviewer` only as a provider-default fallback and log `effort_axis=provider-default`. For Claude Code review dispatches, do not pass a per-review effort override because the effort axis is not applicable; if selecting a model, cap it by the resolved Claude ceiling.
652
735
  - Treat the commit range as authoritative for review scope. `files_changed` is optional orientation metadata only.
653
736
  - If a Codex reviewer does not return a terminal result on the first wait, poll once more. If it still has not concluded, send one concise nudge to return immediately with current findings. If the reviewer still does not conclude, treat the Tier 1 review dispatch as failed for this phase and perform the review inline instead of waiting indefinitely.
654
737
 
@@ -669,7 +752,7 @@ On reviewer verdict `fail`, run a bounded fix loop.
669
752
 
670
753
  1. Read `oat_orchestration_retry_limit` from `state.md` frontmatter (default: `2`, range 0–5).
671
754
  2. For each retry (up to the limit):
672
- a. Select/log fix dispatch axes from the fix scope, then perform the same pre-dispatch assertion used for implementation dispatch. A Codex fix dispatch with `effort_axis=selected:low|medium|high` MUST use matching `agent_type: "oat-phase-implementer-low|medium|high"`; a Claude Code fix dispatch with `model_axis=selected:<value>` MUST pass `model: "<value>"` on the Task call.
755
+ a. Select/log fix dispatch axes from the fix scope, then perform the same pre-dispatch assertion used for implementation dispatch. A Codex fix dispatch with `effort_axis=selected:low|medium|high|xhigh` MUST use matching `agent_type: "oat-phase-implementer-low|medium|high|xhigh"`; a Claude Code fix dispatch with `model_axis=selected:<value>` MUST pass `model: "<value>"` on the Task call.
673
756
  b. Dispatch the selected phase implementer role in `fix` mode (Tier 1) OR read the agent and apply fixes inline (Tier 2), with: - `review_artifact`: the path written by the reviewer - `findings`: the Critical + Important findings list - `prior_summary`: the last implementer summary
674
757
  c. Receive the fix summary.
675
758
  d. Re-dispatch the reviewer with the updated commit range.
@@ -793,6 +876,14 @@ Append a new entry to the `## Orchestration Runs` section between the `<!-- orch
793
876
  #### Outstanding Items
794
877
 
795
878
  - {None | list of excluded phases with review paths and worktree paths}
879
+
880
+ #### Artifact / Design Deltas
881
+
882
+ Run-scoped snapshot only. The durable record is `## Deviations from Plan / Design`; consolidate any non-`None` entries there at the next phase boundary.
883
+
884
+ | Task / Review | Source Artifact | Planned / Documented | Actual / Accepted | Reason | Source of Truth | Follow-up |
885
+ | ----------------------------- | ----------------------------------- | ------------------------------- | -------------------------------------- | ---------------------------- | ------------------------- | ------------------------------------------- |
886
+ | {task_id/review_id or `None`} | {spec.md/design.md/plan.md section} | {planned behavior/taxonomy/API} | {actual shipped behavior/taxonomy/API} | {why divergence is accepted} | {implementation/artifact} | {artifact update task or explicit deferral} |
796
887
  ```
797
888
 
798
889
  Append only — never overwrite prior run entries.
@@ -887,6 +978,14 @@ When pausing:
887
978
  - Verification run
888
979
  - Notable decisions/deviations
889
980
 
981
+ **Design/artifact deltas (required when present):**
982
+
983
+ - If a completed task intentionally diverged from `spec.md`, `design.md`, or `plan.md`, update the `## Deviations from Plan / Design` table in `implementation.md`.
984
+ - For existing project artifacts, treat any `## Deviations...` heading as the deviations section; migrate to the preferred `## Deviations from Plan / Design` heading and table shape when already touching the section.
985
+ - Each delta must include: the affected source artifact/section, the planned/documented expectation, the actual shipped implementation, the reason the divergence is accepted, the current source of truth, and any follow-up artifact update task or explicit deferral.
986
+ - If the implementation is now source of truth and the design/spec/plan is stale, write that directly. Do not treat the stale artifact as a no-op just because code is correct.
987
+ - If no deltas exist for the phase, do not invent one; leave the table unchanged.
988
+
890
989
  **Bookkeeping commit (required):**
891
990
 
892
991
  **DO NOT SKIP.** This commit prevents state drift across sessions.
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-project-plan
3
- version: 1.3.2
3
+ version: 1.3.3
4
4
  description: Use when design.md is complete and executable implementation tasks are needed. Breaks design into bite-sized TDD tasks in canonical plan.md format.
5
5
  disable-model-invocation: true
6
6
  user-invocable: true
@@ -312,6 +312,62 @@ Unless the source artifact or user already supplied a confirmed `oat_plan_hill_p
312
312
 
313
313
  If `## Planning Checklist` is missing (older plans), add it before finalizing with the items above.
314
314
 
315
+ ### Step 11.5: Resolve Dispatch Ceiling Before Implementation Readiness
316
+
317
+ Before marking the plan ready for implementation, resolve the dispatch ceiling
318
+ for the current provider.
319
+
320
+ Resolution order:
321
+
322
+ 1. Repo/user/local config key `workflow.dispatchCeiling.<provider>` via `oat config get`
323
+ 2. Project `state.md` frontmatter key `oat_dispatch_ceiling`
324
+ 3. Interactive planning prompt
325
+ 4. Leave unresolved for implementation preflight when non-interactive
326
+
327
+ Provider values:
328
+
329
+ - Codex: `low`, `medium`, `high`, `xhigh`
330
+ - Claude: `haiku`, `sonnet`, `opus`
331
+
332
+ If no ceiling resolves for the current provider and the session is interactive,
333
+ ask once before final plan review:
334
+
335
+ ```text
336
+ No Codex dispatch ceiling is configured for this project.
337
+
338
+ Choose the maximum Codex reasoning effort OAT may dispatch during implementation:
339
+ low | medium | high | xhigh
340
+
341
+ This controls implementer/reviewer subagent variants. It does not change your Codex config.
342
+ ```
343
+
344
+ Adapt the wording for Claude:
345
+
346
+ ```text
347
+ No Claude dispatch ceiling is configured for this project.
348
+
349
+ Choose the maximum Claude model tier OAT may dispatch during implementation:
350
+ haiku | sonnet | opus
351
+
352
+ This controls provider-native subagent model selection. It does not change your Claude config.
353
+ ```
354
+
355
+ Persist the answer in `"$PROJECT_PATH/state.md"` frontmatter:
356
+
357
+ ```yaml
358
+ oat_dispatch_ceiling:
359
+ provider: codex
360
+ value: high
361
+ source: project-state
362
+ ```
363
+
364
+ Do not prompt when `OAT_NON_INTERACTIVE=1` or when no user-response channel
365
+ exists. In that case, leave the value unresolved. `oat-project-implement`
366
+ must block before work starts if it still cannot resolve a ceiling.
367
+
368
+ Do not treat Codex provider default effort as the OAT dispatch ceiling. Provider
369
+ default is informational for base/unpinned roles only.
370
+
315
371
  ### Step 12: Review Plan with User
316
372
 
317
373
  Present plan summary:
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-project-plan-writing
3
- version: 1.2.3
3
+ version: 1.2.4
4
4
  description: Use when authoring or mutating plan.md in any OAT workflow. Defines canonical format invariants — stable task IDs, required sections, review table rules, and resume guardrails.
5
5
  disable-model-invocation: true
6
6
  user-invocable: false
@@ -48,6 +48,10 @@ Runtime routing note:
48
48
 
49
49
  - Keep `oat_ready_for` canonical as `oat-project-implement`.
50
50
  - Declare parallelism via `oat_plan_parallel_groups` in plan.md frontmatter (empty = sequential; nested arrays of phase IDs = parallel groups). `oat-project-implement` reads this field to choose sequential vs worktree-isolated parallel execution.
51
+ - Dispatch ceilings are not stored in `plan.md`. Plan-producing skills resolve
52
+ them from `workflow.dispatchCeiling.<provider>` or project `state.md`
53
+ frontmatter, then persist interactive answers back to `state.md` as
54
+ `oat_dispatch_ceiling`.
51
55
 
52
56
  Additional frontmatter keys (`oat_phase`, `oat_phase_status`, `oat_blockers`, `oat_last_updated`, `oat_generated`, `oat_template`, `oat_import_reference`, `oat_import_source_path`, `oat_import_provider`) are set by calling skills as needed.
53
57
 
@@ -71,7 +75,7 @@ Validation rules for explicit rows:
71
75
 
72
76
  - `Phase` must match a real `pNN` phase in the plan.
73
77
  - `Claude model` must be `haiku`, `sonnet`, `opus`, `auto`, or blank.
74
- - `Codex effort` must be `low`, `medium`, `high`, `xhigh`, `auto`, or blank. In Codex, `low`, `medium`, and `high` map to effort-specific implementer roles. Codex xhigh is inherited-only; `xhigh` can be honored only by inheriting an already-xhigh parent/orchestrator session, not by selecting an `xhigh` implementer variant.
78
+ - `Codex effort` must be `low`, `medium`, `high`, `xhigh`, `auto`, or blank. In Codex, explicit effort values are preferred controls that `oat-project-implement` caps against the resolved OAT dispatch ceiling and maps to pinned implementer variants when selected. Provider default effort is informational for base/unpinned roles and is not an OAT ceiling.
75
79
  - Blank or `auto` means no explicit constraint for that provider.
76
80
  - `Rationale` is recommended and should explain why runtime selection should not decide on its own.
77
81
 
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-project-quick-start
3
- version: 2.1.1
3
+ version: 2.1.2
4
4
  description: Use when a task is small enough for quick mode or rapid iteration is preferred. Scaffolds a lightweight OAT project from discovery directly to a runnable plan, with optional brainstorming and lightweight design.
5
5
  argument-hint: '<project-name> ["project description"]'
6
6
  disable-model-invocation: true
@@ -456,6 +456,62 @@ Required parallelism pass before finalizing the plan:
456
456
  - Quick mode is not "sequential by default." A quick-start plan is sequential only when the dependency and write-set analysis says it should be.
457
457
  - When a task claims scoped verification, prefer the exact runner invocation that truly scopes to the intended file, test, or target instead of package-level shortcuts that may execute the full suite.
458
458
 
459
+ ### Step 3.5: Resolve Dispatch Ceiling Before Implementation Readiness
460
+
461
+ Before moving the quick project to ready-for-implementation, resolve the
462
+ dispatch ceiling for the current provider.
463
+
464
+ Resolution order:
465
+
466
+ 1. Repo/user/local config key `workflow.dispatchCeiling.<provider>` via `oat config get`
467
+ 2. Project `state.md` frontmatter key `oat_dispatch_ceiling`
468
+ 3. Interactive quick-planning prompt
469
+ 4. Leave unresolved for implementation preflight when non-interactive
470
+
471
+ Provider values:
472
+
473
+ - Codex: `low`, `medium`, `high`, `xhigh`
474
+ - Claude: `haiku`, `sonnet`, `opus`
475
+
476
+ If no ceiling resolves for the current provider and the session is interactive,
477
+ ask once before finalizing `plan.md`:
478
+
479
+ ```text
480
+ No Codex dispatch ceiling is configured for this project.
481
+
482
+ Choose the maximum Codex reasoning effort OAT may dispatch during implementation:
483
+ low | medium | high | xhigh
484
+
485
+ This controls implementer/reviewer subagent variants. It does not change your Codex config.
486
+ ```
487
+
488
+ Adapt the wording for Claude:
489
+
490
+ ```text
491
+ No Claude dispatch ceiling is configured for this project.
492
+
493
+ Choose the maximum Claude model tier OAT may dispatch during implementation:
494
+ haiku | sonnet | opus
495
+
496
+ This controls provider-native subagent model selection. It does not change your Claude config.
497
+ ```
498
+
499
+ Persist the answer in `"$PROJECT_PATH/state.md"` frontmatter:
500
+
501
+ ```yaml
502
+ oat_dispatch_ceiling:
503
+ provider: codex
504
+ value: high
505
+ source: project-state
506
+ ```
507
+
508
+ Do not prompt when `OAT_NON_INTERACTIVE=1` or when no user-response channel
509
+ exists. In that case, leave the value unresolved. `oat-project-implement`
510
+ must block before work starts if it still cannot resolve a ceiling.
511
+
512
+ Do not treat Codex provider default effort as the OAT dispatch ceiling. Provider
513
+ default is informational for base/unpinned roles only.
514
+
459
515
  ### Step 4: Sync Project State
460
516
 
461
517
  Update `"$PROJECT_PATH/state.md"`:
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: oat-project-review-provide
3
- version: 1.3.3
3
+ version: 1.3.4
4
4
  description: Use when completed work in an active OAT project needs a quality gate before merge. Performs a lifecycle-scoped review after a task, phase, or full implementation, unlike oat-review-provide.
5
5
  disable-model-invocation: true
6
6
  user-invocable: true
@@ -15,6 +15,8 @@ Request and execute a code or artifact review for the current project scope.
15
15
 
16
16
  Produce an independent review artifact that verifies requirements/design alignment (mode-aware) and code quality.
17
17
 
18
+ Reviewers should distinguish implementation defects from artifact drift. If code is defensible but `spec.md`, `design.md`, or `plan.md` is stale, frame the finding as artifact alignment rather than a required code change.
19
+
18
20
  ## Prerequisites
19
21
 
20
22
  **Required:** Active project with at least one completed task.
@@ -481,6 +483,12 @@ Build the "Review Scope" metadata for the reviewer:
481
483
  - Deferred Medium count: {DEFERRED_MEDIUM_COUNT}
482
484
  - Deferred Minor count: {DEFERRED_MINOR_COUNT}
483
485
  {DEFERRED_LEDGER}
486
+
487
+ **Design Drift Review Guidance:**
488
+
489
+ - If implementation differs from `spec.md`, `design.md`, or `plan.md`, decide whether the code should change or whether the artifact is stale.
490
+ - Use artifact-alignment framing when shipped implementation is defensible and the lifecycle artifact should be updated.
491
+ - Do not force a code-defect framing for accepted design drift; `oat-project-review-receive` can convert artifact drift into alignment tasks or explicit deferrals.
484
492
  ```
485
493
 
486
494
  ### Step 6: Execute Review (3-Tier Capability Model)