@delegance/claude-autopilot 5.2.2 → 6.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +1027 -1
- package/README.md +104 -17
- package/dist/src/adapters/council/claude.js +2 -1
- package/dist/src/adapters/council/openai.js +14 -7
- package/dist/src/adapters/deploy/_http.d.ts +43 -0
- package/dist/src/adapters/deploy/_http.js +99 -0
- package/dist/src/adapters/deploy/fly.d.ts +206 -0
- package/dist/src/adapters/deploy/fly.js +696 -0
- package/dist/src/adapters/deploy/generic.d.ts +39 -0
- package/dist/src/adapters/deploy/generic.js +98 -0
- package/dist/src/adapters/deploy/index.d.ts +15 -0
- package/dist/src/adapters/deploy/index.js +78 -0
- package/dist/src/adapters/deploy/render.d.ts +181 -0
- package/dist/src/adapters/deploy/render.js +550 -0
- package/dist/src/adapters/deploy/types.d.ts +221 -0
- package/dist/src/adapters/deploy/types.js +15 -0
- package/dist/src/adapters/deploy/vercel.d.ts +143 -0
- package/dist/src/adapters/deploy/vercel.js +426 -0
- package/dist/src/adapters/pricing.d.ts +36 -0
- package/dist/src/adapters/pricing.js +40 -0
- package/dist/src/adapters/review-engine/claude.js +2 -1
- package/dist/src/adapters/review-engine/codex.js +12 -8
- package/dist/src/adapters/review-engine/gemini.js +2 -1
- package/dist/src/adapters/review-engine/openai-compatible.js +2 -1
- package/dist/src/adapters/sdk-loader.d.ts +15 -0
- package/dist/src/adapters/sdk-loader.js +77 -0
- package/dist/src/cli/autopilot.d.ts +71 -0
- package/dist/src/cli/autopilot.js +735 -0
- package/dist/src/cli/brainstorm.d.ts +23 -0
- package/dist/src/cli/brainstorm.js +131 -0
- package/dist/src/cli/costs.d.ts +15 -1
- package/dist/src/cli/costs.js +99 -10
- package/dist/src/cli/deploy.d.ts +71 -0
- package/dist/src/cli/deploy.js +539 -0
- package/dist/src/cli/fix.d.ts +18 -0
- package/dist/src/cli/fix.js +105 -11
- package/dist/src/cli/help-text.d.ts +52 -0
- package/dist/src/cli/help-text.js +400 -0
- package/dist/src/cli/implement.d.ts +91 -0
- package/dist/src/cli/implement.js +196 -0
- package/dist/src/cli/index.js +784 -222
- package/dist/src/cli/json-envelope.d.ts +187 -0
- package/dist/src/cli/json-envelope.js +270 -0
- package/dist/src/cli/json-mode.d.ts +33 -0
- package/dist/src/cli/json-mode.js +201 -0
- package/dist/src/cli/migrate.d.ts +111 -0
- package/dist/src/cli/migrate.js +305 -0
- package/dist/src/cli/plan.d.ts +81 -0
- package/dist/src/cli/plan.js +149 -0
- package/dist/src/cli/pr.d.ts +106 -0
- package/dist/src/cli/pr.js +191 -19
- package/dist/src/cli/preflight.js +102 -1
- package/dist/src/cli/review.d.ts +27 -0
- package/dist/src/cli/review.js +126 -0
- package/dist/src/cli/runs-watch-renderer.d.ts +45 -0
- package/dist/src/cli/runs-watch-renderer.js +275 -0
- package/dist/src/cli/runs-watch.d.ts +41 -0
- package/dist/src/cli/runs-watch.js +395 -0
- package/dist/src/cli/runs.d.ts +122 -0
- package/dist/src/cli/runs.js +902 -0
- package/dist/src/cli/scan.d.ts +93 -0
- package/dist/src/cli/scan.js +166 -40
- package/dist/src/cli/spec.d.ts +66 -0
- package/dist/src/cli/spec.js +132 -0
- package/dist/src/cli/validate.d.ts +29 -0
- package/dist/src/cli/validate.js +131 -0
- package/dist/src/core/config/schema.d.ts +43 -0
- package/dist/src/core/config/schema.js +25 -0
- package/dist/src/core/config/types.d.ts +17 -0
- package/dist/src/core/council/runner.d.ts +10 -1
- package/dist/src/core/council/runner.js +25 -3
- package/dist/src/core/council/types.d.ts +7 -0
- package/dist/src/core/errors.d.ts +1 -1
- package/dist/src/core/errors.js +12 -0
- package/dist/src/core/logging/redaction.d.ts +13 -0
- package/dist/src/core/logging/redaction.js +20 -0
- package/dist/src/core/migrate/detector-rules.js +6 -0
- package/dist/src/core/migrate/schema-validator.js +22 -1
- package/dist/src/core/phases/static-rules.d.ts +5 -1
- package/dist/src/core/phases/static-rules.js +2 -5
- package/dist/src/core/run-state/budget.d.ts +88 -0
- package/dist/src/core/run-state/budget.js +141 -0
- package/dist/src/core/run-state/cli-internal.d.ts +21 -0
- package/dist/src/core/run-state/cli-internal.js +174 -0
- package/dist/src/core/run-state/events.d.ts +59 -0
- package/dist/src/core/run-state/events.js +504 -0
- package/dist/src/core/run-state/lock.d.ts +61 -0
- package/dist/src/core/run-state/lock.js +206 -0
- package/dist/src/core/run-state/phase-context.d.ts +60 -0
- package/dist/src/core/run-state/phase-context.js +108 -0
- package/dist/src/core/run-state/phase-registry.d.ts +137 -0
- package/dist/src/core/run-state/phase-registry.js +162 -0
- package/dist/src/core/run-state/phase-runner.d.ts +80 -0
- package/dist/src/core/run-state/phase-runner.js +447 -0
- package/dist/src/core/run-state/provider-readback.d.ts +130 -0
- package/dist/src/core/run-state/provider-readback.js +426 -0
- package/dist/src/core/run-state/replay-decision.d.ts +69 -0
- package/dist/src/core/run-state/replay-decision.js +144 -0
- package/dist/src/core/run-state/resolve-engine.d.ts +100 -0
- package/dist/src/core/run-state/resolve-engine.js +190 -0
- package/dist/src/core/run-state/resume-preflight.d.ts +66 -0
- package/dist/src/core/run-state/resume-preflight.js +116 -0
- package/dist/src/core/run-state/run-phase-with-lifecycle.d.ts +73 -0
- package/dist/src/core/run-state/run-phase-with-lifecycle.js +186 -0
- package/dist/src/core/run-state/runs.d.ts +57 -0
- package/dist/src/core/run-state/runs.js +288 -0
- package/dist/src/core/run-state/snapshot.d.ts +14 -0
- package/dist/src/core/run-state/snapshot.js +114 -0
- package/dist/src/core/run-state/state.d.ts +40 -0
- package/dist/src/core/run-state/state.js +164 -0
- package/dist/src/core/run-state/types.d.ts +278 -0
- package/dist/src/core/run-state/types.js +13 -0
- package/dist/src/core/run-state/ulid.d.ts +11 -0
- package/dist/src/core/run-state/ulid.js +95 -0
- package/dist/src/core/schema-alignment/extractor/index.d.ts +1 -1
- package/dist/src/core/schema-alignment/extractor/index.js +2 -2
- package/dist/src/core/schema-alignment/extractor/prisma.d.ts +13 -1
- package/dist/src/core/schema-alignment/extractor/prisma.js +65 -10
- package/dist/src/core/schema-alignment/git-history.d.ts +19 -0
- package/dist/src/core/schema-alignment/git-history.js +53 -0
- package/dist/src/core/static-rules/rules/brand-tokens.js +2 -2
- package/dist/src/core/static-rules/rules/schema-alignment.js +14 -4
- package/package.json +9 -5
- package/scripts/autoregress.ts +3 -2
- package/skills/claude-autopilot.md +1 -1
- package/skills/make-interfaces-feel-better/SKILL.md +104 -0
- package/skills/migrate/SKILL.md +193 -47
- package/skills/simplify-ui/SKILL.md +103 -0
- package/skills/ui/SKILL.md +117 -0
- package/skills/ui-ux-pro-max/SKILL.md +90 -0
package/CHANGELOG.md
CHANGED
|
@@ -1,4 +1,1030 @@
|
|
|
1
|
-
|
|
1
|
+
## Unreleased
|
|
2
|
+
|
|
3
|
+
- v5.6 Phase 7 (docs reconciliation) — pending.
|
|
4
|
+
|
|
5
|
+
## 6.2.2 — `claude-autopilot autopilot --json` envelope + cache version policy (2026-05-07)
|
|
6
|
+
|
|
7
|
+
**Headline.** Closes out the v6.2.x track. `claude-autopilot autopilot --json` now emits exactly one machine-readable envelope on stdout — successful runs, pre-run failures, and mid-pipeline failures all produce the same shape so CI consumers can branch on `.exitCode` / `.failedPhase` / `.errorCode` directly without parsing stderr NDJSON. The cache contract gains a `MIN_SUPPORTED..MAX_SUPPORTED` schema-version window so a stale run dir from a future binary fails with a clear error instead of an opaque shape crash. The migration guide gets a new "v6.1 → v6.2: one runId across the pipeline" section.
|
|
8
|
+
|
|
9
|
+
**Motivation — Codex review of the v6.2 spec (3 WARNING + 3 NOTE).** The v6.2 orchestrator spec reserved `--json` for v6.2.2; the spec for this PR (Codex 5.3-reviewed) folded back three warnings (strict equality on schemaVersion blocks rolling deploys, exactly-once envelope needs uncaughtException coverage, exit-code taxonomy ambiguous for pre-run failures) and three notes (six-phases vs four-phases migration text, `errorCode` union too loose, stdout purity test under stderr load).
|
|
10
|
+
|
|
11
|
+
**What's in (the 9 deliverables from the spec's "Scope" section).**
|
|
12
|
+
|
|
13
|
+
- **Outer JSON envelope** for `claude-autopilot autopilot --json`. New `AutopilotJsonEnvelope` shape (`version: '1'`, `verb: 'autopilot'`, `runId | null`, `status`, `exitCode`, `phases[]`, `totalCostUSD`, `durationMs`, `errorCode?`, `errorMessage?`, `failedAtPhase?`, `failedPhaseName?`). Pre-run failures get `runId: null` + populated `errorCode`. Mid-pipeline failures get `failedAtPhase` + `failedPhaseName`.
|
|
14
|
+
- **Bounded `AutopilotErrorCode` enum.** Exact strings: `invalid_config | budget_exceeded | lock_held | corrupted_state | partial_write | needs_human | phase_failed | internal_error`. CI consumers can rely on these specific values; new codes ship as minor versions of the envelope schema. Per codex NOTE #5.
|
|
15
|
+
- **Single-write latch + uncaughtException / unhandledRejection handlers.** Module-scoped boolean in `src/cli/json-envelope.ts` flips BEFORE writing so subsequent calls no-op. The orchestrator's `runAutopilotWithJsonEnvelope` installs process-level fatal handlers that consult the latch — if an envelope already shipped, they exit silently; otherwise they emit a fallback `internal_error` envelope before exiting `1`. Test seam `__testInstallProcessHandlers: false` keeps the handlers from leaking across the suite. Per codex WARNING #2.
|
|
16
|
+
- **Deterministic exit-code-to-errorCode mapping** via `computeAutopilotExitCode`. `0` success / `1` `invalid_config | phase_failed | internal_error` / `2` `lock_held | corrupted_state | partial_write` / `78` `budget_exceeded | needs_human`. Per codex WARNING #3.
|
|
17
|
+
- **Cache contract version policy** in `src/core/run-state/state.ts` + the replay path in `events.ts`. New exports `RUN_STATE_MIN_SUPPORTED_SCHEMA_VERSION = 1` and `RUN_STATE_MAX_SUPPORTED_SCHEMA_VERSION = RUN_STATE_SCHEMA_VERSION`. `replayState()` throws `corrupted_state` when the persisted `schema_version` falls outside the window, with a message naming both bounds for operator triage. Future minor versions can additively expand the schema while preserving forward-read compatibility (bump writer, leave reader); major bumps reset `MIN_SUPPORTED` to break with the past explicitly. Per codex WARNING #1.
|
|
18
|
+
- **Migration guide section.** New "v6.1 → v6.2: one runId across the pipeline" section in `docs/v6/migration-guide.md` walks through the per-verb → orchestrator collapse, the `--json` envelope shape (success / pre-run failure / mid-pipeline failure examples), the `AutopilotErrorCode` taxonomy table, and the cache version policy. Flags the v6.2.0 vs v6.2.1 phase-set difference per codex NOTE #4 — examples assume the v6.2.1 6-phase set (`scan → spec → plan → implement → migrate → pr`).
|
|
19
|
+
- **Channel discipline preserved.** The envelope is the only thing on stdout in `--json` mode (orchestrator runs with `__silent: true`). NDJSON events continue to flow to stderr unchanged via the existing v6 Phase 5 helpers.
|
|
20
|
+
- **Dispatcher wiring.** `src/cli/index.ts` plumbs `--json` through to `runAutopilotWithJsonEnvelope`; pre-run validation failures (`--mode`, `--budget`) emit envelopes too so CI never sees free-text errors when `--json` is on.
|
|
21
|
+
|
|
22
|
+
**Tests.** Baseline 1534 → 1548 (+14 net new):
|
|
23
|
+
|
|
24
|
+
- 9 envelope tests in `tests/cli/autopilot-json-envelope.test.ts` covering the 6 spec scenarios (success, pre-run failure, mid-pipeline failure, no-ANSI on stdout, stdout purity under stderr load, single-write latch + uncaughtException) plus 1 latch sanity test and 2 exit-code/enum mapping tests.
|
|
25
|
+
- 5 schema-version range tests in `tests/run-state/state.test.ts` covering the bounds export plus accept-in-range, reject-below-MIN, reject-above-MAX, and message-names-both-bounds.
|
|
26
|
+
|
|
27
|
+
**Engine-off path unchanged.** The schema-version range check applies inside `replayState()` (engine-on territory). Engine-off invocations don't read run dirs and are byte-for-byte identical to v6.2.1.
|
|
28
|
+
|
|
29
|
+
**Out of scope (deliberate, see spec for full list).**
|
|
30
|
+
- `--json` envelope on individual wrapped verbs other than `autopilot`. They already emit per-verb envelopes via the v6 Phase 5 helper; no change needed.
|
|
31
|
+
- Streaming JSON (newline-delimited progress events on stdout). v6.3 — would need a major channel-discipline change.
|
|
32
|
+
- Schema migration tooling. v6.x has only one schema version; migration tooling is reserved for the v7 layout change.
|
|
33
|
+
|
|
34
|
+
**Spec.** docs/specs/v6.2.2-json-envelope-and-docs.md (3 WARNING + 3 NOTE folded from the Codex 5.3 review).
|
|
35
|
+
|
|
36
|
+
## 6.2.1 — Side-effect phase idempotency contracts (`migrate` + `pr`) (2026-05-07)
|
|
37
|
+
|
|
38
|
+
**Headline.** Side-effecting phases now satisfy a registry-enforced two-step contract — record a deterministic "I'm starting this work" breadcrumb BEFORE the side-effect, then one reconciliation ref per durable artifact AFTER. With the contract in place, `migrate` and `pr` enter the orchestrator's `--mode=full` registry, expanding the v6.2.0 `scan → spec → plan → implement` pipeline to the full **6-phase** flow `scan → spec → plan → implement → migrate → pr` under one runId.
|
|
39
|
+
|
|
40
|
+
**Motivation — Codex CRITICAL gate from v6.2.** The v6.2 orchestrator spec flagged side-effect resume as the riskiest property to certify before adding `migrate` or `pr`: a partial crash mid-dispatch could leave the engine blind to applied work, causing the resume preflight to either silently re-run side effects (data loss) or pessimistically refuse every retry (operability tax). v6.2.1 closes the gap with a uniform contract every side-effecting phase must declare AND a registry-time guard that throws if the declaration is missing.
|
|
41
|
+
|
|
42
|
+
**What's in (the 7 deliverables from spec section "Scope of THIS PR").**
|
|
43
|
+
|
|
44
|
+
- **New `migration-batch` ref kind** in `ExternalRefKind` (`src/core/run-state/types.ts`). Documented semantics: "deterministic id covers a planned migration batch; emitted BEFORE dispatch so a partial crash leaves a resume target." Joins `migration-version` (the post-effect reconciliation ref).
|
|
45
|
+
- **`migrate` pre-effect breadcrumb.** `src/cli/migrate.ts` now emits a `migration-batch` ref BEFORE `dispatchFn(input)` — a partial crash leaves the orchestrator a resume target. The post-success `migration-version` refs stay (one per applied migration). Per the v6.2.1 spec, the batch id uses the `${env}:pre-dispatch:${Date.now()}` fallback form because no Delegance migrate skill (Supabase, Rails, Alembic, …) exposes its planned set pre-dispatch — the deterministic-id form `sha256(env+plannedMigrations)` is reserved for a follow-up that adds a planning verb to the skill protocol.
|
|
46
|
+
- **Provider readback for `migration-batch`** in `src/core/run-state/provider-readback.ts`. Queries the dispatcher's ledger for the planned set + applied set, returns `merged` (all applied), `open` (some pending), `failed` (any errored), or `unknown` (fail closed on missing fetcher / throw / null). New `MigrationBatchFetcher` interface + `registerMigrationBatchFetcher` seam alongside the existing `MigrationStateFetcher`.
|
|
47
|
+
- **Registry-time enforcement** in `src/core/run-state/phase-registry.ts`. New `registerPhase()` helper throws `Error: registry: side-effect phase <name> missing idempotency contract` when a `hasSideEffects: true` registration omits `preEffectRefKinds` or `postEffectRefKinds`. Applied to all six entries; the four read-only phases (scan/spec/plan/implement) omit the arrays without complaint.
|
|
48
|
+
- **`buildMigratePhase` and `buildPrPhase` builders** extracted following the v6.2.0 builder pattern (scan/spec/plan/implement). Each verb's existing `runX(options)` continues to delegate to its builder — direct CLI behavior is byte-for-byte identical to v6.2.0. The full registry now has: `scan / spec / plan / implement / migrate / pr`.
|
|
49
|
+
- **Resume preflight in orchestrator** (`src/cli/autopilot.ts` + new `src/core/run-state/resume-preflight.ts`). Before invoking `runPhase` on any side-effecting phase, the orchestrator collects prior `phase.success` + `phase.externalRef` events from `events.ndjson` and routes per the spec decision matrix: all post-effect refs `merged`/`live` → emit synthetic `phase.success` and skip; pre-effect breadcrumb `open` → retry (the phase body's own ledger handles dedup); otherwise → emit `replay.override` + throw `GuardrailError('needs_human')`. New error code `needs_human` joins the taxonomy in `src/core/errors.ts`.
|
|
50
|
+
- **`--mode=full` extended** to 6 phases (`DEFAULT_FULL_PHASES` in `phase-registry.ts`). After v6.2.1, `claude-autopilot autopilot` runs the entire pipeline under one runId — the YC-demo win deferred from v6.2.0.
|
|
51
|
+
|
|
52
|
+
**Tests.** Baseline 1509 → 1532 (+23 net new):
|
|
53
|
+
|
|
54
|
+
- 9 gating tests in `tests/cli/autopilot-side-effect-resume.test.ts` covering the 6 spec scenarios (migrate partial-crash retry, migrate full-success skip, pr-open skip, pr-closed needs-human, registry rejection, run-scope budget no-double-charge) plus 3 edge cases (proceed-fresh, prior success without refs, errored-ledger needs-human).
|
|
55
|
+
- 8 unit tests in `tests/run-state/provider-readback.test.ts` covering the new `migration-batch` readback (merged / open / failed / empty plan / null fetcher / throw / no fetcher / default-registry routing).
|
|
56
|
+
- 2 updated tests in `tests/cli/migrate-engine-smoke.test.ts` to account for the new pre-effect breadcrumb (now `1 + N` refs per run instead of `N`).
|
|
57
|
+
- 4 new test variants for the contract guard (`hasSideEffects: true` with each missing array, plus the empty-postEffect / read-only positive cases).
|
|
58
|
+
|
|
59
|
+
**Engine-off path unchanged.** Existing `migrate`/`pr` invocations without `--engine` continue byte-for-byte identical. The engine-off escape hatch threads through `executeMigratePhase(input, null)` / `executePrPhase(input, null)`, where a null `ctx` makes `emitExternalRef` a no-op — same precedent as every other wrapped verb.
|
|
60
|
+
|
|
61
|
+
**Out of scope (deliberate, see spec for full list).**
|
|
62
|
+
- Deterministic batch id (`sha256(env + plannedMigrations)`) — requires extracting a `planMigrations()` verb from each migrate skill's protocol. v6.2.x follow-up.
|
|
63
|
+
- `implement`'s `git-remote-push` ref (declared in the spec table but not yet emitted by `implement.ts`). v6.2.x follow-up.
|
|
64
|
+
- Cross-run ref dedup (e.g. recognizing two pre-dispatch breadcrumbs as the same operation across runs). Not needed for orchestrator MVP.
|
|
65
|
+
- Provider readback for non-Delegance migrate skills (Rails, Alembic, …). v6.2.1 ships the contract; per-skill readback is per-skill follow-up work.
|
|
66
|
+
|
|
67
|
+
**Spec.** docs/specs/v6.2.1-side-effect-idempotency.md (Codex CRITICAL gate from v6.2 — folded back as the foundation for this PR).
|
|
68
|
+
|
|
69
|
+
## 6.2.0 — Multi-phase orchestrator (`claude-autopilot autopilot`) (2026-05-07)
|
|
70
|
+
|
|
71
|
+
**Headline.** New top-level `claude-autopilot autopilot` verb runs `scan → spec → plan → implement` under **one runId**. The pre-v6.2 chain (`scan && spec && plan && implement`) created four separate runs with no parent — the orchestrator collapses them into a single ledger so `claude-autopilot runs watch <id>` covers the whole pipeline and a `--budget=$25` cap ticks down across phases instead of resetting per verb.
|
|
72
|
+
|
|
73
|
+
**What's in.**
|
|
74
|
+
- **`claude-autopilot autopilot [options]`** — sequential N-phase orchestrator. Engine-on REQUIRED (rejected at pre-flight if `--no-engine` / `CLAUDE_AUTOPILOT_ENGINE=off` / `engine.enabled: false`). Lifecycle: `createRun({ phases })` → per-phase `buildPhase + runPhase` → emit `run.complete` exactly once → refresh state snapshot → release lock in `finally`. Non-interactive (a `pause` budget decision becomes hard-fail) so it works in CI without prompting.
|
|
75
|
+
- **`build<Phase>Phase()` builders** extracted from `scan`, `spec`, `plan`, `implement`. Each verb's existing `runX(options)` continues to call its builder internally — direct CLI behavior is byte-for-byte identical to v6.1. Per-verb parity tests (`tests/cli/<verb>-builder-parity.test.ts`) compare stdout / stderr / `events.ndjson` between the legacy entry and the explicit builder + `runPhaseWithLifecycle` path.
|
|
76
|
+
- **Phase registry** at `src/core/run-state/phase-registry.ts`. `as const` + per-entry `satisfies PhaseRegistration<I, O>` preserves per-phase I/O typing through dynamic dispatch (per codex review NOTE #5). `getPhase(name)`, `listPhaseNames()`, and `validatePhaseNames(names)` are the public surface; `--phases=<csv>` validation lives here.
|
|
77
|
+
- **Run-scope budget** — `BudgetConfig.scope: 'phase' | 'run'` (default `'phase'` for back-compat). When `scope === 'run'` the orchestrator's per-phase budget gates resolve against cross-phase `phase.cost` totals so the `$25` demo narrative ticks down across the whole pipeline. `sumPhaseCost(events, '*')` cross-phase overload added. Both `BudgetCheck.scope` and `BudgetCheckEvent.scope` carry the resolution forward to observers (`runs show <id> --events`, future cost dashboards). Per codex review WARNING #2 — pulled forward into v6.2.0 (was deferred to v6.2.2 in the initial draft).
|
|
78
|
+
- **Exit-code matrix** (per codex review WARNING #3) — 0 success, 78 budget_exceeded, 2 engine error (`lock_held` / `corrupted_state` / `partial_write`), 1 everything else. Phase failure wins over finalization error.
|
|
79
|
+
- **CLI surface**: `--mode=full` (default — `scan → spec → plan → implement`), `--phases=<csv>` for custom lists, `--budget=<usd>` for the run-scope cap. `--mode=fix` and `--mode=review` reserved for v6.2.1+; `--json` envelope reserved for v6.2.2.
|
|
80
|
+
|
|
81
|
+
**Tests.** Baseline 1492 → 1509 (+17 new):
|
|
82
|
+
- 4 builder-parity tests (`scan`, `spec`, `plan`, `implement`) covering stdout / stderr / events triple-snapshot.
|
|
83
|
+
- 6 run-scope budget tests in `tests/run-state/budget.test.ts` covering scope flag default, run-scope happy path, run-scope cap exceeded across phases, Layer 1 advisory in run-scope, and phase/run scope math equivalence (regression guard).
|
|
84
|
+
- 7 orchestrator integration tests in `tests/cli/autopilot.test.ts` covering: 3-phase happy path, scan-failure phase 0, run-scope budget exceeded → exit 78, resume lookup `already-complete` short-circuit, `--phases=invalid,scan` → exit 1 invalid_config no run dir, `CLAUDE_AUTOPILOT_ENGINE=off` → exit 1 invalid_config, `cliEngine: false` → exit 1 invalid_config.
|
|
85
|
+
|
|
86
|
+
**Out of scope (deliberate, see spec for full list).**
|
|
87
|
+
- `migrate`, `pr` — gated on per-phase idempotency contracts (preflight readback + externalRef recorded BEFORE side-effect). v6.2.1.
|
|
88
|
+
- `--mode=fix`, `--mode=review` — v6.2.1+.
|
|
89
|
+
- `--json` envelope — v6.2.2.
|
|
90
|
+
- Parallel phase execution. Sequential by design.
|
|
91
|
+
- Interactive prompts inside the orchestrator. CI/scripts get deterministic exit codes; pause budget decisions hard-fail.
|
|
92
|
+
|
|
93
|
+
**Spec.** docs/specs/v6.2-multi-phase-orchestrator.md (Codex-reviewed: 1 CRITICAL + 3 WARNING + 3 NOTE folded back into the spec before implementation).
|
|
94
|
+
|
|
95
|
+
## 6.1.0 — Default flip: engine on by default + `--no-engine` deprecated (2026-05-07)
|
|
96
|
+
|
|
97
|
+
**Headline.** The Run State Engine is now ON by default. Bare
|
|
98
|
+
`claude-autopilot <verb>` invocations create a `.guardrail-cache/runs/<ulid>/`
|
|
99
|
+
directory, emit typed NDJSON events on stderr, apply budget gates if
|
|
100
|
+
`budgets:` is configured, and write a state snapshot — without any opt-in
|
|
101
|
+
config. v6.0 shipped the engine OFF behind an explicit `engine.enabled: true`
|
|
102
|
+
opt-in to give users control during a stabilization window; v6.1 closes
|
|
103
|
+
that window.
|
|
104
|
+
|
|
105
|
+
**Motivation — v6.0 stabilization criteria met.**
|
|
106
|
+
- 10 of 10 pipeline phases wrapped through `runPhaseWithLifecycle`
|
|
107
|
+
(`scan` v6.0.1, `costs`/`fix` v6.0.2, `brainstorm`/`spec` v6.0.3,
|
|
108
|
+
`plan`/`review` v6.0.4, `validate` v6.0.5, `implement` v6.0.7,
|
|
109
|
+
`migrate` v6.0.8 — first side-effecting wrap with `migration-version`
|
|
110
|
+
externalRefs, `pr` v6.0.9 — second side-effecting wrap with `github-pr`
|
|
111
|
+
externalRefs).
|
|
112
|
+
- Lifecycle helper extracted (v6.0.6) so all 10 wraps share the same
|
|
113
|
+
byte-for-byte engine-on / engine-off behavior.
|
|
114
|
+
- Side-effecting wraps proven (`migrate` + `pr`) — externalRef ledger
|
|
115
|
+
+ provider readback semantics exercised end-to-end.
|
|
116
|
+
- Live adapter cert suite green (Vercel + Fly + Render).
|
|
117
|
+
- `runs watch <id>` live cost/budget meter shipped (this release's
|
|
118
|
+
`v6.1.0-pre` entry below) — the YC-demo moment for the events stream.
|
|
119
|
+
- `npm test` baseline: 1469 → 1492 (+23 net new this release; all green).
|
|
120
|
+
|
|
121
|
+
**Deprecation.** `--no-engine`, `CLAUDE_AUTOPILOT_ENGINE=off|false|0|no`,
|
|
122
|
+
and `engine.enabled: false` continue to work as the legacy escape hatch
|
|
123
|
+
in v6.1.x. Each invocation that resolves to engine-off via one of those
|
|
124
|
+
explicit opt-outs now prints a single-line stderr deprecation notice:
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
[deprecation] --no-engine / engine.enabled: false will be removed in v7. Migrate to engine-on (default).
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
The notice fires only on user-driven opt-outs (`source: 'cli' | 'env' |
|
|
131
|
+
'config'`); the new (engine-on) default never trips it. **v7 removes
|
|
132
|
+
the escape hatch** — `engine.enabled: false` becomes a config validation
|
|
133
|
+
error and `--no-engine` / `CLAUDE_AUTOPILOT_ENGINE=off` are silently
|
|
134
|
+
ignored.
|
|
135
|
+
|
|
136
|
+
**Spec.** [`docs/specs/v6.1-default-flip.md`](docs/specs/v6.1-default-flip.md)
|
|
137
|
+
is the canonical reference for what flipped, why, and the v7 follow-up.
|
|
138
|
+
|
|
139
|
+
**Migration tips.**
|
|
140
|
+
- If your CI parses stderr as free-form text and relies on the v5.x
|
|
141
|
+
shape, set `CLAUDE_AUTOPILOT_ENGINE=off` (or pass `--no-engine`)
|
|
142
|
+
to pin the legacy behavior. You'll see the deprecation notice on
|
|
143
|
+
every invocation until you remove it — that's expected.
|
|
144
|
+
- If you opt out via config (`engine.enabled: false`), the same notice
|
|
145
|
+
fires on every invocation. Plan to remove that line before bumping
|
|
146
|
+
to v7.
|
|
147
|
+
- Existing users on `engine.enabled: true` are no-op'd — your config
|
|
148
|
+
still wins via the same precedence rules.
|
|
149
|
+
- See [`docs/v6/migration-guide.md#migrating-from-v60-to-v61`](docs/v6/migration-guide.md)
|
|
150
|
+
for the full upgrade walkthrough.
|
|
151
|
+
|
|
152
|
+
**Test surface.**
|
|
153
|
+
- `tests/run-state/resolve-engine.test.ts` — flipped 4 default-related
|
|
154
|
+
cases. New `v6.1 default-flip` describe block + `v6.1 deprecation
|
|
155
|
+
warning` describe block covering the predicate, the emitter, the
|
|
156
|
+
default `process.stderr` branch, and the `builtInDefault` override
|
|
157
|
+
path.
|
|
158
|
+
- `tests/run-state/run-phase-with-lifecycle.test.ts` — added 4 new
|
|
159
|
+
cases pinning engine-on as the new default + the deprecation banner
|
|
160
|
+
firing on opt-out / staying silent on the new default.
|
|
161
|
+
- 9 engine-smoke tests (`brainstorm`, `costs`, `implement`, `migrate`,
|
|
162
|
+
`plan`, `pr`, `review`, `spec`, `validate`) updated — the
|
|
163
|
+
"engine off (default)" cases are now "engine on (v6.1 default)";
|
|
164
|
+
the matching `cliEngine: false` cases stay as legacy-escape-hatch
|
|
165
|
+
coverage.
|
|
166
|
+
|
|
167
|
+
**Files changed.**
|
|
168
|
+
- `src/core/run-state/resolve-engine.ts` — new active default constant
|
|
169
|
+
`ENGINE_DEFAULT_V6_1 = true`. The deprecated `ENGINE_DEFAULT_V6_0`
|
|
170
|
+
export keeps its historical value (`false`) so out-of-tree consumers
|
|
171
|
+
who pinned that symbol get what the name promises; both constants are
|
|
172
|
+
removed in v7. New `emitEngineOffDeprecationWarning` helper +
|
|
173
|
+
`shouldWarnEngineOffDeprecation` predicate +
|
|
174
|
+
`ENGINE_OFF_DEPRECATION_MESSAGE` stable copy.
|
|
175
|
+
- `src/core/run-state/run-phase-with-lifecycle.ts` — wires the
|
|
176
|
+
deprecation helper into the engine-off branch.
|
|
177
|
+
- `docs/v6/migration-guide.md` — new "Migrating from v6.0 to v6.1"
|
|
178
|
+
section, updated precedence matrix, refreshed default-flip plan,
|
|
179
|
+
relabeled "What changes" table.
|
|
180
|
+
- `README.md` — v6 section updated (engine on by default + v7 removal
|
|
181
|
+
timeline).
|
|
182
|
+
- `package.json` — version `5.5.2` → `6.1.0`.
|
|
183
|
+
|
|
184
|
+
## v6.1.0-pre — `runs watch <id>` live cost meter (2026-05-07)
|
|
185
|
+
|
|
186
|
+
**The YC-demo moment.** v6.0.x hardened the events.ndjson stream across
|
|
187
|
+
all 10 wrapped phases; v6.1 makes that stream visible in real time.
|
|
188
|
+
`runs watch <runId>` tails events.ndjson via `fs.watchFile` (1s poll —
|
|
189
|
+
inotify/FSEvents are unreliable for tiny appends across our matrix) and
|
|
190
|
+
pretty-renders each event with a running cost/budget meter so a user
|
|
191
|
+
running `claude-autopilot autopilot ...` in one terminal can `runs watch`
|
|
192
|
+
in another and watch their $25 budget tick down while phases ship code.
|
|
193
|
+
|
|
194
|
+
**Demo transcript.** Live tail of a fixture run, ANSI-stripped:
|
|
195
|
+
|
|
196
|
+
```
|
|
197
|
+
* run 01HZK7P3D8Q9V00000000000AB
|
|
198
|
+
phases: spec -> plan -> implement -> pr
|
|
199
|
+
budget: $0.00 / $25.00 (0%)
|
|
200
|
+
[12:00:01] phase.start spec
|
|
201
|
+
[12:00:42] phase.cost spec +$0.07 (in: 1.2k, out: 3.4k) total: $0.07
|
|
202
|
+
[12:00:45] phase.success spec OK 44.2s
|
|
203
|
+
[12:00:46] phase.start plan
|
|
204
|
+
[12:01:12] phase.cost plan +$0.21 (in: 4.1k, out: 8.2k) total: $0.28
|
|
205
|
+
[12:01:15] phase.success plan OK 29.0s
|
|
206
|
+
[12:08:33] phase.externalRef pr -> github-pr#123
|
|
207
|
+
[12:08:34] run.complete status=success totalCostUSD=$4.20 duration=8m32s
|
|
208
|
+
|
|
209
|
+
done run 01HZK7P3D8Q9V00000000000AB
|
|
210
|
+
status=success totalCostUSD=$4.20 duration=8m33s
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
**Modes.**
|
|
214
|
+
|
|
215
|
+
- `runs watch <id>` — live tail, exits on `run.complete` / Ctrl-C
|
|
216
|
+
- `runs watch <id> --since <seq>` — replay forward from a specific seq
|
|
217
|
+
(resume after disconnect)
|
|
218
|
+
- `runs watch <id> --no-follow` — render snapshot once and exit (CI /
|
|
219
|
+
scripting)
|
|
220
|
+
- `runs watch <id> --json` — emit raw NDJSON to stdout (one event per
|
|
221
|
+
line) for piping to `jq` or external dashboards. ANSI suppressed.
|
|
222
|
+
- `runs watch <id> --no-color` — force ANSI off even on a TTY
|
|
223
|
+
|
|
224
|
+
**Pretty rendering.** Color thresholds on the budget bar — green <50%,
|
|
225
|
+
yellow 50-90%, red >90%. Per-event coloring: cyan for phase.start, yellow
|
|
226
|
+
for phase.cost, green for phase.success, red for phase.failed, magenta
|
|
227
|
+
for phase.externalRef + lock.takeover + replay.override, bold-green for
|
|
228
|
+
run.complete success, bold-red for run.complete failed/aborted. ANSI
|
|
229
|
+
auto-strips when stdout is not a TTY (CI), when `--no-color` or `--json`
|
|
230
|
+
is set, or when `NO_COLOR` env var is present.
|
|
231
|
+
|
|
232
|
+
**Pure renderer.** `src/cli/runs-watch-renderer.ts` is referentially
|
|
233
|
+
transparent — `renderEventLine(event, runningTotal, opts)` is the core
|
|
234
|
+
primitive, exported and 100% pure. Tests run as string-equality
|
|
235
|
+
assertions in <300ms.
|
|
236
|
+
|
|
237
|
+
**Engine modules untouched.** This is purely a consumer of the existing
|
|
238
|
+
event stream — no changes to `src/core/run-state/**`, no changes to the
|
|
239
|
+
10 wrapped phase verbs, no changes to `runPhaseWithLifecycle`.
|
|
240
|
+
|
|
241
|
+
**Tests.** +43 new tests:
|
|
242
|
+
- `tests/cli/runs-watch-renderer.test.ts` — 29 pure-renderer cases
|
|
243
|
+
covering every event-line variant, the three budget-bar color
|
|
244
|
+
thresholds, ANSI on/off symmetry, and the final-summary block
|
|
245
|
+
- `tests/cli/runs-watch.test.ts` — 14 verb-level cases covering
|
|
246
|
+
`--no-follow` snapshot, `--since` replay, `--json` mode, run-not-found
|
|
247
|
+
(exit 2), invalid-ULID, live tail picks up appended events,
|
|
248
|
+
budget rendering with/without `BudgetConfig`, plural `budgets` config
|
|
249
|
+
alias, ANSI behavior, and run-complete short-circuit on already-
|
|
250
|
+
terminated runs
|
|
251
|
+
|
|
252
|
+
**CLI plumbing.** New sub-verb on the `runs` umbrella: `runs watch <id>`.
|
|
253
|
+
Help block surfaces `--since`, `--no-follow`, `--json`, `--no-color`
|
|
254
|
+
plus a behavior summary + exit-code key. Exit codes: 0 success / clean
|
|
255
|
+
exit, 1 invalid input or stream error, 2 not_found.
|
|
256
|
+
|
|
257
|
+
## v6.0.9 — wrap `pr` through `runPhaseWithLifecycle` (2026-05-06)
|
|
258
|
+
|
|
259
|
+
**First side-effecting phase wrapped.** v6.0.1 → v6.0.5 wrapped read-only
|
|
260
|
+
verbs (`scan`, `costs`, `fix`, `brainstorm`, `spec`, `plan`, `review`,
|
|
261
|
+
`validate`); v6.0.6 extracted the lifecycle helper. v6.0.9 wraps `pr` —
|
|
262
|
+
the first verb that mutates state on the platform of record (GitHub
|
|
263
|
+
issue comments + PR reviews). This proves the helper's `ctx.emitExternalRef`
|
|
264
|
+
plumbing for genuinely side-effecting phases without any helper-shape
|
|
265
|
+
changes.
|
|
266
|
+
|
|
267
|
+
**Declarations.** Match the v6 spec table exactly:
|
|
268
|
+
|
|
269
|
+
- `idempotent: false` — re-running posts a NEW PR review ID each time
|
|
270
|
+
(`postReviewComments` dismisses prior + creates new). PR comment
|
|
271
|
+
posting (`postPrComment`) is marker-deduped on the body but the
|
|
272
|
+
underlying `gh` API call is still mutating.
|
|
273
|
+
- `hasSideEffects: true` — posts to GitHub via the `gh` CLI inside the
|
|
274
|
+
inner `runCommand` invocation.
|
|
275
|
+
- `externalRefs: github-pr` — recorded BEFORE the inner `runCommand`
|
|
276
|
+
runs so a crash mid-pipeline still leaves a breadcrumb pointing at
|
|
277
|
+
the PR. The engine path's Phase 6 resume logic can `gh pr view <id>`
|
|
278
|
+
to confirm the PR is still open before deciding whether a replay
|
|
279
|
+
is safe.
|
|
280
|
+
|
|
281
|
+
**Engine-off byte-for-byte unchanged.** All `gh pr view` + `git fetch` +
|
|
282
|
+
`runCommand` behavior preserved. The wrap adds two test seams
|
|
283
|
+
(`__testPrMeta` to short-circuit PR metadata lookup, `__testRunCommand`
|
|
284
|
+
to stub the inner pipeline) so the smoke test exercises the engine
|
|
285
|
+
lifecycle without `gh` or a real review pipeline. Production callers
|
|
286
|
+
must not pass these — they're documented "test only" with a comment
|
|
287
|
+
mirroring scan / fix's `__testReviewEngine` precedent.
|
|
288
|
+
|
|
289
|
+
**CLI plumbing.** The `pr` dispatcher arm now threads `cliEngine` from
|
|
290
|
+
`parseEngineCliFlag()` and `envEngine` from
|
|
291
|
+
`process.env.CLAUDE_AUTOPILOT_ENGINE`, mirroring every other wrapped
|
|
292
|
+
verb. The per-verb help block (`claude-autopilot help pr`) gains
|
|
293
|
+
`--engine` / `--no-engine` lines plus a side-effects note (engine-on
|
|
294
|
+
records a `github-pr` externalRef; future replays gate on the spec's
|
|
295
|
+
"side-effect readback" rule). `GLOBAL_FLAGS_BLOCK` adds "v6.0.9: wired
|
|
296
|
+
for `pr`" to its breadcrumb list.
|
|
297
|
+
|
|
298
|
+
**Smoke test.** New `tests/cli/pr-engine-smoke.test.ts`, 6 cases:
|
|
299
|
+
- engine off (default): no run dir / no engine artifacts; runCommand
|
|
300
|
+
still invoked
|
|
301
|
+
- engine off (`cliEngine: false`): no run dir
|
|
302
|
+
- engine on (`--engine`): state.json + events.ndjson + lifecycle in
|
|
303
|
+
order (run.start → phase.start → phase.externalRef → phase.success
|
|
304
|
+
→ run.complete); externalRef recorded with kind=`github-pr`,
|
|
305
|
+
id=`42`, provider=`github`; `idempotent: false, hasSideEffects: true`
|
|
306
|
+
reflected on the phase
|
|
307
|
+
- env precedence (`CLAUDE_AUTOPILOT_ENGINE=on` without CLI flag)
|
|
308
|
+
- CLI override (`--no-engine` beats env on)
|
|
309
|
+
- runCommand returning 1 surfaces as verb exit 1 WITHOUT marking the
|
|
310
|
+
engine phase as failed (pipeline result ≠ phase failure, same
|
|
311
|
+
precedent as scan)
|
|
312
|
+
|
|
313
|
+
**Why no follow-up `github-comment` externalRef yet.** A potential
|
|
314
|
+
extension is to record one externalRef per posted comment / review
|
|
315
|
+
(`github-comment`). That requires plumbing the post-comment URL out
|
|
316
|
+
of `runCommand` (currently only logged) — deferred to a follow-up PR.
|
|
317
|
+
For v6.0.9 the `github-pr` ref is sufficient for the spec's readback
|
|
318
|
+
rule: a Phase 6 resume can verify the PR is still open before
|
|
319
|
+
deciding whether to retry.
|
|
320
|
+
|
|
321
|
+
**Files changed.** `src/cli/pr.ts` (270 insertions / 22 deletions),
|
|
322
|
+
`src/cli/index.ts` (+12 lines for engine knob plumbing),
|
|
323
|
+
`src/cli/help-text.ts` (+8 lines for the per-verb Options block +
|
|
324
|
+
breadcrumb), `tests/cli/pr-engine-smoke.test.ts` (new, 306 lines),
|
|
325
|
+
`docs/v6/wrapping-pipeline-phases.md` (status header + table row +
|
|
326
|
+
deviation note), `docs/v6/migration-guide.md` ("what works today" list
|
|
327
|
+
adds `pr`), `docs/specs/v6-run-state-engine.md` (reconciliation block
|
|
328
|
+
appended). Total: ~600 lines added, ~25 lines removed.
|
|
329
|
+
|
|
330
|
+
**Status after v6.0.9.** Nine of 10 phases wrapped. Remaining:
|
|
331
|
+
`implement` (v6.0.7) and `migrate` (v6.0.8) — both side-effecting,
|
|
332
|
+
both wrapped concurrently with this PR by parallel agents.
|
|
333
|
+
- **Bundled UI polish skills** — ships `/ui`, `/simplify-ui`, `/ui-ux-pro-max`,
|
|
334
|
+
`/make-interfaces-feel-better` so consumers get them via `npm install` instead
|
|
335
|
+
of needing user-level skill installs. `/ui` runs the chained pass (audit →
|
|
336
|
+
simplify → align → polish); the other three are individual lenses. Auto-
|
|
337
|
+
discovered via the existing `skills/` directory in the package `files`
|
|
338
|
+
allowlist. Pairs with the design context loader
|
|
339
|
+
(`src/core/ui/design-context-loader.ts`) — both gate on the same
|
|
340
|
+
`hasFrontendFiles()` predicate so they only fire when frontend files change.
|
|
341
|
+
|
|
342
|
+
## v6.0.7 — wrap `implement` through `runPhaseWithLifecycle` (2026-05-07)
|
|
343
|
+
|
|
344
|
+
**Wraps the ninth pipeline phase.** Mechanical wrap following the v6.0.6
|
|
345
|
+
helper recipe. Engine-off path is byte-for-byte unchanged (advisory print
|
|
346
|
+
pointing at the Claude Code `claude-autopilot` skill); engine-on path
|
|
347
|
+
creates a run dir + emits run.start / phase.start / phase.success /
|
|
348
|
+
run.complete events. Concurrent dispatch — landed alongside v6.0.8
|
|
349
|
+
(`migrate`) and v6.0.9 (`pr`).
|
|
350
|
+
|
|
351
|
+
- New `src/cli/implement.ts` — `RunPhase<ImplementInput, ImplementOutput>`
|
|
352
|
+
with `idempotent: true, hasSideEffects: false`. **Documented deviation
|
|
353
|
+
from spec table:** the spec at line 159 of
|
|
354
|
+
`docs/specs/v6-run-state-engine.md` lists `implement` with
|
|
355
|
+
`idempotent: partial, hasSideEffects: yes, externalRefs: git-remote-push`.
|
|
356
|
+
That declaration assumes the verb itself writes commits and pushes them
|
|
357
|
+
to a remote. The v6.0.7 CLI verb does **not** write code, run tests,
|
|
358
|
+
commit, or push to a remote — all of that lives in the Claude Code
|
|
359
|
+
`claude-autopilot` skill (and its delegates: `subagent-driven-development`,
|
|
360
|
+
`commit-push-pr`, `using-git-worktrees`). The CLI verb is the engine-wrap
|
|
361
|
+
shell — its only side effect is writing the local
|
|
362
|
+
`.guardrail-cache/implement/<ts>-implement.md` log stub. If a future PR
|
|
363
|
+
inlines the implement loop into the CLI verb, the declarations flip to
|
|
364
|
+
match the spec table and a `ctx.emitExternalRef({ kind: 'git-remote-push',
|
|
365
|
+
id: '<commit-sha>' })` call lands after each push.
|
|
366
|
+
- CLI dispatcher in `src/cli/index.ts` — wires `--engine` / `--no-engine` /
|
|
367
|
+
`--context` / `--plan` / `--output` / `--config` through the helper
|
|
368
|
+
alongside `process.env.CLAUDE_AUTOPILOT_ENGINE`. Mirrors the validate /
|
|
369
|
+
review / plan dispatcher shape.
|
|
370
|
+
- Help text in `src/cli/help-text.ts` — adds `implement` to the Pipeline
|
|
371
|
+
group + per-verb Options block. Bumps `GLOBAL_FLAGS_BLOCK` to cite
|
|
372
|
+
v6.0.7 alongside v6.0.1 → v6.0.5.
|
|
373
|
+
- New smoke test `tests/cli/implement-engine-smoke.test.ts` (6 cases) —
|
|
374
|
+
asserts state.json + events.ndjson lifecycle, idempotent /
|
|
375
|
+
hasSideEffects flags, env / CLI precedence, log file location.
|
|
376
|
+
- Test count: 1408 → 1414 (+6). `npm test` clean. `npx tsc --noEmit`
|
|
377
|
+
clean except pre-existing fixture errors.
|
|
378
|
+
|
|
379
|
+
## v6.0.8 — wrap `migrate` through `runPhaseWithLifecycle` (2026-05-06)
|
|
380
|
+
|
|
381
|
+
**First side-effecting phase under the engine.** v6.0.1 → v6.0.6 wrapped
|
|
382
|
+
eight read-only / advisory verbs (`scan`, `costs`, `fix`, `brainstorm`,
|
|
383
|
+
`spec`, `plan`, `review`, `validate`). v6.0.8 wraps `migrate` — the
|
|
384
|
+
first verb that mutates external state (database schema). Builds on the
|
|
385
|
+
`runPhaseWithLifecycle` helper landed in v6.0.6 plus
|
|
386
|
+
`ctx.emitExternalRef()` from inside the phase body for the
|
|
387
|
+
`migration-version` ledger. No helper-shape changes needed.
|
|
388
|
+
|
|
389
|
+
**Phase declarations** match the spec table at line 162 of
|
|
390
|
+
`docs/specs/v6-run-state-engine.md`:
|
|
391
|
+
|
|
392
|
+
```
|
|
393
|
+
idempotent: false — dispatcher output varies by ledger state
|
|
394
|
+
(N applied on attempt 1, 0 on attempt 2 even
|
|
395
|
+
though both are operationally safe)
|
|
396
|
+
hasSideEffects: true — applies migrations, writes audit log,
|
|
397
|
+
regenerates types, refreshes schema cache
|
|
398
|
+
externalRefs: migration-version, scoped `<env>:<name>` per applied
|
|
399
|
+
migration. Phase 6's resume gate will read these back
|
|
400
|
+
against the live `migration_state` to decide
|
|
401
|
+
skip-already-applied vs retry vs needs-human.
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
**Why `idempotent: false` even though the underlying Delegance migrate
|
|
405
|
+
skill is ledger-guarded against double-apply:** at the *engine
|
|
406
|
+
semantics* layer, `idempotent: true` means "re-running the phase against
|
|
407
|
+
the same input produces equivalent output." A dispatch invocation that
|
|
408
|
+
previously applied N migrations on attempt 1 and applies 0 on attempt 2
|
|
409
|
+
(everything already in the ledger) DOES produce different output
|
|
410
|
+
(different `appliedMigrations` list, different `status`). The spec's
|
|
411
|
+
`idempotent: false` is correct.
|
|
412
|
+
|
|
413
|
+
**Engine-off path is byte-for-byte identical to v6.0.7.** Same dispatch
|
|
414
|
+
shape (`src/core/migrate/dispatcher.ts` unchanged), same render lines,
|
|
415
|
+
same `--json` payload callback. CI / scripts that don't pass `--engine`
|
|
416
|
+
are unaffected.
|
|
417
|
+
|
|
418
|
+
| File | Role |
|
|
419
|
+
|---|---|
|
|
420
|
+
| `src/cli/migrate.ts` (new) | Engine-wrap shell calling `runMigrate(opts) → { exitCode, result }`. Defines `MigrateInput` / `MigrateOutput` (JSON-serializable), `RunPhase<MigrateInput, MigrateOutput>` with `name: 'migrate'`, `idempotent: false`, `hasSideEffects: true`. Phase body invokes the dispatcher and emits one `migration-version` externalRef per applied migration via `ctx.emitExternalRef({ kind: 'migration-version', id: '<env>:<name>' })`. Test seam: `__testDispatch` injects a fake dispatcher so smoke tests can exercise the engine-wrap path without spawning a child process or hitting a real database |
|
|
421
|
+
| `src/cli/index.ts` | dispatcher case for `migrate` routes through `runMigrate` instead of inlining `runMigrateDispatch`; threads `cliEngine` + `envEngine`. Engine-off byte-for-byte unchanged — same `--json` payload callback, same render |
|
|
422
|
+
| `src/cli/help-text.ts` | per-verb Options block for `migrate` documents `--engine` / `--no-engine` + `--config`; GLOBAL_FLAGS_BLOCK breadcrumb cites v6.0.8 |
|
|
423
|
+
| `tests/cli/migrate-engine-smoke.test.ts` (new) | 6 cases: engine off (default — no run dir), engine on (lifecycle events, state.json shape, idempotent: false + hasSideEffects: true declaration), externalRef emission per applied migration scoped by env, skipped status (zero externalRefs), dispatcher error → exit 1 + engine still records phase.success (domain failure ≠ engine failure), CLI `--no-engine` beats env on |
|
|
424
|
+
| `docs/v6/wrapping-pipeline-phases.md` | phase-status table flips `migrate` to "WRAPPED in v6.0.8"; status line at top moves to "NINE phases wrapped"; new deviation note documents the ledger-vs-engine-semantics rationale |
|
|
425
|
+
| `docs/v6/migration-guide.md` | "What works today" updated — three knobs now honored by `scan`, `costs`, `fix`, `brainstorm`, `spec`, `plan`, `review`, `validate`, `migrate` |
|
|
426
|
+
| `docs/specs/v6-run-state-engine.md` | new "What was actually built (v6.0.8)" reconciliation block |
|
|
427
|
+
|
|
428
|
+
**Test delta:** 1408 → 1414 (+6). Typecheck clean. All 1408 existing
|
|
429
|
+
tests pass unchanged — the engine-off path for `migrate` is byte-for-
|
|
430
|
+
byte identical to v6.0.7 (same dispatch shape, same render).
|
|
431
|
+
|
|
432
|
+
**Concurrency note.** v6.0.7 (`implement`) and v6.0.9 (`pr`) are in
|
|
433
|
+
flight on parallel worktrees, both targeting shared docs (CHANGELOG,
|
|
434
|
+
recipe table, migration-guide) and `src/cli/{index,help-text}.ts`. The
|
|
435
|
+
rebase contract: on push rejection, fetch + rebase + resolve conflicts
|
|
436
|
+
keeping all wraps' contributions, re-test, push with `--force-with-lease`.
|
|
437
|
+
|
|
438
|
+
**Not done in v6.0.8 — explicit non-goals:**
|
|
439
|
+
- Wrapping `implement` and `pr`. Continues across v6.0.7 / v6.0.9
|
|
440
|
+
using the same helper plus `ctx.emitExternalRef()` for
|
|
441
|
+
`git-remote-push` (implement) and `github-pr` (pr).
|
|
442
|
+
- Wiring Phase 6's `migration_state` read-back. The engine PERSISTS
|
|
443
|
+
`migration-version` externalRefs in v6.0.8; consulting them on
|
|
444
|
+
resume ships in Phase 6+. Until then, retries on side-effecting
|
|
445
|
+
phases require `--force-replay`.
|
|
446
|
+
- Multi-phase pipeline orchestrator (autopilot's full
|
|
447
|
+
`brainstorm → spec → plan → ... → migrate → ...` flow under one runId).
|
|
448
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
449
|
+
|
|
450
|
+
## v6.0.6 — `runPhaseWithLifecycle` helper (2026-05-06)
|
|
451
|
+
|
|
452
|
+
**Tech-debt refactor, no behavior change.** v6.0.1 → v6.0.5 wrapped eight
|
|
453
|
+
CLI verbs (`scan`, `costs`, `fix`, `brainstorm`, `spec`, `plan`, `review`,
|
|
454
|
+
`validate`) by hand-rolling the same ~100-line lifecycle pattern in each
|
|
455
|
+
file: `createRun → optional run.warning → runPhase → run.complete →
|
|
456
|
+
state.json refresh → best-effort lock release in finally`. Bugbot caught
|
|
457
|
+
the duplication on PR #97 (LOW severity, deferred) with the explicit
|
|
458
|
+
note: "extracting from 5 of 10 examples risks getting the abstraction
|
|
459
|
+
wrong; from 10 of 10 the pattern is fully evidenced." At 8 of 10, the
|
|
460
|
+
pattern is sufficiently evidenced that the remaining three side-effecting
|
|
461
|
+
phases (`implement`, `migrate`, `pr`) can use the same helper plus
|
|
462
|
+
`ctx.emitExternalRef()` from inside their phase body — no helper-shape
|
|
463
|
+
changes needed.
|
|
464
|
+
|
|
465
|
+
**The helper.** New `src/core/run-state/run-phase-with-lifecycle.ts` sits
|
|
466
|
+
on top of the existing `runPhase()` API (which is unchanged). Callers
|
|
467
|
+
continue to define their own `RunPhase<I, O>` with per-phase
|
|
468
|
+
`idempotent` / `hasSideEffects` / `run`, and pass it in alongside the
|
|
469
|
+
input, the loaded config, the engine knobs, and an `runEngineOff`
|
|
470
|
+
escape-hatch callback. The helper:
|
|
471
|
+
|
|
472
|
+
- Resolves engine on/off via the canonical CLI > env > config > default
|
|
473
|
+
precedence
|
|
474
|
+
- On engine-off: invokes `runEngineOff()` and returns its result with
|
|
475
|
+
`runId/runDir: null`
|
|
476
|
+
- On engine-on: creates a run dir, optionally emits `run.warning` for
|
|
477
|
+
invalid env, runs the phase, emits `run.complete` (success or failed),
|
|
478
|
+
refreshes `state.json` from replayed events, releases the lock in
|
|
479
|
+
`finally` (idempotent), and returns `{ output, runId, runDir }`
|
|
480
|
+
- On phase failure: emits `run.complete` with `status: 'failed'`, prints
|
|
481
|
+
the legacy `[<phase>] engine: phase failed — <msg>` banner to stderr
|
|
482
|
+
byte-for-byte, releases the lock, and re-throws
|
|
483
|
+
|
|
484
|
+
**Migrated phases.** All eight wrapped verbs reduced. Each `runX(opts)`
|
|
485
|
+
function shrinks: keep the per-phase `RunPhase<I, O>` definition + the
|
|
486
|
+
engine-off path body; delete the lifecycle boilerplate; call
|
|
487
|
+
`runPhaseWithLifecycle` once. Total reduction across `src/cli/`:
|
|
488
|
+
|
|
489
|
+
- `scan.ts` 498 → 429 lines (-69)
|
|
490
|
+
- `costs.ts` 297 → 231 lines (-66)
|
|
491
|
+
- `fix.ts` 473 → 415 lines (-58)
|
|
492
|
+
- `brainstorm.ts` 251 → 189 lines (-62)
|
|
493
|
+
- `spec.ts` 216 → 159 lines (-57)
|
|
494
|
+
- `plan.ts` 269 → 199 lines (-70)
|
|
495
|
+
- `review.ts` 256 → 189 lines (-67)
|
|
496
|
+
- `validate.ts` 262 → 196 lines (-66)
|
|
497
|
+
- **Total: 2522 → 2007 lines (~515 lines saved)**
|
|
498
|
+
|
|
499
|
+
**Engine-off path is byte-for-byte unchanged.** All eight existing
|
|
500
|
+
`tests/cli/<verb>-engine-smoke.test.ts` smokes pass without modification
|
|
501
|
+
(44 cases). The helper supplies an `runEngineOff` callback so the legacy
|
|
502
|
+
code path stays intact even when the phase body's call shape would
|
|
503
|
+
otherwise pin it.
|
|
504
|
+
|
|
505
|
+
### Test count
|
|
506
|
+
|
|
507
|
+
After v6.0.5 baseline: 1396 → 1408 (+12). +12 cases for the new
|
|
508
|
+
`tests/run-state/run-phase-with-lifecycle.test.ts` covering: engine-off
|
|
509
|
+
(default + CLI > env > config precedence); engine-on success (lifecycle
|
|
510
|
+
events, state.json shape, env / config resolution, costUSD pass-through,
|
|
511
|
+
costUSD-absent fallback to 0); engine-on failure (run.complete failed,
|
|
512
|
+
state.json refresh, error re-thrown with original message preserved,
|
|
513
|
+
lock released through finally); invalid env value falling through to
|
|
514
|
+
config-resolved engine-on with `run.warning`. Existing 44 phase smokes
|
|
515
|
+
unchanged. Typecheck clean. Bugbot LOW from PR #97 addressed.
|
|
516
|
+
|
|
517
|
+
### Deliberately deferred
|
|
518
|
+
|
|
519
|
+
- Wrapping the remaining pipeline phases (`implement`, `migrate`, `pr`).
|
|
520
|
+
Side-effecting phases need careful externalRef plumbing — they will
|
|
521
|
+
build against `runPhaseWithLifecycle` plus `ctx.emitExternalRef()`
|
|
522
|
+
from inside their phase body. Helper signature does not need to grow
|
|
523
|
+
for them; documented in the helper's header comment.
|
|
524
|
+
- Multi-phase pipeline orchestrator (autopilot's full
|
|
525
|
+
`brainstorm → spec → plan → ...` flow under one runId). The single-
|
|
526
|
+
phase shape stays — multi-phase wrapping is a separate v6.x lift.
|
|
527
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
528
|
+
|
|
529
|
+
## v6.0.5 — Engine wire-up Part E (2026-05-06)
|
|
530
|
+
|
|
531
|
+
**The headline.** v6.0.4 wrapped `plan` and `review`. v6.0.5 continues the
|
|
532
|
+
mechanical wrap pattern from the recipe at
|
|
533
|
+
[`docs/v6/wrapping-pipeline-phases.md`](docs/v6/wrapping-pipeline-phases.md)
|
|
534
|
+
with one more single-shot, read-only verb:
|
|
535
|
+
|
|
536
|
+
- **`validate`** — new CLI verb. Engine-wrap shell for the validate
|
|
537
|
+
pipeline phase. Writes a validate log stub under
|
|
538
|
+
`.guardrail-cache/validate/`; the actual validation work (static
|
|
539
|
+
checks, auto-fix, tests, Codex review with auto-fix, bugbot triage) is
|
|
540
|
+
owned by the Claude Code `/validate` skill. Declared `idempotent: true,
|
|
541
|
+
hasSideEffects: false` (local file write only; no provider calls, no
|
|
542
|
+
git push, no PR comment, no SARIF upload).
|
|
543
|
+
|
|
544
|
+
**Documented deviation from the spec table.** The v6 spec
|
|
545
|
+
([docs/specs/v6-run-state-engine.md](docs/specs/v6-run-state-engine.md),
|
|
546
|
+
line 161) lists `validate` with externalRefs `sarif-artifact`. The
|
|
547
|
+
v6.0.5 wrap matches the `idempotent: true, hasSideEffects: false`
|
|
548
|
+
declaration but does **not** plumb a `sarif-artifact` externalRef — the
|
|
549
|
+
v6.0.5 `validate` CLI verb does not emit a SARIF artifact. SARIF
|
|
550
|
+
emission lives in `claude-autopilot run --format sarif --output <path>`
|
|
551
|
+
(a separate verb). The SARIF reference is local-only file output (no
|
|
552
|
+
remote upload), so the engine doesn't need a readback rule for it on
|
|
553
|
+
resume — `idempotent: true` covers replay safety. If a future PR adds
|
|
554
|
+
SARIF emission directly to this verb, the wrap can add a
|
|
555
|
+
`ctx.emitExternalRef({ kind: 'sarif-artifact', ... })` call after the
|
|
556
|
+
file write lands. Documented inline in `src/cli/validate.ts` and in the
|
|
557
|
+
wrapping recipe's deviation note.
|
|
558
|
+
|
|
559
|
+
The engine-off code path is byte-for-byte unchanged; the `validate`
|
|
560
|
+
verb is brand new in v6.0.5 (validation previously lived only as a
|
|
561
|
+
Claude Code skill).
|
|
562
|
+
|
|
563
|
+
### Test count
|
|
564
|
+
|
|
565
|
+
After v6.0.4 baseline: 1390 → 1396 (+6). +6 cases for
|
|
566
|
+
`validate-engine-smoke.test.ts`, mirroring the
|
|
567
|
+
`review-engine-smoke.test.ts` shape: engine off → no run dir + log
|
|
568
|
+
written; engine off (cliEngine: false); engine on → state.json +
|
|
569
|
+
events.ndjson with the right lifecycle (`run.start` →
|
|
570
|
+
`phase.start` → `phase.success` → `run.complete`); engine on with
|
|
571
|
+
explicit `--context`; env-resolved; CLI override beats env. Typecheck
|
|
572
|
+
clean.
|
|
573
|
+
|
|
574
|
+
### Deliberately deferred
|
|
575
|
+
|
|
576
|
+
- Wrapping the remaining pipeline phases (`implement`, `migrate`,
|
|
577
|
+
`pr`). Side-effecting phases need careful externalRef plumbing per
|
|
578
|
+
the recipe's "side effects" gate; wrap them last.
|
|
579
|
+
- Adding SARIF emission directly to the `validate` verb. Lives in
|
|
580
|
+
`claude-autopilot run --format sarif` (separate verb).
|
|
581
|
+
- Extracting a shared `runPhaseWithLifecycle` helper across the eight
|
|
582
|
+
wrapped verbs. Separate refactor PR — out of scope for v6.0.5.
|
|
583
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
584
|
+
|
|
585
|
+
## v6.0.4 — Engine wire-up Part D (2026-05-06)
|
|
586
|
+
|
|
587
|
+
**The headline.** v6.0.3 wrapped `brainstorm` and `spec`. v6.0.4 continues
|
|
588
|
+
the mechanical wrap pattern from the recipe at
|
|
589
|
+
[`docs/v6/wrapping-pipeline-phases.md`](docs/v6/wrapping-pipeline-phases.md)
|
|
590
|
+
with two more single-shot verbs:
|
|
591
|
+
|
|
592
|
+
- **`plan`** ([#98](https://github.com/axledbetter/claude-autopilot/pull/98)) —
|
|
593
|
+
new CLI verb. Engine-wrap shell for the plan pipeline phase. Writes a
|
|
594
|
+
plan markdown stub under `.guardrail-cache/plans/`; the actual
|
|
595
|
+
LLM-driven planning content is owned by the Claude Code
|
|
596
|
+
superpowers:writing-plans skill. Declared `idempotent: true,
|
|
597
|
+
hasSideEffects: false` (local file write only; no provider calls, no
|
|
598
|
+
git push, no PR comment).
|
|
599
|
+
- **`review`** ([#98](https://github.com/axledbetter/claude-autopilot/pull/98)) —
|
|
600
|
+
new CLI verb. Engine-wrap shell for the review pipeline phase. Writes
|
|
601
|
+
a review log stub under `.guardrail-cache/reviews/`; the actual
|
|
602
|
+
LLM-driven review content is owned by the Claude Code review skills
|
|
603
|
+
(`/review`, `/review-2pass`, `pr-review-toolkit:review-pr`). Declared
|
|
604
|
+
`idempotent: true, hasSideEffects: false`.
|
|
605
|
+
|
|
606
|
+
**Documented deviation from the spec table.** The v6 spec
|
|
607
|
+
([docs/specs/v6-run-state-engine.md](docs/specs/v6-run-state-engine.md))
|
|
608
|
+
lists `review` with externalRefs `review-comments`, implying PR-side
|
|
609
|
+
comment posting (which would force `hasSideEffects: true`). The v6.0.4
|
|
610
|
+
`review` verb does **not** post anywhere — PR-side comment posting
|
|
611
|
+
lives in `claude-autopilot pr --inline-comments` /
|
|
612
|
+
`--post-comments` (a separate verb). If a future PR adds platform-side
|
|
613
|
+
comment posting to this verb, both declarations will need to flip and
|
|
614
|
+
the readback rules will need to plumb a `review-comments` externalRef.
|
|
615
|
+
Documented inline in `src/cli/review.ts`.
|
|
616
|
+
|
|
617
|
+
**Backward-compat — `review` grouping prefix preserved.**
|
|
618
|
+
`claude-autopilot review` (no args) still prints the alpha.2 prefix
|
|
619
|
+
help banner per the V16 v4-compat test. Flat-verb invocation requires
|
|
620
|
+
at least one flag, e.g. `claude-autopilot review --engine`.
|
|
621
|
+
`claude-autopilot help review` continues to surface the flat-verb
|
|
622
|
+
Options block via `buildCommandHelpText`.
|
|
623
|
+
|
|
624
|
+
Engine-off code paths are unchanged for both verbs.
|
|
625
|
+
|
|
626
|
+
### Test count
|
|
627
|
+
|
|
628
|
+
After v6.0.3 baseline: 1378 → 1390 (+12). +6 cases for
|
|
629
|
+
`plan-engine-smoke.test.ts`, +6 cases for `review-engine-smoke.test.ts`.
|
|
630
|
+
Both mirror `costs-engine-smoke.test.ts`: engine off → no run dir;
|
|
631
|
+
engine on → state.json + events.ndjson with the right lifecycle
|
|
632
|
+
(`run.start` → `phase.start` → `phase.success` → `run.complete`);
|
|
633
|
+
env-resolved; CLI override beats env. Typecheck clean.
|
|
634
|
+
|
|
635
|
+
### Deliberately deferred
|
|
636
|
+
|
|
637
|
+
- Wrapping the remaining pipeline phases (`implement`, `migrate`,
|
|
638
|
+
`validate`, `pr`). Side-effecting phases (`implement`, `migrate`,
|
|
639
|
+
`pr`) need careful externalRef plumbing per the recipe's "side
|
|
640
|
+
effects" gate; wrap them last.
|
|
641
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
642
|
+
|
|
643
|
+
## v6.0.3 — Wrap brainstorm + spec through runPhase (2026-05-05)
|
|
644
|
+
|
|
645
|
+
**The headline.** v6.0.3 continues the mechanical phase-wrap pattern from
|
|
646
|
+
the recipe at
|
|
647
|
+
[`docs/v6/wrapping-pipeline-phases.md`](docs/v6/wrapping-pipeline-phases.md)
|
|
648
|
+
with two more pipeline verbs:
|
|
649
|
+
|
|
650
|
+
- **`brainstorm`** — the pipeline entry point. Implemented primarily as
|
|
651
|
+
a Claude Code skill (`/brainstorm` → `superpowers:brainstorming`); the
|
|
652
|
+
CLI verb is an advisory shim pointing the user there. The wrap declares
|
|
653
|
+
`idempotent: true, hasSideEffects: false`. Engine-off path is
|
|
654
|
+
byte-for-byte identical to v6.0.2 (the same advisory banner). Engine-on
|
|
655
|
+
path creates a run dir + emits `run.start` / `phase.start` /
|
|
656
|
+
`phase.success` / `run.complete`. `--json` envelope shape is preserved
|
|
657
|
+
for back-compat with the WS7 welcome regression guard and
|
|
658
|
+
`json-channel-discipline.test.ts`.
|
|
659
|
+
- **`spec`** — same shape as brainstorm. New top-level subcommand (it
|
|
660
|
+
was previously absent from `SUBCOMMANDS`); the CLI verb is an advisory
|
|
661
|
+
shim pointing at the autopilot/brainstorm Claude Code flow. Same wrap
|
|
662
|
+
flags + same engine lifecycle.
|
|
663
|
+
|
|
664
|
+
**Documented deviation from the spec table.** The
|
|
665
|
+
[v6 spec table](docs/specs/v6-run-state-engine.md) declares both
|
|
666
|
+
`brainstorm` and `spec` `idempotent: no` because the LLM dialogue
|
|
667
|
+
produces new content each invocation. v6.0.3 declares `idempotent: true`
|
|
668
|
+
because the CLI verbs themselves are static advisory prints with no LLM
|
|
669
|
+
call and no externalRefs to reconcile — the engine's idempotency check
|
|
670
|
+
is "safe to retry without reconciliation," not "produces byte-identical
|
|
671
|
+
output." Justified inline at the top of `src/cli/brainstorm.ts` and
|
|
672
|
+
`src/cli/spec.ts` plus a deviation block in the recipe. Once the CLI
|
|
673
|
+
verbs grow real LLM bodies (a future v6.x lift), the declaration may
|
|
674
|
+
flip and a `spec-file` externalRef will land on every successful run.
|
|
675
|
+
|
|
676
|
+
Engine-off code paths are unchanged for both verbs; existing tests pass
|
|
677
|
+
without modification.
|
|
678
|
+
|
|
679
|
+
### Test count
|
|
680
|
+
|
|
681
|
+
1367 → 1378 (+11). +5 cases for `brainstorm-engine-smoke.test.ts`, +5
|
|
682
|
+
cases for `spec-engine-smoke.test.ts`, +1 case for `spec` joining
|
|
683
|
+
`MIGRATED_VERBS` in `json-channel-discipline.test.ts`. Both new smoke
|
|
684
|
+
files mirror `costs-engine-smoke.test.ts`: engine off → no run dir;
|
|
685
|
+
engine on → state.json + events.ndjson with the right lifecycle
|
|
686
|
+
(`run.start` → `phase.start` → `phase.success` → `run.complete`);
|
|
687
|
+
env-resolved; CLI override beats env. Typecheck clean.
|
|
688
|
+
|
|
689
|
+
### Deliberately deferred
|
|
690
|
+
|
|
691
|
+
- Wrapping the six remaining pipeline phases (`plan`, `implement`,
|
|
692
|
+
`migrate`, `validate`, `pr`, `review`). One or two per release across
|
|
693
|
+
v6.0.4+. A parallel agent works `plan` + `review` for v6.0.4.
|
|
694
|
+
- Promoting `brainstorm`/`spec` from advisory shims to full LLM-bearing
|
|
695
|
+
CLI verbs. The Claude Code skill remains the user-facing entry point;
|
|
696
|
+
the CLI wraps exist so the engine has a place to record run-state for
|
|
697
|
+
future multi-phase orchestration.
|
|
698
|
+
|
|
699
|
+
## v6.0.2 — Engine wire-up Part B (2026-05-06)
|
|
700
|
+
|
|
701
|
+
**The headline.** v6.0.1 wrapped the first pipeline phase (`scan`) through
|
|
702
|
+
`runPhase`. v6.0.2 continues the mechanical wrap pattern from the recipe at
|
|
703
|
+
[`docs/v6/wrapping-pipeline-phases.md`](docs/v6/wrapping-pipeline-phases.md)
|
|
704
|
+
with two more single-shot verbs:
|
|
705
|
+
|
|
706
|
+
- **`costs`** ([#96](https://github.com/axledbetter/claude-autopilot/pull/96)) —
|
|
707
|
+
pure read-only summary of the local cost ledger. The cleanest possible
|
|
708
|
+
wrap: `idempotent: true, hasSideEffects: false`, no provider, no LLM,
|
|
709
|
+
no file writes. CLI dispatcher passes `cliEngine` + `envEngine` through;
|
|
710
|
+
`--config` flag also wired since the engine resolver consults config.
|
|
711
|
+
- **`fix`** ([#96](https://github.com/axledbetter/claude-autopilot/pull/96)) —
|
|
712
|
+
applies LLM-generated patches to local files. Declared
|
|
713
|
+
`idempotent: true` (same finding + same file content → same patch) and
|
|
714
|
+
`hasSideEffects: false` (no remote / git push / PR creation in the
|
|
715
|
+
existing flow — purely local file edits, which the recipe defines as
|
|
716
|
+
platform-side-effect-free). If/when fix grows a `--push` mode it will
|
|
717
|
+
flip to `hasSideEffects: true` with a `git-remote-push` externalRef.
|
|
718
|
+
|
|
719
|
+
**Documented deviation from the recipe.** Both wraps follow the recipe
|
|
720
|
+
mechanically. `fix` adds one explicit deviation: its phase body emits
|
|
721
|
+
per-finding console output and reads a [y/n/q] confirmation via
|
|
722
|
+
`readline`. Pure side-effect-free phase bodies are the recipe default,
|
|
723
|
+
but interactive verbs are an explicit exception (same precedent as
|
|
724
|
+
`scan` keeping its LLM call inside `executeScanPhase`). The summary line
|
|
725
|
+
+ exit-code logic still lives in `renderFixOutput` so the engine path's
|
|
726
|
+
idempotency isn't coupled to the final stdout shape. See the new "Note
|
|
727
|
+
on interactive verbs" section at the bottom of the wrapping recipe.
|
|
728
|
+
|
|
729
|
+
Engine-off code paths are byte-for-byte unchanged for both verbs;
|
|
730
|
+
existing tests pass without modification.
|
|
731
|
+
|
|
732
|
+
### Test count
|
|
733
|
+
|
|
734
|
+
1356 → 1367 (+11). +6 cases for `costs-engine-smoke.test.ts`, +5 cases
|
|
735
|
+
for `fix-engine-smoke.test.ts`. Both mirror `scan-engine-smoke.test.ts`:
|
|
736
|
+
engine off → no run dir; engine on → state.json + events.ndjson with
|
|
737
|
+
the right lifecycle (`run.start` → `phase.start` → `phase.success` →
|
|
738
|
+
`run.complete`); env-resolved; CLI override beats env. Typecheck clean.
|
|
739
|
+
|
|
740
|
+
### Deliberately deferred
|
|
741
|
+
|
|
742
|
+
- Wrapping the seven remaining pipeline phases (`brainstorm`, `plan`,
|
|
743
|
+
`implement`, `migrate`, `validate`, `pr`, `review`). One or two per
|
|
744
|
+
release across v6.0.3+.
|
|
745
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
746
|
+
|
|
747
|
+
## v6.0.1 — Engine wire-up Part A (2026-05-05)
|
|
748
|
+
|
|
749
|
+
**The headline.** v6.0 shipped the engine modules but left the user-facing
|
|
750
|
+
knobs un-wired. This release lights up the three knobs (`--engine` /
|
|
751
|
+
`--no-engine` CLI flag, `CLAUDE_AUTOPILOT_ENGINE` env var,
|
|
752
|
+
`engine.enabled` config key) with explicit precedence (CLI > env > config
|
|
753
|
+
> built-in default) and wraps the **first** pipeline phase — `scan` —
|
|
754
|
+
through `runPhase`. Every other pipeline phase still bypasses the engine;
|
|
755
|
+
those land one or two per PR across subsequent v6.0.x releases following
|
|
756
|
+
the recipe at [`docs/v6/wrapping-pipeline-phases.md`](docs/v6/wrapping-pipeline-phases.md).
|
|
757
|
+
|
|
758
|
+
The engine still ships **OFF** by default in v6.0.x. The default flip to
|
|
759
|
+
**ON** lands in v6.1 per [`docs/specs/v6.1-default-flip.md`](docs/specs/v6.1-default-flip.md).
|
|
760
|
+
|
|
761
|
+
### What landed (PR #95)
|
|
762
|
+
|
|
763
|
+
- **`resolveEngineEnabled()` precedence resolver.** Pure / no-IO function
|
|
764
|
+
in `src/core/run-state/resolve-engine.ts`. Inputs:
|
|
765
|
+
`{cliEngine?, envValue?, configEnabled?, builtInDefault?}`. Outputs:
|
|
766
|
+
`{enabled, source, reason, invalidEnvValue?}`. Accepts case-insensitive
|
|
767
|
+
env values `on/off/true/false/1/0/yes/no` (plus whitespace tolerance);
|
|
768
|
+
invalid values fall through to the next-lowest precedence layer and
|
|
769
|
+
surface the raw string in `invalidEnvValue` so the caller can emit a
|
|
770
|
+
`run.warning`. **+45 unit tests** covering every precedence layer, every
|
|
771
|
+
accepted env form, the conflict rules, and the invalid-env fallthrough.
|
|
772
|
+
- **CLI flag parsing in `src/cli/index.ts`.** New `parseEngineCliFlag()`
|
|
773
|
+
helper rejects the conflict case (both `--engine` AND `--no-engine`)
|
|
774
|
+
with `invalid_config` exit 1. Wired into the `scan` case to pass
|
|
775
|
+
`cliEngine` + `envEngine` (from `process.env.CLAUDE_AUTOPILOT_ENGINE`)
|
|
776
|
+
through to `runScan`.
|
|
777
|
+
- **Config schema** (`src/core/config/types.ts` + `schema.ts`). New
|
|
778
|
+
optional `engine.enabled: boolean` knob; schema rejects unknown
|
|
779
|
+
sub-keys (`additionalProperties: false`).
|
|
780
|
+
- **Help text** (`src/cli/help-text.ts`). New `GLOBAL_FLAGS_BLOCK`
|
|
781
|
+
documents `--json` / `--engine` / `--no-engine` + the precedence
|
|
782
|
+
matrix + scope (scan only in v6.0.1; rest follows the recipe). Per-verb
|
|
783
|
+
`scan` Options block adds the new flags so `claude-autopilot help scan`
|
|
784
|
+
is self-contained.
|
|
785
|
+
- **`scan` pilot phase wrapping** (`src/cli/scan.ts`). Refactored the
|
|
786
|
+
LLM-call-and-finding-processing portion into `executeScanPhase(input)`
|
|
787
|
+
→ `ScanOutput` (pure, no console output, no exit-code logic). Defined
|
|
788
|
+
`RunPhase<ScanInput, ScanOutput>` with `name: 'scan'`,
|
|
789
|
+
`idempotent: true`, `hasSideEffects: false`. Engine-on path:
|
|
790
|
+
`createRun()` → `runPhase()` → `run.complete` event +
|
|
791
|
+
`replayState`/`writeStateSnapshot` refresh + best-effort lock release
|
|
792
|
+
in `finally`. Engine-off path: `executeScanPhase(input)` directly,
|
|
793
|
+
byte-for-byte unchanged from v6.0. Rendering extracted into
|
|
794
|
+
`renderScanOutput()` so the engine path's idempotency isn't coupled
|
|
795
|
+
to console output. Test seam (`__testReviewEngine`) lets the smoke test
|
|
796
|
+
inject a fake without an LLM key.
|
|
797
|
+
- **End-to-end smoke test** (`tests/cli/scan-engine-smoke.test.ts`).
|
|
798
|
+
Drives `runScan` with the engine on against a tmp project; asserts
|
|
799
|
+
`state.status === 'success'`, single `scan` phase with the right
|
|
800
|
+
`idempotent` / `hasSideEffects` flags, monotonic seq numbers, and the
|
|
801
|
+
full lifecycle (`run.start` → `phase.start` → `phase.success` →
|
|
802
|
+
`run.complete`). Five cases including engine-off (no run dir),
|
|
803
|
+
env-resolved, CLI override, and invalid-env-fallthrough warning.
|
|
804
|
+
- **Wrapping recipe doc** (`docs/v6/wrapping-pipeline-phases.md`).
|
|
805
|
+
Six-step recipe + phase-status table + idempotency decision tree +
|
|
806
|
+
worked example (scan) + a checklist subsequent v6.0.x PRs follow when
|
|
807
|
+
wrapping the remaining ten pipeline phases (`brainstorm`, `plan`,
|
|
808
|
+
`implement`, `migrate`, `validate`, `pr`, `review`, `fix`, `costs`).
|
|
809
|
+
- **Migration guide** (`docs/v6/migration-guide.md`). "What works today"
|
|
810
|
+
list updated — three knobs move from "wiring pending" to "wired (limited
|
|
811
|
+
to scan)". Other phases still tracked under "wiring pending."
|
|
812
|
+
- **Spec reconciliation** (`docs/specs/v6-run-state-engine.md`). New "What
|
|
813
|
+
was actually built (v6.0.1 — Part A)" block.
|
|
814
|
+
|
|
815
|
+
### Test count
|
|
816
|
+
|
|
817
|
+
1306 → 1356 (+50). Typecheck clean. Existing 1306 tests continue to pass
|
|
818
|
+
unchanged — the engine-off code path for `scan` is byte-for-byte
|
|
819
|
+
identical to v6.0.
|
|
820
|
+
|
|
821
|
+
### Deliberately deferred
|
|
822
|
+
|
|
823
|
+
- Wrapping of any other pipeline phase. Lands one or two per PR across
|
|
824
|
+
v6.0.2+ following the recipe.
|
|
825
|
+
- Flipping the v6.0 built-in default to ON. v6.1 territory.
|
|
826
|
+
- Removing `--no-engine`. v7 territory.
|
|
827
|
+
|
|
828
|
+
## v6.0 — Run State Engine (2026-05-05)
|
|
829
|
+
|
|
830
|
+
**The headline.** Autopilot moves from a stateless command-stream to a
|
|
831
|
+
checkpointed, resumable, budget-bounded, observable pipeline. Every run gets
|
|
832
|
+
a ULID and a per-project directory at `.guardrail-cache/runs/<ulid>/`.
|
|
833
|
+
Every state transition appends a typed event to `events.ndjson` and updates
|
|
834
|
+
`state.json` atomically. Two-layer budget enforcement (advisory `estimateCost`
|
|
835
|
+
preflight + mandatory runtime guard) hard-stops runaway spend before it
|
|
836
|
+
happens. Every CLI verb grows a `--json` flag with strict stdout/stderr
|
|
837
|
+
channel discipline so CI consumers can drive the pipeline programmatically.
|
|
838
|
+
Side-effect phase replay decisions consult persisted `externalRefs` plus a
|
|
839
|
+
live provider read-back so resume is safe by construction. **v6.0 ships
|
|
840
|
+
with the engine OFF by default — opt-in via `engine.enabled: true` (config
|
|
841
|
+
wiring across 6.0.x point releases). Default flips to ON in v6.1.** See
|
|
842
|
+
[`docs/v6/migration-guide.md`](docs/v6/migration-guide.md) for the v5.x → v6
|
|
843
|
+
walkthrough and [`docs/v6/quickstart.md`](docs/v6/quickstart.md) for the
|
|
844
|
+
five-minute version.
|
|
845
|
+
|
|
846
|
+
### Per-phase landings
|
|
847
|
+
|
|
848
|
+
- **Phase 1 — Run State Engine persistence layer ([#86](https://github.com/axledbetter/claude-autopilot/pull/86)).** `RunState` / `RunEvent` / `PhaseSnapshot` / `ExternalRef` / `WriterId` types in `src/core/run-state/types.ts`. Pure-TS 26-char Crockford Base32 ULID generator (`ulid.ts`). Per-run advisory lock via `proper-lockfile` + `.lock-meta.json` sidecar with PID + SHA-256-hashed hostname; off-host writers default to alive (fail closed) so a network-mounted lock can't be stolen. Durable append protocol for `events.ndjson` (`open(O_APPEND)` → `write` → `fsync(fd)` → `close` per event) with monotonic `seq` via `.seq` sidecar. Truncated last-line detection emits `run.recovery(reason: 'recovered-from-partial-write')` and continues; mid-file corruption throws `partial_write` immediately. Atomic snapshot writer for `state.json` (`open(.tmp)` → `fsync(fd)` → `rename` → `fsync(dirfd)`; tmpfs/SMB compatibility via swallowed EISDIR/EPERM/ENOTSUP on the dir-fsync). `recoverState` falls back to events replay when `state.json` is missing/corrupt. `createRun` / `listRuns` / `gcRuns` lifecycle helpers; symlink-safe GC. New `ErrorCode` variants: `lock_held`, `corrupted_state`, `partial_write`. **+56 tests.**
|
|
849
|
+
- **Phase 2 — Phase wrapper + lifecycle ([#87](https://github.com/axledbetter/claude-autopilot/pull/87)).** `RunPhase<I, O>` interface (`idempotent` / `hasSideEffects` / `estimateCost?` / `run` / `onResume?`). `runPhase` orchestrator emits `phase.start` → `phase.success`/`failed` and gates idempotent short-circuit + side-effecting replay. Atomic per-phase snapshot writer (`writePhaseSnapshot` with path-traversal rejection on phase names). Hidden CLI verb `claude-autopilot internal log-phase-event` exposed via `cli-internal.ts` so markdown-driven skills can append events without importing the engine. Sub-phase nesting via synthetic `phaseIdx` encoding (`parentIdx * 1000 + childOrdinal`). **+27 tests.** Spec deviation: idempotent-replay short-circuit emits `run.warning(details.reason: 'idempotent-replay')` instead of a new `phase.skipped` event variant — durable log doesn't need a new shape since the snapshot is identical.
|
|
850
|
+
- **Phase 3 — `runs` / `run resume` CLI ([#88](https://github.com/axledbetter/claude-autopilot/pull/88)).** Six verbs: `runs list` (newest-first, `--status` filter), `runs show <id>` (state + optional events tail), `runs gc` (default 30-day cutoff, confirmation gate), `runs delete <id>` (terminal-status guard + lock acquisition), `runs doctor` (replay vs snapshot drift; `--fix` rewrites), `run resume <id>` (**lookup-only** in v6.0 — identifies next phase + decision rationale; live execution wires in 6.1+). Every verb supports `--json` envelope output (v1 schema). New `Engine` group in `HELP_GROUPS`. Decision vocabulary (`retry` / `skip-idempotent` / `needs-human` / `already-complete`) preserved as a thin wrapper around the canonical `decideReplay` matrix introduced in Phase 6. **No changes to existing CLI verbs.**
|
|
851
|
+
- **Phase 4 — Budget enforcement ([#89](https://github.com/axledbetter/claude-autopilot/pull/89)).** `BudgetConfig` (`perRunUSD`, `perPhaseUSD?`, `councilMaxRecursionDepth?`, `bgAutopilotMaxRoundsPerSelfEat?`, `conservativePhaseReserveUSD?`). `checkPhaseBudget` pure decision function with two-layer policy: (1) advisory — uses `estimateCost.high` if the phase declares one; (2) mandatory — runs regardless, enforces `actualSoFar + conservativePhaseReserveUSD <= perRunUSD` so phases without `estimateCost` still trigger budget gates. `runPhase` emits a `budget.check` event with full decision rationale (`{phase, phaseIdx, estimatedHigh, actualSoFar, reserveApplied, capRemaining, decision, reason}`) before every spawn; throws `GuardrailError(budget_exceeded)` on hard-fail. Council synthesizer recursion bounded via `councilMaxRecursionDepth` — exceeded calls return `status: 'partial'` rather than continuing. **+25-30 tests.**
|
|
852
|
+
- **Phase 5 — Typed JSON events + strict `--json` channel discipline ([#90](https://github.com/axledbetter/claude-autopilot/pull/90)).** `--json` flag now lives on every Review / Pipeline / Deploy / Migrate / Diagnostics verb. Strict channel contract enforced by a dispatcher-level wrapper (`runUnderJsonMode` in `src/cli/json-envelope.ts`): exactly **one** JSON envelope on stdout per invocation; **only** NDJSON event lines on stderr (synthetic `run.warning` for legacy text via `installJsonModeChannelDiscipline` console-wrap); ANSI color codes stripped; interactive prompts hard-fail with `EXIT_NEEDS_HUMAN = 78` and the envelope's `nextActions` field carries the resume hint. Text-mode behavior unchanged. **`tests/cli/json-channel-discipline.test.ts` asserts the invariants per migrated verb.**
|
|
853
|
+
- **Phase 6 — Idempotency contracts + provider read-back ([#91](https://github.com/axledbetter/claude-autopilot/pull/91)).** `decideReplay` pure decision matrix in `replay-decision.ts` maps `(priorSuccess, idempotent, hasSideEffects, refs, readbacks, forceReplay)` → `'retry' | 'skip-already-applied' | 'needs-human' | 'abort'`. Pluggable `ProviderReadback` registry in `provider-readback.ts` with built-in read-backs for `github` (via `gh` CLI), `vercel` / `fly` / `render` (via the deploy adapters), `supabase` (via `migration_state`). All read-backs **fail closed** — any throw, parse failure, or unrecognized state collapses to `existsOnPlatform=false, currentState='unknown'` so the matrix routes to `needs-human` instead of a silent skip. `runPhase` wires `decideReplay` (replaces Phase 2's hard-coded throw). New `replay.override` event variant emitted when `--force-replay` flips a refusal into a retry; `foldEvents` records overrides on `phase.meta.replayOverrides`. `PhaseSnapshot.result` field added so `skip-already-applied` returns the prior output without re-execution. CLI lookup (`runRunResume`) delegates to the same `decideReplay` so prediction matches live execution. **+55 tests.**
|
|
854
|
+
- **Phase 7 — Live adapter certification suite ([#92](https://github.com/axledbetter/claude-autopilot/pull/92)).** Five live assertions × three providers (Vercel + Fly + Render): deploy success, auth failure, 404, rollback, log streaming with redaction-on-planted-secret. Env-gated via `resolveProviderEnv()` — runs report `skipped` until the operator adds the seven `*_TEST` GitHub Secrets per `docs/adapters/cert-suite.md`. Flake-control harness (`tests/adapters/live/_harness.ts`) implements per-provider 3-attempt retry budget with exp backoff (1s / 4s / 16s) on transient categories, hard-fail (no retry) on auth/404/schema-mismatch, soft-fail with 3-strike escalation on rollout/log-streaming flakes; **+42 unit tests** for the harness alone (run under regular `npm test`, no live creds required). Nightly CI workflow at `.github/workflows/adapter-cert.yml` (09:00 UTC + manual `workflow_dispatch`); uploads `events.ndjson` + `log-tail.txt` artifacts on every run. **Spec deviation:** Fly cert needs a third env var (`FLY_IMAGE_TEST`) since the Fly adapter doesn't build images per the v5.6 design.
|
|
855
|
+
- **Phase 8 — Docs + migration guide ([#94](https://github.com/axledbetter/claude-autopilot/pull/94), this PR).** `docs/v6/migration-guide.md` walks v5.x users through the opt-in flow with a precedence matrix, troubleshooting recipes, the per-phase idempotency table, and the v6.0 → v6.1 default-flip plan. `docs/v6/quickstart.md` is the five-minute version. README gains a "Run State Engine (v6)" section. CHANGELOG (this entry) bundles every phase. Spec gets a Phase 8 reconciliation block + a Status column on the implementation phases table. New `docs/specs/v6.1-default-flip.md` outlines the stabilization criteria for flipping `engine.enabled` to `true` by default and removing `--no-engine`.
|
|
856
|
+
- **Spec — Codex-reviewed twice ([#85](https://github.com/axledbetter/claude-autopilot/pull/85)).** Two passes through Codex 5.3 hardened the persistence protocol (durable append + atomic snapshot ordering), promoted `events.ndjson` to source-of-truth with `state.json` as a derived cache, mandated copy-not-symlink for artifacts, added the two-layer budget policy with a mandatory runtime guard, formalized the strict `--json` channel discipline, defined the external-operation ledger for replay safety (`ExternalRef` + provider read-back), pinned the precedence matrix, and added flake-control parameters for the live adapter cert suite.
|
|
857
|
+
|
|
858
|
+
### Codex / council pricing — from the GPT-5.5 swap ([#93](https://github.com/axledbetter/claude-autopilot/pull/93))
|
|
859
|
+
|
|
860
|
+
- **Default codex/council model bumped `gpt-5.3-codex` → `gpt-5.5`.** OpenAI
|
|
861
|
+
released GPT-5.5 (codename Spud) on 2026-04-23 — better at coding than 5.4
|
|
862
|
+
with fewer tokens, available via standard Responses/Chat Completions API
|
|
863
|
+
at `gpt-5.5` (no `-codex` suffix). Pricing **doubles** to $5/1M input +
|
|
864
|
+
$30/1M output, so the per-adapter `COST_PER_M_INPUT/OUTPUT` defaults moved
|
|
865
|
+
in lockstep — without this, every cost-ledger entry would silently halve.
|
|
866
|
+
New canonical pricing table at `src/adapters/pricing.ts` keeps the legacy
|
|
867
|
+
`gpt-5.3-codex` and `gpt-5.4` entries for back-compat with pinned
|
|
868
|
+
`CODEX_MODEL`/`council.models[].model` configs. Override via env vars
|
|
869
|
+
(`CODEX_MODEL`, `CODEX_COST_INPUT_PER_M`, `CODEX_COST_OUTPUT_PER_M`).
|
|
870
|
+
|
|
871
|
+
## v5.6.0 — Fly.io + Render deploy adapters (2026-05-04)
|
|
872
|
+
|
|
873
|
+
### Added
|
|
874
|
+
|
|
875
|
+
- **`@delegance/claude-autopilot deploy --adapter fly`** — first-class Fly.io adapter. Image-based releases via the Machines API (image must be pre-pushed via `fly deploy --build-only --push`), polling-based status, **WebSocket log streaming**, **native rollback** with simulated fallback when the API endpoint is unavailable. `FLY_API_TOKEN` env var; auth doctor warns when missing.
|
|
876
|
+
- **`@delegance/claude-autopilot deploy --adapter render`** — first-class Render adapter. REST API deploys (with optional `clearCache`), service-scoped status polling at `GET /v1/services/{serviceId}/deploys/{deployId}`, REST-polling log stream with `(timestamp, logId)` cursor dedup, **simulated rollback** by re-deploying the previous successful commit. `RENDER_API_KEY` env var; auth doctor warns when missing.
|
|
877
|
+
- **`DeployAdapterCapabilities` interface** — adapters declare `streamMode: 'websocket' | 'polling' | 'none'` and `nativeRollback: boolean`. CLI prints a one-line stderr notice for polling-mode adapters under `--watch` so users understand why log lines arrive in batches.
|
|
878
|
+
- **Bounded auto-rollback orchestration in `src/cli/deploy.ts`** — when health check fails after deploy and `rollbackOn: [healthCheckFailure]` is configured, the CLI fires exactly one rollback (no chains), with `runHealthCheck` capped at 5 attempts × 6s backoff (~30s window). New terminal `DeployResult.status` values: `fail_rolled_back` and `fail_rollback_failed`.
|
|
879
|
+
- **HTTP-status error taxonomy** — new `not_found` `ErrorCode` joins the union; per-adapter mapping: 401/403→`auth`, 404→`not_found`, 422/400→`invalid_config`, 5xx→`transient_network` (retryable). Provider request-id headers (`Fly-Request-Id`, `x-request-id`) captured into `error.details` for support tickets.
|
|
880
|
+
- **Mandatory log redaction across all adapters** — every log line surfaced into `DeployResult.output` or PR-comment bodies runs through `redactLogLines()` (defaults: `AKIA…`, `sk-…`, `eyJ…`, `ghp_`, `xoxb-`, plus user-configurable `config.persistence.redactionPatterns`). Closes a real existing security hazard in the v5.4 Vercel adapter that was emitting unredacted logs into PR comments.
|
|
881
|
+
- **Shared `src/adapters/deploy/_http.ts`** — extracted `fetchWithRetry` + `safeReadBody` helpers used by Vercel, Fly, and Render adapters; one canonical retry implementation to maintain.
|
|
882
|
+
|
|
883
|
+
### Fixed
|
|
884
|
+
|
|
885
|
+
- **Bugbot caught + autopilot fixed 4 real bugs across the v5.6 self-eat phases.** HIGH on Phase 2 (Render service-scoped URL — `pollUntilTerminal` and `status()` were using shorthand `/v1/deploys/{id}` which doesn't exist on Render's API). MEDIUM on Phase 3 (Render cursor dedup wasn't sorting same-ms entries by id, silently dropping out-of-order siblings). LOW on Phase 4 (`printAutoRollback` hardcoded "failed 3x" but the constant is now 5). LOW on Phase 5 (`getPreviousFileContent` was being called for `.sql` files where `previousContent` is ignored, wasting a `git show` spawn per migration).
|
|
886
|
+
- **Schema-alignment diff-aware Prisma parsing (PR #44, schema-alignment cleanup)** — `getPreviousFileContent` now defaults to a CI-aware base ref (`GITHUB_BASE_REF` → `origin/<base>`, then `CI_MERGE_REQUEST_TARGET_BRANCH_NAME`, fallback `HEAD~1`) instead of always reading from `HEAD` (which gave empty diffs in CI). Dropped models now emit `drop_column` for every field of the removed model.
|
|
887
|
+
- **Tombstone CLI no longer crashes with a stack trace when presets are missing (PR #82)** — schema-validator was running file IO at module load time, so every `claude-autopilot --version` call eagerly read `presets/aliases.lock.json` + `presets/schemas/migrate.schema.json`; missing presets crashed the CLI before it could format an error. Now lazy-init via memoized `getValidator()`.
|
|
888
|
+
|
|
889
|
+
## v5.5.2 — Framework-agnostic /migrate (2026-04-30)
|
|
890
|
+
|
|
891
|
+
### Added
|
|
892
|
+
|
|
893
|
+
- **Working examples for Rails, Alembic, Django, golang-migrate, Prisma, Drizzle, dbmate, Flyway, supabase-cli, custom scripts** in `skills/migrate/SKILL.md`. The dispatcher was always framework-agnostic, but the prior doc text only described the Supabase path.
|
|
894
|
+
- **Detector `defaultCommand` fills** for `prisma-push`, `drizzle-push`, `golang-migrate`, `typeorm` so `claude-autopilot init` produces a working `stack.md` on first try for these toolchains.
|
|
895
|
+
|
|
896
|
+
### Fixed
|
|
897
|
+
|
|
898
|
+
- **`/migrate` skill description rewritten** as a generic dispatcher description with a "when to use migrate-supabase instead" callout. Anyone running `migrate@1` in a non-Supabase repo no longer sees Supabase-specific instructions.
|
|
899
|
+
|
|
900
|
+
## v5.5.1 — `openai` SDK now optional (2026-04-30)
|
|
901
|
+
|
|
902
|
+
### Changed
|
|
903
|
+
|
|
904
|
+
- **`openai` moved to `optionalDependencies`** alongside `@anthropic-ai/sdk`, `@google/generative-ai`, `@modelcontextprotocol/sdk`. All four LLM SDKs are now optional. `npm install --omit=optional` shed grows to **~26 MB** (was ~13 MB after v5.5.0). `scripts/autoregress.ts` migrated to `loadOpenAI()` — the last direct `import OpenAI` outside the adapter layer.
|
|
905
|
+
|
|
906
|
+
### Notes
|
|
907
|
+
|
|
908
|
+
- Council runner already handles missing-synth-SDK gracefully — returns `status: 'partial'` with the friendly install hint surfaced via the synthesis error field. Users with only `ANTHROPIC_API_KEY` get a partial result with model responses preserved.
|
|
909
|
+
|
|
910
|
+
## v5.5.0 — Lazy-load LLM SDKs + Vercel auth doctor (2026-04-30)
|
|
911
|
+
|
|
912
|
+
### Added
|
|
913
|
+
|
|
914
|
+
- **`src/adapters/sdk-loader.ts`** with `loadAnthropic` / `loadOpenAI` / `loadGoogleGenerativeAI` + `isSdkInstalled` helper. Friendly `GuardrailError` on `MODULE_NOT_FOUND` points at the exact `npm install` command.
|
|
915
|
+
- **Phase 6 of v5.4 spec — Vercel auth doctor.** `claude-autopilot doctor` detects `deploy.adapter: vercel` in `guardrail.config.yaml` and warns when `VERCEL_TOKEN` is missing.
|
|
916
|
+
- **LLM SDK install-state surface in doctor** — shows which optional LLM SDKs are actually installed.
|
|
917
|
+
|
|
918
|
+
### Changed
|
|
919
|
+
|
|
920
|
+
- **`@anthropic-ai/sdk`, `@google/generative-ai`, `@modelcontextprotocol/sdk` moved to `optionalDependencies`**. Six adapters converted from top-level import to dynamic load. Users with `--omit=optional` shed ~13 MB and only need the SDK matching their API key.
|
|
921
|
+
|
|
922
|
+
## v5.4.0 — Vercel first-class deploy adapter (2026-04-30)
|
|
923
|
+
|
|
924
|
+
### Added
|
|
925
|
+
|
|
926
|
+
- **`@delegance/claude-autopilot deploy --adapter vercel`** — first-class Vercel adapter via the v13 deployments API. Returns `dpl_xxx` IDs, polls status until terminal, populates `deployUrl` / `buildLogsUrl` / `output`. Auth via `VERCEL_TOKEN`.
|
|
927
|
+
- **`--watch` SSE+NDJSON log streaming** — subscribes to `/v2/deployments/<id>/events?builds=1`, prints to stderr in real time. Reconnects once with exp backoff on disconnect.
|
|
928
|
+
- **`claude-autopilot deploy rollback` + `deploy status`** — CLI subverbs over the adapter's `rollback()` / `status()` methods. `--to <id>` overrides "previous prod deploy" lookup.
|
|
929
|
+
- **Auto-rollback on health-check failure** — when `rollbackOn: [healthCheckFailure]` is set in config, the CLI promotes the previous prod deploy if the post-deploy health check fails. PR comment shows both URLs (new + rolled-back-to).
|
|
930
|
+
- **`<!-- claude-autopilot-deploy -->` upserting PR comment** — single comment is updated in place across deploy → log-stream → health-check → rollback, instead of spamming the PR with multiple comments.
|
|
931
|
+
|
|
932
|
+
### Fixed
|
|
933
|
+
|
|
934
|
+
- **Bugbot caught explicit `--config <missing>` was silently ignored on PR #63 (Phase 3)** — autopilot fixed it with a regression test in 4 minutes.
|
|
935
|
+
- **Phase 4 introduced a regression in Phase 2's `--watch` test surface; caught via `npm test` before PR opened**, autopilot adapted spec interpretation (made health-check opt-in instead of falling back to deployUrl) and documented the deviation.
|
|
936
|
+
|
|
937
|
+
### Notes
|
|
938
|
+
|
|
939
|
+
- This release was **shipped as four self-eat PRs** (#59, #61, #63, #64) where autopilot implemented its own next phase end-to-end. Cumulative cost ~\$17.50, wall clock ~82 min, 47 new tests. See [DEMO.md](DEMO.md) for the full proof set.
|
|
940
|
+
- v5.3 "deploy phase" was superseded by v5.4 — the adapter pattern subsumed the generic-command-only design from the in-flight v5.3 spec.
|
|
941
|
+
|
|
942
|
+
## v5.2.2 — Demo polish
|
|
943
|
+
|
|
944
|
+
### Fixed
|
|
945
|
+
|
|
946
|
+
- **Cost log skips zero-token entries.** Setup-flow scans, dry-runs, and no-findings paths were polluting the log with empty rows that drowned real review entries in `claude-autopilot costs` output.
|
|
947
|
+
- **`costs` shows scope.** Output now explicitly notes "per-project — scoped to `<cwd>/.guardrail-cache/costs.jsonl`" so users understand it's not a global aggregate.
|
|
948
|
+
- **`pr` no longer hard-fails on missing config.** First-run on a fresh repo now auto-detects + prints a remediation line pointing at `setup`.
|
|
949
|
+
|
|
950
|
+
### Added
|
|
951
|
+
|
|
952
|
+
- **DEMO.md committed at repo root.** Real end-to-end pipeline run on randai-johnson (multi-file Python integration, 12 min wall clock, $2.20 spend, 5 new tests, zero manual intervention). Linkable from external docs / pitch material.
|
|
953
|
+
|
|
954
|
+
## v5.2.1 — Stress-test polish
|
|
955
|
+
|
|
956
|
+
### Fixed
|
|
957
|
+
|
|
958
|
+
- **venv detection in tests phase.** `pytest -q` now resolves to `<project>/.venv/bin/pytest` (or `venv/`, `env/`) when present, so `claude-autopilot pr` no longer reports "tests failed" on Python repos with venv-installed pytest.
|
|
959
|
+
- **`autoregress` 100% broken on global install** — the bridge resolved `SCRIPT` to `dist/scripts/autoregress.ts` under the compiled layout, but `scripts/` ships at the package root. Every invocation threw `ERR_MODULE_NOT_FOUND`. Now uses `findPackageRoot` + existence check.
|
|
960
|
+
- **Council in python preset.** Python preset now ships a commented `council:` template (mirrors the generic preset). Out-of-the-box `init --preset python` no longer requires manual schema discovery.
|
|
961
|
+
- **Regression-lane fixture top-level await.** CI workflow's `npx tsx -e "..."` blocks wrapped in `async () => {...}` so esbuild's CJS output accepts them. Plus expected-ledger.json updated to match v5.2.0's new version format.
|
|
962
|
+
|
|
963
|
+
## v5.0.8 — Line extraction + fix gate
|
|
964
|
+
|
|
965
|
+
### Fixed
|
|
966
|
+
|
|
967
|
+
- **Parser extracts "line N" / "on line N" / "at line N" from prose** when not adjacent to a file ref. Previously findings shipped with file but no line, so `fix --dry-run` reported "no fixable findings" on a non-empty findings list.
|
|
968
|
+
- **`fix` distinguishes actionable (file present) from fixable (file + line).** Dry-run surfaces actionable findings even when line-less, with a clear message about why the LLM-fix loop can't act on them.
|
|
969
|
+
|
|
970
|
+
## v5.0.7 — File backfill + cost ledger consolidation
|
|
971
|
+
|
|
972
|
+
### Fixed
|
|
973
|
+
|
|
974
|
+
- **Single-file scan unconditionally backfills the file path.** The 5.0.6 fallback only triggered on `<unspecified>`, so prose-noise like `"n.r"` slipped through and broke `fix`.
|
|
975
|
+
- **`pr-desc` and `council` now persist to the cost ledger.** Previously only `scan` and `run` were tracked, so `claude-autopilot costs` showed misleadingly low totals after multi-call sessions.
|
|
976
|
+
- **Single-letter code extensions removed from bare-reference parser** (c/d/h/m/r/s) — they still match when backtick-wrapped, but bare matches like "n.r" no longer slip through.
|
|
977
|
+
- **`appendCostLog` swallows write errors centrally.** Cost log is observability, not a contract — a read-only FS or full disk no longer crashes commands that already succeeded.
|
|
978
|
+
|
|
979
|
+
## v5.0.6 — Setup YAML + branch fallback
|
|
980
|
+
|
|
981
|
+
### Fixed
|
|
982
|
+
|
|
983
|
+
- **`setup` no longer writes duplicate `testCommand` keys.** Several presets (go, python, python-fastapi, rails-postgres) ship with their own `testCommand:` line; `cli/setup.ts` was unconditionally appending another, producing invalid YAML that hard-failed every command after `setup` on those stacks.
|
|
984
|
+
- **Single-file scan backfills file path** (initial fix; superseded by v5.0.7's unconditional version).
|
|
985
|
+
- **Branch-derived PR titles default to `chore:` for unknown prefixes.** `autopilot-test/validate-weights` → `chore: validate weights` instead of `autopilot test validate weights` (which fails commitlint).
|
|
986
|
+
|
|
987
|
+
## v5.0.5 — Python detect + parser hardening
|
|
988
|
+
|
|
989
|
+
### Added
|
|
990
|
+
|
|
991
|
+
- **`presets/python/`** — general Python config (pytest, ruff, hardcoded-secrets, common protected paths). Detector now picks it for any `pyproject.toml` or `requirements.txt` without FastAPI signals (was falling through to the JS/Generic preset).
|
|
992
|
+
|
|
993
|
+
### Fixed
|
|
994
|
+
|
|
995
|
+
- **Parser rejects "e.g" / "i.e" / "etc" prose as file refs.** The prior regex `\.[a-z]{1,6}` accepted any 1-6 letter suffix, so prose like "(e.g. dict, list)" was matched. Bare references now require a known code-file extension.
|
|
996
|
+
- **`pr-desc` real titles.** Prompt now explicitly asks for a Title line; parser falls through to a branch-derived conventional-commit title (`fix/cost-tracker` → `fix: cost tracker`), then first summary bullet, then `chore: update` only as a last resort.
|
|
997
|
+
- **`runReviewOnTestFail` default flipped to `true`.** Failed/missing test commands no longer silently kill the LLM review phase. Strict gating still available via explicit `false`.
|
|
998
|
+
|
|
999
|
+
## v5.0.4 — Council Responses API
|
|
1000
|
+
|
|
1001
|
+
### Fixed
|
|
1002
|
+
|
|
1003
|
+
- **Council 404s on `gpt-5.3-codex` resolved.** Codex variants and o-series reasoning models are Responses-API-only — the council adapter only used `client.chat.completions`. Now branches by model name (`/codex|^o[1-9]|^gpt-5\.3-/`) to use `client.responses.create()` for those models. Fixes the multi-model differentiator for any user with only `OPENAI_API_KEY`.
|
|
1004
|
+
- **Generic preset ships a working council template.**
|
|
1005
|
+
|
|
1006
|
+
## v5.0.3 — Cost tracker
|
|
1007
|
+
|
|
1008
|
+
### Fixed
|
|
1009
|
+
|
|
1010
|
+
- **Codex adapter computes `costUSD`** (was returning `usage` without a cost field, so every codex run logged $0).
|
|
1011
|
+
- **`scan` now persists to cost log** (was only `run` writing entries).
|
|
1012
|
+
|
|
1013
|
+
## v5.0.2 — Post-install friction
|
|
1014
|
+
|
|
1015
|
+
### Fixed
|
|
1016
|
+
|
|
1017
|
+
- **preflight `tsx` false-positive eliminated.** Every fresh global install reported `✗ tsx available` blocker because the bundled tsx wasn't checked. Now uses `findPackageRoot(import.meta.url)`.
|
|
1018
|
+
- **Top-level `unhandledRejection` + `uncaughtException` handlers** format `GuardrailError` as a single-line red message instead of a Node stack trace. `CLAUDE_AUTOPILOT_DEBUG=1` re-enables stack.
|
|
1019
|
+
- **Tarball trimmed:** dropped `src/` + `*.map` from `files` array → 319 files / 182 kB packed (was 726 / 382 kB), -56% / -52%.
|
|
1020
|
+
- **Stale strings:** `@alpha` install hint → `@latest`; `npx guardrail run` blocker text → `claude-autopilot run`; init deprecation banner removed (both verbs work).
|
|
1021
|
+
|
|
1022
|
+
## v5.0.1 — Types + tombstone
|
|
1023
|
+
|
|
1024
|
+
### Fixed
|
|
1025
|
+
|
|
1026
|
+
- **Ships `dist/src/index.d.ts`** for TypeScript consumers.
|
|
1027
|
+
- **Tombstone `@delegance/guardrail` package** publishes a forwarder pointing at the renamed package; pre-rename versions deprecated with migration message.
|
|
2
1028
|
|
|
3
1029
|
## v5.2.0 — Migrate skill generalization
|
|
4
1030
|
|