@delegance/claude-autopilot 7.9.0 → 7.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  - v5.6 Phase 7 (docs reconciliation) — pending.
4
4
 
5
+ ## 7.9.1 — 2026-05-13 (correctness hotfix)
6
+
7
+ ### Fixed
8
+ - **`skills/autopilot/SKILL.md` ran migrate BEFORE validate.** On stacks that auto-promote (Supabase-script-specific), this could leave production with new schema and no working code if validate or PR review later failed. Resequenced: validate is now Step 4, migrate-dev is Step 5. PR + Codex + bugbot follow. Production migration is explicitly handed off to the user's CI/CD pipeline.
9
+ - **Removed misleading "dev → QA → prod auto-promote" claim.** That behavior is Supabase-stack-specific, not a generic CLI capability. The skill now references the four real `migrate.policy` keys (`allow_prod_in_ci`, `require_clean_git`, `require_manual_approval`, `require_dry_run_first`) and explains how to wire them in `.autopilot/stack.md`.
10
+
11
+ ### Out of scope (filed as v7.10.0 + v7.11.0 candidates)
12
+ - Expand/contract migration classification (additive vs destructive enforcement)
13
+ - v6 run-state engine integration into the autopilot skill (4,873 LOC of checkpoint/resume infra currently unused by the skill)
14
+ - Retry-loop sameness detector ("same fingerprint twice → escalate to human")
15
+
5
16
  ## 7.9.0 — 2026-05-12
6
17
 
7
18
  ### Changed
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@delegance/claude-autopilot",
3
- "version": "7.9.0",
3
+ "version": "7.9.1",
4
4
  "type": "module",
5
5
  "publishConfig": {
6
6
  "tag": "next"
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: autopilot
3
- description: End-to-end pipeline — brainstorm → spec → plan → implement → migratevalidate → PR → Codex review → bugbot. Risk-tiered. No manual intervention after spec approval.
3
+ description: End-to-end pipeline — brainstorm → spec → plan → implement → validatemigrate dev → PR → Codex review → bugbot → merge. Risk-tiered. No manual intervention after spec approval until PR merge; production deploy/migration gates are handled by the user's CI/CD pipeline.
4
4
  ---
5
5
 
6
6
  # Autopilot — Idea to Merged PR Pipeline
@@ -209,26 +209,16 @@ For each task:
209
209
  - Skip formal spec/quality review to maintain speed (the validate step catches issues)
210
210
  - If subagent fails to write to worktree: implement directly
211
211
 
212
- ### Step 4: Auto-migrate
212
+ ### Step 4: Validate
213
213
 
214
- For any `.sql` files created in `data/deltas/` during implementation:
214
+ Run both checks in order. **Autofix runs first** so the static review sees the final post-autofix diff (otherwise the LLM review is stale on autofix mutations):
215
215
 
216
216
  ```bash
217
- /migrate
218
- ```
219
-
220
- Run against dev → QA → prod with auto-promote. If migration fails, fix the SQL and retry.
221
-
222
- ### Step 5: Validate
223
-
224
- Run both checks in order:
217
+ # 1. Full project validation FIRST (autofix, tests, codex, gate) — mutates files
218
+ npx tsx scripts/validate.ts --commit-autofix --allow-dirty
225
219
 
226
- ```bash
227
- # 1. Static rules + LLM review on changed files
220
+ # 2. THEN static rules + LLM review on the final diff
228
221
  npx autopilot run --base main
229
-
230
- # 2. Full project validation (autofix, tests, codex, gate)
231
- npx tsx scripts/validate.ts --commit-autofix --allow-dirty
232
222
  ```
233
223
 
234
224
  The `validate.ts` Phase 1 includes a **tsc regression check**: it runs `npx tsc --noEmit` against both the PR and the merge-base (cached at `.claude/.tsc-baseline-cache.json`) and surfaces only files where the PR introduces *new* TypeScript errors versus the baseline. Forward-pressure check — type errors are warnings, not blockers.
@@ -239,7 +229,73 @@ If either FAIL:
239
229
  - Re-run the failing check
240
230
  - Max 3 retry iterations
241
231
 
242
- If both PASS: proceed to PR.
232
+ If both PASS: proceed to Step 5.
233
+
234
+ ### Step 5: Migrate dev-only (verify SQL parses + applies)
235
+
236
+ For any `.sql` files created in `data/deltas/` during implementation, run the migration against the dev database ONLY. Always invoke explicitly — do not rely on the CLI default:
237
+
238
+ ```bash
239
+ /migrate --env=dev
240
+ ```
241
+
242
+ This verifies the SQL parses and applies against the real schema shape before the PR opens. Prod migration is **deferred to the user's CI/CD pipeline** after the PR merges — autopilot does NOT orchestrate prod deploys or migrations directly, because deployment ordering varies wildly across user infrastructure (ECS, CodeBuild, blue/green, manual approvals).
243
+
244
+ **Post-migration re-validate** (catches type/test failures that only surface after the schema actually changes):
245
+
246
+ ```bash
247
+ # Regenerate any stack-specific DB types if migration introduced schema changes
248
+ # (e.g. Supabase: scripts/gen-types.ts). Stack-specific — skip if N/A.
249
+ # REQUIRED for any stack with generated DB-typed code: missing regeneration leaves
250
+ # stale types and validate.ts may not catch runtime/schema mismatches if the PR
251
+ # does not directly touch the changed columns.
252
+
253
+ # Re-run validation against the post-migration dev state (no autofix this time):
254
+ npx tsx scripts/validate.ts --allow-dirty
255
+
256
+ # If migration or type generation produced new file changes, re-run the
257
+ # static/LLM review against the final diff — the Step 4 review is now stale.
258
+ git diff --quiet HEAD -- || npx autopilot run --base main
259
+ ```
260
+
261
+ > **v7.10+ candidate:** automate detection by reading a `post_migrate_dev: [...]` hook list from `.autopilot/stack.md` and failing Step 5 if schema changes are detected but no type-generation hook is configured. Today this is advisory — operator responsibility.
262
+
263
+ **Dev database drift** — `/migrate --env=dev` applies schema changes to dev BEFORE the PR is merged. If the PR is later rejected, substantially changed, or abandoned, dev will contain unmerged schema. Mitigations:
264
+
265
+ - **Preferred:** target an isolated/ephemeral dev database per branch (per-PR Supabase project, Postgres schema namespace, or local container).
266
+ - **Shared dev DB fallback:** before running Step 5, capture the current migration_state head; on PR abandonment, run a corrective migration that brings dev back to that head. Document the policy in `.autopilot/stack.md` so the team knows the cleanup contract.
267
+
268
+ If migration fails:
269
+ - **Before retrying:** confirm the dev migration rolled back cleanly. Some migration tools leave partial state on failure (non-transactional DDL, drift in migration_state table). If partial state exists, reset the dev database to the pre-migration baseline OR write a corrective migration that brings dev back to a known state.
270
+ - Fix the SQL
271
+ - Re-run Step 4 (validate) against the corrected code
272
+ - Re-run Step 5 (migrate dev)
273
+ - Max 3 retry iterations
274
+
275
+ **For prod-safety policy**, projects should set in `.autopilot/stack.md`:
276
+
277
+ ```yaml
278
+ migrate:
279
+ skill: <your-skill>@1
280
+ policy:
281
+ require_dry_run_first: true # always preview before apply
282
+ require_manual_approval: true # CI gates prod on human approval
283
+ require_clean_git: true # never apply against dirty tree
284
+ ```
285
+
286
+ As of v7.9.1, the valid keys per `presets/schemas/migrate.schema.json` are: `allow_prod_in_ci`, `require_clean_git`, `require_manual_approval`, `require_dry_run_first`. If the schema changes in a future release, that file is the source of truth.
287
+
288
+ These keys are **declarative policy inputs**, not autopilot enforcement. Your CI/CD pipeline must explicitly invoke a migrate runner that reads `.autopilot/stack.md` AND enforces equivalent checks (e.g. require-clean-git, dry-run-before-apply, manual-approval-for-prod). Setting them in stack.md alone does not protect production unless your pipeline reads them.
289
+
290
+ **Minimum CI/CD migrate-runner contract** (fail closed on all of these):
291
+
292
+ 1. Refuse to apply if `.autopilot/stack.md` cannot be read or parsed.
293
+ 2. Refuse to apply if the policy block contains unknown keys (schema-validate against `presets/schemas/migrate.schema.json`).
294
+ 3. If `require_dry_run_first: true`: refuse apply without a matching dry-run artifact for the current git head + target env.
295
+ 4. If `require_manual_approval: true` and `env != dev`: require an explicit human approval signal (CI approval gate, signed commit, etc.) before apply.
296
+ 5. If `require_clean_git: true`: refuse to apply against a dirty working tree (untracked or unstaged changes). This is intentionally limited to working-tree cleanliness — commit topology (squash vs rebase vs merge vs tag) is a separate concern not enforced by this key.
297
+ 6. If `allow_prod_in_ci: false` (the default): refuse prod apply from any CI context. **Note:** teams whose intended prod migration path is CI/CD with manual approval must explicitly set `allow_prod_in_ci: true` alongside `require_manual_approval: true` and `require_dry_run_first: true`. `allow_prod_in_ci: false` is for manual/operator-run production migration workflows only.
298
+ 7. On any policy-read or schema-validation failure, exit non-zero and surface the specific check that failed.
243
299
 
244
300
  ### Step 6: Push + create PR
245
301
 
@@ -289,7 +345,8 @@ Tell the user:
289
345
  - **Preflight failure:** Surface the specific check, exit. Do not partially run.
290
346
  - **Missing skill/credential:** Exit with install/auth hint.
291
347
  - **Subagent failure:** Re-dispatch with more context or implement directly.
292
- - **Migration failure:** Fix SQL, re-run `/migrate`.
348
+ - **Migration failure (Step 5 — dev only):** Fix SQL, re-run Step 4 (validate) against corrected code, then re-run Step 5 (migrate dev). Prod stays untouched. Max 3 retries.
349
+ - **Prod migration:** Out of scope for autopilot. Handed off to user's CI/CD pipeline via `migrate.policy` (require_manual_approval, require_dry_run_first).
293
350
  - **Validate failure:** Fix issues, re-run (max 3 retries).
294
351
  - **Codex CRITICAL findings:** REMEDIATE (apply fix), push, re-review (max 2 retries). Do NOT continue past unremediated CRITICALs.
295
352
  - **Bugbot findings:** `/bugbot` handles triage + fix automatically (max 3 rounds).