mustflow 2.107.3 → 2.108.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +1 -0
  2. package/dist/cli/commands/init.js +49 -1
  3. package/dist/cli/commands/run/execution.js +7 -0
  4. package/dist/cli/commands/run/executor.js +7 -0
  5. package/dist/cli/commands/verify.js +14 -0
  6. package/dist/cli/commands/workspace.js +106 -16
  7. package/dist/cli/i18n/en.js +6 -1
  8. package/dist/cli/i18n/es.js +6 -1
  9. package/dist/cli/i18n/fr.js +6 -1
  10. package/dist/cli/i18n/hi.js +6 -1
  11. package/dist/cli/i18n/ko.js +6 -1
  12. package/dist/cli/i18n/zh.js +6 -1
  13. package/dist/cli/index.js +8 -0
  14. package/dist/cli/lib/agent-context.js +7 -0
  15. package/dist/cli/lib/repo-map.js +14 -0
  16. package/dist/cli/lib/run-plan.js +7 -0
  17. package/dist/core/change-verification.js +7 -0
  18. package/dist/core/verification-scheduler.js +7 -0
  19. package/package.json +1 -1
  20. package/schemas/README.md +3 -3
  21. package/schemas/workspace-status.schema.json +4 -2
  22. package/templates/default/common/.mustflow/config/mustflow.toml +3 -3
  23. package/templates/default/i18n.toml +61 -7
  24. package/templates/default/locales/en/.mustflow/docs/agent-workflow.md +24 -1
  25. package/templates/default/locales/en/.mustflow/skills/INDEX.md +51 -5
  26. package/templates/default/locales/en/.mustflow/skills/admin-control-plane-safety-review/SKILL.md +200 -0
  27. package/templates/default/locales/en/.mustflow/skills/ai-product-readiness-review/SKILL.md +158 -0
  28. package/templates/default/locales/en/.mustflow/skills/auth-permission-change/SKILL.md +91 -28
  29. package/templates/default/locales/en/.mustflow/skills/browser-automation-reliability-review/SKILL.md +279 -0
  30. package/templates/default/locales/en/.mustflow/skills/cli-option-contract-review/SKILL.md +147 -0
  31. package/templates/default/locales/en/.mustflow/skills/database-change-safety/SKILL.md +21 -2
  32. package/templates/default/locales/en/.mustflow/skills/database-migration-change/SKILL.md +25 -7
  33. package/templates/default/locales/en/.mustflow/skills/deployment-rollout-safety-review/SKILL.md +117 -43
  34. package/templates/default/locales/en/.mustflow/skills/frontend-component-library-review/SKILL.md +299 -0
  35. package/templates/default/locales/en/.mustflow/skills/frontend-localization-review/SKILL.md +128 -36
  36. package/templates/default/locales/en/.mustflow/skills/notification-delivery-integrity-review/SKILL.md +226 -0
  37. package/templates/default/locales/en/.mustflow/skills/payment-integrity-review/SKILL.md +34 -14
  38. package/templates/default/locales/en/.mustflow/skills/routes.toml +54 -0
  39. package/templates/default/locales/en/.mustflow/skills/small-service-platform-architecture-review/SKILL.md +273 -0
  40. package/templates/default/locales/en/.mustflow/skills/third-party-api-integration-review/SKILL.md +188 -0
  41. package/templates/default/locales/en/.mustflow/skills/website-task-friction-review/SKILL.md +139 -0
  42. package/templates/default/manifest.toml +60 -1
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.database-migration-change
3
3
  locale: en
4
4
  canonical: true
5
- revision: 3
5
+ revision: 4
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: database-migration-change
9
- description: Apply this skill when database migration files, schema migration history, ORM schema migrations, generated clients, schema dumps, SQL snapshots, online DDL, large indexes, constraints, state-dependent CHECK constraints, backfills, rolling deploy compatibility, expand-and-contract changes, destructive database changes, migration rollback or roll-forward claims, cut-over plans, lock or timeout policy, replication lag risk, migration observability, or production database migration procedures are created, changed, reviewed, or reported.
9
+ description: Apply this skill when database migration files, schema migration history, ORM schema migrations, generated clients, schema dumps, SQL snapshots, online DDL, large indexes, constraints, state-dependent CHECK constraints, background-job backfills, zero-downtime migration claims, rolling deploy compatibility, expand-and-contract changes, destructive database changes, migration rollback or roll-forward claims, cut-over plans, feature-flagged read/write switches, lock or timeout policy, replication lag risk, migration observability, or production database migration procedures are created, changed, reviewed, or reported.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -33,13 +33,14 @@ Keep database migrations safe for running systems by checking deploy compatibili
33
33
 
34
34
  Do not treat migration authoring as "make a file that applies locally." Treat it as "old code and new code must survive the same database during rollout."
35
35
  Migration incidents usually happen in the interval where old code, new code, old data, and new data are all alive at once. Design that interval first.
36
+ Do not collapse schema expansion, data backfill, read or write cut-over, and destructive cleanup into one deploy-time migration just because that worked on a developer database.
36
37
 
37
38
  <!-- mustflow-section: use-when -->
38
39
  ## Use When
39
40
 
40
41
  - A database migration file, migration history entry, schema dump, ORM schema, SQL snapshot, generated client, seed, fixture, schema validator, or migration documentation is created or changed.
41
42
  - A change adds, removes, renames, splits, merges, backfills, rewrites, validates, constrains, indexes, foreign-keys, type-changes, defaults, nullable rules, enum values, tables, columns, generated columns, triggers, views, functions, row-level policies, or data migrations.
42
- - A task mentions rolling deploy, expand-and-contract, online migration, backfill, production schema change, rollback, roll-forward, down migration, migration lock, lock timeout, statement timeout, DDL transaction, `CREATE INDEX CONCURRENTLY`, MySQL `ALGORITHM=INSTANT`, MySQL `LOCK=NONE`, generated ORM client, migration drift, schema drift, or database migration safety.
43
+ - A task mentions rolling deploy, zero-downtime migration, expand-and-contract, online migration, long-running migration, background job backfill, feature flag migration, dual-write, dual-read, compatibility read fallback, production schema change, rollback, roll-forward, down migration, migration lock, lock timeout, statement timeout, DDL transaction, `CREATE INDEX CONCURRENTLY`, MySQL `ALGORITHM=INSTANT`, MySQL `LOCK=NONE`, generated ORM client, migration drift, schema drift, or database migration safety.
43
44
  - Prisma, Drizzle, TypeORM, Rails Active Record, Django migrations, Alembic, Diesel, Ecto, Flyway, Liquibase, Knex, Sequelize, SQLx, or another migration tool changes schema, generated output, migration metadata, or deployment behavior.
44
45
  - A final report claims a database migration is safe, reversible, applied, validated, production-ready, no-downtime, rollback-safe, or tested from an old schema.
45
46
 
@@ -58,6 +59,8 @@ Migration incidents usually happen in the interval where old code, new code, old
58
59
  - Deployment shape: single-step deploy, rolling deploy, blue-green, multiple app versions, background workers, read replicas, multiple services, serverless functions, mobile clients, or external integrations.
59
60
  - Database engine and operational surface: PostgreSQL, MySQL, SQLite, SQL Server, managed database, migration lock behavior, DDL transaction behavior, online DDL options, table size, write load, long-running transactions, replication or CDC topology, expected lock time, statement timeout, lock timeout, and restore capability when known.
60
61
  - Data preservation needs, compatibility window, backfill size, batch strategy, cursor or checkpoint marker, validation query, observability query, rollback or roll-forward type, cut-over control, and whether old code can run after the new schema lands.
62
+ - Application transition controls: feature flags, tenant gates, read fallback, dual-write window, old-write cutoff, old-read cutoff, worker rollout order, admin/reporting/BI dependency review, and how to disable the new path without restoring the database.
63
+ - Production runbook boundary: execution owner, intended window, expected lock time, expected replication lag, metrics to watch, stop or pause thresholds, retry policy, partial-apply handling, customer-impact communication trigger, and manual approval points when relevant.
61
64
  - State and timestamp invariant matrix when a migration introduces lifecycle statuses, terminal
62
65
  timestamps, retry or dead-letter states, delivery states, soft-delete states, approval states, or
63
66
  other columns whose valid nullability depends on status.
@@ -78,6 +81,7 @@ Migration incidents usually happen in the interval where old code, new code, old
78
81
 
79
82
  - Update migration files, ORM schema files, generated client expectations, schema dumps, SQL snapshots, seeds, fixtures, compatibility code, backfill code, validation checks, docs, and tests directly required by the migration.
80
83
  - Prefer expand-and-contract for live systems: add compatible shape, dual-write or compatibility-read where needed, backfill safely, switch reads and writes, then contract only after compatibility is proven.
84
+ - Move long-running data rewrites out of deploy-time schema migrations into bounded, restartable background jobs when production-sized data or live traffic can be affected.
81
85
  - Keep destructive cleanup separate from expansion unless the repository explicitly proves a single-step deployment is safe.
82
86
  - Do not weaken tests, delete migration history, hand-edit generated client output, suppress migration drift, or claim rollback safety for lossy changes.
83
87
 
@@ -93,19 +97,21 @@ Migration incidents usually happen in the interval where old code, new code, old
93
97
  - Django: migration files, state operations, historical models, schema editor behavior, generated SQL when relevant, and data migration functions.
94
98
  - Alembic or SQLAlchemy: migration revisions, autogenerate output, branch heads, model metadata, downgrade functions, naming conventions, and generated SQL.
95
99
  - Diesel, Ecto, Flyway, Liquibase, Knex, Sequelize, SQLx, and raw SQL: migration history, checked-in SQL, generated metadata, compile-time query checks, rollback files, and schema dumps.
96
- 3. Build a migration ledger: old shape, new shape, rows affected, old code behavior, new code behavior, rollback expectation, generated artifact changes, dependent callers, and validation query.
100
+ 3. Build a migration ledger: old shape, new shape, rows affected, old code behavior, new code behavior, worker and batch behavior, admin/reporting/BI behavior, rollback expectation, generated artifact changes, dependent callers, and validation query.
97
101
  4. Classify compatibility.
98
102
  - Old code on old schema.
99
103
  - Old code on expanded schema.
100
104
  - New code on expanded schema.
101
105
  - New code after backfill.
102
106
  - New code after contract.
107
+ - Old background workers, cron jobs, admin tools, reporting queries, and external integrations during the same window.
103
108
  If any required state fails, the migration is not rolling-deploy safe.
104
109
  5. Split the deployment plan into expand, backfill, switch, and contract phases.
105
110
  - Expansion adds shapes old code can ignore and new code can start writing.
106
- - Backfill is bounded, restartable, idempotent, observable, and separately validated.
107
- - Switch changes read paths through a feature flag, rollout gate, tenant gate, or compatible deploy step where possible.
111
+ - Backfill is bounded, restartable, idempotent, observable, separately validated, and separated from the deployment pipeline when it can run long.
112
+ - Switch changes read and write paths through a feature flag, rollout gate, tenant gate, or compatible deploy step where possible.
108
113
  - Contract removes old shapes only after at least one compatibility window proves no code, job, report, or manual SQL still depends on them.
114
+ - A single migration file that expands, rewrites data, flips reads, and drops old structures is not zero-downtime evidence unless the repository proves the single-step path is safe for its deployment model.
109
115
  6. For column add, decide nullability, default behavior, backfill strategy, write path, read fallback, index need, and when a future `NOT NULL` or constraint can be enforced.
110
116
  - Add nullable first unless a proven engine/version/table-size path makes the non-null default safe.
111
117
  - Do not assume a database default backfills existing rows or matches ORM, API, batch, or application defaults.
@@ -139,6 +145,10 @@ Migration incidents usually happen in the interval where old code, new code, old
139
145
  - Partition attach can scan existing rows unless a suitable `CHECK` constraint proves the range first.
140
146
  - Table split, table merge, or relationship rewrite must preserve stable identifiers, foreign keys, audit references, external IDs, permissions, search documents, exports, and old-to-new mapping until all callers switch.
141
147
  15. For backfills, make them bounded, restartable, observable, and validated. Define batch size, cursor-based ordering key such as `id > last_id`, checkpoint, retry behavior, idempotency, timeout, lock expectation, throttle or pause/resume control, dead-letter or manual review behavior, and validation queries.
148
+ - Keep long-running data rewrites out of deploy-time migrations unless the affected row count, lock behavior, WAL/binlog or undo impact, replication lag, and timeout behavior prove the operation is short and bounded.
149
+ - Commit in small batches instead of one huge transaction when live data volume can be large.
150
+ - Process only rows that still need work, so reruns and retries cannot corrupt already migrated rows.
151
+ - Track progress with a durable cursor or checkpoint; do not rely on offset pagination for mutable production tables.
142
152
  16. Do not run or recommend full-table updates on production-sized data without measured volume, lock expectation, WAL or undo impact, replication lag risk, batch plan, timeout policy, and recovery plan.
143
153
  17. Review replication, CDC, and long-running transaction interactions.
144
154
  - Online DDL can leave replicas, read traffic, backups, CDC connectors, or failover readiness behind even when the primary looks healthy.
@@ -150,14 +160,17 @@ Migration incidents usually happen in the interval where old code, new code, old
150
160
  - Monitor dual-write mismatch and sample old/new values during the compatibility window; code intent is not proof that every path writes both sides.
151
161
  19. Prepare observability before apply.
152
162
  - Pair the migration with read-only progress and safety queries for lock waits, index build progress, replication lag, backfill cursor, skipped rows, failed rows, duplicate rows, missing rows, dead tuples, or estimated remaining range when the engine supports them.
163
+ - Watch application error rate, p95 or p99 latency, connection pressure, fallback-read rate, dual-write mismatch rate, and critical business event failures when the migration changes a live request path.
153
164
  - Log or report dry-run selection counts, apply counts, skip reasons, batch durations, and recovery handles.
154
165
  - A final `done` line is not enough evidence for a live migration.
166
+ - Prepare a runbook before apply. It should name the operator, execution window, expected duration, expected lock and replication behavior, stop thresholds, pause or abort action, partial-apply behavior, code rollback order, feature-flag fallback, validation queries, and customer-impact communication trigger.
155
167
  20. Decide rollback honestly and prefer roll-forward for partial live changes.
156
168
  - Reversible: schema-only and data-preserving.
157
169
  - App rollback: old and new code both tolerate the expanded shape, so the read path can move back without losing new writes.
158
170
  - Forward-fix preferred: partial live migration can be corrected without restoring.
159
171
  - Restore required: deletes, table merges, generated IDs, hashing, encryption, irreversible type conversions, external side effects, or lossy transforms.
160
172
  Do not promise rollback for changes that cannot reconstruct old values.
173
+ Treat backups as disaster recovery evidence, not ordinary deploy rollback, unless a restore drill proves that restoring the database would not lose acceptable live writes, external side effects, or dependent service state.
161
174
  21. Keep external side effects out of database migrations unless the repository has an explicit recovery model. Sending emails, calling payment APIs, deleting files, or mutating external providers from a migration usually breaks rollback.
162
175
  22. Check generated surfaces after schema changes: ORM clients, types, SQL snapshots, schema dumps, OpenAPI or GraphQL projections, API mocks, fixtures, seeds, admin screens, analytics, ETL, BI queries, and docs examples.
163
176
  23. Review ORM-specific traps.
@@ -180,11 +193,13 @@ Migration incidents usually happen in the interval where old code, new code, old
180
193
  - Source schema, target schema, migration files, generated artifacts, schema dumps, seeds, fixtures, and dependent code agree.
181
194
  - Expand, backfill, switch, and contract phases are separated or explicitly proven unnecessary.
182
195
  - Old-code/new-schema and new-code/expanded-schema compatibility is classified.
196
+ - Read-path fallback, write-path transition, dual-write mismatch detection, feature-flag control, and old worker/admin/reporting dependency review are explicit when a live rollout can overlap versions.
183
197
  - Backfill and validation behavior is cursor-based or otherwise bounded, restartable, idempotent, observable, and checkable where relevant.
184
198
  - State-dependent CHECK constraints, terminal timestamp exclusivity, and valid nullability matrices
185
199
  are explicit where status columns can otherwise contradict timestamp or reason columns.
186
200
  - Lock levels, online DDL support, long-running transaction waits, replication lag, cut-over control, timeout policy, and observability queries are explicit where production data may be affected.
187
201
  - Rollback claims distinguish schema rollback, data rollback, app rollback, roll-forward, forward-fix, and restore-required cases.
202
+ - Production runbook stop thresholds, pause or abort behavior, partial-apply handling, and communication triggers are explicit where the migration can affect live service behavior.
188
203
  - Destructive changes and production lock risks are either deferred, measured, guarded, or reported as remaining risk.
189
204
 
190
205
  <!-- mustflow-section: verification -->
@@ -212,6 +227,8 @@ Prefer configured migration dry-run, generated-output, schema-diff, or database
212
227
  - If online DDL support, long-running transaction behavior, replication lag, or cut-over control is unknown, report the migration as operationally unproven.
213
228
  - If an autogenerator proposes drop/create for a rename, stop and rewrite the migration plan.
214
229
  - If a migration is lossy, do not claim rollback beyond restore or forward corrective migration.
230
+ - If rollback depends only on a backup restore, label it disaster recovery instead of deploy rollback and report live-write loss or external-state reconciliation risk.
231
+ - If the migration plan lacks feature-flag fallback, read/write cut-over order, stop thresholds, or partial-apply handling for a live rollout, do not call it zero-downtime.
215
232
  - If a backfill is not idempotent, restartable, observable, and throttled or bounded, keep it out of a production migration claim.
216
233
  - If generated clients or schema dumps drift, fix the source of truth and regenerated surfaces together.
217
234
  - If configured verification is missing, report the missing command intent instead of inferring package-manager, ORM, or migration-tool commands.
@@ -224,7 +241,8 @@ Prefer configured migration dry-run, generated-output, schema-diff, or database
224
241
  - Source schema, target schema, and migration phase
225
242
  - Old-code/new-schema and new-code/expanded-schema compatibility
226
243
  - Expand/backfill/switch/contract plan and destructive cleanup timing
227
- - Backfill cursor, idempotency, throttle, pause/resume, validation, lock, timeout, replication, cut-over, and observability classification
244
+ - Read/write transition, feature-flag fallback, dual-write or compatibility-read window, old worker/admin/reporting dependency review
245
+ - Backfill cursor, idempotency, throttle, pause/resume, validation, lock, timeout, replication, cut-over, runbook stop threshold, and observability classification
228
246
  - Status, timestamp, CHECK constraint, and existing-row validation matrix where relevant
229
247
  - Rollback, app rollback, roll-forward, forward-fix, and restore-required classification
230
248
  - ORM/generated client/schema dump/snapshot surfaces synchronized
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.deployment-rollout-safety-review
3
3
  locale: en
4
4
  canonical: true
5
- revision: 2
5
+ revision: 3
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: deployment-rollout-safety-review
9
- description: Apply this skill when server, backend, worker, scheduler, queue consumer, cron, container, VM, serverless, migration, config, feature-flag, cache, deployment pipeline, canary, rollback, release envelope, image digest, deployment history, traffic rollback, health check, readiness/liveness/startup probe, graceful shutdown, artifact promotion, release observability, or post-deploy smoke behavior is created, changed, reviewed, or reported and the risk is whether a deployment can be rolled out, stopped, observed, or rolled back safely.
9
+ description: Apply this skill when CI/CD pipeline gates, required checks, preview deploys, migration checks, server, backend, worker, scheduler, queue consumer, cron, container, VM, serverless, IaC, config, secret handling, feature-flag, cache, canary, rollback, release envelope, image digest, deployment history, traffic rollback, health check, readiness/liveness/startup probe, graceful shutdown, artifact promotion, release observability, or post-deploy smoke behavior is created, changed, reviewed, or reported and the risk is whether a deployment can be rolled out, stopped, observed, or rolled back safely.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -29,13 +29,15 @@ metadata:
29
29
  <!-- mustflow-section: purpose -->
30
30
  ## Purpose
31
31
 
32
- Review a server deployment as a runtime state transition, not as "the build passed".
32
+ Review a server deployment as a runtime state transition, not as "the build passed" or
33
+ "CI is green".
33
34
 
34
35
  Deployment failures usually come from changed ordering, config, data, cache, queues, rollback shape,
35
- permissions, and observability. Rollback is not just restarting an older container. It is the older
36
- release surviving today's data, config, cache, queue messages, external side effects, and traffic
37
- state. This skill makes the agent prove that a wrong deploy has small blast radius, is detected
38
- quickly, can be stopped quickly, and can be rolled back without inventing a recovery plan during the
36
+ permissions, secrets, infrastructure policy, human approvals, and observability. Rollback is not
37
+ just restarting an older container. It is the older release surviving today's data, config, cache,
38
+ queue messages, external side effects, secret versions, and traffic state. This skill makes the
39
+ agent prove that a wrong deploy has small blast radius, is detected quickly, can be stopped quickly,
40
+ and can be rolled back or safely rolled forward without inventing a recovery plan during the
39
41
  incident.
40
42
 
41
43
  <!-- mustflow-section: use-when -->
@@ -43,10 +45,12 @@ incident.
43
45
 
44
46
  - A change touches server deployment, backend runtime behavior, workers, schedulers, cron, queue
45
47
  consumers, containers, VMs, serverless functions, Kubernetes manifests, process managers,
46
- deployment pipelines, release gates, canaries, release envelopes, or rollback procedures.
48
+ CI/CD workflows, preview environments, deployment pipelines, release gates, canaries, release
49
+ envelopes, or rollback procedures.
47
50
  - A change touches DB migration order, config or env vars, feature flags, cache keys, queue or topic
48
51
  message formats, external API dependencies, storage paths, startup probes, readiness probes,
49
- liveness probes, graceful shutdown, worker drain, deployment locks, or post-deploy smoke checks.
52
+ liveness probes, graceful shutdown, worker drain, deployment locks, secret scope, IaC policy, or
53
+ post-deploy smoke checks.
50
54
  - A review needs to decide whether code, config, database, cache, queue, and scheduler changes can
51
55
  coexist across old and new versions while traffic is still moving.
52
56
 
@@ -77,11 +81,26 @@ incident.
77
81
  - Deployment model: environment order, artifact promotion path, rolling/blue-green/canary strategy,
78
82
  traffic rollback path, old-version retention, deployment history retention, deployment
79
83
  concurrency, and deployment lock owner.
84
+ - Pipeline gate model: required status checks, branch protection or merge gate, reviewer or
85
+ CODEOWNER gate, fast-fail ordering, clean CI environment, test data setup, flaky-test quarantine
86
+ owner, artifact retention, and security scan coverage.
87
+ - Preview model: preview URL, isolated namespace or schema, ephemeral database or seeded data,
88
+ teardown policy, cost guardrail, deployment protection, and proof that production secrets and
89
+ production PII are not exposed.
90
+ - Migration check model: dry-run or shadow database evidence, destructive-change detection,
91
+ lock or table-rewrite risk, online DDL assumptions, backfill separation, N-1 app compatibility,
92
+ and rollback or roll-forward boundary.
93
+ - Secret and approval model: environment-scoped secrets, production approval gate, OIDC or
94
+ short-lived credential path, log masking, rotation or revocation path, external secret store or
95
+ encryption-at-rest boundary, and least-privilege deploy job permissions.
80
96
  - Compatibility model: old code with old data, old code with new data, new code with old data, new
81
97
  code with new data, N-1 message compatibility, cache key version, and rollback survivability.
82
98
  - Runtime control model: startup/liveness/readiness probes, graceful shutdown behavior, load
83
99
  balancer connection draining, worker drain, cron duplicate guard, kill switch, safe flag defaults,
84
100
  automatic stop conditions, synthetic transactions, and post-deploy observation window.
101
+ - Infrastructure and observability model: rendered manifests or plan output, policy checks, image
102
+ digest pinning, resource requests and limits, probe presence, privileged or hostPath use,
103
+ plaintext-secret checks, release-labeled logs, metrics, traces, alerts, and dashboard slices.
85
104
 
86
105
  <!-- mustflow-section: preconditions -->
87
106
  ## Preconditions
@@ -99,6 +118,9 @@ incident.
99
118
  - Add or update deploy runbooks, release checklists, pipeline metadata, smoke tests, probe tests,
100
119
  config validation, feature-flag defaults, rollback notes, canary gates, and deployment safety
101
120
  tests when they match the repository style.
121
+ - Add or update preview-deploy guardrails, migration preflight notes, manifest or IaC validation
122
+ fixtures, secret-scope checks, required-check documentation, and release evidence templates when
123
+ they are already within the task scope.
102
124
  - Add local code guards for startup config validation, readiness separation, shutdown handling,
103
125
  worker drain, cache-key versioning, and deployment attribution when the task scope includes the
104
126
  affected runtime code.
@@ -115,131 +137,171 @@ incident.
115
137
  external APIs, permissions, config, and feature flags. If the ledger is unknown, report that the
116
138
  deployment blast radius is unknown instead of guessing.
117
139
 
118
- 2. Build the release envelope.
140
+ 2. Check the merge and CI gate before the deploy gate.
141
+ Required status checks, protected branches, review or CODEOWNER gates, and stale-review handling
142
+ should stop unreviewed or partially verified production changes. Order checks so cheap failures
143
+ happen before slow checks: format, lint, typecheck, unit tests, build, security scans,
144
+ integration or contract tests, then focused end-to-end smoke. Require a clean reproducible CI
145
+ environment, explicit test database setup, seeded data, dependency and container scan coverage,
146
+ IaC or configuration scans when those files changed, secret scanning, and a named owner or SLA for
147
+ flaky-test quarantine. A flaky test silently bypassed by the merge gate is a deployment-safety
148
+ risk, not a testing nuisance.
149
+
150
+ 3. Build the release envelope.
119
151
  Rollback needs more than an image tag. Bind image digest, chart or manifest revision, values
120
152
  file hash, ConfigMap name or version, Secret version, migration range, feature flag snapshot,
121
153
  ingress or router weight, deployer, and deployment time under one `release_id` when the platform
122
154
  exposes those facts. If only a mutable tag such as `latest` identifies production, report that
123
155
  rollback identity is not reproducible.
124
156
 
125
- 3. Separate artifact promotion from environment rebuilds.
157
+ 4. Make preview deployment a rehearsal, not just a link.
158
+ Preview deploys should use an isolated namespace, schema, or ephemeral database with explicit
159
+ seed data and teardown. Do not allow production secrets, production OAuth callbacks, production
160
+ webhooks, or production PII in preview. Name deployment protection, access control, teardown
161
+ failure alerts, and cost cleanup. Run or identify the preview checks that matter for the changed
162
+ flow: smoke, selected E2E, API compatibility, accessibility, visual regression, and thin DAST or
163
+ security probes where the repository owns them.
164
+
165
+ 5. Separate artifact promotion from environment rebuilds.
126
166
  Verify whether staging and production receive the same artifact identity. Prefer promoting one
127
- built artifact over rebuilding per environment. If per-environment rebuild is unavoidable, name
128
- the drift risk and require commit SHA, dependency lock, image digest, and build input evidence.
129
- Treat image tags as human labels and image digests as the rollback proof.
167
+ built immutable artifact over rebuilding per environment. If per-environment rebuild is
168
+ unavoidable, name the drift risk and require commit SHA, dependency lock, build run, SBOM or
169
+ package inventory when available, image digest, and build input evidence. Treat image tags as
170
+ human labels and image digests as the rollback proof.
130
171
 
131
- 4. Preserve rollback history and warm capacity.
172
+ 6. Preserve rollback history and warm capacity.
132
173
  For platform-managed rollbacks, check whether old revisions, ReplicaSets, Helm release history,
133
174
  blue/green environments, or rollout controller history are retained long enough to be useful.
134
175
  Do not treat a rollback path as safe if the old version is immediately scaled to zero, cold, or
135
176
  deleted. Prefer traffic rollback to an already-warm old version before rebuilding or pulling a
136
177
  replacement during an incident.
137
178
 
138
- 5. Split deploy order from migration order.
179
+ 7. Split deploy order from migration order.
139
180
  For DB changes, check expand/migrate/read-write/switch/contract sequencing. New code must tolerate
140
181
  old data, old code must tolerate expanded schema or new data, and contraction must wait until old
141
182
  code is impossible. Check migration lock timeout, batch size, retryability, partial progress
142
183
  markers, rollback preview evidence, point-in-time recovery practice, database config backup, and
143
- rollback limits. Treat destructive rollback SQL as a data-loss risk, not a recovery guarantee.
184
+ rollback limits. Require dry-run or shadow database evidence when the repository has a migration
185
+ surface. Treat destructive rollback SQL as a data-loss risk, not a recovery guarantee.
144
186
 
145
- 6. Treat config changes as code changes.
187
+ 8. Treat config and IaC changes as code changes.
146
188
  Diff config and environment variables. Require startup validation for missing, misspelled, empty,
147
189
  malformed, or incompatible values. A service should fail fast before accepting traffic when a
148
190
  required setting is unsafe. Prefer versioned or immutable config names over in-place config
149
191
  mutation, and name whether env-var, mounted-file, or subpath-style config updates require pod or
150
- process restart.
151
-
152
- 7. Separate startup, liveness, and readiness.
192
+ process restart. For infrastructure files, review rendered manifests or plans rather than only
193
+ source snippets. Check resource requests and limits, probes, privileged mode, hostPath mounts,
194
+ plaintext secrets, IAM or service account scope, image digest pinning, and policy validation
195
+ coverage.
196
+
197
+ 9. Keep secrets out of the wrong stage.
198
+ Test, preview, staging, and production jobs should use environment-scoped secrets with the
199
+ smallest permission set that can complete that job. Prefer OIDC or short-lived credentials over
200
+ long-lived cloud keys in CI. Production secrets should require the production environment gate,
201
+ and preview jobs should not be able to read them. Check secret masking, rotation or revocation
202
+ notes, external secret store or encryption-at-rest boundary, and whether platform-native secrets
203
+ also need RBAC, namespace, and audit controls.
204
+
205
+ 10. Separate startup, liveness, and readiness.
153
206
  Startup should protect slow boots, liveness should detect stuck processes, and readiness should
154
207
  gate whether the instance can serve real dependencies. Do not count "process is running" as
155
208
  deployment readiness. Keep liveness conservative enough that overload or long GC pauses do not
156
209
  turn a recoverable incident into a restart loop.
157
210
 
158
- 8. Check graceful shutdown before rolling traffic.
211
+ 11. Check graceful shutdown before rolling traffic.
159
212
  Verify SIGTERM or platform shutdown handling for in-flight requests, DB transactions, uploads,
160
213
  payment or webhook callbacks, and streaming responses. The app shutdown timeout must be shorter
161
214
  than load balancer connection draining.
162
215
 
163
- 9. Drain workers deliberately.
216
+ 12. Drain workers deliberately.
164
217
  Queue workers need a stop-accepting-new-work phase, a current-work completion or checkpoint phase,
165
218
  and an idempotent retry path for interrupted work. Ack, delete, offset commit, and visibility
166
219
  timeout behavior must match the shutdown path. Rollback should name consumer pause, in-flight
167
220
  work completion, idempotency-key checks, and dead-letter or quarantine inspection for worker and
168
221
  scheduler surfaces.
169
222
 
170
- 10. Prove message compatibility.
223
+ 13. Prove message compatibility.
171
224
  Producers and consumers are rarely deployed at the exact same instant. Message format changes need
172
225
  N-1 message compatibility, tolerant readers, versioned fields, defaults for missing data, and a
173
226
  plan for queued old messages. Unknown event types should not crash old consumers; require
174
227
  quarantine, dead-letter, or ignore policy that preserves investigation evidence.
175
228
 
176
- 11. Design external side effects as compensation.
229
+ 14. Design external side effects as compensation.
177
230
  Emails, payment authorizations, third-party webhooks, provider state, and object-storage writes
178
231
  do not disappear when code rolls back. Require outbox, idempotency key, state machine,
179
232
  reconciliation, or compensation notes for side effects that cannot be undone by reverting code.
180
233
 
181
- 12. Keep API compatibility wider than the deploy window.
234
+ 15. Keep API compatibility wider than the deploy window.
182
235
  Server versions and clients do not change at the same instant. Check N-1 and N+1 compatibility
183
236
  for request fields, response fields, enum values, error codes, API versions, and mobile or SDK
184
237
  clients. Removing fields, narrowing enums, or adding required inputs can make rollback fail even
185
238
  when the old container starts.
186
239
 
187
- 13. Add a kill switch, not just a flag.
240
+ 16. Add a kill switch, not just a flag.
188
241
  Feature flags that only select a new path are not enough. Require a fast disable path, a safe
189
- default when flag lookup fails, and an owner who can flip it during the observation window.
190
-
191
- 14. Define the canary cohort.
242
+ default when flag lookup fails, an owner who can flip it during the observation window, rollout
243
+ percentages or cohorts, and an expiry or removal plan so release flags do not become permanent
244
+ hidden branches.
245
+
246
+ 17. Run production preflight before touching traffic.
247
+ Preflight should name the active incident status, deployment window, error budget or SLO burn,
248
+ DB replication lag, queue depth, cache health, dependency health, cloud quota, feature flag
249
+ defaults, rollback target digest, migration compatibility, alert mute state, and on-call owner.
250
+ If the system is already degraded, default to blocking or narrowing the deploy rather than adding
251
+ another variable to an incident.
252
+
253
+ 18. Define the canary cohort.
192
254
  Percent-only canaries can hide the exact risky traffic. Name which users, tenants, regions,
193
255
  routes, worker partitions, queues, payment methods, or dependency paths receive the new version.
194
256
  Prefer a cohort that exercises the changed behavior without making the blast radius vague.
195
257
 
196
- 15. Measure canaries by version and by user harm.
258
+ 19. Measure canaries by version and by user harm.
197
259
  A small canary can disappear in global averages. Require metrics split by service version,
198
260
  release id, route, cohort, or worker partition where safe. Prefer user-impact signals such as
199
261
  5xx, p95 or p99 latency, login failure, payment failure, order failure, queue backlog, or retry
200
262
  explosion over CPU-only rollback triggers.
201
263
 
202
- 16. Make automatic stop conditions numeric.
264
+ 20. Make automatic stop conditions numeric.
203
265
  Before rollout, define stop or rollback thresholds for error rate, p95 or p99 latency, login or
204
266
  payment failure, queue backlog, retry rate, dependency timeout, and saturation. "Watch it" is not
205
267
  a stop condition.
206
268
 
207
- 17. Verify read and write paths separately.
269
+ 21. Verify read and write paths separately.
208
270
  A read-only health check can pass while writes are broken. Add or identify smoke checks and
209
271
  synthetic transactions that cover the changed read path, changed write path, and visible business
210
272
  result without corrupting production data.
211
273
 
212
- 18. Preserve release attribution in logs and telemetry.
274
+ 22. Preserve release attribution in logs and telemetry.
213
275
  Deployment logs and runtime logs should expose commit SHA, image tag or digest, config version,
214
276
  chart or manifest revision, config version, migration version, feature flag state, deployment
215
277
  environment, deployment id, service version, and instance. Log format changes must not silently
216
278
  break alerts, dashboards, or search queries.
217
279
 
218
- 19. Version cache keys and narrow invalidation.
280
+ 23. Version cache keys and narrow invalidation.
219
281
  Cache changes need cache key versioning or compatibility. Avoid blanket flushes unless the cold
220
282
  DB load has been budgeted. Prefer narrow, gradual invalidation with fallback behavior named.
221
283
  Old code must not deserialize new cache payloads without a compatibility plan.
222
284
 
223
- 20. Guard scheduler duplication.
285
+ 24. Guard scheduler duplication.
224
286
  Cron and scheduled jobs can overlap during rolling deploys. Check singleton locks, leader
225
287
  election, idempotency keys, deployment locks, and duplicate execution behavior for old and new
226
288
  versions.
227
289
 
228
- 21. Treat CRDs and operators as schema rollouts.
290
+ 25. Treat CRDs and operators as schema rollouts.
229
291
  Custom resources, operators, storage versions, conversion strategies, and controller downgrade
230
292
  behavior can block rollback even when application pods are healthy. When those surfaces change,
231
293
  require stored-object migration, old-client compatibility, and operator downgrade notes.
232
294
 
233
- 22. Use deployment locks per service and environment.
295
+ 26. Use deployment locks per service and environment.
234
296
  Two deploys, migrations, or production commands against the same service/environment need an
235
297
  explicit conflict rule. Name the lock scope and the operator-visible owner.
236
298
 
237
- 23. Make production commands boring.
299
+ 27. Make production commands boring.
238
300
  Any production command touched by the change should have dry-run output, bounded target scope,
239
301
  explicit confirmation or ticket reference, and refusal behavior for ambiguous environment,
240
302
  tenant, region, or time range.
241
303
 
242
- 24. Reserve post-deploy observation time.
304
+ 28. Reserve post-deploy observation time.
243
305
  Deployment is not done when traffic flips. Require a post-deploy observation window with owners
244
306
  available, synthetic transaction results, dashboards, logs, queue backlog, dependency health, and
245
307
  stop-condition status checked.
@@ -257,13 +319,22 @@ incident.
257
319
  ## Review Checklist
258
320
 
259
321
  - Deployment resource ledger is complete enough to name the blast radius.
322
+ - Merge and CI gates include required checks, branch protection or equivalent merge policy,
323
+ fast-fail ordering, clean test data, flaky-test ownership, and security scan coverage.
260
324
  - Release envelope binds image digest, config, Secret, migration, flag, traffic, deployer, and time
261
325
  under a stable release identity where those platform concepts exist.
326
+ - Preview deploys are isolated, seeded, protected, cleaned up, and kept away from production secrets
327
+ and production PII.
262
328
  - Same artifact is promoted across environments or rebuild drift is explicitly controlled.
263
329
  - Rollback history is retained, the old version can stay warm when needed, and traffic rollback is
264
330
  separated from rebuilding or repulling code.
265
- - DB migration, code deploy, data backfill, cache, queue, and rollback order are separated.
266
- - Required config is validated at startup and deployment config diff is inspectable.
331
+ - DB migration, migration dry-run or shadow evidence, code deploy, data backfill, cache, queue, and
332
+ rollback order are separated.
333
+ - Required config is validated at startup, deployment config diff is inspectable, and rendered
334
+ manifests or IaC plans are policy-checked for resources, probes, privileges, secrets, and digest
335
+ pinning.
336
+ - Secrets are environment-scoped, production-gated, least-privilege, short-lived where possible,
337
+ masked in logs, and rotatable.
267
338
  - Startup, liveness, and readiness probes answer different questions.
268
339
  - Graceful shutdown, load balancer connection draining, worker drain, and interrupted work retry are
269
340
  compatible.
@@ -272,6 +343,8 @@ incident.
272
343
  - External side effects have idempotency, reconciliation, or compensation instead of rollback
273
344
  overclaim.
274
345
  - Feature flag lookup failure chooses the safe default and a kill switch can stop the change.
346
+ - Production preflight checks active incidents, error budget, queue depth, DB lag, dependency
347
+ health, quota, rollback target, feature defaults, alert mute state, and owner availability.
275
348
  - Canary cohort, version-split telemetry, and automatic stop condition are concrete, numeric, and
276
349
  tied to the changed flow.
277
350
  - Read smoke, write smoke, synthetic transaction, deployment logs, and post-deploy metrics are
@@ -315,7 +388,8 @@ When reporting a review or change, include:
315
388
  - Skills used.
316
389
  - Deployment resources touched.
317
390
  - Rollout and rollback risks found or guarded.
318
- - Config, migration, cache, queue, probe, shutdown, canary, and observation decisions.
391
+ - CI, preview, artifact promotion, config, secret, migration, IaC, cache, queue, probe, shutdown,
392
+ canary, rollback, and observation decisions.
319
393
  - Verification commands run and their result.
320
394
  - Remaining manual deployment checks, especially production-only smoke, canary, and observation
321
395
  steps that cannot be executed locally.