mustflow 2.107.9 → 2.108.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (26) hide show
  1. package/dist/cli/commands/api/serve.js +73 -10
  2. package/dist/core/run-receipt-state.js +23 -2
  3. package/dist/core/secret-redaction.js +6 -1
  4. package/package.json +1 -1
  5. package/schemas/api-serve-response.schema.json +1 -0
  6. package/templates/default/i18n.toml +48 -12
  7. package/templates/default/locales/en/.mustflow/docs/agent-workflow.md +24 -1
  8. package/templates/default/locales/en/.mustflow/skills/INDEX.md +52 -14
  9. package/templates/default/locales/en/.mustflow/skills/admin-control-plane-safety-review/SKILL.md +200 -0
  10. package/templates/default/locales/en/.mustflow/skills/ai-product-readiness-review/SKILL.md +158 -0
  11. package/templates/default/locales/en/.mustflow/skills/auth-permission-change/SKILL.md +91 -28
  12. package/templates/default/locales/en/.mustflow/skills/browser-automation-reliability-review/SKILL.md +279 -0
  13. package/templates/default/locales/en/.mustflow/skills/ci-pipeline-triage/SKILL.md +39 -11
  14. package/templates/default/locales/en/.mustflow/skills/cloud-cost-guardrail-review/SKILL.md +4 -1
  15. package/templates/default/locales/en/.mustflow/skills/database-change-safety/SKILL.md +21 -2
  16. package/templates/default/locales/en/.mustflow/skills/database-migration-change/SKILL.md +25 -7
  17. package/templates/default/locales/en/.mustflow/skills/deployment-rollout-safety-review/SKILL.md +117 -43
  18. package/templates/default/locales/en/.mustflow/skills/frontend-component-library-review/SKILL.md +299 -0
  19. package/templates/default/locales/en/.mustflow/skills/frontend-localization-review/SKILL.md +128 -36
  20. package/templates/default/locales/en/.mustflow/skills/notification-delivery-integrity-review/SKILL.md +226 -0
  21. package/templates/default/locales/en/.mustflow/skills/payment-integrity-review/SKILL.md +34 -14
  22. package/templates/default/locales/en/.mustflow/skills/routes.toml +36 -0
  23. package/templates/default/locales/en/.mustflow/skills/small-service-platform-architecture-review/SKILL.md +273 -0
  24. package/templates/default/locales/en/.mustflow/skills/tauri-code-change/SKILL.md +41 -3
  25. package/templates/default/locales/en/.mustflow/skills/wails-code-change/SKILL.md +34 -4
  26. package/templates/default/manifest.toml +43 -1
@@ -0,0 +1,279 @@
1
+ ---
2
+ mustflow_doc: skill.browser-automation-reliability-review
3
+ locale: en
4
+ canonical: true
5
+ revision: 1
6
+ lifecycle: mustflow-owned
7
+ authority: procedure
8
+ name: browser-automation-reliability-review
9
+ description: Apply this skill when browser automation, UI automation, Playwright, Selenium, Puppeteer, WebDriver, computer-use/browser-driving agents, visual browser verification, flaky selectors, page readiness, authentication state, CAPTCHA or anti-bot handling, rate limits, screenshot checks, retry, timeout, human approval, or browser automation observability is created, changed, reviewed, triaged, or reported.
10
+ metadata:
11
+ mustflow_schema: "1"
12
+ mustflow_kind: procedure
13
+ pack_id: mustflow.core
14
+ skill_id: mustflow.core.browser-automation-reliability-review
15
+ command_intents:
16
+ - changes_status
17
+ - changes_diff_summary
18
+ - lint
19
+ - build
20
+ - test_related
21
+ - test
22
+ - docs_validate_fast
23
+ - test_release
24
+ - mustflow_check
25
+ ---
26
+
27
+ # Browser Automation Reliability Review
28
+
29
+ <!-- mustflow-section: purpose -->
30
+ ## Purpose
31
+
32
+ Review browser automation as a stateful, evidence-producing system, not as a sequence of clicks.
33
+
34
+ The core question is: "Does the automation know what state the browser, user, page, network,
35
+ session, target data, and approval gate are in before it acts and before it claims success?" If not,
36
+ the flow will look fine in a demo and then fail under rerenders, slow CI, auth drift, anti-bot
37
+ gates, rate limits, visual noise, stale approvals, or agent hallucination.
38
+
39
+ <!-- mustflow-section: use-when -->
40
+ ## Use When
41
+
42
+ - Code, tests, docs, templates, or reviews touch browser automation, UI automation, end-to-end
43
+ harnesses, Playwright, Selenium, Puppeteer, WebDriver, browser contexts, remote browsers,
44
+ screenshots, videos, traces, HAR files, synthetic user flows, or computer-use browser agents.
45
+ - A task mentions flaky selectors, unstable locators, actionability, stale elements, rerenders,
46
+ page readiness, `networkidle`, sleeps, waits, timeouts, retries, screenshot diffs, visual checks,
47
+ popups, downloads, native dialogs, iframes, shadow DOM, virtualized lists, or input typing.
48
+ - Automation logs into a product, reuses storage state, shares accounts across workers, handles SSO,
49
+ OAuth, MFA, passkeys, cookies, localStorage, sessionStorage, IndexedDB, account lockout, or
50
+ permission changes.
51
+ - Browser automation touches third-party sites, CAPTCHA, anti-bot or WAF challenges, rate limits,
52
+ robots or terms boundaries, IP reputation, headless fingerprints, provider throttling, or manual
53
+ fallback paths.
54
+ - A browser-driving agent reads page content, follows page instructions, clicks by screenshot or
55
+ coordinates, extracts table data visually, enters forms, sends messages, purchases, deletes,
56
+ mutates external state, or asks for human approval before continuing.
57
+
58
+ <!-- mustflow-section: do-not-use-when -->
59
+ ## Do Not Use When
60
+
61
+ - The task is a pure LLM agent control-flow change with no browser or UI automation surface. Use
62
+ `agent-execution-control-review`.
63
+ - The task is only prompt, RAG, model, tool schema, cost, latency, hallucination, or eval behavior
64
+ without browser execution. Use the matching LLM or agent specialist skill.
65
+ - The task is only a product auth bug that is not being automated through a browser. Use
66
+ `auth-flow-triage` or `auth-permission-change`.
67
+ - The task is only a browser request, CORS, CDN, API, or provider failure before the browser
68
+ automation layer is relevant. Use `api-failure-triage`.
69
+ - The task is only frontend UI quality, layout resilience, accessibility, render stability, or web
70
+ performance for human users rather than automation harness reliability. Use the matching frontend
71
+ skill first.
72
+ - The task is only test-suite runtime optimization, shard balance, retry policy, or flaky-test
73
+ handling without browser-specific failure modes. Use `test-suite-performance-review` or
74
+ `test-maintenance`.
75
+
76
+ <!-- mustflow-section: required-inputs -->
77
+ ## Required Inputs
78
+
79
+ - Automation intent ledger: target site or app, owner, internal versus third-party boundary,
80
+ allowed actions, forbidden actions, expected user role, data class, write risk, and whether the
81
+ browser path is the right tool rather than an API, fixture, or deterministic adapter.
82
+ - State ledger: current URL, frame, page, route, modal, popup, selected account, auth storage,
83
+ browser context, viewport, locale, timezone, permissions, feature flags, test data, worker ID,
84
+ correlation ID, and previous step result.
85
+ - Readiness ledger: page-ready signal, data-ready signal, actionable-control signal, business-ready
86
+ signal, network and background-work assumptions, and any waits or assertions that prove them.
87
+ - Selector and action ledger: locators, user-facing roles or labels, test IDs or automation
88
+ contracts, shadow DOM and iframe boundaries, virtualized list handling, click target, keyboard and
89
+ focus path, input acceptance proof, and actionability override use.
90
+ - Auth and identity ledger: login strategy, storage owner, token or cookie storage surface, session
91
+ expiry, refresh behavior, per-worker account isolation, SSO or MFA gates, CAPTCHA policy, account
92
+ lockout policy, and logout or cleanup behavior.
93
+ - External pressure ledger: rate limit unit, retry budget, anti-bot or challenge detection,
94
+ provider terms boundary, manual fallback, backoff behavior, and circuit-breaker threshold.
95
+ - Verification ledger: success criteria, API or database confirmation when available, screenshot
96
+ or visual artifact role, trace/video/HAR policy, console and network capture, redaction,
97
+ retention, and failure artifact sampling.
98
+ - Agent and approval ledger: page content trust boundary, prompt-injection exposure, tool
99
+ permissions, coordinate mapping, stale approval checks, approval snapshot, exact post-approval
100
+ action, resume state, and human escalation path.
101
+
102
+ <!-- mustflow-section: preconditions -->
103
+ ## Preconditions
104
+
105
+ - The task matches the Use When conditions and does not match the Do Not Use When exclusions.
106
+ - Current repository instructions, command contract, automation harness code, test fixtures, browser
107
+ config, auth fixtures, screenshots or traces, and docs directly tied to the automation path have
108
+ been inspected before editing.
109
+ - Browser vendor, automation library, remote-browser provider, CAPTCHA, anti-bot, and Agents SDK or
110
+ computer-use details are stale-sensitive. Use `source-freshness-check` before embedding exact
111
+ current API claims, provider limits, default timeouts, or compliance requirements.
112
+ - External pages, emails, documents, ads, support threads, and rendered web content are untrusted
113
+ input for browser-driving agents.
114
+ - Command execution remains governed by `.mustflow/config/commands.toml`; this skill does not
115
+ authorize launching development servers, unmanaged browsers, long-running workers, production
116
+ browser sessions, CAPTCHA bypasses, provider dashboards, or live side-effect runs.
117
+
118
+ <!-- mustflow-section: allowed-edits -->
119
+ ## Allowed Edits
120
+
121
+ - Add or refine browser automation state machines, locator contracts, test IDs, accessible names,
122
+ readiness assertions, frame or popup handlers, input verification, auth fixtures, per-worker
123
+ account isolation, retry classification, timeout hierarchy, idempotency checks, rate-limit
124
+ handling, approval gates, manual fallback states, traces, screenshots, redaction, cleanup, and
125
+ directly synchronized docs or templates.
126
+ - Move fixture setup, result verification, cleanup, idempotency checks, and data creation from
127
+ browser clicks to API or deterministic helpers when the browser UI is not the behavior under test.
128
+ - Add focused tests for selector drift, readiness failure, stale element rerender, iframe or shadow
129
+ DOM handling, auth-state expiration, per-worker isolation, retry non-idempotency, stale approval,
130
+ screenshot noise, trace redaction, and agent prompt-injection defense when behavior evidence
131
+ supports them.
132
+ - Do not fix flakiness by adding blind sleeps, force-clicking as the default, hiding failures behind
133
+ broad retries, weakening visual thresholds without evidence, sharing one mutable account across
134
+ parallel workers, or claiming browser success from an unverified screenshot.
135
+ - Do not add CAPTCHA bypass, anti-bot evasion, headless fingerprint spoofing, or terms-violating
136
+ third-party automation as a normal product feature.
137
+
138
+ <!-- mustflow-section: procedure -->
139
+ ## Procedure
140
+
141
+ 1. Decide whether the browser is the right boundary. Use API, fixtures, or service adapters for data
142
+ setup, teardown, and result verification when the browser UI itself is not the behavior being
143
+ tested or automated.
144
+ 2. Classify the automation owner: internal app E2E, internal operations tool, third-party site
145
+ workflow, browser-driving LLM agent, visual regression, scraping-like extraction, support tool,
146
+ or production user-assistance flow.
147
+ 3. Define a state machine before actions. Name the states such as unauthenticated, authenticated,
148
+ searching, selecting target, filling form, awaiting approval, submitting, verifying result,
149
+ retrying, blocked by challenge, manual fallback, succeeded, and failed.
150
+ 4. Replace sleeps with readiness evidence. For each step, define what proves the page is ready, the
151
+ data is ready, the target control is actionable, and the business state is safe to advance.
152
+ 5. Treat `networkidle` and selector-visible waits as weak signals. Prefer domain assertions such as
153
+ expected row identity, enabled submit state, loaded data count, settled validation, known URL,
154
+ confirmation ID, provider event, or backend result.
155
+ 6. Review locator contracts. Prefer stable user-facing roles, labels, names, and explicit test IDs
156
+ over CSS layout paths, generated classes, index-based XPath, translated prose only, or first-match
157
+ selectors.
158
+ 7. Check ambiguous DOM. Handle hidden duplicate controls, responsive desktop and mobile DOM at the
159
+ same time, skeletons that resemble real content, virtualized rows, portals, sticky overlays,
160
+ cookie banners, focus traps, iframes, cross-origin frames, shadow DOM, and custom components.
161
+ 8. Avoid stale element handles. Re-resolve locators at action time, and keep find-check-act-verify
162
+ close together so rerenders cannot invalidate old DOM references silently.
163
+ 9. Review actionability honestly. A forced click, coordinate click, JS-dispatched event, or disabled
164
+ actionability check must be exceptional, documented, and followed by proof that a real user path
165
+ is not being bypassed.
166
+ 10. Verify input acceptance. After typing, pasting, selecting dates, entering currency, using IME,
167
+ triggering autocomplete, or blurring a field, confirm the stored value, validation state, submit
168
+ readiness, or outbound payload rather than assuming keystrokes were accepted.
169
+ 11. Make auth state explicit. Identify whether auth lives in cookies, localStorage, sessionStorage,
170
+ IndexedDB, memory, or provider redirects; isolate accounts by worker; avoid shared mutable user
171
+ state; and handle expiry, rotation, SSO, MFA, passkeys, lockout, and logout contamination.
172
+ 12. Treat CAPTCHA and anti-bot as product states. In test or staging, use allowed test keys,
173
+ allowlists, or disabled challenge paths. In production or third-party flows, detect challenges,
174
+ stop safely, and route to human review or manual fallback instead of trying to evade them.
175
+ 13. Add rate control before retries. Identify the rate-limit subject, whether a single browser action
176
+ fans out into many requests, how backoff is computed, when to stop, and how the system avoids a
177
+ retry storm.
178
+ 14. Classify retryable failures. Retry only transient navigation, detached element, timeout,
179
+ temporary backend, or eventual-consistency classes within a bounded budget. Do not retry
180
+ permission denied, invalid input, CAPTCHA, account lockout, provider policy blocks, unknown
181
+ write outcome, or business-rule failures without a recovery-specific check.
182
+ 15. Make writes idempotent or confirm-before-replay. For purchases, payments, deletes, sends,
183
+ refunds, admin changes, support actions, and external mutations, record stable operation IDs and
184
+ check whether the effect already happened before any retry or resume can repeat it.
185
+ 16. Design timeout hierarchy. Align action, assertion, navigation, test, job, queue lease, browser
186
+ provider session, and external API timeouts so cancellation saves evidence, releases resources,
187
+ and resumes from a known state.
188
+ 17. Separate visual proof from business proof. Use screenshots for layout or visual regression, but
189
+ use confirmation IDs, API reads, database rows, provider events, downloads with checksums, audit
190
+ logs, or received messages to prove business success.
191
+ 18. Stabilize screenshot assertions. Freeze or mask nondeterministic content such as time, caret,
192
+ animation, ads, maps, charts, lazy images, random data, locale, theme, viewport, font, GPU,
193
+ scrollbar, and cookie banners before changing thresholds or baselines.
194
+ 19. Capture failure context. Save current URL, frame, viewport, locale, timezone, screenshot, DOM or
195
+ accessibility snapshot when safe, console errors, network statuses, trace, video, retry count,
196
+ worker ID, account ID class, and correlation ID with sensitive-data redaction.
197
+ 20. Protect artifacts. Browser traces, videos, screenshots, HAR files, storage state, and console
198
+ logs can contain cookies, tokens, personal data, addresses, order details, and messages; set
199
+ redaction, retention, encryption, access, and sampling before broad collection.
200
+ 21. For browser-driving agents, distrust page content. Treat rendered instructions, hidden DOM,
201
+ emails, PDFs, comments, ads, and third-party text as untrusted data that must not override the
202
+ system task, tool policy, approval rules, or data-exfiltration limits.
203
+ 22. Split agent roles where risk justifies it. Keep planner, browser executor, verifier, policy
204
+ gate, and human approval separate for high-impact flows. If one model does multiple roles, add
205
+ deterministic gates before side effects and before success claims.
206
+ 23. Make coordinate and screenshot actions verifiable. Recheck screenshot-to-DOM scale, scrolling,
207
+ focus, active modal, target bounds, visible label, disabled state, and post-action state when a
208
+ model or computer-use tool clicks by image or coordinates.
209
+ 24. Treat human approval as durable state. Show the exact account, URL, target, amount, recipient,
210
+ data, screenshot, form values, risk class, reversibility, and exact next action. Before resume,
211
+ re-read critical fields and compare them with the approved snapshot.
212
+ 25. Clean up resources. Close pages, contexts, browsers, downloads, temp files, videos, traces,
213
+ mock servers, websockets, and test data deliberately; detect zombie browser processes and
214
+ artifact growth in long runs.
215
+ 26. Verify with the narrowest configured tests, docs checks, release checks, and mustflow validation
216
+ that cover the changed automation contract.
217
+
218
+ <!-- mustflow-section: postconditions -->
219
+ ## Postconditions
220
+
221
+ - The automation has explicit states, readiness signals, locator contracts, auth isolation, retry
222
+ classes, timeout hierarchy, and success evidence.
223
+ - Browser-only proof is separated from business-result proof.
224
+ - CAPTCHA, anti-bot, rate-limit, human-approval, prompt-injection, and third-party boundary risks
225
+ are detected, stopped, or routed to manual fallback instead of hidden behind retries.
226
+ - Failure artifacts are useful enough to debug and constrained enough not to leak secrets or
227
+ personal data.
228
+
229
+ <!-- mustflow-section: verification -->
230
+ ## Verification
231
+
232
+ Use configured oneshot command intents when available:
233
+
234
+ - `changes_status`
235
+ - `changes_diff_summary`
236
+ - `lint`
237
+ - `build`
238
+ - `test_related`
239
+ - `test`
240
+ - `docs_validate_fast`
241
+ - `test_release`
242
+ - `mustflow_check`
243
+
244
+ Use the narrowest configured fixture, unit, integration, docs, package, or release check that proves
245
+ the changed browser automation contract. Do not infer raw browser launches, dev servers, headed
246
+ browsers, provider dashboards, CAPTCHA-solving services, or production automation runs from local
247
+ files.
248
+
249
+ <!-- mustflow-section: failure-handling -->
250
+ ## Failure Handling
251
+
252
+ - If the failure is not localized to browser automation, use `api-failure-triage`,
253
+ `auth-flow-triage`, `frontend-render-stability`, `test-maintenance`, or another narrower skill
254
+ first.
255
+ - If a selector is flaky, do not patch only the selector string until locator ownership, duplicate
256
+ DOM, responsive DOM, skeletons, frames, shadow DOM, and readiness have been checked.
257
+ - If a retry would replay an unknown write, stop and add idempotency or effect-confirmation before
258
+ enabling retry.
259
+ - If CAPTCHA, anti-bot, account lockout, provider policy, or terms boundaries are detected, stop the
260
+ automation path and report the manual or contractual fallback instead of bypassing it.
261
+ - If human approval resumes after state changed, expire the approval or request a new approval with
262
+ the changed fields.
263
+ - If artifacts would leak secrets or personal data, collect a smaller redacted evidence set and
264
+ report the observability gap.
265
+ - If a configured command fails, use `failure-triage` before continuing.
266
+
267
+ <!-- mustflow-section: output-format -->
268
+ ## Output Format
269
+
270
+ - Browser automation surface reviewed
271
+ - Browser-versus-API boundary and automation owner
272
+ - State machine, readiness, locator, actionability, auth, rate-limit, retry, timeout, and
273
+ idempotency decisions
274
+ - Screenshot, trace, artifact, redaction, and business-success evidence
275
+ - Agent page-content trust, coordinate action, tool permission, approval, and resume checks
276
+ - Files changed
277
+ - Command intents run
278
+ - Skipped checks and reasons
279
+ - Remaining browser automation reliability risk
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.ci-pipeline-triage
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: ci-pipeline-triage
9
- description: Apply this skill when a CI/CD workflow, pipeline, job, runner, matrix, trigger, cache, artifact, deployment job, required check, or post-deploy verification is failing, skipped, queued, flaky, slow, green despite broken output, or not yet localized to trigger, runner, environment, build, test, artifact, deploy, or verification boundaries.
9
+ description: Apply this skill when a CI/CD workflow, pipeline, job, runner, matrix, trigger, cache, artifact, runner-minute billing, artifact storage or retention, deployment job, required check, or post-deploy verification is failing, skipped, queued, flaky, slow, unexpectedly expensive, green despite broken output, or not yet localized to trigger, runner, environment, build, test, cache, artifact, billing, deploy, or verification boundaries.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -46,6 +46,9 @@ changed from the last known-good run, and what evidence would disprove each boun
46
46
  deployment permissions, rollout completion, or post-deploy verification.
47
47
  - A pipeline suddenly breaks without application-code changes, or only fails on forks, protected
48
48
  branches, specific runners, specific regions, specific matrix entries, or reruns.
49
+ - A CI workflow becomes unexpectedly expensive, burns private-repository minutes too quickly,
50
+ exhausts artifact storage, keeps long-lived test artifacts, or needs a release matrix cost review
51
+ before the expensive boundary is known.
49
52
 
50
53
  <!-- mustflow-section: do-not-use-when -->
51
54
  ## Do Not Use When
@@ -66,6 +69,10 @@ changed from the last known-good run, and what evidence would disprove each boun
66
69
  - Run identity ledger: commit SHA, branch or tag, trigger event, workflow file revision, matrix
67
70
  entry, runner label and image, architecture, region, toolchain versions, package-manager version,
68
71
  execution time, and run or job id.
72
+ - CI billing ledger when cost is in scope: public versus private repository behavior, plan or
73
+ allowance snapshot, provider billing page or docs date, runner OS and size, job count, matrix
74
+ shape, per-job rounding behavior, queue versus execution time, artifact retention days, cache
75
+ retention or quota, and release asset handoff.
69
76
  - Last-good comparison: last successful commit and first failing commit, including workflow files,
70
77
  lockfiles, base images, shared scripts, secrets or permission scopes, runner labels, cache keys,
71
78
  feature flags, deployment config, and required-check settings.
@@ -88,9 +95,9 @@ changed from the last known-good run, and what evidence would disprove each boun
88
95
  ## Allowed Edits
89
96
 
90
97
  - Add or tighten workflow triggers, path filters, matrix guards, version pinning, cache keys,
91
- artifact manifests, status aggregation, debug evidence collection, secret-safe diagnostics,
92
- timeout classification, runner labels, concurrency locks, environment validation, smoke checks,
93
- test isolation, docs, and focused fixtures.
98
+ artifact manifests, artifact retention, release-asset promotion, status aggregation, debug
99
+ evidence collection, secret-safe diagnostics, timeout classification, runner labels, concurrency
100
+ locks, environment validation, smoke checks, test isolation, docs, and focused fixtures.
94
101
  - Add tests or docs that prove workflow contract behavior, package metadata, template output,
95
102
  release checks, artifact identity, or command-contract mapping when the repository owns those
96
103
  surfaces.
@@ -134,21 +141,37 @@ changed from the last known-good run, and what evidence would disprove each boun
134
141
  dimensions. Artifacts need file list, size, hash, build SHA, and download verification.
135
142
  14. Verify that the tested artifact is the deployed artifact. Rebuilding during deploy can make CI
136
143
  test one thing and production receive another.
137
- 15. Check auth and permissions by execution context. Fork PRs, protected branches, environments,
144
+ 15. For CI cost or quota questions, split the bill before optimizing:
145
+ - runner execution minutes, not artifact bytes, usually dominate native app release cost;
146
+ - macOS or other premium runners can dominate a matrix even when Linux jobs are longer;
147
+ - job-level minimum billing or rounding can make many tiny split jobs cost more than one
148
+ grouped job;
149
+ - public repository standard-runner rules can differ from private repository included minutes;
150
+ - billing pages may display currency spend while plan allowances are minute or storage quotas,
151
+ so confirm the unit before comparing options.
152
+ 16. Separate Actions artifacts, caches, package registries, and release assets. Short-lived test
153
+ bundles should use short retention. Long-lived distributables should be promoted through the
154
+ repository's release or package channel when that is the intended public artifact. Do not treat
155
+ cache quota as artifact storage or release assets as CI retention.
156
+ 17. For native desktop matrices, avoid full bundles on every PR unless the repository explicitly
157
+ requires it. Prefer PR checks that prove frontend build plus native compile or type contracts on
158
+ the cheapest adequate runner, then run signed or full OS package matrices only on release tags,
159
+ release branches, or protected manual gates.
160
+ 18. Check auth and permissions by execution context. Fork PRs, protected branches, environments,
138
161
  OIDC identity, package publishing identity, cloud role, and repository token scopes can differ
139
162
  across otherwise similar runs.
140
- 16. For deployment jobs, require rollout evidence, readiness, smoke checks, error and latency
163
+ 19. For deployment jobs, require rollout evidence, readiness, smoke checks, error and latency
141
164
  thresholds, and environment concurrency locks instead of treating a zero exit code as success.
142
- 17. Preserve evidence before cleanup. Do not delete runners, caches, artifacts, temporary dirs, or
165
+ 20. Preserve evidence before cleanup. Do not delete runners, caches, artifacts, temporary dirs, or
143
166
  diagnostic logs until the boundary and redaction plan are clear.
144
- 18. Apply the smallest localized fix and verify with the narrowest configured intent that covers the
167
+ 21. Apply the smallest localized fix and verify with the narrowest configured intent that covers the
145
168
  changed workflow, package, docs, template, or test surface.
146
169
 
147
170
  <!-- mustflow-section: postconditions -->
148
171
  ## Postconditions
149
172
 
150
- - The pipeline failure is localized to trigger, runner, environment, build, test, artifact, deploy,
151
- verification, or a named evidence gap.
173
+ - The pipeline failure is localized to trigger, runner, environment, build, test, artifact, billing
174
+ or storage quota, deploy, verification, or a named evidence gap.
152
175
  - Last-good versus first-failure comparison, run identity, false-green risk, cache and artifact
153
176
  behavior, permission scope, and rerun determinism are explicit where relevant.
154
177
  - Follow-up deployment, test performance, security, command-contract, or package-release work is
@@ -178,6 +201,9 @@ CI reruns, deploys, cloud shell commands, or provider dashboard writes outside t
178
201
 
179
202
  - If run identity, last-good comparison, trigger graph, runner, cache, artifact, or permission
180
203
  evidence is missing, report the missing field instead of guessing.
204
+ - If CI pricing, included minutes, storage quotas, or runner rates are time-sensitive and not
205
+ locally available, avoid exact price claims and name the provider billing evidence that must be
206
+ checked.
181
207
  - If debug logs contain secrets or private data, stop copying raw output and summarize safely.
182
208
  - If CI evidence requires remote provider access that is unavailable or unconfigured, report the
183
209
  manual evidence boundary and continue with local workflow or static evidence.
@@ -191,6 +217,8 @@ CI reruns, deploys, cloud shell commands, or provider dashboard writes outside t
191
217
  - Failure shape and localized boundary
192
218
  - Run identity and last-good comparison
193
219
  - Trigger, runner, environment, build, test, cache, artifact, deploy, and verification findings
220
+ - Billing unit, runner-minute, matrix rounding, artifact retention, cache quota, and release asset
221
+ findings when cost is in scope
194
222
  - Hypotheses killed, still open, and selected follow-up boundary
195
223
  - Fix applied or recommended
196
224
  - Evidence level: provider run evidence, configured-test evidence, static review risk, manual-only,
@@ -2,7 +2,7 @@
2
2
  mustflow_doc: skill.cloud-cost-guardrail-review
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: cloud-cost-guardrail-review
@@ -65,6 +65,9 @@ lifecycle cleanup, and service-specific caps before the bill becomes the first a
65
65
  narrower security skill first, then use this skill for spend blast radius.
66
66
  - The task only changes local development code with no cloud, provider, telemetry, storage,
67
67
  network, external API, or deployable infrastructure surface.
68
+ - The task is primarily CI runner minutes, workflow matrix cost, Actions artifact retention,
69
+ build-cache quota, release asset handoff, or CI job billing; use `ci-pipeline-triage` first, then
70
+ return here only when broader cloud, SaaS, or provider spend guardrails remain.
68
71
 
69
72
  <!-- mustflow-section: required-inputs -->
70
73
  ## Required Inputs
@@ -2,7 +2,7 @@
2
2
  mustflow_doc: skill.database-change-safety
3
3
  locale: en
4
4
  canonical: true
5
- revision: 16
5
+ revision: 17
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: database-change-safety
@@ -79,6 +79,7 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
79
79
  - Event role: operational event, audit log, behavior analytics event, integration outbox message, reporting aggregate, or replayable domain event.
80
80
  - Data owner and affected tables, collections, stores, indexes, caches, generated files, or read models.
81
81
  - Entity identity rules, including stable ids, external provider ids, mutable slugs, titles, locale-specific addresses, redirects, and public API identifiers when content or user-facing resources are involved.
82
+ - Regret-prone schema shape rules, including internal versus public ids, normalized unique keys, tenant-scoped uniqueness, foreign keys, join tables, enum or lookup-table ownership, nullable-field meaning, JSON promotion criteria, custom-field boundaries, status history, optimistic locking, and operational trace fields.
82
83
  - Exit and restore rules, including whether exported data preserves relationships, permissions, files, versions, events, audit history, automation rules, provider id mappings, schema metadata, and enough import or restore evidence to reconstruct product state.
83
84
  - Identifier ownership rules, including which ids are product-owned, which ids are public, which ids are provider mappings, and whether external auth, payment, CRM, analytics, storage, or CMS ids can change without breaking internal references.
84
85
  - Authentication identity rules, including app-owned user id, provider subject records, email-as-attribute behavior, social provider subject preservation, account merge or relink policy, session migration expectations, and whether memberships, roles, permissions, and entitlements live in product-owned tables rather than only provider metadata.
@@ -173,8 +174,25 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
173
174
  - External-service core facts, such as current entitlement, subscription or plan state, processed payment event id, email consent state, customer lifecycle state, file identity and ownership, search source document metadata, job processing state, and audit evidence. Do not let a provider dashboard be the only place that can explain these facts.
174
175
  - Search and queue reconstruction records, such as index document builders, ranking or synonym policy versions, search logs, queue message schema versions, job idempotency keys, retry state, dead-letter state, and manual replay markers.
175
176
  4. Check schema shape: primary keys, foreign keys, unique constraints, nullable fields, defaults, check constraints, status values, timestamps, soft delete fields, tenant scope, audit fields, and retention rules.
177
+ - Use immutable internal primary keys for joins and separate public identifiers for URLs and APIs. Do not make email, slug, username, external provider id, or mutable display code the primary key for product-owned rows.
178
+ - Enforce uniqueness in the database, not only in application prechecks. Normalize comparison keys such as email, slug, provider id, and idempotency key explicitly, preserve the display value separately when needed, and name the unique constraint or index so operations can diagnose failures.
179
+ - Scope unique constraints to the real owner. Tenant-owned slugs, emails, invitations, memberships, idempotency keys, and external references usually need `tenant_id`, `workspace_id`, `operation_type`, or `provider` in the key. Global uniqueness should be a deliberate product rule, not an accident.
180
+ - Design soft-delete uniqueness before shipping. Active-only uniqueness, nullable unique behavior, restore conflicts, deleted-id reuse, and tombstone requirements must be explicit; otherwise deleted rows either block valid new records or allow duplicate active records.
181
+ - Prefer database foreign keys for core ownership and reference integrity. If an FK is intentionally omitted for scale, import staging, sharding, or asynchronous reconciliation, name the replacement invariant, cleanup path, and orphan-detection evidence. Index FK columns when joins, parent deletion checks, or tenant deletion depend on them.
182
+ - Treat `ON DELETE CASCADE` as a lifecycle promise, not cleanup convenience. Use it only when child rows truly share the parent's lifetime and audit, retention, restore, and legal obligations do not require separate survival.
183
+ - Model many-to-many relationships with join tables that can own role, status, order, source, timestamps, and actor fields. Avoid comma-separated ids, arrays of ids, or JSON lists for relationships that need joins, uniqueness, permissions, deletes, or audit.
184
+ - Treat polymorphic `entity_type` plus `entity_id` relations as integrity debt for core data because ordinary FKs cannot prove the target exists. Prefer target-specific tables, a shared parent table, or explicit constraint and cleanup machinery when the relation is business-critical.
185
+ - Choose enum, lookup table, or state machine based on change behavior. Stable technical codes may be enums; operator-managed values, values with display or sort metadata, plan or category catalogs, roles, and jurisdiction-specific rules usually need lookup tables. Workflow status needs allowed transitions and history, not only a value list.
186
+ - Avoid boolean state soup such as several independent `is_*` flags for one lifecycle. Use one current status plus timestamps or event history when states are mutually exclusive, ordered, reversible, or policy-driven.
187
+ - Give nullable fields exactly one meaning. Separate unknown, not applicable, not entered yet, deleted, failed, and pending states with explicit status or reason fields when queries or reports depend on the distinction.
188
+ - Avoid EAV or generic `entities`/`attributes`/`values` tables for core domain facts. If customer-defined fields are required, keep them in a bounded custom-field area with definitions, type validation, quotas, ownership, export semantics, and a promotion path once values drive search, sort, permission, billing, or reporting.
189
+ - Do not hide behavior-driving data in JSON. Keys used for filters, ordering, joins, uniqueness, permissions, tenant scope, status, retention, money, dates, quotas, indexes, or operational dashboards should be typed columns, child tables, or generated/computed columns with a migration path. Use `database-json-modeling-review` when JSON is part of the diff.
190
+ - Keep tenant ownership close to the owned row when tenant-scoped operations, billing, audit, export, restore, delete, or performance matter. B2B products should usually separate global users from tenant memberships, roles, invitations, entitlements, and billing records.
176
191
  - Treat deletion as lifecycle when recovery, audit, search behavior, support handling, or retention matters. Consider `deleted_at`, `deleted_by`, `delete_reason`, `restored_at`, `restored_by`, and `purge_after` instead of a lone boolean or timestamp.
177
192
  - Separate business records that should be soft-deleted or archived from personal data that should be anonymized, purged, or retained under a narrower legal rule.
193
+ - Keep status history for states that affect money, access, fulfillment, support, compliance, or user-visible commitments. A current status alone rarely explains who changed it, why, under which request, and whether a late webhook, retry, or admin action should still apply.
194
+ - Add optimistic versioning or conditional updates when two users, admins, workers, or webhooks can edit the same important row. Last-write-wins is usually data loss unless the product explicitly accepts it.
195
+ - Add operational trace fields where incident response will need them: server timestamps, actor ids, `created_by`, `updated_by`, `request_id`, `source`, import or provider reference, and safe reason codes. Do not add them blindly to every table, but do not leave high-value rows untraceable.
178
196
  - Treat mutable high-value records as versioned when reproducibility matters, such as AI prompts, documents, contracts, price policies, experiment configs, comparison data, permission policies, automation rules, and model settings. Prefer a stable parent row with a current-version pointer plus immutable version rows.
179
197
  - Use ledgers for money-like or quota-like balances, such as points, credits, inventory reservations, refunds, coupon issuance, entitlement grants, and manual adjustments. Treat cached balances as derived from ledger entries unless the local design proves otherwise.
180
198
  - For audit logs, store actor type, actor id when safe, action, target type and id, bounded before and after values, reason, request id, idempotency key, and timestamp in the same local transaction as the audited change when possible. Audit logs should be append-only to normal operators and should redact or omit personal data that is not needed to explain the change.
@@ -318,6 +336,7 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
318
336
  ## Postconditions
319
337
 
320
338
  - The database role and source of truth are explicit.
339
+ - Regret-prone schema shortcuts such as mutable primary keys, app-only uniqueness, unscoped tenant uniqueness, missing FK or cascade ownership, ambiguous nulls, boolean state soup, polymorphic core relations, EAV core facts, behavior-driving JSON, and user-as-tenant coupling are fixed, explicitly accepted, or reported.
321
340
  - Database rows, ORM models, generated caches, and read models do not leak into domain truth unless the local architecture intentionally owns that boundary.
322
341
  - Queries preserve authorization, tenant or user scope, deterministic ordering, expected absence behavior, and retention rules.
323
342
  - Content and resource models separate stable identity from mutable titles, slugs, URLs, translations, display fields, revisions, facts, sources, projections, and analytics dimensions when those concerns exist.
@@ -375,7 +394,7 @@ Prefer the narrowest configured test, build, docs, release, or mustflow intent t
375
394
 
376
395
  - Database role and owner
377
396
  - Affected read and write paths
378
- - Schema, constraint, and query semantics reviewed
397
+ - Schema-regret, constraint, relation, enum, JSON, custom-field, status-history, traceability, and query semantics reviewed
379
398
  - Identity, slug, lifecycle, asset, body block, taxonomy, relationship, attribute, filter URL, landing-page, translation, locale, country, currency, timezone, local-date, money, price snapshot, revision, claim, fact, source, collection, verification, comparison methodology, affiliate link, data-ownership, behavior analytics, audit log, API projection, public identifier, backup or restore, bulk update, admin audit, user-state, aggregate, cache-key, projection, and cache-invalidation checks where relevant
380
399
  - Export, import, product-owned id, provider-id mapping, relationship, permission, file, automation, event-history, and reconstruction checks where relevant
381
400
  - Authorization, tenant scope, retention, and privacy checks
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.database-migration-change
3
3
  locale: en
4
4
  canonical: true
5
- revision: 3
5
+ revision: 4
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: database-migration-change
9
- description: Apply this skill when database migration files, schema migration history, ORM schema migrations, generated clients, schema dumps, SQL snapshots, online DDL, large indexes, constraints, state-dependent CHECK constraints, backfills, rolling deploy compatibility, expand-and-contract changes, destructive database changes, migration rollback or roll-forward claims, cut-over plans, lock or timeout policy, replication lag risk, migration observability, or production database migration procedures are created, changed, reviewed, or reported.
9
+ description: Apply this skill when database migration files, schema migration history, ORM schema migrations, generated clients, schema dumps, SQL snapshots, online DDL, large indexes, constraints, state-dependent CHECK constraints, background-job backfills, zero-downtime migration claims, rolling deploy compatibility, expand-and-contract changes, destructive database changes, migration rollback or roll-forward claims, cut-over plans, feature-flagged read/write switches, lock or timeout policy, replication lag risk, migration observability, or production database migration procedures are created, changed, reviewed, or reported.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -33,13 +33,14 @@ Keep database migrations safe for running systems by checking deploy compatibili
33
33
 
34
34
  Do not treat migration authoring as "make a file that applies locally." Treat it as "old code and new code must survive the same database during rollout."
35
35
  Migration incidents usually happen in the interval where old code, new code, old data, and new data are all alive at once. Design that interval first.
36
+ Do not collapse schema expansion, data backfill, read or write cut-over, and destructive cleanup into one deploy-time migration just because that worked on a developer database.
36
37
 
37
38
  <!-- mustflow-section: use-when -->
38
39
  ## Use When
39
40
 
40
41
  - A database migration file, migration history entry, schema dump, ORM schema, SQL snapshot, generated client, seed, fixture, schema validator, or migration documentation is created or changed.
41
42
  - A change adds, removes, renames, splits, merges, backfills, rewrites, validates, constrains, indexes, foreign-keys, type-changes, defaults, nullable rules, enum values, tables, columns, generated columns, triggers, views, functions, row-level policies, or data migrations.
42
- - A task mentions rolling deploy, expand-and-contract, online migration, backfill, production schema change, rollback, roll-forward, down migration, migration lock, lock timeout, statement timeout, DDL transaction, `CREATE INDEX CONCURRENTLY`, MySQL `ALGORITHM=INSTANT`, MySQL `LOCK=NONE`, generated ORM client, migration drift, schema drift, or database migration safety.
43
+ - A task mentions rolling deploy, zero-downtime migration, expand-and-contract, online migration, long-running migration, background job backfill, feature flag migration, dual-write, dual-read, compatibility read fallback, production schema change, rollback, roll-forward, down migration, migration lock, lock timeout, statement timeout, DDL transaction, `CREATE INDEX CONCURRENTLY`, MySQL `ALGORITHM=INSTANT`, MySQL `LOCK=NONE`, generated ORM client, migration drift, schema drift, or database migration safety.
43
44
  - Prisma, Drizzle, TypeORM, Rails Active Record, Django migrations, Alembic, Diesel, Ecto, Flyway, Liquibase, Knex, Sequelize, SQLx, or another migration tool changes schema, generated output, migration metadata, or deployment behavior.
44
45
  - A final report claims a database migration is safe, reversible, applied, validated, production-ready, no-downtime, rollback-safe, or tested from an old schema.
45
46
 
@@ -58,6 +59,8 @@ Migration incidents usually happen in the interval where old code, new code, old
58
59
  - Deployment shape: single-step deploy, rolling deploy, blue-green, multiple app versions, background workers, read replicas, multiple services, serverless functions, mobile clients, or external integrations.
59
60
  - Database engine and operational surface: PostgreSQL, MySQL, SQLite, SQL Server, managed database, migration lock behavior, DDL transaction behavior, online DDL options, table size, write load, long-running transactions, replication or CDC topology, expected lock time, statement timeout, lock timeout, and restore capability when known.
60
61
  - Data preservation needs, compatibility window, backfill size, batch strategy, cursor or checkpoint marker, validation query, observability query, rollback or roll-forward type, cut-over control, and whether old code can run after the new schema lands.
62
+ - Application transition controls: feature flags, tenant gates, read fallback, dual-write window, old-write cutoff, old-read cutoff, worker rollout order, admin/reporting/BI dependency review, and how to disable the new path without restoring the database.
63
+ - Production runbook boundary: execution owner, intended window, expected lock time, expected replication lag, metrics to watch, stop or pause thresholds, retry policy, partial-apply handling, customer-impact communication trigger, and manual approval points when relevant.
61
64
  - State and timestamp invariant matrix when a migration introduces lifecycle statuses, terminal
62
65
  timestamps, retry or dead-letter states, delivery states, soft-delete states, approval states, or
63
66
  other columns whose valid nullability depends on status.
@@ -78,6 +81,7 @@ Migration incidents usually happen in the interval where old code, new code, old
78
81
 
79
82
  - Update migration files, ORM schema files, generated client expectations, schema dumps, SQL snapshots, seeds, fixtures, compatibility code, backfill code, validation checks, docs, and tests directly required by the migration.
80
83
  - Prefer expand-and-contract for live systems: add compatible shape, dual-write or compatibility-read where needed, backfill safely, switch reads and writes, then contract only after compatibility is proven.
84
+ - Move long-running data rewrites out of deploy-time schema migrations into bounded, restartable background jobs when production-sized data or live traffic can be affected.
81
85
  - Keep destructive cleanup separate from expansion unless the repository explicitly proves a single-step deployment is safe.
82
86
  - Do not weaken tests, delete migration history, hand-edit generated client output, suppress migration drift, or claim rollback safety for lossy changes.
83
87
 
@@ -93,19 +97,21 @@ Migration incidents usually happen in the interval where old code, new code, old
93
97
  - Django: migration files, state operations, historical models, schema editor behavior, generated SQL when relevant, and data migration functions.
94
98
  - Alembic or SQLAlchemy: migration revisions, autogenerate output, branch heads, model metadata, downgrade functions, naming conventions, and generated SQL.
95
99
  - Diesel, Ecto, Flyway, Liquibase, Knex, Sequelize, SQLx, and raw SQL: migration history, checked-in SQL, generated metadata, compile-time query checks, rollback files, and schema dumps.
96
- 3. Build a migration ledger: old shape, new shape, rows affected, old code behavior, new code behavior, rollback expectation, generated artifact changes, dependent callers, and validation query.
100
+ 3. Build a migration ledger: old shape, new shape, rows affected, old code behavior, new code behavior, worker and batch behavior, admin/reporting/BI behavior, rollback expectation, generated artifact changes, dependent callers, and validation query.
97
101
  4. Classify compatibility.
98
102
  - Old code on old schema.
99
103
  - Old code on expanded schema.
100
104
  - New code on expanded schema.
101
105
  - New code after backfill.
102
106
  - New code after contract.
107
+ - Old background workers, cron jobs, admin tools, reporting queries, and external integrations during the same window.
103
108
  If any required state fails, the migration is not rolling-deploy safe.
104
109
  5. Split the deployment plan into expand, backfill, switch, and contract phases.
105
110
  - Expansion adds shapes old code can ignore and new code can start writing.
106
- - Backfill is bounded, restartable, idempotent, observable, and separately validated.
107
- - Switch changes read paths through a feature flag, rollout gate, tenant gate, or compatible deploy step where possible.
111
+ - Backfill is bounded, restartable, idempotent, observable, separately validated, and separated from the deployment pipeline when it can run long.
112
+ - Switch changes read and write paths through a feature flag, rollout gate, tenant gate, or compatible deploy step where possible.
108
113
  - Contract removes old shapes only after at least one compatibility window proves no code, job, report, or manual SQL still depends on them.
114
+ - A single migration file that expands, rewrites data, flips reads, and drops old structures is not zero-downtime evidence unless the repository proves the single-step path is safe for its deployment model.
109
115
  6. For column add, decide nullability, default behavior, backfill strategy, write path, read fallback, index need, and when a future `NOT NULL` or constraint can be enforced.
110
116
  - Add nullable first unless a proven engine/version/table-size path makes the non-null default safe.
111
117
  - Do not assume a database default backfills existing rows or matches ORM, API, batch, or application defaults.
@@ -139,6 +145,10 @@ Migration incidents usually happen in the interval where old code, new code, old
139
145
  - Partition attach can scan existing rows unless a suitable `CHECK` constraint proves the range first.
140
146
  - Table split, table merge, or relationship rewrite must preserve stable identifiers, foreign keys, audit references, external IDs, permissions, search documents, exports, and old-to-new mapping until all callers switch.
141
147
  15. For backfills, make them bounded, restartable, observable, and validated. Define batch size, cursor-based ordering key such as `id > last_id`, checkpoint, retry behavior, idempotency, timeout, lock expectation, throttle or pause/resume control, dead-letter or manual review behavior, and validation queries.
148
+ - Keep long-running data rewrites out of deploy-time migrations unless the affected row count, lock behavior, WAL/binlog or undo impact, replication lag, and timeout behavior prove the operation is short and bounded.
149
+ - Commit in small batches instead of one huge transaction when live data volume can be large.
150
+ - Process only rows that still need work, so reruns and retries cannot corrupt already migrated rows.
151
+ - Track progress with a durable cursor or checkpoint; do not rely on offset pagination for mutable production tables.
142
152
  16. Do not run or recommend full-table updates on production-sized data without measured volume, lock expectation, WAL or undo impact, replication lag risk, batch plan, timeout policy, and recovery plan.
143
153
  17. Review replication, CDC, and long-running transaction interactions.
144
154
  - Online DDL can leave replicas, read traffic, backups, CDC connectors, or failover readiness behind even when the primary looks healthy.
@@ -150,14 +160,17 @@ Migration incidents usually happen in the interval where old code, new code, old
150
160
  - Monitor dual-write mismatch and sample old/new values during the compatibility window; code intent is not proof that every path writes both sides.
151
161
  19. Prepare observability before apply.
152
162
  - Pair the migration with read-only progress and safety queries for lock waits, index build progress, replication lag, backfill cursor, skipped rows, failed rows, duplicate rows, missing rows, dead tuples, or estimated remaining range when the engine supports them.
163
+ - Watch application error rate, p95 or p99 latency, connection pressure, fallback-read rate, dual-write mismatch rate, and critical business event failures when the migration changes a live request path.
153
164
  - Log or report dry-run selection counts, apply counts, skip reasons, batch durations, and recovery handles.
154
165
  - A final `done` line is not enough evidence for a live migration.
166
+ - Prepare a runbook before apply. It should name the operator, execution window, expected duration, expected lock and replication behavior, stop thresholds, pause or abort action, partial-apply behavior, code rollback order, feature-flag fallback, validation queries, and customer-impact communication trigger.
155
167
  20. Decide rollback honestly and prefer roll-forward for partial live changes.
156
168
  - Reversible: schema-only and data-preserving.
157
169
  - App rollback: old and new code both tolerate the expanded shape, so the read path can move back without losing new writes.
158
170
  - Forward-fix preferred: partial live migration can be corrected without restoring.
159
171
  - Restore required: deletes, table merges, generated IDs, hashing, encryption, irreversible type conversions, external side effects, or lossy transforms.
160
172
  Do not promise rollback for changes that cannot reconstruct old values.
173
+ Treat backups as disaster recovery evidence, not ordinary deploy rollback, unless a restore drill proves that restoring the database would not lose acceptable live writes, external side effects, or dependent service state.
161
174
  21. Keep external side effects out of database migrations unless the repository has an explicit recovery model. Sending emails, calling payment APIs, deleting files, or mutating external providers from a migration usually breaks rollback.
162
175
  22. Check generated surfaces after schema changes: ORM clients, types, SQL snapshots, schema dumps, OpenAPI or GraphQL projections, API mocks, fixtures, seeds, admin screens, analytics, ETL, BI queries, and docs examples.
163
176
  23. Review ORM-specific traps.
@@ -180,11 +193,13 @@ Migration incidents usually happen in the interval where old code, new code, old
180
193
  - Source schema, target schema, migration files, generated artifacts, schema dumps, seeds, fixtures, and dependent code agree.
181
194
  - Expand, backfill, switch, and contract phases are separated or explicitly proven unnecessary.
182
195
  - Old-code/new-schema and new-code/expanded-schema compatibility is classified.
196
+ - Read-path fallback, write-path transition, dual-write mismatch detection, feature-flag control, and old worker/admin/reporting dependency review are explicit when a live rollout can overlap versions.
183
197
  - Backfill and validation behavior is cursor-based or otherwise bounded, restartable, idempotent, observable, and checkable where relevant.
184
198
  - State-dependent CHECK constraints, terminal timestamp exclusivity, and valid nullability matrices
185
199
  are explicit where status columns can otherwise contradict timestamp or reason columns.
186
200
  - Lock levels, online DDL support, long-running transaction waits, replication lag, cut-over control, timeout policy, and observability queries are explicit where production data may be affected.
187
201
  - Rollback claims distinguish schema rollback, data rollback, app rollback, roll-forward, forward-fix, and restore-required cases.
202
+ - Production runbook stop thresholds, pause or abort behavior, partial-apply handling, and communication triggers are explicit where the migration can affect live service behavior.
188
203
  - Destructive changes and production lock risks are either deferred, measured, guarded, or reported as remaining risk.
189
204
 
190
205
  <!-- mustflow-section: verification -->
@@ -212,6 +227,8 @@ Prefer configured migration dry-run, generated-output, schema-diff, or database
212
227
  - If online DDL support, long-running transaction behavior, replication lag, or cut-over control is unknown, report the migration as operationally unproven.
213
228
  - If an autogenerator proposes drop/create for a rename, stop and rewrite the migration plan.
214
229
  - If a migration is lossy, do not claim rollback beyond restore or forward corrective migration.
230
+ - If rollback depends only on a backup restore, label it disaster recovery instead of deploy rollback and report live-write loss or external-state reconciliation risk.
231
+ - If the migration plan lacks feature-flag fallback, read/write cut-over order, stop thresholds, or partial-apply handling for a live rollout, do not call it zero-downtime.
215
232
  - If a backfill is not idempotent, restartable, observable, and throttled or bounded, keep it out of a production migration claim.
216
233
  - If generated clients or schema dumps drift, fix the source of truth and regenerated surfaces together.
217
234
  - If configured verification is missing, report the missing command intent instead of inferring package-manager, ORM, or migration-tool commands.
@@ -224,7 +241,8 @@ Prefer configured migration dry-run, generated-output, schema-diff, or database
224
241
  - Source schema, target schema, and migration phase
225
242
  - Old-code/new-schema and new-code/expanded-schema compatibility
226
243
  - Expand/backfill/switch/contract plan and destructive cleanup timing
227
- - Backfill cursor, idempotency, throttle, pause/resume, validation, lock, timeout, replication, cut-over, and observability classification
244
+ - Read/write transition, feature-flag fallback, dual-write or compatibility-read window, old worker/admin/reporting dependency review
245
+ - Backfill cursor, idempotency, throttle, pause/resume, validation, lock, timeout, replication, cut-over, runbook stop threshold, and observability classification
228
246
  - Status, timestamp, CHECK constraint, and existing-row validation matrix where relevant
229
247
  - Rollback, app rollback, roll-forward, forward-fix, and restore-required classification
230
248
  - ORM/generated client/schema dump/snapshot surfaces synchronized