mustflow 2.107.3 → 2.108.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +1 -0
  2. package/dist/cli/commands/init.js +49 -1
  3. package/dist/cli/commands/run/execution.js +7 -0
  4. package/dist/cli/commands/run/executor.js +7 -0
  5. package/dist/cli/commands/verify.js +14 -0
  6. package/dist/cli/commands/workspace.js +106 -16
  7. package/dist/cli/i18n/en.js +6 -1
  8. package/dist/cli/i18n/es.js +6 -1
  9. package/dist/cli/i18n/fr.js +6 -1
  10. package/dist/cli/i18n/hi.js +6 -1
  11. package/dist/cli/i18n/ko.js +6 -1
  12. package/dist/cli/i18n/zh.js +6 -1
  13. package/dist/cli/index.js +8 -0
  14. package/dist/cli/lib/agent-context.js +7 -0
  15. package/dist/cli/lib/repo-map.js +14 -0
  16. package/dist/cli/lib/run-plan.js +7 -0
  17. package/dist/core/change-verification.js +7 -0
  18. package/dist/core/verification-scheduler.js +7 -0
  19. package/package.json +1 -1
  20. package/schemas/README.md +3 -3
  21. package/schemas/workspace-status.schema.json +4 -2
  22. package/templates/default/common/.mustflow/config/mustflow.toml +3 -3
  23. package/templates/default/i18n.toml +61 -7
  24. package/templates/default/locales/en/.mustflow/docs/agent-workflow.md +24 -1
  25. package/templates/default/locales/en/.mustflow/skills/INDEX.md +51 -5
  26. package/templates/default/locales/en/.mustflow/skills/admin-control-plane-safety-review/SKILL.md +200 -0
  27. package/templates/default/locales/en/.mustflow/skills/ai-product-readiness-review/SKILL.md +158 -0
  28. package/templates/default/locales/en/.mustflow/skills/auth-permission-change/SKILL.md +91 -28
  29. package/templates/default/locales/en/.mustflow/skills/browser-automation-reliability-review/SKILL.md +279 -0
  30. package/templates/default/locales/en/.mustflow/skills/cli-option-contract-review/SKILL.md +147 -0
  31. package/templates/default/locales/en/.mustflow/skills/database-change-safety/SKILL.md +21 -2
  32. package/templates/default/locales/en/.mustflow/skills/database-migration-change/SKILL.md +25 -7
  33. package/templates/default/locales/en/.mustflow/skills/deployment-rollout-safety-review/SKILL.md +117 -43
  34. package/templates/default/locales/en/.mustflow/skills/frontend-component-library-review/SKILL.md +299 -0
  35. package/templates/default/locales/en/.mustflow/skills/frontend-localization-review/SKILL.md +128 -36
  36. package/templates/default/locales/en/.mustflow/skills/notification-delivery-integrity-review/SKILL.md +226 -0
  37. package/templates/default/locales/en/.mustflow/skills/payment-integrity-review/SKILL.md +34 -14
  38. package/templates/default/locales/en/.mustflow/skills/routes.toml +54 -0
  39. package/templates/default/locales/en/.mustflow/skills/small-service-platform-architecture-review/SKILL.md +273 -0
  40. package/templates/default/locales/en/.mustflow/skills/third-party-api-integration-review/SKILL.md +188 -0
  41. package/templates/default/locales/en/.mustflow/skills/website-task-friction-review/SKILL.md +139 -0
  42. package/templates/default/manifest.toml +60 -1
@@ -0,0 +1,279 @@
1
+ ---
2
+ mustflow_doc: skill.browser-automation-reliability-review
3
+ locale: en
4
+ canonical: true
5
+ revision: 1
6
+ lifecycle: mustflow-owned
7
+ authority: procedure
8
+ name: browser-automation-reliability-review
9
+ description: Apply this skill when browser automation, UI automation, Playwright, Selenium, Puppeteer, WebDriver, computer-use/browser-driving agents, visual browser verification, flaky selectors, page readiness, authentication state, CAPTCHA or anti-bot handling, rate limits, screenshot checks, retry, timeout, human approval, or browser automation observability is created, changed, reviewed, triaged, or reported.
10
+ metadata:
11
+ mustflow_schema: "1"
12
+ mustflow_kind: procedure
13
+ pack_id: mustflow.core
14
+ skill_id: mustflow.core.browser-automation-reliability-review
15
+ command_intents:
16
+ - changes_status
17
+ - changes_diff_summary
18
+ - lint
19
+ - build
20
+ - test_related
21
+ - test
22
+ - docs_validate_fast
23
+ - test_release
24
+ - mustflow_check
25
+ ---
26
+
27
+ # Browser Automation Reliability Review
28
+
29
+ <!-- mustflow-section: purpose -->
30
+ ## Purpose
31
+
32
+ Review browser automation as a stateful, evidence-producing system, not as a sequence of clicks.
33
+
34
+ The core question is: "Does the automation know what state the browser, user, page, network,
35
+ session, target data, and approval gate are in before it acts and before it claims success?" If not,
36
+ the flow will look fine in a demo and then fail under rerenders, slow CI, auth drift, anti-bot
37
+ gates, rate limits, visual noise, stale approvals, or agent hallucination.
38
+
39
+ <!-- mustflow-section: use-when -->
40
+ ## Use When
41
+
42
+ - Code, tests, docs, templates, or reviews touch browser automation, UI automation, end-to-end
43
+ harnesses, Playwright, Selenium, Puppeteer, WebDriver, browser contexts, remote browsers,
44
+ screenshots, videos, traces, HAR files, synthetic user flows, or computer-use browser agents.
45
+ - A task mentions flaky selectors, unstable locators, actionability, stale elements, rerenders,
46
+ page readiness, `networkidle`, sleeps, waits, timeouts, retries, screenshot diffs, visual checks,
47
+ popups, downloads, native dialogs, iframes, shadow DOM, virtualized lists, or input typing.
48
+ - Automation logs into a product, reuses storage state, shares accounts across workers, handles SSO,
49
+ OAuth, MFA, passkeys, cookies, localStorage, sessionStorage, IndexedDB, account lockout, or
50
+ permission changes.
51
+ - Browser automation touches third-party sites, CAPTCHA, anti-bot or WAF challenges, rate limits,
52
+ robots or terms boundaries, IP reputation, headless fingerprints, provider throttling, or manual
53
+ fallback paths.
54
+ - A browser-driving agent reads page content, follows page instructions, clicks by screenshot or
55
+ coordinates, extracts table data visually, enters forms, sends messages, purchases, deletes,
56
+ mutates external state, or asks for human approval before continuing.
57
+
58
+ <!-- mustflow-section: do-not-use-when -->
59
+ ## Do Not Use When
60
+
61
+ - The task is a pure LLM agent control-flow change with no browser or UI automation surface. Use
62
+ `agent-execution-control-review`.
63
+ - The task is only prompt, RAG, model, tool schema, cost, latency, hallucination, or eval behavior
64
+ without browser execution. Use the matching LLM or agent specialist skill.
65
+ - The task is only a product auth bug that is not being automated through a browser. Use
66
+ `auth-flow-triage` or `auth-permission-change`.
67
+ - The task is only a browser request, CORS, CDN, API, or provider failure before the browser
68
+ automation layer is relevant. Use `api-failure-triage`.
69
+ - The task is only frontend UI quality, layout resilience, accessibility, render stability, or web
70
+ performance for human users rather than automation harness reliability. Use the matching frontend
71
+ skill first.
72
+ - The task is only test-suite runtime optimization, shard balance, retry policy, or flaky-test
73
+ handling without browser-specific failure modes. Use `test-suite-performance-review` or
74
+ `test-maintenance`.
75
+
76
+ <!-- mustflow-section: required-inputs -->
77
+ ## Required Inputs
78
+
79
+ - Automation intent ledger: target site or app, owner, internal versus third-party boundary,
80
+ allowed actions, forbidden actions, expected user role, data class, write risk, and whether the
81
+ browser path is the right tool rather than an API, fixture, or deterministic adapter.
82
+ - State ledger: current URL, frame, page, route, modal, popup, selected account, auth storage,
83
+ browser context, viewport, locale, timezone, permissions, feature flags, test data, worker ID,
84
+ correlation ID, and previous step result.
85
+ - Readiness ledger: page-ready signal, data-ready signal, actionable-control signal, business-ready
86
+ signal, network and background-work assumptions, and any waits or assertions that prove them.
87
+ - Selector and action ledger: locators, user-facing roles or labels, test IDs or automation
88
+ contracts, shadow DOM and iframe boundaries, virtualized list handling, click target, keyboard and
89
+ focus path, input acceptance proof, and actionability override use.
90
+ - Auth and identity ledger: login strategy, storage owner, token or cookie storage surface, session
91
+ expiry, refresh behavior, per-worker account isolation, SSO or MFA gates, CAPTCHA policy, account
92
+ lockout policy, and logout or cleanup behavior.
93
+ - External pressure ledger: rate limit unit, retry budget, anti-bot or challenge detection,
94
+ provider terms boundary, manual fallback, backoff behavior, and circuit-breaker threshold.
95
+ - Verification ledger: success criteria, API or database confirmation when available, screenshot
96
+ or visual artifact role, trace/video/HAR policy, console and network capture, redaction,
97
+ retention, and failure artifact sampling.
98
+ - Agent and approval ledger: page content trust boundary, prompt-injection exposure, tool
99
+ permissions, coordinate mapping, stale approval checks, approval snapshot, exact post-approval
100
+ action, resume state, and human escalation path.
101
+
102
+ <!-- mustflow-section: preconditions -->
103
+ ## Preconditions
104
+
105
+ - The task matches the Use When conditions and does not match the Do Not Use When exclusions.
106
+ - Current repository instructions, command contract, automation harness code, test fixtures, browser
107
+ config, auth fixtures, screenshots or traces, and docs directly tied to the automation path have
108
+ been inspected before editing.
109
+ - Browser vendor, automation library, remote-browser provider, CAPTCHA, anti-bot, and Agents SDK or
110
+ computer-use details are stale-sensitive. Use `source-freshness-check` before embedding exact
111
+ current API claims, provider limits, default timeouts, or compliance requirements.
112
+ - External pages, emails, documents, ads, support threads, and rendered web content are untrusted
113
+ input for browser-driving agents.
114
+ - Command execution remains governed by `.mustflow/config/commands.toml`; this skill does not
115
+ authorize launching development servers, unmanaged browsers, long-running workers, production
116
+ browser sessions, CAPTCHA bypasses, provider dashboards, or live side-effect runs.
117
+
118
+ <!-- mustflow-section: allowed-edits -->
119
+ ## Allowed Edits
120
+
121
+ - Add or refine browser automation state machines, locator contracts, test IDs, accessible names,
122
+ readiness assertions, frame or popup handlers, input verification, auth fixtures, per-worker
123
+ account isolation, retry classification, timeout hierarchy, idempotency checks, rate-limit
124
+ handling, approval gates, manual fallback states, traces, screenshots, redaction, cleanup, and
125
+ directly synchronized docs or templates.
126
+ - Move fixture setup, result verification, cleanup, idempotency checks, and data creation from
127
+ browser clicks to API or deterministic helpers when the browser UI is not the behavior under test.
128
+ - Add focused tests for selector drift, readiness failure, stale element rerender, iframe or shadow
129
+ DOM handling, auth-state expiration, per-worker isolation, retry non-idempotency, stale approval,
130
+ screenshot noise, trace redaction, and agent prompt-injection defense when behavior evidence
131
+ supports them.
132
+ - Do not fix flakiness by adding blind sleeps, force-clicking as the default, hiding failures behind
133
+ broad retries, weakening visual thresholds without evidence, sharing one mutable account across
134
+ parallel workers, or claiming browser success from an unverified screenshot.
135
+ - Do not add CAPTCHA bypass, anti-bot evasion, headless fingerprint spoofing, or terms-violating
136
+ third-party automation as a normal product feature.
137
+
138
+ <!-- mustflow-section: procedure -->
139
+ ## Procedure
140
+
141
+ 1. Decide whether the browser is the right boundary. Use API, fixtures, or service adapters for data
142
+ setup, teardown, and result verification when the browser UI itself is not the behavior being
143
+ tested or automated.
144
+ 2. Classify the automation owner: internal app E2E, internal operations tool, third-party site
145
+ workflow, browser-driving LLM agent, visual regression, scraping-like extraction, support tool,
146
+ or production user-assistance flow.
147
+ 3. Define a state machine before actions. Name the states such as unauthenticated, authenticated,
148
+ searching, selecting target, filling form, awaiting approval, submitting, verifying result,
149
+ retrying, blocked by challenge, manual fallback, succeeded, and failed.
150
+ 4. Replace sleeps with readiness evidence. For each step, define what proves the page is ready, the
151
+ data is ready, the target control is actionable, and the business state is safe to advance.
152
+ 5. Treat `networkidle` and selector-visible waits as weak signals. Prefer domain assertions such as
153
+ expected row identity, enabled submit state, loaded data count, settled validation, known URL,
154
+ confirmation ID, provider event, or backend result.
155
+ 6. Review locator contracts. Prefer stable user-facing roles, labels, names, and explicit test IDs
156
+ over CSS layout paths, generated classes, index-based XPath, translated prose only, or first-match
157
+ selectors.
158
+ 7. Check ambiguous DOM. Handle hidden duplicate controls, responsive desktop and mobile DOM at the
159
+ same time, skeletons that resemble real content, virtualized rows, portals, sticky overlays,
160
+ cookie banners, focus traps, iframes, cross-origin frames, shadow DOM, and custom components.
161
+ 8. Avoid stale element handles. Re-resolve locators at action time, and keep find-check-act-verify
162
+ close together so rerenders cannot invalidate old DOM references silently.
163
+ 9. Review actionability honestly. A forced click, coordinate click, JS-dispatched event, or disabled
164
+ actionability check must be exceptional, documented, and followed by proof that a real user path
165
+ is not being bypassed.
166
+ 10. Verify input acceptance. After typing, pasting, selecting dates, entering currency, using IME,
167
+ triggering autocomplete, or blurring a field, confirm the stored value, validation state, submit
168
+ readiness, or outbound payload rather than assuming keystrokes were accepted.
169
+ 11. Make auth state explicit. Identify whether auth lives in cookies, localStorage, sessionStorage,
170
+ IndexedDB, memory, or provider redirects; isolate accounts by worker; avoid shared mutable user
171
+ state; and handle expiry, rotation, SSO, MFA, passkeys, lockout, and logout contamination.
172
+ 12. Treat CAPTCHA and anti-bot as product states. In test or staging, use allowed test keys,
173
+ allowlists, or disabled challenge paths. In production or third-party flows, detect challenges,
174
+ stop safely, and route to human review or manual fallback instead of trying to evade them.
175
+ 13. Add rate control before retries. Identify the rate-limit subject, whether a single browser action
176
+ fans out into many requests, how backoff is computed, when to stop, and how the system avoids a
177
+ retry storm.
178
+ 14. Classify retryable failures. Retry only transient navigation, detached element, timeout,
179
+ temporary backend, or eventual-consistency classes within a bounded budget. Do not retry
180
+ permission denied, invalid input, CAPTCHA, account lockout, provider policy blocks, unknown
181
+ write outcome, or business-rule failures without a recovery-specific check.
182
+ 15. Make writes idempotent or confirm-before-replay. For purchases, payments, deletes, sends,
183
+ refunds, admin changes, support actions, and external mutations, record stable operation IDs and
184
+ check whether the effect already happened before any retry or resume can repeat it.
185
+ 16. Design timeout hierarchy. Align action, assertion, navigation, test, job, queue lease, browser
186
+ provider session, and external API timeouts so cancellation saves evidence, releases resources,
187
+ and resumes from a known state.
188
+ 17. Separate visual proof from business proof. Use screenshots for layout or visual regression, but
189
+ use confirmation IDs, API reads, database rows, provider events, downloads with checksums, audit
190
+ logs, or received messages to prove business success.
191
+ 18. Stabilize screenshot assertions. Freeze or mask nondeterministic content such as time, caret,
192
+ animation, ads, maps, charts, lazy images, random data, locale, theme, viewport, font, GPU,
193
+ scrollbar, and cookie banners before changing thresholds or baselines.
194
+ 19. Capture failure context. Save current URL, frame, viewport, locale, timezone, screenshot, DOM or
195
+ accessibility snapshot when safe, console errors, network statuses, trace, video, retry count,
196
+ worker ID, account ID class, and correlation ID with sensitive-data redaction.
197
+ 20. Protect artifacts. Browser traces, videos, screenshots, HAR files, storage state, and console
198
+ logs can contain cookies, tokens, personal data, addresses, order details, and messages; set
199
+ redaction, retention, encryption, access, and sampling before broad collection.
200
+ 21. For browser-driving agents, distrust page content. Treat rendered instructions, hidden DOM,
201
+ emails, PDFs, comments, ads, and third-party text as untrusted data that must not override the
202
+ system task, tool policy, approval rules, or data-exfiltration limits.
203
+ 22. Split agent roles where risk justifies it. Keep planner, browser executor, verifier, policy
204
+ gate, and human approval separate for high-impact flows. If one model does multiple roles, add
205
+ deterministic gates before side effects and before success claims.
206
+ 23. Make coordinate and screenshot actions verifiable. Recheck screenshot-to-DOM scale, scrolling,
207
+ focus, active modal, target bounds, visible label, disabled state, and post-action state when a
208
+ model or computer-use tool clicks by image or coordinates.
209
+ 24. Treat human approval as durable state. Show the exact account, URL, target, amount, recipient,
210
+ data, screenshot, form values, risk class, reversibility, and exact next action. Before resume,
211
+ re-read critical fields and compare them with the approved snapshot.
212
+ 25. Clean up resources. Close pages, contexts, browsers, downloads, temp files, videos, traces,
213
+ mock servers, websockets, and test data deliberately; detect zombie browser processes and
214
+ artifact growth in long runs.
215
+ 26. Verify with the narrowest configured tests, docs checks, release checks, and mustflow validation
216
+ that cover the changed automation contract.
217
+
218
+ <!-- mustflow-section: postconditions -->
219
+ ## Postconditions
220
+
221
+ - The automation has explicit states, readiness signals, locator contracts, auth isolation, retry
222
+ classes, timeout hierarchy, and success evidence.
223
+ - Browser-only proof is separated from business-result proof.
224
+ - CAPTCHA, anti-bot, rate-limit, human-approval, prompt-injection, and third-party boundary risks
225
+ are detected, stopped, or routed to manual fallback instead of hidden behind retries.
226
+ - Failure artifacts are useful enough to debug and constrained enough not to leak secrets or
227
+ personal data.
228
+
229
+ <!-- mustflow-section: verification -->
230
+ ## Verification
231
+
232
+ Use configured oneshot command intents when available:
233
+
234
+ - `changes_status`
235
+ - `changes_diff_summary`
236
+ - `lint`
237
+ - `build`
238
+ - `test_related`
239
+ - `test`
240
+ - `docs_validate_fast`
241
+ - `test_release`
242
+ - `mustflow_check`
243
+
244
+ Use the narrowest configured fixture, unit, integration, docs, package, or release check that proves
245
+ the changed browser automation contract. Do not infer raw browser launches, dev servers, headed
246
+ browsers, provider dashboards, CAPTCHA-solving services, or production automation runs from local
247
+ files.
248
+
249
+ <!-- mustflow-section: failure-handling -->
250
+ ## Failure Handling
251
+
252
+ - If the failure is not localized to browser automation, use `api-failure-triage`,
253
+ `auth-flow-triage`, `frontend-render-stability`, `test-maintenance`, or another narrower skill
254
+ first.
255
+ - If a selector is flaky, do not patch only the selector string until locator ownership, duplicate
256
+ DOM, responsive DOM, skeletons, frames, shadow DOM, and readiness have been checked.
257
+ - If a retry would replay an unknown write, stop and add idempotency or effect-confirmation before
258
+ enabling retry.
259
+ - If CAPTCHA, anti-bot, account lockout, provider policy, or terms boundaries are detected, stop the
260
+ automation path and report the manual or contractual fallback instead of bypassing it.
261
+ - If human approval resumes after state changed, expire the approval or request a new approval with
262
+ the changed fields.
263
+ - If artifacts would leak secrets or personal data, collect a smaller redacted evidence set and
264
+ report the observability gap.
265
+ - If a configured command fails, use `failure-triage` before continuing.
266
+
267
+ <!-- mustflow-section: output-format -->
268
+ ## Output Format
269
+
270
+ - Browser automation surface reviewed
271
+ - Browser-versus-API boundary and automation owner
272
+ - State machine, readiness, locator, actionability, auth, rate-limit, retry, timeout, and
273
+ idempotency decisions
274
+ - Screenshot, trace, artifact, redaction, and business-success evidence
275
+ - Agent page-content trust, coordinate action, tool permission, approval, and resume checks
276
+ - Files changed
277
+ - Command intents run
278
+ - Skipped checks and reasons
279
+ - Remaining browser automation reliability risk
@@ -0,0 +1,147 @@
1
+ ---
2
+ mustflow_doc: skill.cli-option-contract-review
3
+ locale: en
4
+ canonical: true
5
+ revision: 1
6
+ lifecycle: mustflow-owned
7
+ authority: procedure
8
+ name: cli-option-contract-review
9
+ description: Apply this skill when CLI options, flags, positional arguments, aliases, defaults, parser behavior, prompt controls, config or environment precedence, or automation-facing argument contracts are created, changed, reviewed, or reported.
10
+ metadata:
11
+ mustflow_schema: "1"
12
+ mustflow_kind: procedure
13
+ pack_id: mustflow.core
14
+ skill_id: mustflow.core.cli-option-contract-review
15
+ command_intents:
16
+ - changes_status
17
+ - changes_diff_summary
18
+ - test_related
19
+ - docs_validate_fast
20
+ - test_release
21
+ - mustflow_check
22
+ ---
23
+
24
+ # CLI Option Contract Review
25
+
26
+ <!-- mustflow-section: purpose -->
27
+ ## Purpose
28
+
29
+ Preserve the contract between CLI syntax and the humans, scripts, CI jobs, shells, terminals, config files, and docs that depend on it.
30
+
31
+ CLI options are public API. A convenient flag can still be unsafe if it collides with existing shorthand, hides destructive behavior behind a vague name, prompts in CI, writes to stdout when scripts expect JSON, or turns a path, format, selector, or environment into an ambiguous value.
32
+
33
+ <!-- mustflow-section: use-when -->
34
+ ## Use When
35
+
36
+ - A command adds, removes, renames, aliases, deprecates, validates, or changes a flag, option, positional argument, variadic argument, default value, inherited global flag, or option parser rule.
37
+ - A task designs or reviews standard CLI controls such as dry-run, check, plan, diff, yes, force, confirm, no-input, interactive, verbose, quiet, debug, format, output, color, pager, progress, config, profile, env, timeout, retry, jobs, cache, stdin, token, endpoint, region, project, pagination, target, prune, rollback, or AI-agent permission flags.
38
+ - A command changes prompt behavior, TTY behavior, non-interactive behavior, CI behavior, option terminator support, repeated flags, boolean negation, duration or size parsing, path handling, glob handling, stdin handling, or list parsing.
39
+ - A final report claims that CLI options are safe, automatable, compatible, conventional, discoverable, or aligned with docs and tests.
40
+
41
+ <!-- mustflow-section: do-not-use-when -->
42
+ ## Do Not Use When
43
+
44
+ - The task changes only stdout, stderr, JSON fields, JSONL packets, exit codes, color rendering, progress output, warning text, error text, or help wording without changing option or argument semantics. Use `cli-output-contract-review`.
45
+ - The task changes only public JSON, JSONL, schema-backed reports, or machine-readable stdout and stderr contracts. Use `public-json-contract-change`.
46
+ - The task changes only `.mustflow/config/commands.toml` command intents or command authority. Use `command-contract-authoring`.
47
+ - The task changes only environment variables, secrets, config keys, feature flags, or runtime/build-time exposure. Use `config-env-change`.
48
+ - The task changes only docs prose that mentions an unchanged command syntax. Use the matching docs skill.
49
+
50
+ <!-- mustflow-section: required-inputs -->
51
+ ## Required Inputs
52
+
53
+ - The affected command, command tree, parser library or command router, inherited global flags, positional arguments, variadic arguments, current aliases, defaults, validation rules, and help metadata.
54
+ - Existing docs, README snippets, examples, tests, snapshots, fixtures, shell completions, schemas, template copies, package tests, and release notes that mention the syntax.
55
+ - The operation type: read-only, planning, validation, write, destructive write, remote write, deploy, migration, deletion, cleanup, generated-file write, or AI-agent action.
56
+ - The intended consumers: humans at a TTY, scripts, CI jobs, package tests, shell completion users, remote APIs, installed templates, release automation, or downstream wrappers.
57
+ - Current config and environment precedence, including config files, profiles, env vars, CLI flags, defaults, and explicit override rules.
58
+ - Current non-interactive, prompt, color, pager, progress, timeout, retry, cache, lock, and exit-code expectations when they exist.
59
+ - Relevant command-intent entries for related tests, docs validation, release checks, and mustflow validation.
60
+
61
+ <!-- mustflow-section: preconditions -->
62
+ ## Preconditions
63
+
64
+ - The task matches the Use When conditions and does not match the Do Not Use When exclusions.
65
+ - Existing command syntax, aliases, docs examples, tests, and parser behavior have been inspected before changing or recommending a flag.
66
+ - Short flags are treated as scarce public API. Do not assign them from generic CLI advice without checking collisions, command frequency, and established project conventions.
67
+ - External articles, AI summaries, package defaults, and other CLIs are evidence only. The repository's current parser, command contract, compatibility policy, and user instructions remain authoritative.
68
+ - Command execution remains governed by `.mustflow/config/commands.toml`; this skill does not authorize raw command execution.
69
+
70
+ <!-- mustflow-section: allowed-edits -->
71
+ ## Allowed Edits
72
+
73
+ - Update CLI parser code, command metadata, help text, completions, docs examples, tests, fixtures, schemas, template copies, and release-sensitive package metadata that describe the same option contract.
74
+ - Add explicit long flags, validation errors, compatibility aliases, deprecation notices, negative tests, or parser edge-case tests when they reduce ambiguity.
75
+ - Prefer clear long options over clever short aliases. Add a short option only when it is frequent, unambiguous, and consistent with existing command conventions.
76
+ - Do not merge different safety meanings into one flag. For example, prompt acceptance, safety bypass, preview, destructive overwrite, and non-interactive failure should remain separable.
77
+ - Do not introduce unsafe defaults, vague automation flags, broad bypass flags, hidden prompts, or silent output-mode changes.
78
+ - Do not add parser behavior that breaks paths beginning with a dash, negative numbers, option terminators, repeated values, or non-interactive scripts unless that incompatibility is intentional and documented.
79
+
80
+ <!-- mustflow-section: procedure -->
81
+ ## Procedure
82
+
83
+ 1. Inventory the command syntax: subcommands, positional arguments, variadic arguments, options, inherited global flags, aliases, defaults, environment variables, config files, and generated completions.
84
+ 2. Classify each option by role: safety and preview, confirmation and prompts, output and formatting, logging and diagnostics, config and environment, selection and filtering, file input and output, remote endpoint and auth, performance and cache, concurrency and locking, CI automation, destructive lifecycle, or AI-agent authority.
85
+ 3. Decide whether the behavior belongs in a subcommand, positional argument, option, config key, environment variable, or separate command. Destructive lifecycle changes often deserve explicit verbs rather than a broad boolean flag.
86
+ 4. Review naming collisions before adding names. Pay special attention to common conflicts such as verbose versus version, force versus file, dry-run versus debug or delete or directory, output format versus output path, interactive versus input, and shorthand reused differently across subcommands.
87
+ 5. Separate near-neighbor semantics. `--yes` accepts prompts; `--force` bypasses a safety guard; `--dry-run` avoids writes; `--check` reports whether change is needed; `--diff` shows the proposed change; `--output` should mean a destination only if format uses another name such as `--format`.
88
+ 6. Prefer explicit paired controls for risky workflows: dry-run, plan, diff, check, validate, no-input, confirm, yes, force, no-clobber, overwrite, backup, rollback, atomic, lock-timeout, fail-fast, and continue-on-error.
89
+ 7. Check non-interactive behavior. Prompts should be TTY-only; `--no-input` should fail instead of waiting; CI-oriented paths should be compatible with quiet, JSON, no-color, no-progress, no-pager, timeout, wait, and detailed exit-code behavior when the repository supports those controls.
90
+ 8. Check human and machine output interaction. If an option changes output format, route machine-readable results and diagnostics consistently, and use `cli-output-contract-review` or `public-json-contract-change` for the output contract details.
91
+ 9. Define config and environment precedence. Document and test whether CLI flags override environment variables, profiles, config files, defaults, and inline `--set` style overrides.
92
+ 10. Review parser edge cases: `--` option terminator, paths beginning with `-`, negative numbers, repeated flags, comma-separated lists versus repeated values, boolean negation with `--no-*`, optional values, duration and size units, shell quoting, globs, symlinks, hidden files, recursive flags, and stdin markers.
93
+ 11. Check file and generation behavior. Separate input path, output path, output directory, create-dirs, overwrite, no-clobber, backup, atomic write, recursive traversal, hidden files, symlink following, ignore files, and validation-only modes.
94
+ 12. Check remote and SaaS behavior when relevant. Separate endpoint URL, region, account, project, token source, token stdin, CA or proxy settings, connect timeout, read timeout, pagination, query filters, and retries.
95
+ 13. Check infra or deploy behavior when relevant. Separate plan, apply, refresh, target, replace, prune, rollback, lock, lock-timeout, wait, parallelism, and detailed-exit-code semantics.
96
+ 14. Check AI-agent behavior when relevant. Separate model, prompt source, context include or exclude, max files, max bytes, write permissions, command permissions, network permissions, approval policy, checkpoint, dry-run, diff, and apply.
97
+ 15. Preserve compatibility. For renamed or split flags, consider aliases, deprecation warnings, migration help, and tests before removing old syntax. Treat breaking option removals, changed defaults, changed prompt behavior, and changed parser grammar as public API changes.
98
+ 16. Synchronize every surface that teaches or consumes the syntax: parser code, help text, completions, docs, README, examples, tests, fixtures, schemas, templates, package metadata, and release notes when applicable.
99
+ 17. Verify with the narrowest configured related tests first, then docs, release, template, and mustflow checks when syntax, docs, profiles, templates, or package metadata changed.
100
+
101
+ <!-- mustflow-section: postconditions -->
102
+ ## Postconditions
103
+
104
+ - Option names, aliases, defaults, parser behavior, config precedence, prompt behavior, and non-interactive behavior are explicit and synchronized.
105
+ - Short flags have a documented reason or are omitted in favor of clear long flags.
106
+ - Destructive, write, preview, confirmation, force, and non-interactive controls are not conflated.
107
+ - Automation-facing use has stable output-mode, no-prompt, no-color, no-progress, no-pager, timeout, retry, and exit-code behavior when relevant.
108
+ - Parser edge cases are covered by tests or reported as remaining risk.
109
+
110
+ <!-- mustflow-section: verification -->
111
+ ## Verification
112
+
113
+ Use configured oneshot command intents when available:
114
+
115
+ - `changes_status`
116
+ - `changes_diff_summary`
117
+ - `test_related`
118
+ - `docs_validate_fast`
119
+ - `test_release`
120
+ - `mustflow_check`
121
+
122
+ Use broader configured tests when option parsing is cross-cutting or no narrower related test covers the syntax.
123
+
124
+ <!-- mustflow-section: failure-handling -->
125
+ ## Failure Handling
126
+
127
+ - If an option name conflicts with existing syntax, keep the old contract and choose a clearer long option unless a breaking change is intentionally routed through compatibility and versioning.
128
+ - If a parser edge case cannot be verified directly, add focused coverage or report the missing coverage before claiming safety.
129
+ - If docs, help text, completions, or templates cannot be synchronized in the same change, avoid claiming the option contract is installed or documented.
130
+ - If non-interactive behavior is unclear, default to failing safely rather than prompting, writing, deleting, or assuming consent.
131
+ - If an external recommendation conflicts with repository conventions, document the rejected recommendation and the repository-specific reason.
132
+ - If a breaking option change is intentional, route the version impact through the repository versioning policy and report affected consumers.
133
+
134
+ <!-- mustflow-section: output-format -->
135
+ ## Output Format
136
+
137
+ - CLI command and options reviewed
138
+ - Option role classification and naming decision
139
+ - Short and long flag collision review
140
+ - Safety, preview, destructive, prompt, and non-interactive controls
141
+ - Parser edge cases checked or reported missing
142
+ - Config and environment precedence
143
+ - Human, machine, CI, color, pager, progress, timeout, retry, and exit-code interaction
144
+ - Docs, help, completions, tests, schemas, templates, and package metadata synchronized
145
+ - Command intents run
146
+ - Skipped checks and reasons
147
+ - Remaining CLI-option contract risk
@@ -2,7 +2,7 @@
2
2
  mustflow_doc: skill.database-change-safety
3
3
  locale: en
4
4
  canonical: true
5
- revision: 16
5
+ revision: 17
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: database-change-safety
@@ -79,6 +79,7 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
79
79
  - Event role: operational event, audit log, behavior analytics event, integration outbox message, reporting aggregate, or replayable domain event.
80
80
  - Data owner and affected tables, collections, stores, indexes, caches, generated files, or read models.
81
81
  - Entity identity rules, including stable ids, external provider ids, mutable slugs, titles, locale-specific addresses, redirects, and public API identifiers when content or user-facing resources are involved.
82
+ - Regret-prone schema shape rules, including internal versus public ids, normalized unique keys, tenant-scoped uniqueness, foreign keys, join tables, enum or lookup-table ownership, nullable-field meaning, JSON promotion criteria, custom-field boundaries, status history, optimistic locking, and operational trace fields.
82
83
  - Exit and restore rules, including whether exported data preserves relationships, permissions, files, versions, events, audit history, automation rules, provider id mappings, schema metadata, and enough import or restore evidence to reconstruct product state.
83
84
  - Identifier ownership rules, including which ids are product-owned, which ids are public, which ids are provider mappings, and whether external auth, payment, CRM, analytics, storage, or CMS ids can change without breaking internal references.
84
85
  - Authentication identity rules, including app-owned user id, provider subject records, email-as-attribute behavior, social provider subject preservation, account merge or relink policy, session migration expectations, and whether memberships, roles, permissions, and entitlements live in product-owned tables rather than only provider metadata.
@@ -173,8 +174,25 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
173
174
  - External-service core facts, such as current entitlement, subscription or plan state, processed payment event id, email consent state, customer lifecycle state, file identity and ownership, search source document metadata, job processing state, and audit evidence. Do not let a provider dashboard be the only place that can explain these facts.
174
175
  - Search and queue reconstruction records, such as index document builders, ranking or synonym policy versions, search logs, queue message schema versions, job idempotency keys, retry state, dead-letter state, and manual replay markers.
175
176
  4. Check schema shape: primary keys, foreign keys, unique constraints, nullable fields, defaults, check constraints, status values, timestamps, soft delete fields, tenant scope, audit fields, and retention rules.
177
+ - Use immutable internal primary keys for joins and separate public identifiers for URLs and APIs. Do not make email, slug, username, external provider id, or mutable display code the primary key for product-owned rows.
178
+ - Enforce uniqueness in the database, not only in application prechecks. Normalize comparison keys such as email, slug, provider id, and idempotency key explicitly, preserve the display value separately when needed, and name the unique constraint or index so operations can diagnose failures.
179
+ - Scope unique constraints to the real owner. Tenant-owned slugs, emails, invitations, memberships, idempotency keys, and external references usually need `tenant_id`, `workspace_id`, `operation_type`, or `provider` in the key. Global uniqueness should be a deliberate product rule, not an accident.
180
+ - Design soft-delete uniqueness before shipping. Active-only uniqueness, nullable unique behavior, restore conflicts, deleted-id reuse, and tombstone requirements must be explicit; otherwise deleted rows either block valid new records or allow duplicate active records.
181
+ - Prefer database foreign keys for core ownership and reference integrity. If an FK is intentionally omitted for scale, import staging, sharding, or asynchronous reconciliation, name the replacement invariant, cleanup path, and orphan-detection evidence. Index FK columns when joins, parent deletion checks, or tenant deletion depend on them.
182
+ - Treat `ON DELETE CASCADE` as a lifecycle promise, not cleanup convenience. Use it only when child rows truly share the parent's lifetime and audit, retention, restore, and legal obligations do not require separate survival.
183
+ - Model many-to-many relationships with join tables that can own role, status, order, source, timestamps, and actor fields. Avoid comma-separated ids, arrays of ids, or JSON lists for relationships that need joins, uniqueness, permissions, deletes, or audit.
184
+ - Treat polymorphic `entity_type` plus `entity_id` relations as integrity debt for core data because ordinary FKs cannot prove the target exists. Prefer target-specific tables, a shared parent table, or explicit constraint and cleanup machinery when the relation is business-critical.
185
+ - Choose enum, lookup table, or state machine based on change behavior. Stable technical codes may be enums; operator-managed values, values with display or sort metadata, plan or category catalogs, roles, and jurisdiction-specific rules usually need lookup tables. Workflow status needs allowed transitions and history, not only a value list.
186
+ - Avoid boolean state soup such as several independent `is_*` flags for one lifecycle. Use one current status plus timestamps or event history when states are mutually exclusive, ordered, reversible, or policy-driven.
187
+ - Give nullable fields exactly one meaning. Separate unknown, not applicable, not entered yet, deleted, failed, and pending states with explicit status or reason fields when queries or reports depend on the distinction.
188
+ - Avoid EAV or generic `entities`/`attributes`/`values` tables for core domain facts. If customer-defined fields are required, keep them in a bounded custom-field area with definitions, type validation, quotas, ownership, export semantics, and a promotion path once values drive search, sort, permission, billing, or reporting.
189
+ - Do not hide behavior-driving data in JSON. Keys used for filters, ordering, joins, uniqueness, permissions, tenant scope, status, retention, money, dates, quotas, indexes, or operational dashboards should be typed columns, child tables, or generated/computed columns with a migration path. Use `database-json-modeling-review` when JSON is part of the diff.
190
+ - Keep tenant ownership close to the owned row when tenant-scoped operations, billing, audit, export, restore, delete, or performance matter. B2B products should usually separate global users from tenant memberships, roles, invitations, entitlements, and billing records.
176
191
  - Treat deletion as lifecycle when recovery, audit, search behavior, support handling, or retention matters. Consider `deleted_at`, `deleted_by`, `delete_reason`, `restored_at`, `restored_by`, and `purge_after` instead of a lone boolean or timestamp.
177
192
  - Separate business records that should be soft-deleted or archived from personal data that should be anonymized, purged, or retained under a narrower legal rule.
193
+ - Keep status history for states that affect money, access, fulfillment, support, compliance, or user-visible commitments. A current status alone rarely explains who changed it, why, under which request, and whether a late webhook, retry, or admin action should still apply.
194
+ - Add optimistic versioning or conditional updates when two users, admins, workers, or webhooks can edit the same important row. Last-write-wins is usually data loss unless the product explicitly accepts it.
195
+ - Add operational trace fields where incident response will need them: server timestamps, actor ids, `created_by`, `updated_by`, `request_id`, `source`, import or provider reference, and safe reason codes. Do not add them blindly to every table, but do not leave high-value rows untraceable.
178
196
  - Treat mutable high-value records as versioned when reproducibility matters, such as AI prompts, documents, contracts, price policies, experiment configs, comparison data, permission policies, automation rules, and model settings. Prefer a stable parent row with a current-version pointer plus immutable version rows.
179
197
  - Use ledgers for money-like or quota-like balances, such as points, credits, inventory reservations, refunds, coupon issuance, entitlement grants, and manual adjustments. Treat cached balances as derived from ledger entries unless the local design proves otherwise.
180
198
  - For audit logs, store actor type, actor id when safe, action, target type and id, bounded before and after values, reason, request id, idempotency key, and timestamp in the same local transaction as the audited change when possible. Audit logs should be append-only to normal operators and should redact or omit personal data that is not needed to explain the change.
@@ -318,6 +336,7 @@ Use the smallest persistence boundary that proves the risk. Do not introduce rep
318
336
  ## Postconditions
319
337
 
320
338
  - The database role and source of truth are explicit.
339
+ - Regret-prone schema shortcuts such as mutable primary keys, app-only uniqueness, unscoped tenant uniqueness, missing FK or cascade ownership, ambiguous nulls, boolean state soup, polymorphic core relations, EAV core facts, behavior-driving JSON, and user-as-tenant coupling are fixed, explicitly accepted, or reported.
321
340
  - Database rows, ORM models, generated caches, and read models do not leak into domain truth unless the local architecture intentionally owns that boundary.
322
341
  - Queries preserve authorization, tenant or user scope, deterministic ordering, expected absence behavior, and retention rules.
323
342
  - Content and resource models separate stable identity from mutable titles, slugs, URLs, translations, display fields, revisions, facts, sources, projections, and analytics dimensions when those concerns exist.
@@ -375,7 +394,7 @@ Prefer the narrowest configured test, build, docs, release, or mustflow intent t
375
394
 
376
395
  - Database role and owner
377
396
  - Affected read and write paths
378
- - Schema, constraint, and query semantics reviewed
397
+ - Schema-regret, constraint, relation, enum, JSON, custom-field, status-history, traceability, and query semantics reviewed
379
398
  - Identity, slug, lifecycle, asset, body block, taxonomy, relationship, attribute, filter URL, landing-page, translation, locale, country, currency, timezone, local-date, money, price snapshot, revision, claim, fact, source, collection, verification, comparison methodology, affiliate link, data-ownership, behavior analytics, audit log, API projection, public identifier, backup or restore, bulk update, admin audit, user-state, aggregate, cache-key, projection, and cache-invalidation checks where relevant
380
399
  - Export, import, product-owned id, provider-id mapping, relationship, permission, file, automation, event-history, and reconstruction checks where relevant
381
400
  - Authorization, tenant scope, retention, and privacy checks