npm - mustflow - Versions diffs - 2.18.21 → 2.21.2 - Mend

mustflow 2.18.21 → 2.21.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/dist/cli/commands/classify.js +2 -3
package/dist/cli/commands/doctor.js +46 -6
package/dist/cli/commands/run/output.js +1 -1
package/dist/cli/commands/run/receipt.js +1 -0
package/dist/cli/commands/run.js +3 -1
package/dist/cli/commands/verify.js +15 -10
package/dist/cli/i18n/en.js +1 -0
package/dist/cli/i18n/es.js +1 -0
package/dist/cli/i18n/fr.js +1 -0
package/dist/cli/i18n/hi.js +1 -0
package/dist/cli/i18n/ko.js +1 -0
package/dist/cli/i18n/zh.js +1 -0
package/dist/cli/lib/filesystem.js +3 -96
package/dist/cli/lib/local-index/index.js +4 -4
package/dist/cli/lib/repo-map.js +3 -2
package/dist/cli/lib/run-plan.js +8 -4
package/dist/core/check-issues.js +1 -1
package/dist/core/command-contract-validation.js +24 -10
package/dist/core/command-effects.js +3 -4
package/dist/core/command-output-limits.js +2 -1
package/dist/core/line-endings.js +12 -4
package/dist/core/repeated-failure.js +3 -3
package/dist/core/run-performance-history.js +4 -4
package/dist/core/run-profile.js +2 -3
package/dist/core/run-receipt.js +11 -3
package/dist/core/run-write-drift.js +67 -15
package/dist/core/safe-filesystem.js +158 -0
package/package.json +1 -1
package/schemas/commands.schema.json +1 -0
package/schemas/doctor-report.schema.json +23 -1
package/schemas/run-receipt.schema.json +6 -2
package/templates/default/i18n.toml +13 -13
package/templates/default/locales/en/.mustflow/skills/INDEX.md +13 -13
package/templates/default/locales/en/.mustflow/skills/adapter-boundary/SKILL.md +72 -4
package/templates/default/locales/en/.mustflow/skills/command-contract-authoring/SKILL.md +16 -10
package/templates/default/locales/en/.mustflow/skills/command-pattern/SKILL.md +64 -7
package/templates/default/locales/en/.mustflow/skills/database-change-safety/SKILL.md +249 -16
package/templates/default/locales/en/.mustflow/skills/dependency-reality-check/SKILL.md +37 -7
package/templates/default/locales/en/.mustflow/skills/migration-safety-check/SKILL.md +74 -10
package/templates/default/locales/en/.mustflow/skills/performance-budget-check/SKILL.md +132 -5
package/templates/default/locales/en/.mustflow/skills/pure-core-imperative-shell/SKILL.md +12 -5
package/templates/default/locales/en/.mustflow/skills/result-option/SKILL.md +4 -2
package/templates/default/locales/en/.mustflow/skills/security-privacy-review/SKILL.md +112 -29
package/templates/default/locales/en/.mustflow/skills/state-machine-pattern/SKILL.md +17 -4
package/templates/default/locales/en/.mustflow/skills/structure-discovery-gate/SKILL.md +193 -2
package/templates/default/manifest.toml +1 -1

package/templates/default/locales/en/.mustflow/skills/dependency-reality-check/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.dependency-reality-check
 locale: en
 canonical: true
-revision: 3
+revision: 6
 lifecycle: mustflow-owned
 authority: procedure
 name: dependency-reality-check
-description: Apply this skill when a task assumes, adds, removes, imports, invokes, installs, audits, or documents a package, runtime, tool, command, service, or platform capability, especially for AI-suggested dependencies or supply-chain-sensitive changes.
+description: Apply this skill when a task assumes, adds, removes, imports, invokes, installs, audits, or documents a package, runtime, framework, tool, command, service, platform capability, supported-version policy, security patch path, ecosystem maturity claim, maintainer-risk assumption, runtime portability claim, edge or serverless compatibility claim, critical-path dependency choice, or experimental technology placement, especially for AI-suggested dependencies, core backend stack choices, or supply-chain-sensitive changes.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -34,6 +34,12 @@ Prevent code, docs, tests, and final reports from assuming unavailable packages,
 - An AI-generated patch, assistant suggestion, copied snippet, or generated docs introduce a package name that could be hallucinated, misspelled, abandoned, lookalike, or unnecessary.
 - A change adds package-manager scripts, package lifecycle hooks, build downloads, binary installers, lockfile changes, audit suppression, vulnerability scanner output, or CI dependency gates.
 - A solution relies on a package manager, binary, environment variable, browser API, operating-system command, hosted service, or optional integration.
+- A dependency or platform is proposed for authentication, payment, database, migrations, authorization, cryptography, deployment, queueing, file storage, email, or another survival path where failure can stop the product.
+- A small, experimental, single-maintainer, fast-moving, or platform-specific dependency is proposed and the task needs to decide whether it belongs in a differentiating feature or a core operating path.
+- A language runtime or web framework choice needs to prove supported or LTS version policy, end-of-life avoidance, security advisory response, dependency lock behavior, smoke-test coverage, deployment path, and rollback path.
+- A JavaScript or TypeScript runtime claim treats Node, Bun, Deno, serverless functions, edge runtime, Web-standard adapters, or Node compatibility modes as interchangeable without checking which APIs, packages, native modules, filesystem access, connection reuse, or platform features the code actually uses.
+- A framework feature such as server actions, route handlers, edge middleware, framework cache, ORM relation helpers, or hosted platform storage is proposed for core business logic rather than for delivery, persistence, or infrastructure glue.
+- Documentation or design claims that a technology has enough ecosystem support, production use, migration path, failure examples, security response, or maintainer coverage.
 - A generated instruction tells another agent or user to run a tool that may not be declared in the repository.
 - A failure may be caused by a missing install, mismatched version, unsupported runtime, or unavailable command.
@@ -51,6 +57,10 @@ Prevent code, docs, tests, and final reports from assuming unavailable packages,
 - Package, lock, config, import, script, command-intent, or documentation files that declare or reference it.
 - The minimum version, capability, or availability claim if one is required.
 - Registry name, package scope, lockfile entry, provenance or maintainer expectation, install script risk, and whether the dependency is runtime, development, fixture-only, transitive, or optional.
+- Dependency role criticality: decorative utility, product-experience feature, operational support, or survival path such as identity, money, durable data, permissions, security, migrations, deployment, queues, or file ownership.
+- Runtime and framework patchability: supported-version or LTS expectation, end-of-life status, security advisory channel, update cadence, dependency-lock behavior, deployment artifact, smoke-test surface, rollback route, and whether the choice is experimental, regulated, or core-path-facing.
+- Runtime compatibility boundary: whether code imports Node-specific APIs, Bun or Deno globals, edge-only APIs, native modules, filesystem access, framework request objects, environment reads, ORM clients, or platform storage and queue objects outside delivery or infrastructure layers.
+- Ecosystem and maintainer context when available from approved tooling or existing metadata: maintainer count, organization or foundation backing, release history, issue handling, security policy, migration docs, failure cases, license clarity, tests, alternatives, and hiring or support availability.
 - Vulnerability, license, audit, lifecycle-script, binary-download, package-age, maintainer-change, and fork-or-replacement context when those details are available from approved repository tooling or existing metadata.
 - Relevant command-intent contract entries for build, package, test, or documentation verification.
@@ -81,16 +91,33 @@ Prevent code, docs, tests, and final reports from assuming unavailable packages,
 6. If absent, prefer an existing local alternative. Add a new dependency only when it is necessary, within the task scope, and reflected in the package metadata and lockfile policy.
 7. Treat package scripts and lifecycle hooks as executable code. Review `preinstall`, `install`, `postinstall`, `prepare`, build-time downloads, generated binaries, and shell-spawning scripts before accepting them.
 8. Check supply-chain-sensitive metadata when available through approved tooling or existing files: package scope, maintainer or organization expectation, package age, maintainer changes, install scripts, binary downloads, transitive dependency impact, license constraints, and fixture-only versus runtime use.
-9. For vulnerability or audit output, separate runtime dependencies from fixture-only or intentionally vulnerable samples. Do not weaken audit gates, delete lockfiles, or add broad suppressions without a repository-owned reason.
-10. For new dependencies, prefer pinned or lockfile-backed versions according to project policy. Avoid widening ranges or removing lockfiles to satisfy generated code.
-11. Do not introduce new package-manager wrappers, vulnerability scanners, registry queries, or install commands inside this skill. Use configured command intents or report the missing verification surface.
-12. Keep all dependency-facing surfaces aligned: package metadata, lockfiles when intentionally updated, command contract, docs, tests, and installation notes.
-13. Run the narrowest configured verification that proves the dependency path used by the change.
+9. Classify the dependency by product criticality.
+   - Decorative or utility dependencies can be small when they are pure, replaceable, locked, and easy to remove.
+   - Product-experience dependencies such as charts, editors, calendars, or drag-and-drop should be wrapped or localized when they can spread through the UI.
+   - Operational dependencies such as logs, queues, caches, search, file processing, and email need maturity, observability, failure handling, and a replacement path.
+   - Survival-path dependencies such as authentication, payment, database access, migrations, authorization, cryptography, deployment, and security should avoid fragile single-maintainer or experimental choices unless the project explicitly accepts the risk.
+10. Place new or experimental technology intentionally. Prefer proven, boring dependencies for the survival path; reserve experimental technology for differentiating product areas such as UX, AI workflow, search experience, visualization, or internal tooling, and keep it isolated enough to replace.
+    - For runtime and framework choices, judge the stack by whether security patches can be applied, tested, deployed, and rolled back quickly. Do not treat performance benchmarks or developer excitement as a substitute for a supported-version policy and a working patch circuit.
+    - Keep experimental runtimes and fast-moving frameworks away from authentication, payment, authorization, database, migration, cryptography, and security-critical deployment paths unless the project explicitly accepts the risk and has a rollback plan.
+11. For runtime portability claims, inspect dependency leakage by layer before accepting the claim.
+    - Core and application code should not depend on Node-only APIs, Bun or Deno globals, edge-only globals, framework request objects, ORM clients, environment variables, or platform-specific queue and storage clients.
+    - Delivery and infrastructure layers may use runtime or framework capabilities, but the choice should be visible as an adapter decision rather than scattered through use cases.
+    - Treat Web-standard frameworks or adapters as a portability aid, not as proof that domain code is portable if provider SDKs, ORM clients, filesystem calls, or platform bindings still leak inward.
+12. Evaluate ecosystem maturity when the task depends on it: production examples, searchable error solutions, migration notes, security-response history, version stability, plugin ecosystem, older issue answers, support availability, and whether the direction is likely to hold over the next release cycles.
+13. For small libraries, accept them only when they are small in role as well as code: simple, understandable, peripheral, safe to fork or reimplement, and not directly touching secrets, personal data, money, permissions, migrations, durable state, or security.
+14. For vulnerability or audit output, separate runtime dependencies from fixture-only or intentionally vulnerable samples. Do not weaken audit gates, delete lockfiles, or add broad suppressions without a repository-owned reason.
+15. For new dependencies, prefer pinned or lockfile-backed versions according to project policy. Avoid widening ranges or removing lockfiles to satisfy generated code.
+16. Do not introduce new package-manager wrappers, vulnerability scanners, registry queries, or install commands inside this skill. Use configured command intents or report the missing verification surface.
+17. Keep all dependency-facing surfaces aligned: package metadata, lockfiles when intentionally updated, command contract, docs, tests, and installation notes.
+18. Run the narrowest configured verification that proves the dependency path used by the change.
 <!-- mustflow-section: postconditions -->
 ## Postconditions
 - Every dependency or tool claim is backed by a repository declaration, configured command, host boundary, or explicit unverified-risk note.
+- Critical-path dependency choices identify role criticality, maintainer concentration, ecosystem maturity, replacement path, and whether the dependency belongs in a survival path or a differentiating feature.
+- Runtime and framework choices identify supported-version policy, end-of-life exposure, security patch path, smoke-test surface, deployment and rollback route, and whether experimental choices are isolated from survival paths.
+- Runtime portability claims identify which APIs are confined to delivery or infrastructure and which would force core or application code to change when moving between Node, Bun, Deno, serverless functions, or edge runtime.
 - New dependency requirements are reflected in the appropriate metadata and public documentation.
 - The final report states whether the dependency was existing, added, optional, unavailable, or intentionally not verified.
@@ -122,6 +149,9 @@ Use a narrower configured test, package, or docs intent when it better proves th
 - Dependency or capability checked
 - Repository declaration or absence
+- Role criticality, ecosystem maturity, maintainer-risk, and replacement-path boundary when relevant
+- Supported-version, patchability, smoke-test, rollback, and experimental-placement boundary when relevant
+- Runtime portability, framework feature leakage, and layer-containment boundary when relevant
 - Surfaces synchronized
 - Command intents run
 - Skipped dependency checks and reasons

package/templates/default/locales/en/.mustflow/skills/migration-safety-check/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.migration-safety-check
 locale: en
 canonical: true
-revision: 1
+revision: 8
 lifecycle: mustflow-owned
 authority: procedure
 name: migration-safety-check
-description: Apply this skill when code, data, schema, configuration, file layout, template, or generated-state migrations are planned, edited, documented, or reported.
+description: Apply this skill when code, data, schema, configuration, file layout, template, content frontmatter, file-to-database, SQLite-to-PostgreSQL, local-disk-to-object-storage, tenant-boundary, URL, slug, content lifecycle, asset path or variant, claim or fact extraction, global-ready time money locale or currency backfills, API projection compatibility, public identifier changes, provider id mappings, event-schema changes, search index or ranking migration, queue message or retry-policy migration, log or analytics migration, observability identifier continuity, deployment-state reproduction, generated-state, backup or restore proof, expand-migrate-contract deployment, destructive schema rollback, semantic export, import, platform exit, or cache migrations are planned, edited, documented, or reported.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -32,6 +32,16 @@ Keep migrations reversible, scoped, and verified before they affect data, schema
 - A change moves, renames, deletes, transforms, backfills, or rewrites files, data, schemas, configuration, template state, generated state, or persisted metadata.
 - A task mentions migration, upgrade, conversion, import, export, reindexing, backfill, cleanup, lock refresh, or baseline regeneration.
+- Content moves between inline code, Markdown/MDX files, CMS documents, database rows, generated indexes, site-specific overrides, central facts, redirects, or reusable content blocks.
+- Existing content must gain lifecycle states, redirects, topic aliases, relationship records, asset metadata, immutable originals, variants, claim or fact references, comparison methodology records, affiliate link records, or exportable ownership metadata.
+- Existing APIs, mobile clients, integrations, analytics events, cache keys, public identifiers, read models, generated indexes, or reporting tables must survive a schema or storage shape change.
+- Existing product state must survive leaving a CMS, hosted database, authentication provider, payment provider, file store, analytics tool, observability backend, deployment platform, or closed automation tool.
+- Existing search quality, search ranking policy, search documents, synonyms, query logs, click logs, queue messages, retry policy, dead-letter history, product events, audit events, billing events, job events, or operator troubleshooting records must survive leaving a search engine, queue, log backend, analytics SaaS, or dashboard-driven automation.
+- Existing request ids, trace ids, job ids, webhook ids, event ids, user ids, organization ids, or provider id mappings must survive a log, trace, metrics, event, job, webhook, export, or import migration.
+- Existing local SQLite files, local upload directories, object-storage keys, public file URLs, tenant or workspace fields, server-side authorization scope, or API response shapes must survive a persistence or deployment change.
+- Existing timestamp, date-only, timezone, currency, price, exchange-rate, tax, locale, country, account-default, user-preference, or AI-pricing fields must be split, backfilled, snapshotted, or made globally safe.
+- A migration touches destructive schema changes such as dropping columns, changing column types, setting `NOT NULL`, adding constraints, creating large indexes, rewriting many rows, or claiming that a `down` migration makes rollback safe.
+- Documentation or final reports claim that backup, restore, cache rebuild, index rebuild, or analytics migration is safe, tested, repeatable, or complete.
 - Documentation or final reports claim that a migration is safe, complete, reversible, idempotent, or already applied.
 - A change could make older installed projects, existing lock files, generated files, caches, or user-edited documents incompatible.
@@ -47,6 +57,17 @@ Keep migrations reversible, scoped, and verified before they affect data, schema
 - The source state, target state, and the files or data that would change.
 - The owner of the migration surface: code, schema, template, lock file, generated state, cache, package, or docs.
+- Content identity, slug, URL, frontmatter schema, relationship, fact, source, search index, sitemap, analytics, and cache dependency expectations when content or site migrations are involved.
+- Lifecycle state mapping, asset path mapping, original versus variant ownership, topic or tag merge history, policy or fact extraction, comparison methodology backfill, affiliate link classification, export shape, and deletion or anonymization expectations when those surfaces are involved.
+- API response compatibility, public id mapping, client version support, analytics event versioning, cache key versioning, search index rebuild, reporting aggregate rebuild, and projection backfill expectations when those surfaces are involved.
+- Database engine mapping, local file path mapping, object-storage key mapping, tenant or workspace ownership mapping, private download URL behavior, and API response mapper compatibility when those surfaces are involved.
+- Restore-test evidence or restore gap, including database, files, secrets or environment configuration, migration history, external service settings, queue or job state, and cache storage when the migration depends on recovery guarantees.
+- Export/import reconstruction evidence or gap, including relationships, permissions, files, versions, event history, audit history, automation rules, external integration mappings, provider id mappings, schemas, and whether imported data can rebuild a working service elsewhere.
+- Search, queue, log, and analytics reconstruction evidence or gap, including search document structure, ranking or boost policy, representative query expectations, queue message envelope and schema versions, idempotency keys, retry and dead-letter rules, internal event list, retention windows, and raw event export or replay paths.
+- Deployment-state migration evidence or gap, including environment variable schema, secret names, DNS records, domains, SSL assumptions, redirects, cron schedules, runtime versions, build commands, regions, storage, queues, worker settings, observability routing, and rollback behavior.
+- Rollback type expected: schema rollback, data rollback, application rollback, operational restore, forward corrective migration, or explicit no-rollback boundary.
+- Deployment compatibility rule: whether old application code can run on the new schema, whether new application code can run while old fields remain, and when destructive cleanup is safe.
+- Operational safety limits for live databases: rehearsal dataset, expected lock time, statement timeout, lock timeout, batch size, restart marker, validation query, and point-in-time restore availability when applicable.
 - Idempotency, rollback, backup, dry-run, compatibility, and failure behavior expectations.
 - Relevant command-intent contract entries for status, diff, docs, release, build, or mustflow validation.
@@ -69,19 +90,58 @@ Keep migrations reversible, scoped, and verified before they affect data, schema
 ## Procedure
 1. Identify the migration surface and classify it as code, schema, data, configuration, template, generated state, cache, package metadata, or documentation.
+   For content systems, also classify whether it touches inline page content, body files, frontmatter, lifecycle states, slug history, redirects, taxonomy, assets, facts, claims, sources, site exposure, search index, feeds, sitemaps, analytics events, exports, or public API projections.
 2. Record the source state, target state, expected affected paths, and whether the change must support old and new states during transition.
-3. Check whether the migration is idempotent: a second run should either do nothing or report an already-applied state without extra diffs.
-4. Check rollback or recovery expectations: backup, restore path, manual fallback, or explicit "not reversible" report.
-5. Prefer dry-run or read-only inspection before apply behavior when a command or workflow exists for it.
-6. Keep compatibility claims tied to fixtures, lock metadata, tests, generated output, or documented command results.
-7. If the migration changes public docs, installed templates, package contents, or lock files, synchronize the related metadata and version surfaces.
-8. Run the narrowest configured verification that proves the migrated surface and its metadata still agree.
+   - Prefer an expand, backfill, switch, and shrink sequence for live schema and API changes: add the new shape, support old and new reads or writes, migrate data, switch readers, then remove the old shape only after compatibility is proven.
+   - Treat rollback as more than `down` migration. Distinguish schema rollback, data rollback, app rollback, and operational restore. A `down` file does not recover deleted or overwritten data unless the original values were preserved.
+   - Prefer forward-only recovery for live systems: if a change fails after partial application, use a corrective migration or compatibility patch unless a tested restore path proves a rewind is safer.
+   - Delay destructive cleanup such as `DROP COLUMN`, type changes, `SET NOT NULL`, irreversible rewrites, and old-field removal until at least one compatible application version and validation window have proven the new shape.
+   - For large tables, separate schema expansion from data backfill. Make backfills restartable, bounded by batches, and validated with queries that count missing, malformed, duplicate, or conflicting rows.
+   - For globally safe data fixes, backfill time, money, locale, country, currency, and timezone from explicit evidence only. If a value was previously inferred, mark confidence or require review instead of silently treating inference as fact.
+3. Check identity preservation. Stable ids, content groups, entity ids, author ids, asset ids, and API ids should survive title, slug, file path, locale, category, site, or provider changes.
+   - If internal numeric ids are replaced or hidden by public ids, preserve mapping, redirects or aliases, audit references, API compatibility, and analytics continuity.
+4. Check whether the migration is idempotent: a second run should either do nothing or report an already-applied state without extra diffs.
+5. Check rollback or recovery expectations: backup, restore path, manual fallback, redirect fallback, old/new schema compatibility, or explicit "not reversible" report.
+   - Before risky live database work, require recent backup or point-in-time restore capability, rehearsal on production-like data when possible, expected lock behavior, statement and lock timeouts, and a documented stop condition.
+6. Prefer dry-run or read-only inspection before apply behavior when a command or workflow exists for it.
+7. For file-to-database or database-to-file content migration, verify that strict metadata is preserved or intentionally introduced: id, type, status, locale, slug, summary, author, created/updated actors, category, tags, related entities, SEO fields, canonical group, site exposure, and access level where relevant.
+8. For lifecycle migration, map old booleans or ad hoc flags into explicit states such as draft, scheduled, published, unlisted, private, archived, deprecated, redirected, gone, and soft-deleted without losing search, access, retention, or recovery semantics.
+9. For URL or slug migration, verify canonical targets, old-to-new redirects, duplicate handling, sitemap and canonical updates, analytics continuity, and cache invalidation.
+10. For asset migration, preserve immutable originals, introduce stable asset ids, avoid storage keys tied to mutable slugs, rebuild variants from originals, carry alt text, license, credit, dimensions, hash, focal point, and usage references, and report any missing metadata that cannot be inferred safely.
+11. For taxonomy or relationship migration, preserve tag aliases, merge history, topic-hub indexability, relationship type, direction, order, confidence, manual or automatic source, reason, and creator when those fields affect search, navigation, analytics, or SEO.
+12. For fact, claim, policy, comparison, or affiliate extraction, verify that changing facts move to typed entity, claim, source, methodology, result, or affiliate records with source, observed or effective dates, jurisdiction, risk tier, reviewer, relationship policy, and usage mappings while historical prose remains intentionally preserved or annotated.
+13. For export or platform-exit migration, verify that content, assets, redirects, tag merges, claim references, source references, revisions, and page dependencies can be exported or rebuilt without relying on the current CMS screen layout.
+   - Treat CSV-only export as insufficient for product-state migration when relationships, permissions, comments, files, history, automations, provider mappings, or audit trails define the service meaning.
+   - Preserve product-owned stable ids and map provider ids separately. Payment, authentication, CRM, storage, analytics, observability, and CMS ids should be replaceable without breaking internal references.
+   - Verify import or restore shape, not only download shape. If data cannot be loaded into another environment or self-hosted replacement, report the export as partial.
+   - Treat provider dashboards and no-code screens as migration inputs only when their hidden rules can be represented as data, configuration, code, or documented operator procedure. Manual refund, permission repair, file deletion, customer verification, and email automation habits need an owner before tool replacement.
+   - For search migration, rebuild the index from source records and compare representative queries or expected top results before claiming quality continuity. Search settings changed only through a hosted dashboard need a captured policy or change log.
+   - For queue migration, preserve a versioned message envelope, job type list, retry policy, timeout, dead-letter state, ordering requirement, idempotency key, and manual replay procedure. Do not claim queue replacement is safe from message transport alone.
+   - For log or analytics migration, separate raw historical events from dashboards. Keep event names, schema versions, request or trace ids, retention rules, and core billing, entitlement, file, security, job, search, and support events available outside the SaaS.
+14. For deployment-platform migration, verify that operating state is reproducible from code and docs rather than dashboard memory. Preserve or recreate environment variable schemas, secret names, domain and DNS records, redirects, SSL assumptions, cron jobs, build commands, runtime versions, regions, storage buckets, queues, worker settings, observability routing, deployment hooks, and rollback procedure.
+15. For observability migration, preserve request id, trace id, span id, user or anonymous id, tenant or organization id, command id, job run id, webhook event id, and event schema version continuity. Do not migrate logs or traces in a way that replaces internal ids with email, token, payment customer id, or other sensitive provider identifiers.
+16. For database-engine migration, such as SQLite to PostgreSQL, preserve schema constraints, transaction semantics, public ids, timestamps, JSON handling, unique indexes, idempotency keys, audit references, and backup or restore expectations. Check write-concurrency behavior, locking assumptions, and any code that depended on local file paths or process-local database access.
+17. For local-disk to object-storage migration, preserve stable asset ids, owner or workspace scope, original filenames as metadata, storage keys independent from mutable user input, visibility, status, checksums, variants, signed download behavior, and cleanup of stale local files or orphaned remote objects.
+18. For tenant-boundary migration, verify that every affected read, list, search, mutation, upload, download, admin operation, audit log, idempotency key, cache key, and analytics event maps to the correct workspace, organization, team, or account before enforcing the new scope.
+19. Keep compatibility claims tied to fixtures, lock metadata, tests, generated output, or documented command results.
+20. For API or projection migration, verify that database table splits, column renames, status remaps, relationship moves, storage-key changes, and internal identifier changes are absorbed by response mappers or versioned contracts before they break clients.
+21. For analytics or event migration, keep event names and schema versions explicit. Do not mix old and new event payload meanings under one unversioned JSON shape.
+22. For cache migration, version cache keys when response shape, visibility, filter normalization, ranking formula, or source-of-truth rules change. Define whether old keys can expire naturally or need an explicit purge.
+23. For backup or restore claims, require evidence from a clean-environment restore or report the missing drill. A backup file, snapshot setting, or managed-service checkbox is not enough to claim restore readiness.
+24. For AI usage or model-pricing migrations, preserve request identity, provider-call identity, feature key, pricing snapshot, retry grouping, cache-hit type, and historical cost units. Do not recalculate past AI costs from only the current provider price sheet.
+25. If the migration changes public docs, installed templates, package contents, or lock files, synchronize the related metadata and version surfaces.
+26. Run the narrowest configured verification that proves the migrated surface and its metadata still agree.
 <!-- mustflow-section: postconditions -->
 ## Postconditions
 - The migration surface, source state, target state, and compatibility boundary are named.
-- Idempotency, rollback, backup, dry-run, and verification status are either proven or explicitly left as remaining risk.
+- Content identity, lifecycle, URL, slug, asset, metadata, taxonomy, relationship, claim, fact, search, cache, sitemap, export, and public API continuity are preserved or explicitly marked out of scope when relevant.
+- Semantic export or import, provider id mappings, self-hosted or replacement restore shape, deployment-state reproduction, and observability identifier continuity are preserved or explicitly marked out of scope when relevant.
+- Idempotency, rollback, backup, restore, dry-run, old/new compatibility, and verification status are either proven or explicitly left as remaining risk.
+- Rollback claims distinguish schema, data, app, and operational recovery, and destructive changes are delayed or explicitly marked as requiring restore/manual recovery.
+- API projections, public identifiers, analytics event versions, cache keys, read models, and generated indexes keep compatibility or carry an explicit migration risk.
+- Database engine changes, local-disk to object-storage moves, tenant-boundary retrofits, and API response reshapes preserve compatibility or carry an explicit migration risk.
 - Final reports do not imply that a live or destructive migration ran unless configured evidence proves it.
 <!-- mustflow-section: verification -->
@@ -110,7 +170,11 @@ Use a narrower configured test, build, migration dry-run, or documentation inten
 - Migration surface reviewed
 - Source and target state
-- Idempotency and rollback status
+- Identity, lifecycle, slug, URL, asset, metadata, taxonomy, relationship, claim, fact, index, cache, export, and redirect continuity where relevant
+- API projection, public id, event schema, restore, and generated-state continuity where relevant
+- Semantic export/import, provider-id mapping, deployment-state reproduction, observability identifier, and platform-exit continuity where relevant
+- Search ranking, search quality set, queue message contract, dead-letter or replay procedure, log or analytics event ownership, and operator procedure continuity where relevant
+- Idempotency, expand-migrate-contract compatibility, rollback type, backup, restore, rehearsal, timeout, backfill, and validation-query status
 - Compatibility or lock metadata updated
 - Command intents run
 - Skipped checks and reasons

package/templates/default/locales/en/.mustflow/skills/performance-budget-check/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.performance-budget-check
 locale: en
 canonical: true
-revision: 1
+revision: 12
 lifecycle: mustflow-owned
 authority: procedure
 name: performance-budget-check
-description: Apply this skill when performance budgets, bundle size, page weight, startup time, command duration, memory use, asset size, throughput, latency, benchmark output, or performance claims are planned, edited, reviewed, or reported.
+description: Apply this skill when performance budgets, query-count budgets, N+1 risk, read/write workload shape, database concurrency pressure, app-server scaling, vertical versus horizontal scaling, process count, connection-pool pressure, read-model cost, operational database reporting load, analytics-query isolation, cache strategy, cache keys, cache invalidation, cache stampede, hot keys, stale fallback, ranking snapshots, search API cost, search index rebuild cost, search quality regression set, file upload bandwidth, external-dependency timeout cost, retry storms, worker queue starvation, provider rate limits, queue backlog or dead-letter growth, pricing-growth cost, vendor free-tier limits, value-pricing units, internal cost units, tenant usage limits, user-action fan-out, contribution margin, P50/P90/P99 heavy-user costs, AI usage cost budgets, AI gateway hard limits, provider budget guardrails, agent loop caps, model-call retries, token-cost tracking, bundle size, page weight, startup time, command duration, memory use, asset size, throughput, latency, benchmark output, or performance claims are planned, edited, reviewed, or reported.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -34,6 +34,24 @@ Keep performance claims and budgets tied to declared thresholds, reproducible me
 - A task changes or reports performance budgets, bundle size, page weight, startup time, command duration, memory use, asset size, throughput, latency, search index size, build time, or benchmark output.
 - A change adds heavier dependencies, generated assets, static pages, search indexes, startup work, file scanning, command fan-out, or repeated process spawning.
+- A change introduces or reports caching, cache-control headers, cache keys, cache tags, purge rules, stale fallback, precomputed ranking, search-result caching, faceted-filter caching, CDN caching, or private versus shared cache behavior.
+- A list, feed, search, admin table, dashboard, or API response introduces relation loading, ORM includes, lazy loading, per-row counts, viewer-specific flags, aggregate counters, or read models that can hide N+1 queries or unbounded query cost.
+- A behavior analytics, dashboard, reporting, search ranking, event log, or experiment analysis path may scan operational tables, consume the same connection pool as user requests, or run grouped aggregates on high-growth data.
+- A database or storage choice is justified by "read-heavy", "write-heavy", "SQLite is enough", "PostgreSQL is safer", "cache it", "direct upload", or "local file upload" performance assumptions.
+- A performance issue is framed as "scale up", "add servers", "move to serverless", "move to edge", "add workers", or "use a larger instance" before CPU, database, external dependency, regional latency, and process-state bottlenecks are separated.
+- App servers may be multiplied and could increase database connections, queue load, retry volume, cron duplication, cache pressure, or external API calls instead of improving throughput.
+- A file upload, download, resize, conversion, object-storage, CDN, or app-server streaming path could consume request time, memory, bandwidth, or worker capacity.
+- A cache, queue, search service, analytics store, AI provider, email service, or other auxiliary dependency might cause core user requests to wait, retry, stampede, or fail.
+- HTTP requests perform AI, email, embedding, statistics, webhook follow-up, import, export, file conversion, or other slow work inline instead of accepting work and handing it to a bounded worker path.
+- A retry policy, worker pool, or provider integration can create retry storms, rate-limit feedback loops, dead-letter buildup, or queue starvation across unrelated work.
+- Search ranking, query behavior, search index rebuild, queue partitioning, job retry policy, dead-letter retention, log volume, or analytics event volume may affect latency, worker capacity, provider cost, storage cost, or operational visibility.
+- AI, embedding, reranking, image, audio, or tool-call features can create provider cost, token cost, retry cost, cache savings, rate-limit pressure, free-plan abuse, or worker starvation that needs a budget, usage ledger, or limit.
+- AI requests need a gateway-level cost stop before provider calls, including estimated-cost checks, hard budget decisions, model downgrade rules, request-size caps, token caps, tool-call caps, agent-step caps, timeout caps, or an emergency disable switch.
+- A third-party tool, hosted platform, analytics service, observability vendor, automation provider, database, file store, email provider, authentication provider, or AI provider has a free tier, seat price, API-call price, event price, storage price, bandwidth price, workspace price, audit-log price, export price, or usage limit that can become a product margin or growth bottleneck.
+- A pricing or plan design must compare the value unit users understand with the cost units the system consumes, such as seats, workspaces, requests, storage, bandwidth, AI tokens, search queries, image conversions, automation runs, events, realtime connections, or support.
+- A user action can fan out into several internal jobs such as thumbnail generation, OCR, AI summary, embeddings, search indexing, notifications, logs, analytics events, or webhook calls.
+- Free, unlimited, or generous plan limits touch high-cost surfaces such as AI calls, media conversion, file storage, download traffic, search, automation, webhooks, realtime connections, or log retention.
+- A margin claim depends on average customers but could be dominated by heavy users, high-volume tenants, or P90/P99 usage.
 - A report claims that a path is faster, slower, lightweight, optimized, cached, parallelized, cheap, expensive, within budget, or over budget.
 - A failure or slowdown suggests that measurement scope, command selection, concurrency, caching, or generated output size needs review.
@@ -50,6 +68,23 @@ Keep performance claims and budgets tied to declared thresholds, reproducible me
 - The performance surface, such as command, page, asset, bundle, startup path, query, build, or generated output.
 - The budget source, if one exists: repository config, documented threshold, user-provided limit, benchmark baseline, package metadata, or current command result.
 - Measurement method, environment boundary, warm or cold run expectation, and whether the result is deterministic, sampled, local-only, or approximate.
+- Cache layer, cache key source, cache version, TTL or freshness rule, invalidation trigger, stale fallback, private or no-store boundary, and rebuild source when a cache or precomputed read model is involved.
+- Cache failure behavior, hot-key risk, stampede risk, TTL jitter or lock strategy, cache flush tolerance, and whether the cache is disposable or runtime storage.
+- Expected query count, row count, relation loading shape, aggregate strategy, read-model owner, and whether the measurement can detect query growth when a database-backed read path is involved.
+- Read/write workload profile, including repeated reads, freshness requirement, write bursts, same-row contention, index maintenance cost, ledger or audit write amplification, and whether a read projection can replace per-request calculation.
+- Operational database versus analytics or reporting boundary, including read replica, precomputed aggregate, queue, event store, separate connection pool, or external analytics system when available.
+- Timeout, retry, circuit-breaker, stale-response, feature-flag, and degraded-mode policy when an auxiliary dependency can affect the critical path.
+- Worker and provider capacity boundary, including queue separation, concurrency limits, retry delay, backoff with jitter, circuit-breaker threshold, dead-letter behavior, and whether one provider can consume shared worker or database resources.
+- Scaling boundary, including current process count, CPU and memory pressure, connection-pool limits, database maximum connections, serverless or edge timeout limits, worker concurrency, cron ownership, and whether adding app servers would increase pressure on the real bottleneck.
+- Search capacity and quality boundary, including index rebuild time, partial reindex trigger, query log volume, ranking snapshot cost, representative query set, and whether relevance changes are measured or only observed anecdotally.
+- Log and analytics volume boundary, including which events must be retained internally, which can be sampled or dropped, retention window, storage cost, and whether analysis scans are isolated from core user requests.
+- AI cost boundary, including feature key, account or workspace scope, request count, input and output token limits, cached-input treatment, provider price snapshot, retry grouping, cache-hit type, model tier, plan limit, and whether failed or cancelled provider calls can still cost money.
+- AI gateway boundary, including preflight estimated cost, hard limit decision, remaining budget, model downgrade, feature policy, provider budget role, maximum tool calls, maximum agent steps, maximum total tokens, timeout, and emergency kill switch.
+- Vendor cost boundary, including whether cost grows by users, seats, workspaces, API calls, events, storage, bandwidth, active users, projects, advanced permissions, audit logs, exports, AI tokens, or support tier, and whether that growth follows the product's revenue model.
+- Pricing and margin boundary, including the user-facing value unit, internal cost unit, included quota or credit pool, overuse policy, tenant-level limit, free-plan maximum loss, and customer contribution margin formula.
+- Usage metering boundary, including workspace or organization id, user id, feature key, request type, input size, output size, processing time, external API usage, retries, failures, plan, and whether one user action can create multiple billable or cost-bearing internal operations.
+- Heavy-user boundary, including P50, P90, and P99 customer cost, whether a few users can dominate provider or infrastructure bills, and which high-cost actions require hard limits instead of only reporting.
+- Free-to-paid transition boundary, including which operationally required features are outside the free tier, what usage cliffs exist, and whether growth creates a gradual cost curve or a sudden platform migration or plan upgrade.
 - Relevant command-intent contract entries for status, diff, build, tests, docs, release, or mustflow validation.
 <!-- mustflow-section: preconditions -->
@@ -75,14 +110,96 @@ Keep performance claims and budgets tied to declared thresholds, reproducible me
 3. Check nearby code, docs, templates, tests, and command metadata for duplicated performance statements or stale thresholds.
 4. Classify the measurement as deterministic, sampled, local-only, externally dependent, or unmeasured.
 5. If the change adds dependencies, generated output, or repeated work, identify the likely cost path and whether an existing alternative is available.
-6. Keep claims conservative: state the command, input scope, and whether caching, warm runs, parallelism, or generated files influenced the result.
-7. If a budget is exceeded, report the affected surface, budget source, measured value or unavailable measurement, likely cause, and smallest follow-up.
-8. Run the narrowest configured verification that proves the changed performance, package, docs, or mustflow surface.
+6. For database-backed lists, feeds, search, dashboards, or admin tables, define the intended query shape before accepting a performance claim.
+   - Count queries separately from returned rows when the local tooling supports it.
+   - Watch for per-row author, tag, attachment, permission, count, reaction, bookmark, or viewer-state lookups.
+   - Prefer joins for small required one-to-one data, batch queries for one-to-many data, aggregate or cached counters for counts, and read models or projections for complex feeds.
+   - Treat ORM lazy loading as a performance risk until the query count is bounded or measured.
+   - Treat repeated `GROUP BY`, `COUNT`, `SUM`, large date windows, free-form filtering, and dashboard scans on high-growth tables as reporting load. Prefer precomputed aggregates, read replicas, analytics stores, or bounded query windows over user-request database resources.
+   - Protect core user requests from analytics or reporting load with separate connection pools, read-only replicas, queued jobs, cached summaries, or explicit rate limits when the architecture has those tools.
+7. For read-heavy and write-heavy workload claims, check the ordering of mitigations before accepting the design.
+   - For read-heavy paths, first stabilize query patterns, then indexes, then precomputed read tables or projections, then caches, then replicas or separate search engines. Do not add cache first when invalidation is unclear.
+   - For write-heavy paths, account for index write cost, audit-log amplification, ledger writes, lock contention, hot counters, same-row balance or inventory updates, and retry or idempotency overhead.
+   - Treat current-value fields such as balances, counts, or rankings as derived when a ledger, event, or snapshot is the real evidence source.
+8. For caching work, classify the cache layer: browser, CDN, server response, query-result cache, search index, precomputed ranking or statistics, generated page, or generated API projection.
+9. Check whether the cache key comes from normalized input rather than raw URL order, casing, default values, arbitrary range values, or temporary UI state. Include a cache version when the response shape, filter logic, ranking formula, or visibility rule can change.
+10. Check invalidation before accepting a cache: name the source data, affected cache tags or dependencies, purge trigger, rebuild source, stale-response behavior, and whether failures degrade safely.
+   - Ask whether the cache can be flushed. If flushing only increases latency, report it as cache; if it destroys sessions, queues, locks, rate-limit state, user state, or permissions, report it as runtime storage and require a different durability budget.
+   - Check hot keys such as global home feeds, popular lists, pricing data, and common search terms. Report sharding, replication, request coalescing, local memoization, or CDN strategy when one key can receive disproportionate traffic.
+   - Check stampede behavior. Prefer TTL jitter, single-flight refresh, stale-while-revalidate, background refresh, or prewarming over letting simultaneous misses hit the origin together.
+11. For ranking, trending, search, and faceted-list APIs, prefer precomputed snapshots, generated indexes, or bounded caches over per-request full aggregation when traffic can spike.
+12. For file upload and download paths, identify whether the app server handles raw bytes or only issues signed object-storage URLs.
+   - Large uploads, image processing, document conversion, video conversion, and archive extraction should not monopolize request memory or bandwidth when they can be direct-to-storage or worker-driven.
+   - Treat app-local file serving as a scalability and failure-isolation risk once user files are a product feature, especially with multiple servers, redeploys, or CDN needs.
+13. Ensure admin, private, authenticated, or personalized responses use no shared cache. Require `no-store` or private-cache behavior where leaking data would be worse than serving slower responses.
+14. For external or auxiliary dependencies on a critical path, check timeout, retry, backoff, circuit-breaker, fallback, and feature-flag behavior. A slow AI, search, analytics, email, or statistics dependency should not consume the whole request budget unless the user-visible operation truly depends on it.
+15. For scaling choices, locate the bottleneck before accepting the mitigation.
+   - If CPU is the bottleneck, consider a larger instance, more processes, worker processes, or worker threads for CPU-heavy work before distributing state across many app hosts.
+   - If the database is the bottleneck, check query shape, indexes, slow queries, transaction length, connection pooling, and N+1 behavior before adding app servers that may exhaust connections faster.
+   - If an external API is the bottleneck, use queueing, timeout budgets, limited retries, circuit breakers, degraded behavior, and rate limits rather than letting user requests wait indefinitely.
+   - If regional latency is the bottleneck, consider edge or regional routing only for short, independent paths. Do not move database-write-heavy or dependency-heavy business logic to edge runtime only because the edge is faster for simple responses.
+   - Treat serverless and edge scaling as capacity tools with their own limits: cold starts, timeouts, connection reuse, provider compatibility, bundle size, and cost cliffs still need budgets.
+16. For worker and retry paths, check whether retryable work is bounded.
+   - Prefer accepting work quickly, persisting a job or outbox record, and returning a queued or processing status over making HTTP wait for slow external completion.
+   - Use backoff with jitter so many failing jobs or clients do not retry at the same time.
+   - Separate queues, worker pools, rate limits, or concurrency budgets when AI, email, analytics, embeddings, webhooks, billing, and imports have different urgency or failure policies.
+   - Report dead-letter growth, retry exhaustion, provider rate limits, and unknown provider outcomes as capacity and reliability risks, not just error-handling details.
+   - Check that queues with different urgency do not share an unbounded worker pool when one backlog can delay payments, entitlement grants, password resets, webhook processing, or other critical work.
+   - Treat manual replay and dead-letter review as operational capacity. A dead-letter queue that no one watches is a delayed outage, not a solved failure.
+17. For search and analytics volume, check whether derived systems can be rebuilt and observed without overwhelming the core path.
+   - Search indexes should be rebuildable from source records, and full or partial reindex cost should be bounded before relying on provider search as the only serving path.
+   - Search relevance claims should cite a representative query set, expected top results, or explicit unmeasured status instead of relying on a dashboard impression.
+   - Logs and analytics events should not grow without retention, sampling, aggregation, export, or cold-storage policy when storage, query, or SaaS event pricing can become the bottleneck.
+18. For AI cost and provider-usage budgets, treat cost as a first-class performance and product limit.
+   - Do not rely on provider dashboards as the only source for user, workspace, feature, model, cache, retry, or plan-level cost decisions.
+   - Prefer a single AI call boundary that records request-level usage before cost is summarized. Scattered direct SDK calls hide feature economics and retry amplification.
+   - Track user request id separately from provider call id so one user action with retries, fallbacks, embeddings, tool calls, or evaluations can be costed without being counted as multiple user actions.
+   - Store usage in integer cost units plus a pricing snapshot or version reference. Do not recompute historical costs from the current provider price sheet.
+   - Distinguish app response cache, provider prompt cache, embedding cache, and search-result cache. A cache hit that avoids a provider call is not the same as a discounted provider input.
+   - Apply preflight limits for plan, account, request length, model tier, monthly cost, request count, input tokens, and output tokens; record actual usage afterward and update rollups or limits.
+   - Treat provider console budgets, account-level spend caps, and rate-limit headers as secondary guardrails unless they are proven hard stops. Product-owned limits should block, downgrade, queue, or reject high-cost work before the provider call.
+   - For agentic or multi-call AI work, cap steps, tool calls, total tokens, total estimated cost, and total time. One visible user request can create many provider calls, so request-count limits alone are not enough.
+   - Keep budget decisions inspectable. Record allow, block, downgrade, or emergency-disable decisions with safe identifiers, estimated cost, remaining budget, selected model, and blocked reason.
+   - Include failed, timed-out, cancelled, and retried calls in the budget review when they may consume provider quota or money.
+19. For vendor pricing and free-tier claims, compare the tool's pricing unit with the product's revenue unit.
+   - Check whether the product earns by customer, workspace, seat, transaction, storage, content item, automation run, active user, or AI usage, then compare that with how the vendor charges.
+   - Treat user-seat, monthly-active-user, API-call, event, storage, bandwidth, workspace, project, advanced-permission, audit-log, export, and overage pricing as structural risk when the product can grow in a different direction.
+   - Identify operationally required features that are plan-gated, such as backups, audit logs, SSO, role management, webhooks, API limits, data export, retention, monitoring, or support. A generous free tier can still be risky when the paid cliff lands on a feature that is hard to replace later.
+   - Report pricing cliffs and unverified provider terms as margin risk rather than performance risk alone. "Cheap now" is not evidence that the tool remains cheap at the product's next scale.
+20. For pricing and internal metering claims, separate user-perceived value from system cost.
+   - Identify the value unit: seat, workspace, project, document, transaction, plan, or another unit customers can understand.
+   - Identify the cost units: storage, transfer, database usage, search, AI or external API calls, log or analytics volume, email or notification sends, automation runs, file conversions, queue work, payment fees, and support load.
+   - Prefer simple external plans plus internal limits for cost-bearing resources. A seat or workspace plan can include storage, AI credits, search quotas, automation runs, and shared tenant pools without exposing every raw request count to the customer.
+   - Treat "unlimited" as a claim that must have a natural human limit, fair-use policy, rate limit, abuse detection, or hard internal cap. Do not let unlimited AI, media conversion, storage, traffic, search, automation, realtime, webhook, or log retention become an unbounded liability.
+   - Model contribution margin as customer revenue minus customer variable cost. Report which variable costs are included and which are unmeasured.
+   - Compare P50, P90, and P99 users or tenants when possible. Averages can hide a small number of heavy users who destroy margin.
+   - Meter by workspace or organization as well as user when team usage is pooled. Seat-level credits may be sold, but shared tenant pools often better match real usage.
+21. For user-action fan-out, count internal work rather than only the visible request.
+   - Name the jobs triggered by one action, such as uploads, transforms, OCR, AI calls, embeddings, search indexing, notification sends, event writes, log writes, analytics exports, and webhook deliveries.
+   - Identify which fan-out work is synchronous, queued, retryable, deduplicated, rate-limited, or skipped under load.
+   - Treat hidden retries, failed calls, and duplicate worker execution as cost multipliers when they consume provider quota or infrastructure.
+22. Keep claims conservative: state the command, input scope, query-count boundary, cache boundary, worker boundary, search rebuild or quality boundary, log and analytics volume boundary, AI cost boundary, vendor cost boundary, pricing value/cost boundary, critical-path dependency boundary, and whether caching, warm runs, parallelism, stale responses, precomputed snapshots, generated files, queues, provider limits, pricing cliffs, user-action fan-out, or external services influenced the result.
+23. If a budget is exceeded, report the affected surface, budget source, measured value or unavailable measurement, likely cause, and smallest follow-up.
+24. Run the narrowest configured verification that proves the changed performance, package, docs, or mustflow surface.
 <!-- mustflow-section: postconditions -->
 ## Postconditions
 - Performance claims have a budget source, measurement method, or explicit unverified status.
+- Database-backed read paths have an explicit query-count, row-count, relation-loading, or unmeasured-risk note when N+1 or aggregate cost is plausible.
+- Read-heavy and write-heavy claims identify query patterns, indexes, projections, cache invalidation, write contention, audit or ledger amplification, and retry overhead before claiming a store or cache is sufficient.
+- File upload and download paths identify app-server bandwidth, memory, conversion, object-storage, CDN, and worker boundaries when those costs are plausible.
+- Cache behavior has an owner, key source, freshness rule, invalidation path, private/shared boundary, and rebuild or fallback story when cache is part of the claim.
+- Analytics, dashboard, and reporting paths do not silently share unbounded operational query cost with core user requests, or the remaining risk is reported.
+- Critical-path external dependencies have timeout, retry, fallback, feature-flag, or degraded-mode boundaries when performance or availability can affect core use.
+- Vertical scaling, horizontal scaling, serverless, edge, worker, and process-count claims identify the actual bottleneck and the state, connection, cron, queue, or provider limits that could make the chosen scaling path worse.
+- Worker queues, retry policies, provider rate limits, and dead-letter paths have capacity boundaries when auxiliary work can starve core flows.
+- Search index rebuilds, search quality checks, log volume, analytics event volume, and queue dead-letter review have explicit measured or unmeasured status when they affect latency, cost, or operational visibility.
+- AI usage and cost claims have request, provider-call, feature, model, cache, retry, pricing-snapshot, and plan-limit boundaries when model calls can affect cost or quota.
+- AI gateway claims have preflight hard-limit, provider-budget, downgrade, agent-step, tool-call, timeout, and emergency-disable boundaries when autonomous or high-cost model work can affect margin.
+- Vendor pricing, free-tier, plan-gated feature, and usage-growth claims are tied to the product's revenue unit or reported as unverified margin risk.
+- Pricing claims separate customer-visible value units from internal cost units, and identify included limits, credit pools, overuse behavior, tenant-level controls, free-plan loss budget, and unverified margin risk.
+- Usage-cost claims account for user-action fan-out, hidden retries, P50/P90/P99 heavy-user shape, and contribution margin when high-cost actions can dominate customer economics.
 - Thresholds and benchmark-facing docs, tests, package metadata, generated output notes, and command contracts are synchronized where they overlap.
 - Final reports separate measured evidence from estimates, local observations, and suggested follow-up work.
@@ -115,6 +232,16 @@ Use a narrower configured benchmark, asset, build, docs, or test intent when it
 - Performance surface reviewed
 - Budget source or missing budget
 - Measurement method and boundary
+- Query-count, N+1, read-model, and aggregate-cost boundary when relevant
+- Operational versus analytics query boundary when relevant
+- Cache layer, key, freshness, invalidation, hot-key, stampede, flush-tolerance, and private/shared boundary when relevant
+- Critical-path external dependency timeout, retry, fallback, worker, queue, rate-limit, and dead-letter boundary when relevant
+- Scaling bottleneck, process-count, database-connection, serverless, edge, worker, and cron-ownership boundary when relevant
+- Search rebuild, search quality, log volume, analytics retention, queue backlog, and dead-letter review boundary when relevant
+- AI usage, token, provider-call, model-tier, retry-cost, cache-hit, pricing-snapshot, and plan-limit boundary when relevant
+- AI gateway hard limit, provider budget guardrail, model downgrade, agent loop, tool-call, timeout, and emergency kill-switch boundary when relevant
+- Vendor pricing unit, customer value unit, internal cost unit, tenant limit, free-tier cliff, plan-gated operations feature, contribution-margin, P50/P90/P99 heavy-user, and revenue-alignment boundary when relevant
+- User-action fan-out, hidden retry, and internal work amplification when relevant
 - Thresholds, claims, or metadata synchronized
 - Command intents run
 - Skipped measurements and reasons

package/templates/default/locales/en/.mustflow/skills/pure-core-imperative-shell/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.pure-core-imperative-shell
 locale: en
 canonical: true
-revision: 6
+revision: 7
 lifecycle: mustflow-owned
 authority: procedure
 name: pure-core-imperative-shell
-description: Apply this skill when business decisions, validation, authorization, pricing, eligibility, state transitions, domain events, effect descriptions, or calculations are mixed with I/O such as databases, HTTP handlers, repositories, SDK calls, files, queues, logs, metrics, clocks, randomness, environment reads, payments, emails, or framework request/response objects.
+description: Apply this skill when business decisions, validation, authorization, pricing, discounts, credits, permissions, eligibility, state transitions, domain events, effect descriptions, or calculations are mixed with I/O such as databases, ORM entities, HTTP handlers, repositories, SDK calls, files, queues, logs, metrics, clocks, randomness, environment reads, payments, emails, or framework request/response objects.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -41,6 +41,7 @@ Core decides. Shell does.
 - Business rules are mixed with database access, HTTP handlers, repositories, external SDK calls, framework objects, logs, metrics, clocks, randomness, generated identifiers, environment reads, payments, emails, files, queues, or caches.
 - Code contains meaningful `if`, `switch`, pricing, permission, eligibility, expiration, quota, scoring, matching, validation, or state-transition logic and also performs side effects.
 - Several pricing, discount, permission, scoring, matching, recommendation, or provider-choice policies need to remain pure while being selected at runtime.
+- ORM models, entity hooks, lifecycle hooks, decorators, lazy-loaded relations, or active-record methods contain pricing, permissions, discounts, credits, entitlement, subscription, point, or state-transition decisions.
 - Core tests require database mocks, HTTP mocks, SDK mocks, clock mocks, logger mocks, or framework request objects.
 - A handler, repository, adapter, worker, or event consumer hides business policy.
 - A state change must produce domain events or effect descriptions without executing those effects immediately.
@@ -64,6 +65,7 @@ Core decides. Shell does.
 - The business action, command, workflow, or state change being implemented or refactored.
 - The decision the domain must make and the facts needed to make it.
 - The current side effects, including persistence, external calls, messages, logs, metrics, generated identifiers, time, randomness, and environment reads.
+- ORM-specific behavior involved in the current decision, such as relation includes, lazy loading, model methods, hooks, transactions, repository calls, and generated database row types.
 - Local patterns for result types, domain errors, events, effects, outbox messages, repositories, adapters, mappers, and tests.
 - Existing behavior evidence when refactoring code that already runs.
 - Relevant command-intent contract entries for verification.
@@ -97,7 +99,7 @@ Core decides. Shell does.
 1. Locate the mixed responsibility.
    - Decision signals: `if`, `switch`, status checks, role checks, amount calculations, eligibility checks, validation rules, state transitions, deadline rules, quota rules, and domain error choices.
-   - Execution signals: `await`, database access, external SDK calls, HTTP clients, file access, logging, metrics, email sending, message publishing, cache access, `new Date()`, `Date.now()`, generated identifiers, randomness, and environment reads.
+   - Execution signals: `await`, database access, ORM relation access, active-record model methods, ORM hooks, external SDK calls, HTTP clients, file access, logging, metrics, email sending, message publishing, cache access, `new Date()`, `Date.now()`, generated identifiers, randomness, and environment reads.
 2. Name the pure decision.
    - Prefer verbs such as `decide`, `calculate`, `derive`, `validate`, `transition`, `classify`, `price`, `score`, `select`, `can`, `is`, or `has`.
    - Avoid naming the core after a route, ORM model, SDK method, provider, or transport operation.
@@ -137,6 +139,9 @@ Core decides. Shell does.
    - Map decisions to persistence commands after core returns.
    - Database constraints can protect integrity, but they must not be the only place where business policy exists.
    - Use optimistic locking, version checks, unique constraints, and transactions in the shell when stale decisions or duplicates are possible.
+   - Keep ORM syntax, eager-loading choices, lazy-loading behavior, model hooks, decorators, and generated entity types out of business rules. Treat the ORM as a persistence tool, not the owner of domain policy.
+   - Do not hide notifications, payments, credit grants, permission changes, audit writes, or other business effects in ORM create, update, or delete hooks.
+   - For complex reads, allow a query service, projection, or explicit SQL-style read model instead of forcing all screens through the write-domain model.
 10. Keep external side effects outside local transactions.
     - Do not hold a database transaction open while calling slow network services.
     - When local state and external messages must both be reliable, save state and outbox messages in one transaction, then publish after commit.
@@ -145,8 +150,9 @@ Core decides. Shell does.
     - If status, state, phase, step, or stage controls allowed actions, use `state-machine-pattern` to define the transition table, event names, guards, terminal states, effect descriptions, invalid transitions, and tests.
     - Keep the transition function pure and let the shell persist state, transition history, idempotency records, and outbox rows.
 12. Use strategies for interchangeable pure policies when needed.
-    - If pricing, discount, scoring, ranking, matching, permission, recommendation, or provider-choice logic has several methods with one shared purpose, use `strategy-pattern`.
-    - Keep strategy selection in a selector, resolver, or shell boundary and keep strategy execution behind a shared pure contract when possible.
+   - If pricing, discount, scoring, ranking, matching, permission, recommendation, or provider-choice logic has several methods with one shared purpose, use `strategy-pattern`.
+   - Keep strategy selection in a selector, resolver, or shell boundary and keep strategy execution behind a shared pure contract when possible.
+   - Return explainable policy results for pricing, discounts, credits, entitlements, and permissions, such as original amount, applied rules, rejected rules, final amount, tax, rounding, and reason codes, so UI, receipts, refunds, support, and analytics do not recalculate the rule independently.
 13. Use command structure for state-changing shell units when needed.
     - If one user or system intent needs explicit payload, context, authorization, transaction, idempotency, outbox, audit, retry, concurrency, or queue and worker reuse, use `command-pattern` to shape the shell execution unit.
     - Keep the pure core as the decision maker and the command handler as the orchestrator.
@@ -167,6 +173,7 @@ Core decides. Shell does.
 - Given the same input, the core returns the same output.
 - The core can run without a database, network, file system, queue, cache, server, framework, logger, clock, environment variables, random generator, or generated identifier service.
 - Business rules are visible in core functions, not hidden inside handlers, repositories, adapters, or database queries.
+- Business rules are not hidden inside ORM models, relation loading, lifecycle hooks, decorators, or generated entity methods.
 - The shell owns all I/O, boundary mapping, persistence, transactions, retries, idempotency, logs, metrics, and side-effect execution.
 - Business rule tests do not require mocks.

package/templates/default/locales/en/.mustflow/skills/result-option/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.result-option
 locale: en
 canonical: true
-revision: 2
+revision: 3
 lifecycle: mustflow-owned
 authority: procedure
 name: result-option
-description: Apply this skill when expected failures, meaningful absence, null or undefined returns, thrown business errors, boolean success flags, raw string errors, repository lookups, validation, parsing, external adapter errors, or boundary error mapping need explicit Result and Option handling.
+description: Apply this skill when expected failures, meaningful absence, null or undefined returns, thrown business errors, boolean success flags, raw string errors, repository lookups, validation, parsing, external adapter errors, API error response contracts, or boundary error mapping need explicit Result and Option handling.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -44,6 +44,7 @@ Expected failure must be data. Meaningful absence must be data. Exceptions are o
 - A repository lookup can fail due to persistence and can also legitimately find no record.
 - External SDK, database, HTTP, payment, email, filesystem, or framework exceptions leak into business logic.
 - A controller, adapter, or command handler must convert typed failures into HTTP, UI, CLI, or queue responses.
+- API responses need stable machine-readable error codes, safe messages, request identifiers, and details that do not expose raw causes, provider payloads, or sensitive data.
 - Tests need stable success, failure, and absence cases without relying on thrown exceptions.
 <!-- mustflow-section: do-not-use-when -->
@@ -127,6 +128,7 @@ Expected failure must be data. Meaningful absence must be data. Exceptions are o
    - Repositories that can fail and may not find data should return `Result<Option<T>, E>`.
    - Services may convert an `Option` into a domain error when the value is required.
    - Controllers, CLI handlers, queue consumers, and UI boundary code should convert `Result` into protocol responses.
+   - Public and integration APIs should expose a consistent error envelope with stable code, safe message, optional safe details, and request id when available; they should not leak internal `Result` names, stack traces, storage keys, SQL, provider responses, or sensitive raw causes.
    - Do not serialize internal `Result` or `Option` shapes as public API responses unless that is the explicit public contract.
 9. Log once at the outer boundary.
    - Do not log the same error at every layer.