npm - mustflow - Versions diffs - 2.85.4 → 2.99.0 - Mend

mustflow 2.85.4 → 2.99.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

package/templates/default/locales/en/.mustflow/skills/INDEX.md CHANGED Viewed

@@ -2,7 +2,7 @@
 mustflow_doc: skills.index
 locale: en
 canonical: true
-revision: 177
+revision: 182
 authority: router
 lifecycle: mustflow-owned
 ---
@@ -132,6 +132,19 @@ refer to `AGENTS.md` and `.mustflow/config/commands.toml` to implement the most
   per-request latency review for I/O counts, DB/ORM/Redis/external-call fan-out,
   pagination/count/index fit, transactions, pool wait, cache miss paths, serialization, response
   size, request-path CPU, or trace/span observability rather than only general hot-path repetition.
+- Use `api-failure-triage` as a primary route when an API failure, SDK failure, browser request
+  failure, CORS preflight, gateway/CDN/proxy issue, wrong status/body, auth failure, retry, rate
+  limit, cache, OpenAPI drift, or deployment-config difference is not yet localized to one boundary.
+- Use `auth-flow-triage` as a primary route when login, OAuth, OIDC, cookie, session, token,
+  passkey, MFA, refresh, logout, or authorization-after-login failure is not yet localized to the
+  identity, browser, provider, proxy, token, session-store, clock, rate-limit, or permission boundary.
+- Use `docker-runtime-triage` as a primary route when Docker daemon, context, Compose, container
+  start, crash loop, health, image pull, port, DNS, network, volume, storage, cgroup, OOM, signal,
+  registry, or runtime failure is not yet localized to host, daemon, image, app, network, storage,
+  resource, or registry boundaries.
+- Use `ci-pipeline-triage` as a primary route when CI/CD workflow, pipeline, job, runner, trigger,
+  cache, artifact, deployment job, required check, or post-deploy verification failure is not yet
+  localized to trigger, runner, environment, build, test, artifact, deploy, or verification.
 - Use `web-render-performance-review` as an adjunct when web frontend routes need first-render,
   Core Web Vitals, LCP, CLS, FCP, TTFB, critical CSS, font, image, iframe, third-party script,
   hydration, first-view data, resource-hint, CDN/cache, route-prefetch, or long-task review.
@@ -150,6 +163,9 @@ refer to `AGENTS.md` and `.mustflow/config/commands.toml` to implement the most
   per-frame browser work, INP delay, scroll or animation jank, style recalculation, layout, paint,
   compositing, DOM size, selector cost, event scheduling, long tasks, framework rerender cost, or
   hydration cost.
+- Use `motion-system-contract-review` as an adjunct when UI motion needs state-transition, event,
+  timeline-track, interruption, settlement, reduced-motion, channel-collision, role-binding, or
+  async-feedback review.
 - Use `app-startup-performance-review` as an adjunct when installed app startup needs review from
   icon tap or process launch to first frame and fully usable state, including TTID, TTFD,
   `reportFullyDrawn()`, Application or AppDelegate work, auto-init, DI graph creation, SDK
@@ -288,6 +304,18 @@ refer to `AGENTS.md` and `.mustflow/config/commands.toml` to implement the most
 - Use `database-query-bottleneck-review` as an adjunct when database review needs to catch query
   bottlenecks from cardinality explosion, N+1 access, overfetching, unstable pagination,
   index-defeating predicates, plan skew, or long transaction scope before live plan evidence.
+- Use `search-index-integrity-review` as a primary route when keyword search, full-text search,
+  Elasticsearch, OpenSearch, Lucene-style indexing, aliases, bulk ingestion, refresh visibility,
+  analyzer, synonym, autocomplete, pagination, shard failure, search quality, or search
+  performance needs source-to-search and query-contract evidence.
+- Use `vector-search-integrity-review` as a primary route when vector search, semantic search, RAG
+  retrieval, embeddings, ANN indexes, exact-versus-approximate search, filters, metadata,
+  namespaces, tenants, hybrid search, reranking, recall, latency, or golden-set behavior needs
+  retrieval-contract review.
+- Use `rag-pipeline-triage` as a primary route when a RAG, knowledge-base answer, grounded chat,
+  citation answer, document QA, or support bot failure is not yet localized to ingestion, parsing,
+  chunking, retrieval, filtering, reranking, context assembly, prompt construction, generation,
+  citation validation, answerability, access control, latency, or cost.
 - Use `database-json-modeling-review` as an adjunct when database review needs to decide whether
   JSON, jsonb, metadata, settings, raw payload, or dynamic keys should become typed columns,
   child tables, generated/computed columns, JSON indexes, schema versions, or key registries
@@ -370,6 +398,10 @@ routes. Event routes stay inactive until their event occurs.
 | A configured command intent or verification step fails | `.mustflow/skills/failure-triage/SKILL.md` | Failing intent and output tail | Failure cause only | misdiagnosis | `mustflow_check`; original failing intent | Root cause, fix, rerun result |
 | The same read, list, search, path, or review observation repeats without new evidence; a duplicate-call guard appears; or a review/completion claim would rely on stale, failed, truncated, directory-only, or missing evidence | `.mustflow/skills/evidence-stall-breaker/SKILL.md` | Repeated tool call signature, prior result or warning, claim at risk, inspected sources, and next different observation strategy | Investigation path, review wording, completion evidence, and the smallest in-scope skill or workflow wording when preserving the failure mode | hallucinated codebase claim, fake review finding, exhausted tool budget, or false completion | `changes_status`, `changes_diff_summary`, `mustflow_check` | Stalled observation, evidence ledger, changed strategy or stopped branch, downgraded claims, verification, and remaining evidence gaps |
 | A bug or confusing failure needs a fix before the smallest deterministic reproduction or cause is clear | `.mustflow/skills/repro-first-debug/SKILL.md` | Symptom, expected behavior, observed output, failing intent or action, likely changed files, and known flakiness or environment limits | Diagnostic reads, focused reproduction, temporary instrumentation, smallest fix, and symptom-tied regression guard | speculative fix, flaky reproduction, lingering debug output, broad unrelated test, or over-testing | `test_related`, `test_fast`, `mustflow_check` | Symptom, reproduction path or gap, hypotheses, observations, fix, original reproduction rerun, verification, and remaining risk |
+| Reported API, SDK, browser, mobile, webhook, gateway, CDN, load balancer, provider, wrong-status, wrong-body, CORS preflight, auth, rate-limit, cache, OpenAPI, or deployment-config failure is not yet localized to the client, network, proxy, app, database, cache, provider, or deployment boundary | `.mustflow/skills/api-failure-triage/SKILL.md` | Failing request packet, success comparator, boundary ledger, timing ledger, contract ledger, auth ledger, change ledger, redaction constraints, and configured command intents | Request/response evidence preservation, success/failure wire comparison, boundary localization, timing decomposition, status/body/content-type mapping, CORS/preflight split, redirect and proxy header checks, authn/authz split, retry/timeout/rate-limit/idempotency classification, cache and OpenAPI drift checks, focused reproduction fixtures, and directly synchronized docs or templates | log-first debugging, SDK argument theater, missing failing packet, success-only comparison, CORS blamed for server-to-server calls, redirect losing auth or method, proxy stripping idempotency or trace headers, `200` error body, HTML body with JSON content type, authn/authz collapse, object-auth incident missed, clock-skew flake, retry storm, non-idempotent replay, 429 hidden as 500, stale CDN or browser cache, OpenAPI drift, deployment config drift, or unfalsifiable log reading | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | API failure triaged, request packet and comparator, boundary and timing ledger, localized cause or evidence gap, hypotheses killed or open, fix or recommendation, evidence level, verification, and remaining API-failure risk |
+| Login, signup, logout, refresh, password reset, magic link, passkey, MFA, OAuth, OIDC, JWT, cookie, session, token exchange, provider callback, account-linking, or authorization-after-login behavior is failing or intermittent before the failing identity boundary is known | `.mustflow/skills/auth-flow-triage/SKILL.md` | Auth attempt packet, stage ledger, token and session ledger, browser and proxy ledger, provider ledger, denial and privacy ledger, redaction constraints, and configured command intents | Auth stage localization, sanitized success/failure comparison, cookie and CORS credential checks, proxy trust and redirect URI checks, state, nonce, PKCE, issuer and subject checks, token and JWKS validation, session refresh and logout checks, passkey and MFA checks, account-linking checks, focused denial tests, and directly synchronized docs or templates | login-as-one-bucket debugging, token or cookie logging, account enumeration, loose redirect matching, state or nonce bypass, PKCE mismatch hidden, issuer plus subject ignored, SameSite or Secure drift, forwarded-header trust bug, refresh-token race, session fixation, email-only account linking, stale token claims, clock-skew flake, broad CORS wildcard, or unverified provider-console assumption | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Auth flow triaged, failing stage and comparator, cookie/proxy/provider/token/session/passkey/MFA findings, fix or recommendation, evidence level, verification, and remaining auth-flow risk |
+| Docker Engine, Docker Desktop, daemon, context, Compose, container start, crash loop, health check, image pull, build cache, port mapping, DNS, network, volume, bind mount, storage, proxy, registry, cgroup, OOM, signal handling, PID 1, or runtime behavior is failing before the failing container boundary is known | `.mustflow/skills/docker-runtime-triage/SKILL.md` | Runtime packet, container ledger, actual config ledger, host resource ledger, network ledger, storage ledger, evidence-preservation constraints, and configured command intents | Host, daemon, context, image, container, Compose, process, resource, storage, network, proxy, registry, and build boundary localization; evidence preservation before cleanup; focused Dockerfile, Compose, health, entrypoint, network, volume, resource, docs, fixture, or test edits only after localization | prune-before-evidence, restart loop hiding first error, app blame before daemon proof, logs-only diagnosis, exit code 137 treated as automatic OOM, PID 1 signal loss, container localhost confusion, bind mount hiding image files, Compose variable drift, tag identity confusion, stale build cache, broad firewall reset, volume deletion, or unbounded raw Docker command | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Docker runtime triaged, boundary findings, evidence preserved and missing, fix or recommendation, evidence level, verification, and remaining Docker runtime risk |
+| CI/CD workflow, pipeline, job, matrix, trigger, required check, runner, cache, artifact, deployment step, or post-deploy verification is failing, skipped, queued, flaky, slow, or green despite broken output before the failing pipeline boundary is known | `.mustflow/skills/ci-pipeline-triage/SKILL.md` | Failure classification, run identity ledger, last-good comparison, boundary ledger, redaction constraints, and configured command intents | Trigger, parsed graph, queue, runner, environment, dependency, build, test, cache, artifact, deploy, smoke, and final status localization; false-green checks; safe diagnostic evidence; focused workflow, package, docs, fixture, or test edits only after localization | last-red-line debugging, latest-code comparison, rerun-green treated as fixed, skipped required check, path-filter pending state, hidden `continue-on-error`, queue time mistaken for build time, floating `latest`, secret logging, sleep-based service readiness, cache-as-artifact confusion, deploying untested rebuilt artifacts, fork token scope surprise, unguarded environment concurrency, or zero-exit deploy without smoke evidence | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | CI pipeline triaged, failure shape and localized boundary, run identity and last-good comparison, trigger/runner/environment/build/test/cache/artifact/deploy/verification findings, verification, and remaining CI pipeline risk |
 ### General Code Change
@@ -382,12 +414,13 @@ routes. Event routes stay inactive until their event occurs.
 | Quality metrics, line-count limits, complexity budgets, lint/type/test gates, or assistant-authored changes may be gamed through long-line stuffing, multiple statements per line, new suppressions, test bypass markers, type escapes, placeholder implementations, generated/vendor logic, giant config blobs, dispatch maps, or helper/util/manager/common containers | `.mustflow/skills/quality-gaming-guard/SKILL.md` | User goal, intended quality outcome behind the metric, current diff, quality gates, formatter/lint/type/test rules, generated/vendor policy, helper naming conventions, suppression baseline, and command contract entries | Real responsibility split, removal of gaming patterns, focused tests or quality-gate evidence, bounded `quality_gaming_check` command contract, and directly synchronized docs or templates | cosmetic metric compliance, line stuffing, validation suppression, test bypass, broad type escape, placeholder success, generated/vendor hiding, junk-drawer helper extraction, or legacy baseline confused with new regression | `quality_gaming_check`, `changes_status`, `changes_diff_summary`, `test_related`, `lint`, `build`, `mustflow_check` | Quality goal, gaming patterns inspected, patterns removed or intentionally left, baseline versus new-regression decision, verification, and remaining quality-gaming risk |
 | Prompts, prompt builders, system or developer messages, RAG prompt assembly, few-shot examples, structured outputs, tool-use instructions, model selection, reasoning-effort settings, eval sets, refusal or fallback handling, prompt versioning, or AI feature completion criteria are created, changed, reviewed, or reported | `.mustflow/skills/prompt-contract-quality-review/SKILL.md` | Prompt contract ledger, input ledger, authority ledger, output schema, tool policy, model/runtime ledger, RAG evidence ledger, eval ledger, changed files, and command contract entries | Prompt builders, prompt templates, schemas, validators, eval fixtures, boundary examples, tool policies, fallback states, completion definitions, docs, tests, and directly synchronized templates | prompt-as-function gap, user input treated as authority, buried RAG evidence, happy-path-only examples, JSON-parse theater, unpinned production model, hidden prompt storage, raw chain-of-thought request, guessed tool parameters, missing failure state, unbounded reasoning/token cost, or vibe-based prompt improvement claim | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Prompt contract reviewed, function boundary, authority and source separation, eval and semantic validation status, model/runtime policy, RAG/tool/failure/completion states, verification, and remaining prompt-contract risk |
 | LLM answers, RAG responses, citations, source grounding, claim extraction, evidence IDs, answerability states, abstain behavior, retrieval thresholds, tool-backed facts, output validators, LLM judges, or hallucination-control metrics are created, changed, reviewed, or reported | `.mustflow/skills/llm-hallucination-control-review/SKILL.md` | Answer contract ledger, evidence ledger, claim ledger, tool ledger, validator ledger, eval ledger, observability ledger, changed files, and command contract entries | Answerability states, abstain states, missing-information states, source-coverage gates, claim maps, evidence-ID requirements, citation validators, retrieval thresholds, chunk metadata, tool-parameter ownership, deterministic calculators, domain validators, eval fixtures, tests, docs, route metadata, and directly synchronized templates | unsupported factual claim, fabricated citation, source ID invention, weak retrieval gate, noisy semantic-only retrieval, chunk context loss, summary-on-summary hallucination, guessed tool parameter, model arithmetic, source-priority conflict, LLM judge overtrust, low-temperature theater, missing abstain path, missing dirty eval, false citation metric gap, or unobservable grounding drift | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Hallucination-control surface reviewed, answerability and abstain states, evidence IDs, claim map, citations, source coverage, validators, retrieval thresholds, tool ownership, evals, metrics, verification, and remaining hallucination-control risk |
+| RAG, knowledge-base answer, grounded chat, citation answer, retrieval-augmented support bot, or document QA flow is wrong, stale, unsupported, slow, leaking data, over-refusing, or not yet localized to ingestion, parsing, chunking, retrieval, filtering, reranking, context assembly, prompt construction, generation, citation validation, or answerability boundaries | `.mustflow/skills/rag-pipeline-triage/SKILL.md` | Symptom classification, trace ledger, source ledger, comparison ledger, eval ledger, privacy ledger, changed files, and command contract entries | End-to-end trace preservation, source availability and parsed-text checks, chunk metadata, duplicate and stale source checks, no-retrieval/current-context/gold-context comparison, keyword/vector/hybrid/retriever/reranker/context/prompt/generation/citation/answerability localization, safe synthetic fixtures, metrics, docs, and directly synchronized templates | model scapegoating, tuning top-k before source proof, original-document theater while parsed text is broken, stale source mixing, filter blamed as vector failure, reranker candidate starvation, critical evidence truncated or buried, retrieved text treated as instruction, citation decoration, answerability missing state, private corpus dump, access filter bypass, or single satisfaction score hiding layer failure | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | RAG pipeline triaged, localized boundary, trace/source/comparison/eval/metric/privacy ledgers, layer findings, fix or recommendation, evidence level, verification, and remaining RAG pipeline risk |
 | LLM API calls, prompt assembly, chat history, RAG context, tool schemas, structured output schemas, model routing, reasoning settings, token budgets, provider prompt caching, app-level response caching, retries, batch or flex processing, predicted outputs, image or file inputs, or LLM cost metrics are created, changed, reviewed, or reported | `.mustflow/skills/llm-token-cost-control-review/SKILL.md` | Cost surface ledger, request ledger, cache ledger, context ledger, output ledger, routing ledger, observability ledger, changed files, and command contract entries | Request builders, prompt prefix ordering, canonical serialization, prompt hashes, cache keys, token counters, budget guards, model routers, context trimming, RAG packing, tool and schema payloads, output patch formats, retry repair paths, metrics, logs, tests, docs, route metadata, and directly synchronized templates | prompt-cache prefix drift, volatile field before stable prefix, unmeasured token count, full transcript replay, RAG chunk bloat, oversized tool or JSON schema payload, expensive model default, unbounded reasoning, no visible output after token cap, full-output regeneration, full-context retry replay, app cache key leak, predicted-output cost confusion, image or file token surprise, or per-call cost hiding cost-per-success regression | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `prompt_cache_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | LLM token-cost surface reviewed, cost unit and measurement source, stable prefix and cache behavior, app cache/history/RAG/tool/schema/input choices, routing/reasoning/output/retry/Batch/Flex/prediction choices, observability, verification, and remaining token-cost risk |
 | LLM response latency, time to first token, first useful output, streaming, output length, LLM round trips, tool-call wait, prompt-cache latency, model routing, speculative or parallel execution, realtime continuation, priority tiers, predicted outputs, or user-perceived AI speed are created, changed, reviewed, or reported | `.mustflow/skills/llm-response-latency-review/SKILL.md` | Latency target ledger, request timeline ledger, call graph ledger, output ledger, cache ledger, routing ledger, observability ledger, changed files, and command contract entries | Streaming paths, first-useful-output contracts, request timeline metrics, call graph simplification, parallel or speculative work, model routers, fallback cascades, output caps, schema shortening, prompt-cache prefix ordering, cache keys, realtime continuation, priority-tier routing, timeout and cancellation behavior, tests, docs, route metadata, and directly synchronized templates | slow first token, useless streamed preamble, extra sequential LLM round trip, tool wait blocking first output, cache-prefix drift, unmeasured cache miss, verbose output drift, long JSON key or enum overhead, RAG chunk bloat, router escalation loop, prediction-token mismatch, priority-tier misuse, unsafe speculative work, missing cancellation, or raw prompt telemetry leak | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | LLM response-latency surface reviewed, latency unit and request timeline, round trips, parallel/tool/stream/cancel behavior, output/schema/cache/history/RAG/routing/fallback/prediction/realtime/priority choices, observability, verification, and remaining response-latency risk |
 | Autonomous or semi-autonomous LLM agents, agentic workflows, planners, executors, verifiers, tool contracts, tool-call gates, human approval or interrupt flows, durable agent state, handoffs, guardrails, loop budgets, retry policies, trace evaluation, or agent outcome metrics are created, changed, reviewed, or reported | `.mustflow/skills/agent-execution-control-review/SKILL.md` | Autonomy ledger, stage gate ledger, role separation ledger, tool contract ledger, effect ledger, state and resume ledger, memory and context ledger, handoff and guardrail ledger, loop, retry, and budget ledger, trace and eval outcome ledger, changed files, and command contract entries | Workflow-versus-agent routing, stage gates, planner/executor/verifier boundaries, tool contracts, tool argument ownership, draft/execute separation, idempotency keys, approval records, durable checkpoints, state schema versions, memory partitions, handoff filters, guardrails, loop budgets, retry classification, trace spans, eval fixtures, tests, docs, route metadata, and directly synchronized templates | unnecessary autonomous agent, one model self-certifying success, ungated bad plan, ambiguous tool contract, guessed tool argument, relative-path trap, external effect before approval, missing idempotency key, interrupt replay side effect, stale state schema, over-shared handoff, misplaced guardrail, repeated tool loop, blind retry, final-answer-only eval, or unsafe trace data | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Agent execution-control surface reviewed, workflow-versus-agent decision, autonomy envelope, stage gates, role separation, tool contracts, approval and side-effect replay safety, state/resume/memory/handoff/guardrail/loop/retry/trace/eval checks, verification, and remaining agent execution-control risk |
 | Agent evaluation loops, trace or trajectory grading, LLM judges, verifier agents, outcome scoring, tool-call prechecks or postchecks, eval datasets, golden or dirty sets, pass@k or pass^k metrics, shadow environments, production-monitoring-to-eval pipelines, or agent regression gates are created, changed, reviewed, or reported | `.mustflow/skills/agent-eval-integrity-review/SKILL.md` | Outcome ledger, trace ledger, oracle ledger, tool-boundary ledger, dataset ledger, metric ledger, environment ledger, monitoring ledger, privacy ledger, changed files, and command contract entries | Outcome oracles, trace schemas, trajectory graders, deterministic checkers, model-judge rubrics, human-review sampling, tool prechecks and postchecks, tool-result evidence packets, eval fixtures, golden and dirty sets, shadow-environment adapters, monitoring-to-eval candidate flows, tests, docs, route metadata, and directly synchronized templates | final-answer-only scoring, LLM judge as sole oracle, reasoning claim treated as evidence, self-reflection certifying success, missing final environment state, ungraded unsafe trajectory, missing tool precheck or postcheck, brittle exact tool-order assertion, uncalibrated judge drift, pass@k masking unreliable pass^k, dirty set gating flakiness, raw trace data leak, or production failure not entering evals | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Agent eval-integrity surface reviewed, outcome oracle, final-state and trajectory checks, deterministic/model/human oracle split, tool boundary evidence, golden/dirty/capability/regression metrics, shadow environment, monitoring-to-eval loop, trace privacy, verification, and remaining agent eval-integrity risk |
 | Code review or implementation needs module-boundary triage for change spread, co-change clusters, data ownership, policy ownership, failure ownership, import direction, circular dependencies, DTO leakage, shared/common/utils growth, mock-heavy tests, repeated policy conditions, enum interpretation, repository business logic, anemic domain, domain-to-I/O leakage, transaction boundary mismatch, technical event names, public module API bloat, caller sequencing, premature common helpers, bug/fix distance, config ownership, log responsibility, exception translation, cache invalidation ownership, repeated authorization checks, frontend/backend policy leakage, time policy, batch or worker bypass, or temporary-code accumulation | `.mustflow/skills/module-boundary-review/SKILL.md` | Change reason, changed-file spread, co-change evidence, module graph evidence, ownership evidence, test evidence, and configured command intents | Policy ownership, DTO boundaries, mapper boundaries, module public APIs, import direction, config injection, exception translation, cache invalidation ownership, authorization checks, event facts, worker entrypoints, focused tests, and directly synchronized docs or templates | layer-name theater, role-sliced files that always change together, lower-level modules knowing high-level policy, circular dependency, DTO infection, ownerless shared helper, broad noun module, mock-heavy rule tests, copied policy condition, enum rule scatter, repository-owned business rule, service-only domain, domain I/O coupling, cross-owner transaction, table-change event, exposed internals, caller order dependency, unsafe reuse, repeated temporary branch, bug/fix distance, config chaos, misleading logs, leaked provider or DB errors, stale cache owner drift, copied auth checks, frontend reconstructing backend policy, inconsistent time rules, worker bypass, or random capability loss when a module is removed | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Module boundary reviewed, change-spread and co-change evidence, owner and import findings, boundary fixes or recommendation, evidence level, verification, and remaining module-boundary risk |
-| Code review or implementation needs change-blast-radius triage for maintainability risk from unpredictable next-change spread, one change reason scattered across files, multiple change reasons in one file, controller workflow leakage, junk-drawer service names, boolean mode flags, trash-can option objects, scattered domain rules, scattered authorization, hidden state transitions, direct time or randomness, unclear transaction boundaries, external API and DB coupling, retry without idempotency, cache-as-truth decisions, config flag combinations, tenant or partner hardcoding, legacy branches in the core path, DTO/entity/view model mixing, ambiguous nullable values, swallowed exceptions, low-context logs, implementation-coupled tests, mock-heavy tests, decorative abstractions, premature DRY, hidden ordering dependency, invisible event contracts, migration/runtime compatibility, or hard-to-delete features | `.mustflow/skills/change-blast-radius-review/SKILL.md` | User goal, current diff or target files, change-reason ledger, blast-radius ledger, ownership ledger, deleteability ledger, test and operations evidence, and configured command intents | Policy ownership, workflow boundaries, explicit modes, option-object narrowing, authorization owner, state-transition owner, transaction and retry/idempotency boundaries, cache truth boundary, config and tenant variation isolation, legacy adapters, DTO mapping ownership, result types, traceable logs, behavior-focused tests, event contracts, migration compatibility notes, and directly synchronized docs or templates | clean-code theater, unpredictable edit spread, unrelated reasons in one file, controller as boss, junk-drawer service, boolean maze, option combinatorics, copied policy, copied auth, scattered status writes, hidden time or random, partial data after failure, duplicate side effect, stale cache authority, untested flag product, hardcoded customer branch, legacy core pollution, object-language mixing, nullable ambiguity, false success, useless logs, brittle process tests, five-mock class, decorative interface, wrong DRY, order-sensitive line shuffle, ghost event coupling, deploy gamble, or feature that cannot be deleted cleanly | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Change blast radius reviewed, next likely change and owner, spread and deletion path, maintainability findings, fixes or recommendation, verification, and remaining change-spread risk |
+| Code review or implementation needs change-blast-radius triage for maintainability risk from unpredictable next-change spread, historical co-change spread, one change reason scattered across files, multiple change reasons in one file, controller workflow leakage, junk-drawer service names, boolean mode flags, trash-can option objects, scattered domain rules, scattered authorization, hidden state transitions, direct time or randomness, unclear transaction boundaries, external API and DB coupling, retry without idempotency, cache-as-truth decisions, config flag combinations, tenant or partner hardcoding, legacy branches in the core path, DTO/entity/view model mixing, ambiguous nullable values, swallowed exceptions, low-context logs, implementation-coupled tests, mock-heavy tests, decorative abstractions, premature DRY, hidden ordering dependency, invisible event contracts, migration/runtime compatibility, or hard-to-delete features | `.mustflow/skills/change-blast-radius-review/SKILL.md` | User goal, current diff or target files, change-reason ledger, historical co-change ledger, blast-radius ledger, ownership ledger, deleteability ledger, test and operations evidence, and configured command intents | Policy ownership, workflow boundaries, explicit modes, option-object narrowing, authorization owner, state-transition owner, transaction and retry/idempotency boundaries, cache truth boundary, config and tenant variation isolation, legacy adapters, DTO mapping ownership, result types, traceable logs, behavior-focused tests, event contracts, migration compatibility notes, and directly synchronized docs or templates | clean-code theater, unpredictable edit spread, repeated cross-repo co-change, unrelated reasons in one file, controller as boss, junk-drawer service, boolean maze, option combinatorics, copied policy, copied auth, scattered status writes, hidden time or random, partial data after failure, duplicate side effect, stale cache authority, untested flag product, hardcoded customer branch, legacy core pollution, object-language mixing, nullable ambiguity, false success, useless logs, brittle process tests, five-mock class, decorative interface, wrong DRY, order-sensitive line shuffle, ghost event coupling, deploy gamble, or feature that cannot be deleted cleanly | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Change blast radius reviewed, next likely change and owner, historical co-change evidence, spread and deletion path, maintainability findings, fixes or recommendation, verification, and remaining change-spread risk |
 | Code review or implementation needs business-rule-leakage triage for money, permission, ownership, state, settlement, discount, coupon, refund, inventory, notification, subscription, visibility, eligibility, expiry, price, tax, fee, points, reports, tenant scope, UI-only guards, controller eligibility checks, direct status assignment, query predicates as policy, list/detail scope mismatch, admin path bypass, batch hidden policy, tests at the wrong layer, duplicated business constants, date or timezone policy drift, ambiguous `isActive` or `canUse`, authentication/authorization/eligibility mixing, ownership boundary drift, broad update DTOs, PATCH null semantics, mapper logic, default-value drift, misleading error messages, swallowed business failures, transaction/action mismatch, event timing, duplicate requests, webhook trust, out-of-order events, cache-as-rule, search index drift, report or settlement SQL, public text drift, or other bypass entrances | `.mustflow/skills/business-rule-leakage-review/SKILL.md` | User goal, current diff or target files, rule ledger, entrypoint ledger, enforcement ledger, consistency ledger, state/transaction/event/idempotency/default/error evidence, and configured command intents | Shared policy/domain/application/database rule owner, server-side enforcement, reusable query scopes, list/detail consistency, update field restrictions, PATCH intent models, mechanical mappers, default ownership, visible failures, transaction and event boundaries, idempotency, webhook verification, search/detail rechecks, report calculation owners, focused tests, and directly synchronized docs or templates | clean code with leaked money rule, client-only rule, controller judge, scattered status assignment, missing query predicate, URL detail bypass, admin bypass, batch bypass, misplaced test, magic policy number, timezone cutoff drift, vague helper meaning, auth/eligibility blur, user-vs-tenant ownership hole, mass update, PATCH null corruption, mapper policy, default mismatch, error-text lie, false success, half-committed business action, pre-commit event, duplicate refund or coupon use, trusted webhook, stale event transition, wrong cache viewer, dead search result, settlement query drift, support macro drift, or uninspected entrypoint | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Business rule leakage reviewed, source of truth and entrypoints, enforcement and bypass findings, consistency notes, fixes or recommendation, verification, and remaining rule-leakage risk |
 | Payment, checkout, authorization, capture, refund, partial refund, subscription, invoice, trial, grace period, coupon, promotion, inventory reservation, fulfillment, entitlement, settlement, fee, chargeback, dispute, provider webhook, payment session, payment link, payment-provider integration, admin manual payment change, payment log, PCI-sensitive data handling, or payment-related test needs payment-integrity triage for duplicate, late, out-of-order, wrong-actor, wrong-amount, wrong-currency, timeout, retry, idempotency, ledger, reconciliation, or audit risk | `.mustflow/skills/payment-integrity-review/SKILL.md` | User goal, current diff or target files, money-event ledger, provider interaction ledger, state-transition ledger, idempotency and uniqueness ledger, amount and currency ledger, ownership ledger, fulfillment and entitlement ledger, webhook and retry ledger, audit and sensitive-data ledger, existing tests, and configured command intents | Payment state machines, server-side amount calculation, minor-unit money handling, object ownership checks, idempotency keys, provider ID uniqueness, webhook raw-body signature verification, webhook event dedupe, queue handoff, one-time fulfillment, async payment handling, authorization/capture distinctions, refund/dispute/subscription transitions, inventory and coupon reservation, timeout and retry classification, append-only ledgers, secret and payment-data redaction, admin audit trails, stale payment endpoint cleanup notes, focused nightmare-path tests, and directly synchronized docs or templates | paid-boolean shortcut, client-trusted amount, wrong-owner order/payment/refund/subscription ID, amount drift, float money math, missing idempotency, per-retry UUID idempotency, missing provider uniqueness, duplicate webhook, out-of-order webhook, JSON-parsed webhook signature breakage, disabled webhook signature, success-page-as-proof, double fulfillment, async payment premature fulfillment, authorized-treated-as-captured, double refund, missing dispute handling, subscription period-only active check, inventory oversell, coupon double spend or lost coupon, timeout-treated-as-failure, blind retry, mutable ledger, card data logging, test/live secret mix, unaudited admin override, stale payment API, or happy-path-only payment tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Payment integrity reviewed, money-event/provider/state/idempotency/amount/ownership/fulfillment/webhook/audit map, findings, fixes or recommendation, nightmare-path evidence, verification, and remaining payment-integrity risk |
 | Credit, point, wallet balance, reward point, prepaid credit, usage credit, bonus credit, loyalty point, stored-value balance, balance deduction, accrual, refund, reversal, expiration, reservation, capture, release, admin adjustment, transfer, ledger table, balance cache, reconciliation job, settlement report, or credit-related test needs credit-ledger-integrity triage for ledger identity, idempotency, atomic balance changes, concurrency, ordering, ownership, amount precision, rounding, policy snapshots, expiry lots, reservation state, failure recovery, audit evidence, or reconciliation risk | `.mustflow/skills/credit-ledger-integrity-review/SKILL.md` | User goal, current diff or target files, balance surface ledger, ledger-entry ledger, source identity ledger, atomicity ledger, amount and unit ledger, ownership ledger, expiry and lot ledger, reservation ledger, queue and cache ledger, audit and reconciliation ledger, existing tests, and configured command intents | Ledger-entry models, source identifiers, idempotency comparison, conditional balance updates, database constraints, transaction boundaries, row-lock targets, optimistic-lock retry classification, amount validation, rounding policy, non-negative invariants, refund and reversal modeling, partial-use handling, expiry lot allocation, reservation/capture/release transitions, queue idempotency and ordering, cache invalidation, replica-read routing, admin adjustment audit trails, reconciliation checks, evidence logs, focused concurrency and failure tests, and directly synchronized docs or templates | mutable-balance-only path, missing source key, weak idempotency comparison, read-then-update subtraction, unchecked affected rows, split transaction, wrong lock target, optimistic retry double-spend, float credit math, hidden rounding rule, negative or zero amount abuse, app-only non-negative promise, missing unique ledger key, generic balance increase refund, partial refund blind spot, expiry lot loss, expiry batch race, missing reservation state, arbitrary status update, queue reorder damage, duplicate consumer delivery, cache-trusted deduction, replica stale balance, unaudited admin adjustment, request-body wallet ownership, current-price recalculation, missing policy snapshot, no failure injection, no balance-vs-ledger reconciliation, vague deduction log, or happy-path-only credit tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Credit ledger integrity reviewed, balance/ledger/source/atomicity/amount/ownership/expiry/reservation/queue/cache/audit/reconciliation map, findings, fixes or recommendation, nightmare-path evidence, verification, and remaining credit-ledger integrity risk |
@@ -403,8 +436,8 @@ routes. Event routes stay inactive until their event occurs.
 | Code review or implementation needs race-condition triage for shared state observed across interleaving execution flows, including check-then-act, read-modify-write, stale reads after `await`, I/O, callbacks or events, lock scope or global lock order, `tryLock`, timeout, retry, cache miss fill, lazy initialization, double-checked locking, atomics, memory ordering, DB transaction isolation, conditional updates, unique constraints, distributed locks, idempotency, filesystem exists/open, atomic create or rename, outbox ordering, queue duplicates or reordering, concurrent same-user work, shutdown, cancellation, timers, close/send races, shared collections, iterator snapshots, object reuse, fake immutability, sleep-based race tests, log ordering, or status values without transitions | `.mustflow/skills/race-condition-review/SKILL.md` | Shared state surface, invariant, interleaving points, synchronization or transaction boundary, retry and idempotency policy, event, queue, timer, cancellation and shutdown paths, collection or object ownership, evidence level, and configured command intents | Atomic conditional update, atomic create, compare-and-swap, lock scope or lock-order fix, unique constraint, row lock, idempotency guard, singleflight, outbox or inbox guard, state transition guard, snapshot iteration, ownership split, focused concurrency tests, and directly synchronized docs or templates | check-then-act, lost update, stale read after await, torn invariant, callback under lock, deadlock, retry duplication, cache stampede, double init, unsafe atomic assumption, isolation mismatch, app-only uniqueness, broken distributed lock, duplicate side effect, event/state split brain, queue duplicate or out-of-order damage, shutdown drop, cancellation completion race, old timer update, double close, send after close, shared collection mutation, pooled object corruption, fake immutable mutation, sleep-test false confidence, log-order lie, or state value race | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Shared state and invariant reviewed, interleaving ledger, atomicity and synchronization findings, stale-read and ordering checks, tests or evidence level, verification, and remaining race-condition risk |
 | Code review or implementation needs concurrency-invariant triage for shared ownership and primitive discipline when correctness depends on time-order changes, including hidden writes in getters, lazy initialization, check-then-act, read-modify-write, exact lock identity, lock scope, lock order, callbacks under lock, condition-variable `while` predicates, lost notifications, atomics mixed with ordinary state, CAS ABA, double-checked locking, object publication before construction completes, fake immutability, concurrent collection iteration, cache stampede, application-only uniqueness, transaction isolation, distributed lock lease expiry, fencing tokens, idempotency keys, duplicate queue delivery, explicit state-machine transitions, scheduler overlap, shutdown drain, resource release after acquire, thread-local leakage, async `await` interleavings, or sleep-based concurrency tests | `.mustflow/skills/concurrency-invariant-review/SKILL.md` | Shared state inventory, owner decision, invariant, time-order table, lock identity and order, condition predicate, atomic and memory-visibility story, transaction and distributed lease boundary, duplicate execution rule, shutdown and thread-local context, test evidence, and configured command intents | Single-writer ownership, immutable snapshot, scoped lock, global lock order, condition predicate loop, atomic conditional write, transaction or row lock, unique constraint, idempotency record, fencing token, queue dedupe, state transition guard, scheduler ownership, shutdown drain, context cleanup, deterministic interleaving test, and directly synchronized docs or templates | ownerless shared state, read-only helper hidden write, torn invariant, different locks guarding one fact, too-narrow lock, too-wide lock, deadlock order, callback under lock, lost notification, spurious wakeup, atomic-only cover-up, ABA, unsafe publication, half-constructed object, fake immutable mutation, collection mutation during iteration, cache stampede, duplicate insert across service instances, isolation mismatch, expired distributed lock owner, duplicate side effect, queue redelivery damage, status backtracking, scheduler double-run, dropped in-flight work, leaked permit, thread-local tenant leak, stale value after await, or sleep-test false confidence | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Shared inventory and owners reviewed, invariant and time-order table, primitive-discipline findings, fixes or recommendations, deterministic evidence level, verification, and remaining concurrency-invariant risk |
 | Code review or implementation needs failure-integrity triage for exception or failure handling that can produce false success, swallowed exceptions, log-and-continue paths, ambiguous `null`, `false`, or empty defaults, `finally` masking, transaction commit after caught failure, external side-effect ordering bugs, retry without idempotency, missing timeouts, cancellation swallowing, unobserved async failures, queue ack/nack mistakes, lost causes, leaked internal errors, mixed business and system failures, partial state, lock or resource leaks on failure paths, unsafe parsing defaults, fail-open authorization, unsafe cache or fallback defaults, unstable public error messages, or missing failure-path observability | `.mustflow/skills/failure-integrity-review/SKILL.md` | Failure surface, truth surface, state-change ledger, error classification, transaction and side-effect boundary, retry, timeout, cancellation, queue, cleanup, public error, redaction, observability, and configured command intents | Failure propagation, typed error value, rollback or compensation, idempotency guard, timeout and retry budget, cancellation propagation, ack/nack and dead-letter policy, cause preservation, stable public error code, safe logging or metrics, fail-closed behavior, resource cleanup, focused failure-path tests, and directly synchronized docs or templates | broad catch, swallowed exception, false success, false empty data, cleanup masking original error, partial commit, unknown provider outcome, duplicate side effect, retry storm, hung dependency, ignored cancellation, unobserved background failure, dropped queue message, poison message loop, lost stack cause, internal error leak, client string-branching, business/system failure confusion, stuck processing state, unreleased lock, unclosed handle, dangerous default value, fail-open permission, unsafe fallback, invisible compensation failure, or no operator signal | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Failure surface and lie reviewed, state-change and truth ledger, swallowed or false-success findings, rollback/retry/fallback/observability decisions, tests or evidence level, verification, and remaining failure-integrity risk |
-| Code review or implementation needs backend-log-evidence triage for backend request, worker, scheduler, webhook, migration, script, service, repository, or external adapter logs that must explain why a request, job, or data change reached its final state, including event names, schema versions, start and finish logs, trace/span/request IDs, correlation and causation IDs, outcome and reason fields, error causes and stacks, external API before and after logs, DB affected rows, transaction begin/commit/rollback, state transitions, silent early returns, retries, timeouts, queue enqueue and consume, async context propagation, batch summaries, audit events, auth or validation failures, cache hits or misses, lock acquisition, idempotency outcomes, feature flags, release/config startup summaries, migration dry-run and apply logs, severity levels, duplicate logs, structured fields, safe identifiers, sampling, cardinality, log-injection safety, or redaction | `.mustflow/skills/backend-log-evidence-review/SKILL.md` | Backend path, event contract, correlation and causation model, request lifecycle evidence, error evidence, decision evidence, side-effect evidence, sampling and safety constraints, local logger conventions, tests or fixtures, and configured command intents | Structured log events, stable event names, schema versions, safe identifiers, trace/span/request/correlation/causation IDs, request start and finish summaries, result type, outcome, reason code, duration, deployment/resource attributes, error object logging, cause preservation, dependency request IDs, affected-row counts, transition fields, retry attempts, timeout classes, queue and batch IDs, audit fields, auth and validation reason codes, cache and lock result fields, idempotency classifications, feature flag variants, release and config summaries, cardinality controls, redaction guards, focused tests, and directly synchronized docs or templates | route-only start log, no finish log, missing duration, message-based dashboard, missing schema version, missing trace or span id, missing causation id, string-only error, lost cause or stack, external API logged only after failure, raw provider body log, missing affected-row count, invisible transaction boundary, status assignment without from/to state, silent guard return, attempt-free retry log, timeout without actual duration, enqueue without consume evidence, broken async request id, batch started/finished only, audit event mixed with debug log, missing auth reason, validation 400 with no safe field summary, cache blind spot, lock wait hidden, idempotency ambiguity, feature flag opacity, release or config opacity, secret-bearing config log, migration `done`, swallowed background error, all-info severity, duplicate error spam, prose-only log, high-cardinality indexed field, log injection exposure, unsafe sampling, missing domain identifier, or sink-side-only masking | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Backend log boundary reviewed, reconstruction question, event/lifecycle/correlation/causation/error/side-effect/decision/cardinality/sampling/redaction findings, evidence level, verification, and remaining backend-log-evidence risk |
-| Code review or implementation needs observability-debuggability triage for logs, metrics, traces, spans, structured events, telemetry context, dashboards, alerts, runbooks, sampling, redaction, dependency calls, queues, batch jobs, caches, pools, rate limits, feature flags, releases, migrations, or partial-success paths where operators need to narrow incidents quickly without high-cardinality metric explosions, missing denominator counters, lost trace context, or sensitive telemetry leakage | `.mustflow/skills/observability-debuggability-review/SKILL.md` | Incident question, signal inventory, request or job identity, metric model, trace and event model, log model, operational domain, privacy and retention constraints, verification evidence, and configured command intents | Structured event names, safe reason codes, total and failure counters, latency distributions, low-cardinality labels, trace and span context, dependency and operation names, async propagation, per-attempt telemetry, queue or batch lag signals, pool saturation metrics, release and feature attribution, telemetry self-metrics, redaction, focused tests, and directly synchronized docs or templates | success-only log, no denominator, average-only latency, mixed success and error latency, raw URL label, raw user label, raw SQL telemetry, high-cardinality metric label, missing trace or span id, broken async trace propagation, attempt and operation collapse, generic timeout bucket, missing dependency name, missing idempotency or message evidence, missing queue age, missing batch last-success timestamp, missing pool saturation, missing release attribution, decorative metric, alert without action, dropped telemetry invisible, sampling drops errors, unsafe baggage, or sink-side-only masking | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Observability boundary reviewed, incident question and signal ledger, metric/trace/log/cardinality/privacy findings, evidence level, verification, and remaining observability-debuggability risk |
+| Code review or implementation needs backend-log-evidence triage for backend request, worker, scheduler, webhook, migration, script, service, repository, or external adapter logs that must explain why a request, job, or data change reached its final state, including event names, schema versions, start and finish logs, trace/span/request IDs, correlation and causation IDs, outcome and reason fields, error causes and stacks, external API before and after logs, DB affected rows, transaction begin/commit/rollback, state transitions, silent early returns, retries, timeouts, queue enqueue and consume, async context propagation, batch summaries, audit events, auth or validation failures, cache hits or misses, lock acquisition, idempotency outcomes, feature flags, release/config startup summaries, migration dry-run and apply logs, log pipeline canaries, generated/accepted/sent/stored/searchable counts, timestamp versus observed timestamp, parser or mapping failures, severity levels, duplicate logs, structured fields, safe identifiers, sampling, cardinality, log-injection safety, or redaction | `.mustflow/skills/backend-log-evidence-review/SKILL.md` | Backend path, event contract, correlation and causation model, request lifecycle evidence, error evidence, decision evidence, side-effect evidence, pipeline integrity evidence, sampling and safety constraints, local logger conventions, tests or fixtures, and configured command intents | Structured log events, stable event names, schema versions, safe identifiers, trace/span/request/correlation/causation IDs, request start and finish summaries, result type, outcome, reason code, duration, deployment/resource attributes, error object logging, cause preservation, dependency request IDs, affected-row counts, transition fields, retry attempts, timeout classes, queue and batch IDs, audit fields, auth and validation reason codes, cache and lock result fields, idempotency classifications, feature flag variants, release and config summaries, log canary and pipeline survival checks, cardinality controls, redaction guards, focused tests, and directly synchronized docs or templates | route-only start log, no finish log, missing duration, message-based dashboard, missing schema version, missing trace or span id, missing causation id, string-only error, lost cause or stack, external API logged only after failure, raw provider body log, missing affected-row count, invisible transaction boundary, status assignment without from/to state, silent guard return, attempt-free retry log, timeout without actual duration, enqueue without consume evidence, broken async request id, batch started/finished only, audit event mixed with debug log, missing auth reason, validation 400 with no safe field summary, cache blind spot, lock wait hidden, idempotency ambiguity, feature flag opacity, release or config opacity, secret-bearing config log, migration `done`, swallowed background error, all-info severity, duplicate error spam, prose-only log, high-cardinality indexed field, log pipeline silently dropping evidence, log injection exposure, unsafe sampling, missing domain identifier, or sink-side-only masking | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Backend log boundary reviewed, reconstruction question, event/lifecycle/correlation/causation/error/side-effect/decision/pipeline/cardinality/sampling/redaction findings, evidence level, verification, and remaining backend-log-evidence risk |
+| Code review or implementation needs observability-debuggability triage for logs, metrics, traces, spans, structured events, telemetry context, collectors, exporters, telemetry queues, dashboards, alerts, runbooks, sampling, redaction, dependency calls, queues, batch jobs, caches, pools, rate limits, feature flags, releases, migrations, or partial-success paths where operators need to narrow incidents quickly without high-cardinality metric explosions, missing denominator counters, lost trace context, silent telemetry loss, or sensitive telemetry leakage | `.mustflow/skills/observability-debuggability-review/SKILL.md` | Incident question, signal inventory, request or job identity, metric model, trace and event model, log model, operational domain, telemetry pipeline evidence, privacy and retention constraints, verification evidence, and configured command intents | Structured event names, safe reason codes, total and failure counters, latency distributions, low-cardinality labels, trace and span context, dependency and operation names, async propagation, per-attempt telemetry, queue or batch lag signals, pool saturation metrics, release and feature attribution, telemetry self-metrics, signal pipeline survival checks, redaction, focused tests, and directly synchronized docs or templates | success-only log, no denominator, average-only latency, mixed success and error latency, raw URL label, raw user label, raw SQL telemetry, high-cardinality metric label, missing trace or span id, broken async trace propagation, attempt and operation collapse, generic timeout bucket, missing dependency name, missing idempotency or message evidence, missing queue age, missing batch last-success timestamp, missing pool saturation, missing release attribution, decorative metric, alert without action, dropped telemetry invisible, read-path visibility blind spot, sampling drops errors, unsafe baggage, or sink-side-only masking | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Observability boundary reviewed, incident question and signal ledger, metric/trace/log/pipeline/cardinality/privacy findings, evidence level, verification, and remaining observability-debuggability risk |
 | Code review, runbook work, or incident report needs incident-triage review for outages, degradations, timeout spikes, p95 or p99 latency spikes, queue backlog, pool saturation, CPU-idle slowness, memory pressure, OOM, disk or inode pressure, DNS or network failure, load balancer 5xx, Kubernetes node or pod issues, deployment regression, cache stampede, cron or batch spikes, Redis slowdown, DB lock waits, connection leaks, ephemeral-port exhaustion, conntrack saturation, or log floods where operators need to narrow the first bad time, affected slice, recent change, wait class, dependency, and manual-only diagnostics before reading every log | `.mustflow/skills/incident-triage-review/SKILL.md` | Incident frame, time evidence, scope axes, saturation and wait evidence, dependency evidence, change evidence, safety constraints, repository runbook or telemetry evidence, and configured command intents | Runbook steps, alert metadata, incident evidence checklists, telemetry contract notes, dashboard descriptions, test fixtures, docs, and directly synchronized templates that preserve first-bad-time, scope split, change ledger, wait classification, dependency split, success-versus-failure comparison, and manual-only diagnostic boundaries | average-only latency, all-logs-first triage, deployment dismissal, success-only comparison, proxy/app 5xx mixing, app-log-only OOM review, CPU-idle slowness ambiguity, DB-index reflex, pool-wait blindness, queue-lag understatement, cache-hit-rate overtrust, ping-only network checks, pod-only Kubernetes review, disk-capacity-only checks, log-volume blind spots, private incident-log capture, or raw live diagnostic commands treated as agent-authorized | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Incident boundary reviewed, first bad time and affected scope, change/success-failure/latency/resource/wait/dependency evidence, elimination ledger, manual-only diagnostics, verification, and remaining incident-triage risk |
 | Code review, implementation, runbook work, or release preparation needs deployment-rollout safety review for server, backend, worker, scheduler, queue, cron, container, VM, serverless, DB migration, config, feature flag, cache, deployment pipeline, release envelope, image digest, deployment history, traffic rollback, canary, rollback, health check, readiness/liveness/startup probe, graceful shutdown, artifact promotion, release observability, or post-deploy smoke behavior where the deploy must be rolled out, stopped, observed, and rolled back safely | `.mustflow/skills/deployment-rollout-safety-review/SKILL.md` | Deployment resource ledger, release envelope, artifact identity, environment promotion path, deployment model, compatibility matrix, config diff, migration order, rollback history, traffic rollback path, cache and message compatibility, probe model, shutdown and drain behavior, canary cohort, version-split telemetry, stop conditions, rollback limits, synthetic transactions, post-deploy metrics, and configured command intents | Runbooks, release checklists, pipeline metadata, smoke tests, probe tests, config validation, feature-flag defaults, cache-key versions, worker-drain handling, deployment attribution, rollback compatibility notes, focused tests, and directly synchronized templates | unknown blast radius, missing release id, mutable latest tag, tag without digest, per-environment rebuild drift, deleted rollout history, cold old version, traffic rollback tied to rebuild, code and migration lockstep, destructive rollback SQL overclaim, missing PITR practice, config in-place mutation, missing startup config validation, process-only health check, readiness/liveness/startup probe collapse, liveness restart loop, ungraceful shutdown, load balancer drain shorter than app shutdown, worker work loss, non-idempotent queue retry, N-1 message incompatibility, unknown event poison message, missing external compensation, API N-1 or N+1 break, missing kill switch, unsafe flag fallback, vague canary cohort, global-average canary metrics, no automatic stop condition, read-only smoke, log format alert breakage, blanket cache flush, scheduler duplicate execution, CRD or operator downgrade break, missing deployment lock, production command without dry-run, or code-only rollback overclaim | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Deployment rollout boundary reviewed, resource ledger and release envelope, artifact identity, config/migration/cache/queue/API/probe/shutdown/canary/rollback/observation findings, verification, and remaining deployment-rollout risk |
 | Code review, implementation, runbook work, or infrastructure review needs cloud-cost-guardrail review for cloud accounts, projects, subscriptions, environments, Kubernetes namespaces, serverless, databases, object storage, block storage, snapshots, NAT, private endpoints, public IPs, egress, CDN, logs, metrics, traces, autoscaling, quotas, budgets, tags, temporary resources, container registries, Marketplace, LLM APIs, external APIs, or third-party SaaS where spend must be attributed, capped, lifecycle-managed, alerted, and safely stoppable before a silent bill explosion | `.mustflow/skills/cloud-cost-guardrail-review/SKILL.md` | Cost surface ledger, budget actual and forecast thresholds, automated non-production action path, account or project isolation, quota and cap model, tag taxonomy, temporary resource expiration, network cost model, telemetry cost model, storage lifecycle model, commitment baseline, Marketplace or LLM usage limits, and configured command intents | Cost guardrail docs, infrastructure policy files, review checklists, tag schemas, quota notes, budget-action runbooks, cleanup rules, retention defaults, autoscale caps, Kubernetes ResourceQuota and LimitRange notes, registry lifecycle policies, provider usage caps, focused tests, and directly synchronized templates | notification-only budget, imagined hard spending limit, mixed prod and dev account, over-wide service quota, missing owner tag, tag-key chaos, no expires_at, stopped VM with NAT or DB still running, unbounded autoscale, missing Kubernetes ResourceQuota, inflated requests growing nodes, cloud-native service through NAT, untracked egress, cross-AZ surprise, idle public IPv4, no CDN cache cost control, log ingest flood, infinite retention, high-cardinality metric label, unbounded flow or audit logs, object lifecycle missing, cold-storage minimum-duration trap, stale block volume type, snapshot landfill, sticky DB storage growth, unbounded registry images, premature commitment, stateful spot misuse, unmonitored Marketplace or LLM spend, or no safe cost stop runbook | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Cloud cost boundary reviewed, cost surface ledger, budget and action model, isolation/quota/tag/autoscale/Kubernetes/network/telemetry/storage/registry/commitment/spot/Marketplace/LLM/SaaS guardrail findings, manual-only provider checks, verification, and remaining cloud-cost risk |
@@ -414,7 +447,7 @@ routes. Event routes stay inactive until their event occurs.
 | Code review or implementation needs queue-processing integrity triage for queues, streams, pub/sub handlers, workers, task runners, consumers, producers, webhook handoffs, DLQ replayers, retry workers, ack, nack, reject, delete, visibility timeout, offset commit, publisher confirm, prefetch, batch commit, rebalance, FIFO message group, deduplication, or worker-loss behavior that can lose messages, duplicate side effects, hide poison messages, reorder state, exhaust consumers, or falsely claim processing success | `.mustflow/skills/queue-processing-integrity-review/SKILL.md` | Broker and delivery model, success boundary, producer boundary, consumer state ledger, failure and retry policy, concurrency and ordering evidence, observability evidence, test evidence, and configured command intents | Settlement timing, publisher confirmation, outbox or inbox record, durable message key, conditional state transition, bounded retry and DLQ policy, visibility extension, prefetch and concurrency bounds, rebalance handling, shutdown drain, focused queue replay tests, and directly synchronized docs or templates | ack-before-work, catch-and-ack, finally-ack, auto-ack, stale receipt handle, premature offset commit, batch partial-failure skip, unbounded requeue, poison-message loop, unsafe visibility timeout, in-flight saturation, FIFO group misuse, worker-loss false success, producer publish split, side-effect duplication, missing rebalance fence, unlimited parallelism, DLQ bucket without replay policy, or missing decision-point observability | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Queue processing boundary reviewed, broker model and settlement evidence, producer/consumer/retry/DLQ/order/concurrency findings, evidence level, verification, and remaining queue-processing-integrity risk |
 | Code review or implementation needs transaction-boundary integrity triage for transactions, ORM atomic blocks, unit-of-work code, database write paths, service workflows, command handlers, webhook processors, queue consumers, framework transaction annotations, isolation levels, lock usage, rollback behavior, after-commit side effects, outbox patterns, retry handling, or transactional tests that can break a business invariant even when a transaction exists | `.mustflow/skills/transaction-boundary-integrity-review/SKILL.md` | Business invariant, transaction boundary, decision ledger, durable guard evidence, framework behavior, side-effect ledger, failure and retry evidence, test evidence, and configured command intents | Whole read-decision-write transaction boundaries, durable constraints, atomic upsert or conditional update, affected-row checks, version checks, correct lock target, transaction narrowing, after-commit or outbox side effects, idempotent retry classification, focused tests, and directly synchronized docs or templates | app-only `exists()` or `count` guard, stale read-decision-write, absent-row `FOR UPDATE` gap, `SKIP LOCKED` consistency misuse, READ COMMITTED snapshot myth, REPEATABLE READ engine mismatch, SERIALIZABLE without full retry, deadlock or serialization failure treated as plain 500, swallowed rollback trigger, Spring `rollbackFor` miss, self-invocation bypass, `readOnly` write assumption, inner rollback-only surprise, `REQUIRES_NEW` pool pressure, `NESTED` savepoint confusion, Django nested `atomic()` durability myth, pre-commit email/cache/queue side effect, HTTP API inside transaction, Hibernate flush-as-commit confusion, SQLAlchemy implicit transaction surprise, missing optimistic lock, advisory lock scope leak, wrong transaction manager, or transactional test hiding commit behavior | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Transaction boundary reviewed, invariant and decision ledger, durable-guard/lock/isolation/retry/rollback/side-effect findings, evidence level, verification, and remaining transaction-boundary integrity risk |
 | Code review or implementation needs testability-boundary triage for hidden decision inputs, direct time or randomness, direct I/O, constructor side effects, static or singleton state, oversized private logic, branch policy sprawl, boolean mode flags, broad option objects, implicit environment or request context, void side effects, swallowed errors, log-only outcomes, cache order dependence, ORM lazy loading, transaction and external-call coupling, hidden event publication, fire-and-forget async work, real-time retry waits, nondeterministic collection order, hidden defaults, mixed validation and policy, scattered authorization, framework magic, smart controllers, conditional DTO mapping, mock-heavy classes, call-order assertions, inheritance coupling, or reflection-only tests | `.mustflow/skills/testability-boundary-review/SKILL.md` | User goal, current diff or target files, behavior or decision under test, decision-input ledger, side-effect ledger, observability ledger, test friction evidence, and configured command intents | Explicit inputs, fixed clocks or generators, dependency seams, pure decision cores, shell and adapter boundaries, policy or strategy tables, observable results, deterministic ordering, focused tests, and directly synchronized docs or templates | hidden time/random, constructor work, global state, fat private method, boolean mode flags, broad option object, void side effect, swallowed or log-only failure, cache order dependence, ORM lazy loading, transaction/external call coupling, hidden event publication, fire-and-forget sleeps, real-time retry wait, nondeterministic order, hidden defaults, mixed validation/policy, scattered auth, framework magic, smart controller, conditional DTO policy, five or more mocks, fragile call-order tests, deep inheritance, or reflection-only tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | Testability boundary reviewed, decision inputs and side effects exposed, observability fixed or recommended, test friction evidence, verification, and remaining testability risk |
-| A coding task has missing intent, scope, domain, data, security, UX, dependency, architecture, or verification decisions that cannot be safely inferred from repository evidence | `.mustflow/skills/clarifying-question-gate/SKILL.md` | User request, inspected repository evidence, unresolved decisions, reversibility classification, recommended option, and tradeoffs | Blocking questions, safe assumptions, and the smallest safe implementation boundary | over-questioning, lazy questions, expensive wrong assumptions, user-owned decision drift, data loss, auth bypass, public-contract drift, dependency bloat, or unverifiable completion | `changes_status`, `changes_diff_summary`, `mustflow_check` | Repository evidence inspected, blocking questions with recommendations, safe assumptions, selected scope, verification, and remaining ambiguity |
+| A coding task needs request-contract repair because missing intent, scope, completion evidence, domain, data, security, UX, dependency, architecture, or verification decisions cannot be safely inferred from repository evidence | `.mustflow/skills/clarifying-question-gate/SKILL.md` | User request, inspected repository evidence, unresolved decisions, reversibility classification, request state, normalized task contract source tags, recommended option, and tradeoffs | Blocking questions, safe assumptions, normalized request contract when needed, and the smallest safe implementation boundary | over-questioning, lazy questions, expensive wrong assumptions, user-owned decision drift, prompt-rewrite churn, source-tag laundering, data loss, auth bypass, public-contract drift, dependency bloat, or unverifiable completion | `changes_status`, `changes_diff_summary`, `mustflow_check` | Request state, repository evidence inspected, normalized contract when needed, blocking questions with recommendations, safe assumptions, selected scope, verification, and remaining ambiguity |
 | Product, app, service, CLI, API, SDK, library, desktop app, automation tool, or developer tool work needs a decision about which user, developer, operator, automation, integration, recovery, upgrade, documentation, or observability surfaces are supported now, deferred, explicitly unsupported, or internal-only | `.mustflow/skills/support-surface-advisor/SKILL.md` | Product stage, primary actors, main usage path, integration need, maintenance capacity, public-contract willingness, explicit non-goals, recovery and observability expectations, and current repository evidence | Support-surface plan, selected implementation boundaries, docs, tests, route metadata, core-engine boundary, and directly synchronized templates when installed | support-contract bloat, accidental public API, UI/CLI/API duplicate core logic, hidden integration promise, unsupported automation route, unowned recovery path, stale compatibility promise, or implementation explanation leaking into user-facing UI copy | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Product stage, actors, recommended surfaces, deferred and unsupported surfaces, blocking questions, maintenance and compatibility risks, core engine versus shell boundary, staged plan, verification, and remaining support-surface risk |
 | A task chooses, migrates, rewrites, or justifies a primary language, runtime, framework, compile target, or execution environment | `.mustflow/skills/runtime-target-selection/SKILL.md` | Current runtime surfaces, target options, product or system need, environment constraints, migration boundary, smoke targets, and performance or reliability claims | Decision records, skill procedures, route metadata, migration plans, command-contract proposals, tests, fixtures, docs, and smallest selected migration scaffold | language-preference rewrite, unsupported runtime target, unusable build loop, cache or artifact blowup, missing smoke target, deployment drift, or false performance claim | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_related`, `test_release`, `mustflow_check` | Decision boundary, candidate targets, environment and build-loop evidence, smoke targets, migration boundary, calibrated claims, verification, and remaining runtime-target risk |
 | Non-trivial code work needs early structure decisions around domain rules, public contracts, external I/O, operational safety, failure handling, concurrency, data flow, or future change cost | `.mustflow/skills/structure-first-engineering/SKILL.md` | User request, target files, project context, core boundary, data flow, expected failures, public contracts, I/O surfaces, and verification contract | Risk block, focused boundaries, DTOs, adapters, pure functions, error models, tests, and directly synchronized docs or contracts | under-designed hard boundary, speculative abstraction, vague service layer, mixed I/O and domain rules, hidden partial failure, or untestable behavior | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Work risk, structure decision, data flow, failure model, I/O and concurrency boundaries, tests, verification, and remaining structure risk |
@@ -471,8 +504,8 @@ routes. Event routes stay inactive until their event occurs.
 | Trigger | Skill Document | Required Input | Edit Scope | Risk | Verification Intents | Expected Output |
 | --- | --- | --- | --- | --- | --- | --- |
 | Environment variables, config keys, secrets, public env prefixes, build-time or runtime config, config schemas or parsers, feature flags, deployment variables, CI secrets, Docker or Compose env, Kubernetes ConfigMaps or Secrets, Cloudflare bindings, Vite, Next.js, Astro, SvelteKit, Tauri, Node, Bun, generated env types, `.env` examples, config docs, or config validation behavior are created, changed, reviewed, or reported | `.mustflow/skills/config-env-change/SKILL.md` | Key name, value meaning, sensitivity, visibility, timing, required environments, owner, default, validation shape, config source of truth, read-first surfaces, platform timing, deployment surfaces, generated types, docs, tests, and command contract entries | Config schemas, parser code, runtime loader wiring, generated type expectations, fake-value env examples, deployment docs, tests, CI or deployment variable names, feature flag defaults, redacted validation errors, and deprecation notes | secret leak, public-prefix misuse, build-time/runtime confusion, stale deploy config, missing `.env.example`, unchecked raw env read, boolean truthiness bug, unredacted error, stale feature flag, production fallback from local/test, or missing restart/rebuild/rollout note | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Keys or flags changed, sensitivity, visibility, timing, required action after value change, source of truth, synchronized surfaces, public/private boundary, redaction notes, build/runtime classification, feature flag behavior, verification, and remaining config/env risk |
-| Authentication, authorization, permission, role, tenant, session, JWT, OAuth/OIDC, API key, route guard, admin, impersonation, database policy, object-level access control, or permission cache behavior is created or changed | `.mustflow/skills/auth-permission-change/SKILL.md` | Actors, principals, tenants, resources, actions, context, auth middleware, sessions, tokens, API keys, route guards, server policy, DB policy, role matrix, audit, and tests | Auth middleware, policy functions, controllers, services, jobs, webhooks, database queries, RLS, UI guards, audit logs, docs, migrations, and tests | authentication treated as authorization, client guard trusted as security, object-level authorization bypass, cross-tenant leak, stale token or cache permission, over-broad admin/API-key scope, or missing denial tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Auth/authz boundary, principal/tenant/resource/action/context, policy source of truth, server/database enforcement, client UX-only guards, denial coverage, verification, and remaining permission risk |
-| API security review needs api-access-control triage for BOLA or IDOR, broken authentication, object-level authorization, object-property authorization, function-level authorization, request-supplied user, tenant, role, or owner identifiers, tenant isolation, scoped admin, list/detail mismatch, write permissions, mass assignment, DTO exposure, client-only admin, temporary public holes, router order, GraphQL resolvers, batch APIs, exports, downloads, previews, signed storage URLs, cache keys, async job revalidation, webhook ownership mapping, OAuth or OIDC confusion, JWT verification, stale token claims, session rotation, cookie flags, reauthentication, reset tokens, account enumeration, automation defense, internal identity planes, or denial-case matrices | `.mustflow/skills/api-access-control-review/SKILL.md` | User goal, current diff or target files, subject-object-action-context ledger, object authorization ledger, property authorization ledger, function authorization ledger, authentication proof ledger, denial evidence, and configured command intents | Server-side object checks, tenant-scoped lookups, relationship checks, function-level checks, property allowlists, DTO mappers, signed URL scoping, cache-key dimensions, worker revalidation, webhook ownership mapping, token/session/cookie hardening, reauthentication gates, enumeration-safe responses, rate limits, audit logs, focused denial tests, and directly synchronized docs or templates | login-as-authorization, request-owned identity trust, findById before owner check, role-only admin, list/detail authorization drift, read/write permission confusion, mass assignment, entity DTO leak, client-only admin, permitAll leak, route shadowing, resolver bypass, batch item bypass, export/download bypass, signed URL bypass, tenantless query or cache, stale queue permission, webhook ownership confusion, OAuth/OIDC token confusion, decoded-but-unverified JWT, stale token claim, session fixation, weak cookie, missing reauth, reusable reset token, account enumeration, automation abuse, internal account exposure, or happy-path-only auth tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | API access control reviewed, subject/object/action/field/tenant map, object/property/function/auth findings, fixes or recommendation, denial evidence, verification, and remaining API access-control risk |
+| Authentication, authorization, permission, effective-permission decision, role, tenant, session, JWT, OAuth/OIDC, API key, route guard, admin, impersonation, database policy, object-level access control, or permission cache behavior is created or changed | `.mustflow/skills/auth-permission-change/SKILL.md` | Actors, principals, permission decision tuple, effective permissions, tenants, resources, actions, context, auth middleware, sessions, tokens, API keys, route guards, server policy, DB policy, role matrix, audit, and tests | Auth middleware, policy functions, controllers, services, jobs, webhooks, database queries, RLS, UI guards, audit logs, docs, migrations, and tests | authentication treated as authorization, role column mistaken for effective permission, unexplained allow or deny, client guard trusted as security, object-level authorization bypass, cross-tenant leak, stale token or cache permission, over-broad admin/API-key scope, or missing denial tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Auth/authz boundary, principal/tenant/resource/action/context, effective permission, policy version, decision explanation, revocation window, policy source of truth, server/database enforcement, client UX-only guards, denial coverage, verification, and remaining permission risk |
+| API security review needs api-access-control triage for BOLA or IDOR, effective permission decisions, broken authentication, object-level authorization, object-property authorization, function-level authorization, request-supplied user, tenant, role, or owner identifiers, tenant isolation, scoped admin, list/detail mismatch, write permissions, mass assignment, DTO exposure, client-only admin, temporary public holes, router order, GraphQL resolvers, batch APIs, exports, downloads, previews, signed storage URLs, cache keys, async job revalidation, webhook ownership mapping, OAuth or OIDC confusion, JWT verification, stale token claims, session rotation, cookie flags, reauthentication, reset tokens, account enumeration, automation defense, internal identity planes, or denial-case matrices | `.mustflow/skills/api-access-control-review/SKILL.md` | User goal, current diff or target files, subject-object-action-context ledger, decision explanation ledger, object authorization ledger, property authorization ledger, function authorization ledger, authentication proof ledger, denial evidence, and configured command intents | Server-side object checks, tenant-scoped lookups, relationship checks, function-level checks, property allowlists, DTO mappers, signed URL scoping, cache-key dimensions, worker revalidation, webhook ownership mapping, token/session/cookie hardening, reauthentication gates, enumeration-safe responses, rate limits, audit logs, focused denial tests, and directly synchronized docs or templates | login-as-authorization, request-owned identity trust, findById before owner check, role-only admin, unexplainable allow/deny decision, list/detail authorization drift, read/write permission confusion, mass assignment, entity DTO leak, client-only admin, permitAll leak, route shadowing, resolver bypass, batch item bypass, export/download bypass, signed URL bypass, tenantless query or cache, stale queue permission, webhook ownership confusion, OAuth/OIDC token confusion, decoded-but-unverified JWT, stale token claim, session fixation, weak cookie, missing reauth, reusable reset token, account enumeration, automation abuse, internal account exposure, or happy-path-only auth tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | API access control reviewed, subject/object/action/field/tenant map, effective permission and decision-explanation findings, object/property/function/auth findings, fixes or recommendation, denial evidence, verification, and remaining API access-control risk |
 | File upload, import, attachment, direct-to-storage upload, remote file fetch, archive extraction, thumbnail, preview, image resize, OCR, media transcode, document conversion, antivirus scan, CDR, file metadata, storage key, public file URL, signed upload URL, signed download URL, CDN delivery, file download, uploaded filename display, or file lifecycle test needs file-upload-security triage across validation, storage, processing, serving, and cleanup | `.mustflow/skills/file-upload-security-review/SKILL.md` | User goal, current diff or target files, upload entrypoint ledger, file identity ledger, validation ledger, processing ledger, serving ledger, existing tests or abuse cases, storage policy evidence, and configured command intents | Server-side type allowlists, decoded and normalized filename checks, generated storage keys, path containment checks, overwrite protection, upload size and count limits, quota checks, quarantine states, scanner gates, parser sandboxing, archive extraction guards, CSV formula neutralization, metadata stripping, image rewriting, signed URL constraints, download authorization, response headers, filename encoding, audit logs, cleanup rules, focused denial tests, and directly synchronized docs or templates | client-only file restriction, trusted MIME label, extension check before normalization, blocklist-only type policy, original-name-only validation, user-controlled storage key, path traversal, long-name exception leak, overwrite or key guessing, executable web-root storage, web-server handler execution, magic-byte theater, polyglot file, metadata payload, active SVG or HTML, unsafe PDF or Office preview, Zip Slip, decompression bomb, CSV formula injection, remote fetch SSRF, scan-after-publication gap, unsandboxed parser, weak presigned URL policy, direct-to-storage pre-scan publication, tenant key collision, missing download authorization, unsafe response header, filename XSS or header injection, chunk assembly bypass, quota bypass, orphan file leak, or happy-path-only upload tests | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `test_audit`, `docs_validate_fast`, `test_release`, `mustflow_check` | File upload security reviewed, upload/file identity/storage/validation/processing/serving/cleanup map, findings, fixes or recommendation, denial evidence, verification, and remaining file-upload security risk |
 | Code review or implementation needs security-flow triage by tracing values from source to sink across authorization, object ownership, tenant scoping, IDOR or BOLA risk, list/detail/export scope, state-changing permissions, mass assignment, admin-only surfaces, cache keys, sort/filter/field injection inputs, ORM raw paths, shell wrappers, SSRF, file upload and extraction, path traversal, XSS, CSRF, OAuth, reset tokens, JWTs, cookies, cryptography, logs, fail-open error handling, queued work, race conditions, or supply-chain and CI/CD paths | `.mustflow/skills/security-flow-review/SKILL.md` | User goal, current diff or target files, security claim, source-to-sink map, actor and resource map, read and write surfaces, framework escape hatches, existing tests or scanner findings, and configured command intents | Server-side ownership checks, tenant-scoped queries, allowlisted updates, cache-key dimensions, URL and path validation, parser or renderer boundaries, upload and extraction limits, token validation, cookie flags, fail-closed error handling, queue revalidation, idempotency, focused tests, and directly synchronized docs or templates | dangerous-keyword theater, authentication treated as authorization, UUID-as-lock assumption, list/detail scope drift, export over-disclosure, state change via raw body status, mass assignment, frontend-only admin gate, cache viewer leak, ORDER BY injection, ORM raw escape hatch, shell wrapper injection, SSRF, unsafe upload preview, Zip Slip, decompression bomb, path traversal, XSS, CSRF, weak OAuth state, reset-token reuse, stale JWT claims, broad cookie domain, custom crypto, token or PII logging, missing security logs, fail-open permission, stale queued permission, duplicate money or entitlement effect, postinstall or CI secret exposure | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Security flow reviewed, source/sink/actor/resource/tenant map, authorization and ownership notes, input/file/network/browser/token/cookie/crypto/log/fail-open/async/race/cache/supply-chain findings, fixes or recommendation, tests or invariant evidence, verification, and remaining security-flow risk |
 | Code, configuration, docs, templates, logs, telemetry, traces, baggage, behavior analytics, credentials, data flows, data residency policy, region or processing-location claims, AI-generated code, authentication, authorization, client-only permission checks, admin operations, audit logs, cache policy, cache-as-authority decisions, claim or policy data, comparison or affiliate data, user-generated content, sessions, tokens, uploads, downloads, signed URLs, API responses, webhooks, job queues, external API call records, external requests, third-party data-use terms, runtime security patch policy, deployment settings, dependencies, cryptography, secure transport, scanner gates, security invariants, or agent configuration affect secrets, personal data, retention, access control, vendor disclosure, or external disclosure | `.mustflow/skills/security-privacy-review/SKILL.md` | Changed files, sensitive surfaces, actor and resource owner, data-owner boundary, data residency and processing-location boundary, runtime patch boundary, AI gateway or budget boundary, server-side authorization rule, file upload/download boundary, API response field boundary, behavior analytics surface, trace or baggage surface, webhook or external-call record surface, admin operation surface, audit-log surface, cache visibility and authority policy, claim or affiliate policy surface, session or token surface, external target, dependency source, third-party data-use or terms surface, cryptography or transport surface, scanner evidence, agent-tool permission, deployment setting, project secret and privacy rules, public or packaged surfaces, and command contract entries | Sensitive data handling, authorization, admin operations, data residency, runtime patchability, AI budget records, behavior analytics, observability identifiers, webhook receipts, external-call records, dead-letter records, audit logs, shared-cache behavior, cache-authority behavior, claim and affiliate disclosure, sessions, tokens, inputs, files, signed URLs, API responses, logs, receipts, generated state, docs, templates, package metadata, deployment settings, and reports | secret leak, personal-data exposure, access-control bypass, client-trusted role or owner value, unsafe admin action, private file exposure, over-broad API response, shared-cache leak, unsafe cache authority, unprovable data location, unpatchable runtime, privacy-heavy telemetry, unsafe baggage propagation, unsafe webhook payload retention, unsafe external request, supply-chain drift, weak cryptography, insecure transport, over-privileged agent, risky third-party terms, or misleading privacy claim | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Sensitive surfaces reviewed, data residency, runtime patchability, AI hard-limit, behavior analytics, observability, and audit boundaries, webhook, external-call, and dead-letter boundaries, cache authority and disclosure boundaries, assumptions checked, disclosure and retention paths, authorization, file, API response, third-party terms, and external-boundary notes, verification, and remaining security or privacy risk |
@@ -493,6 +526,8 @@ routes. Event routes stay inactive until their event occurs.
 | Database lock contention review needs to catch blocking visible in the diff, including hot rows, mutable counter caches, balance or stock updates, reservation flows, queue table claiming, `SELECT ... FOR UPDATE`, weaker row-lock choices, optimistic version checks, conditional updates, lock order, deadlock retry, MySQL/InnoDB gap or next-key locks, PostgreSQL row-lock variants, SQL Server lock escalation, long transactions, external calls inside transactions, DDL or metadata lock waits, idle-in-transaction blockers, lock timeout policy, connection-pool waits, or lock observability | `.mustflow/skills/database-lock-contention-review/SKILL.md` | Contended resource, workload concentration, database engine and isolation, lock path, index and predicate shape, transaction width, queue claim model, batch size, timeout and retry policy, observability evidence, and configured command intents | Data-shape changes such as ledgers, reservations, sharded counters, materialized summaries, conditional updates, weaker locks, stable lock order, chunked batches, queue shards, timeout policy, focused tests, docs, and directly synchronized templates | hot-row serialization, parent-counter bottleneck, select-then-update race, over-strong `FOR UPDATE`, missing lock-footprint index, gap-lock insert block, metadata-lock surprise, unordered multi-row deadlock, unchunked write outage, queue head contention, hidden FK parent lock, idle transaction blocking DDL, infinite lock wait, pool starvation, unsafe deadlock retry, or missing blocker/waiter evidence | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Lock-contention surface reviewed, contended resource and workload ledger, lock strength/order/index/queue/batch/DDL/timeout/pool/observability findings, evidence level, verification, and remaining database lock-contention risk |
 | SQLite-specific schema, query, transaction, migration, indexing, extension, WAL, local-file persistence, embedded database, mobile database, browser OPFS/WASM SQLite, cache index, or SQLite runtime behavior is created, changed, reviewed, or reported | `.mustflow/skills/sqlite-code-change/SKILL.md` | SQLite role, runtime and binding, file ownership, storage medium, concurrency shape, schema/type rules, query/index evidence, migration and recovery needs, changed files, and command contract entries | SQLite schema, queries, connection setup, transactions, pragmas, indexes, migrations, fixtures, tests, docs, and directly synchronized templates | wrong runtime assumption, file-lock surprise, WAL overclaim, network filesystem risk, disabled foreign keys, weak type constraints, unsafe raw SQL, query-plan overclaim, sidecar-file data loss, failed migration rebuild, or unverified backup/restore | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | SQLite runtime, storage, WAL/concurrency, schema/type/constraint, query/index, migration, backup/restore, verification, and remaining SQLite risk |
 | PostgreSQL-specific schema, query, transaction, migration, indexing, extension, role, row-level security, connection pooling, replication, backup, restore, managed Postgres, or Postgres runtime behavior is created, changed, reviewed, or reported | `.mustflow/skills/postgresql-code-change/SKILL.md` | PostgreSQL role, version, provider, extension inventory, topology, pooler, schema/type rules, query-plan evidence, transaction/retry rules, migration and recovery needs, changed files, and command contract entries | PostgreSQL schema, queries, migrations, generated SQL, connection setup, pool settings, roles, RLS policies, extensions, tests, docs, and directly synchronized templates | version drift, provider constraint miss, connection storm, lock or rewrite surprise, unsafe online DDL claim, bad pooler assumption, RLS bypass, search-path risk, extension drift, stale replica read, query-plan overclaim, or unverified restore | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | PostgreSQL version/topology, pooling, lock/transaction, schema/type/RLS/role, query/index/statistics, backup/restore, verification, and remaining PostgreSQL risk |
+| Keyword search, full-text search, Elasticsearch, OpenSearch, Lucene-style indexing, search APIs, indexing pipelines, aliases, bulk indexing, refresh visibility, analyzers, mappings, synonyms, autocomplete, pagination, shard failures, search quality, or search performance are created, changed, reviewed, or failing | `.mustflow/skills/search-index-integrity-review/SKILL.md` | Symptom classification, source-to-search ledger, query contract ledger, index contract ledger, quality ledger, performance ledger, privacy ledger, changed files, and command contract entries | Search canaries, indexing ledgers, bulk item error handling, alias checks, mapping and analyzer fixtures, exact-versus-full-text tests, tenant and permission filters, golden-set tests, synonym regression tests, pagination guards, query metrics, docs, and directly synchronized templates | cluster-green theater, batch-level bulk success, source/index count illusion, write alias drift, partial shard result, direct/API/UI mismatch, wrong keyword/text field, analyzer drift, synonym regression, rank eyeballing, profile misuse, query fingerprint leak, shard fan-out, cache-only benchmark, refresh overuse, segment merge backlog, disk watermark write block, deep pagination, oversized fetch, or private query/document leak | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Search index integrity reviewed, source-to-search/query/index/quality/performance/privacy ledgers, search findings, fix or recommendation, evidence level, verification, and remaining search-index risk |
+| Vector search, semantic search, RAG retrieval, embedding generation, preprocessing, chunking, vector schema, collection, namespace, tenant, named vector, metadata payload, filter, ANN index, exact-versus-approximate search, hybrid search, reranking, recall, latency, quantization, HNSW, IVF, pgvector, Qdrant, Milvus, Weaviate, OpenSearch kNN, or retrieval golden-set behavior is created, changed, reviewed, or failing | `.mustflow/skills/vector-search-integrity-review/SKILL.md` | Retrieval symptom, query contract ledger, ingestion ledger, quality ledger, performance ledger, privacy ledger, changed files, and command contract entries | Embedding and preprocessing versioning, vector validation, deterministic ids, namespace and tenant selection, metadata indexes, filter construction, exact-search checks, ANN parameters, reranker candidates, golden-set tests, synthetic fixtures, metrics, docs, and directly synchronized templates | vector-DB scapegoating, wrong embedding dimension, model revision drift, filter post-candidate loss, metadata type drift, tenant leak, duplicate chunk ids, stale deletes, metric or normalization mismatch, ANN tuning before exact-search proof, quantization recall loss, reranker candidate starvation, hybrid score misuse, deep ANN pagination, raw vector or document leak, or unmeasured p95 latency | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Vector search integrity reviewed, retrieval/query/ingestion/quality/performance/privacy ledgers, exact-versus-ANN and filter findings, fix or recommendation, evidence level, verification, and remaining vector-search risk |
 | Dependency versions, lockfiles, package-manager metadata, workspace constraints, runtime engines, peer dependencies, optional dependencies, security advisory fixes, generated dependency output, framework plugins, TypeScript compiler tracks, CI actions, Docker base images, package manager behavior, or toolchain versions are upgraded, downgraded, pinned, widened, regenerated, reviewed, or reported | `.mustflow/skills/dependency-upgrade-review/SKILL.md` | Dependency name, old and new versions or ranges, direct or transitive path, ecosystem and package manager, declaration files, lockfiles, runtime or toolchain files, advisory or release-note evidence, generated outputs, callers, docs, package output, Docker, CI, or TypeScript compiler-track surfaces, and command contract entries | Package declarations, lockfiles, generated outputs, compatibility code, tests, docs, package metadata, Docker or CI files, TypeScript compiler-track notes, and directly synchronized examples | lockfile churn, hidden transitive replacement, peer or engine break, module-format drift, native or optional package break, framework or generator output drift, unsafe broad security update, weakened tests, Docker or CI runtime drift, TS7 RC over-adoption, TS7 nightly over-adoption, or unreviewed supply-chain change | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Upgrade reason, ecosystem surface, direct and transitive graph changes, compatibility classification, runtime/peer/engine/module/feature/platform/generated-output/compiler-track risks, synchronized surfaces, verification, and remaining dependency-upgrade risk |
 | Dependency, package, runtime, framework, tool, command, plugin, service, platform capability, supported-version policy, security patch path, ecosystem maturity claim, maintainer-risk assumption, runtime portability claim, edge or serverless compatibility claim, critical-path library choice, package script, lifecycle hook, binary download, lockfile, audit result, or supply-chain-sensitive dependency surface is assumed, added, removed, imported, invoked, installed, audited, or documented | `.mustflow/skills/dependency-reality-check/SKILL.md` | Assumed dependency or capability, declaration files, version or feature expectation, role criticality, supported-version or end-of-life evidence, patchability expectation, runtime compatibility boundary, maintainer and ecosystem evidence when available, lockfile entry, package script or lifecycle hook, audit or provenance evidence, and relevant command intents | Package metadata, lockfiles, imports, scripts, command contracts, docs, tests, runtime policy notes, portability notes, and reports | unavailable dependency, hallucinated or lookalike package, fragile single-maintainer core dependency, experimental technology in a survival path, unsupported runtime, unclear security patch path, runtime-specific API leakage into core logic, stale version claim, lifecycle script risk, audit suppression, lockfile drift, or install guidance mismatch | `changes_status`, `changes_diff_summary`, `build`, `test_release`, `mustflow_check` | Dependency checked, ecosystem and maintainer-risk boundary reviewed, supported-version, patchability, and runtime-portability boundary reviewed, supply-chain surface reviewed, declarations synchronized, verification, and remaining dependency risk |
 | Generated or edited code, configuration, CI workflows, package metadata, install instructions, examples, Docker images, framework setup, runtime declarations, toolchain declarations, TypeScript compiler-track references, Rust release or MSRV references, or migration-sensitive snippets introduce explicit external version references, action refs, package ranges, runtime versions, framework majors, Docker image tags, or scaffold commands that may be stale | `.mustflow/skills/version-freshness-check/SKILL.md` | Versioned reference, owning files, repository version policy, approved freshness source, compatibility context, migration risk, TypeScript compiler track or Rust MSRV/toolchain track when relevant, and command contract entries | Package metadata, lockfiles, CI workflows, Dockerfiles, runtime files, framework config, docs, examples, templates, tests, and version-decision reports | stale default version, false latest claim, accidental major migration, repository policy mismatch, unsupported generated example, TypeScript RC/nightly/API-track confusion, Rust stable/nightly/MSRV confusion, floating-tag drift, or unverified security/support claim | `changes_status`, `changes_diff_summary`, `build`, `test_related`, `docs_validate_fast`, `test_release`, `mustflow_check` | Versioned surfaces checked, repository policy and freshness source, selected version track, compatibility classification, TypeScript stable/RC/nightly/API-track and Rust stable/nightly/MSRV split when relevant, approval need, synchronized surfaces, verification, and remaining version-freshness risk |
@@ -516,6 +551,7 @@ routes. Event routes stay inactive until their event occurs.
 | Web image, hero image, LCP image, product image, feed image, gallery, avatar, thumbnail, carousel image, CSS background hero, `img`, `picture`, `source`, `srcset`, `sizes`, framework image component, responsive preload, `imagesrcset`, `imagesizes`, `fetchpriority`, `loading`, `decoding`, intrinsic dimensions, DPR bucket, width bucket, AVIF/WebP/JPEG/PNG/SVG fallback, quality budget, blur placeholder, base64 inline image, image CDN transformation, derivative cache key, content-hash URL, `Accept` negotiation, image proxy allowlist, EXIF orientation, ICC profile, uploaded SVG handling, `elementtiming`, Resource Timing, DevTools image waterfall, or RUM image evidence needs image-delivery-performance triage for discovery, priority, candidate size, layout stability, cacheability, quality, or abuse risk | `.mustflow/skills/image-delivery-performance-review/SKILL.md` | User goal, current diff or target files, image role ledger, discovery and priority ledger, responsive candidate ledger, layout stability ledger, format and quality ledger, pipeline and metadata ledger, CDN and cache ledger, safety and abuse ledger, image measurement evidence or gap, and configured command intents | Image markup, `picture` source order, `srcset`, `sizes`, `loading`, `decoding`, `fetchpriority`, responsive preload, `imagesrcset`, `imagesizes`, intrinsic dimensions, `aspect-ratio`, placeholders, CSS background preload hints, framework image props, width and DPR buckets, format fallback policy, quality settings, metadata handling, cache headers, derivative cache key rules, `Accept` forwarding, remote image allowlists, transform limits, focused tests, and directly synchronized docs or templates | lazy LCP image, priority inflation, hidden background image discovery, responsive preload missing `imagesrcset` or `imagesizes`, multi-format preload, wrong `sizes`, framework fill without sizes, lazy image missing dimensions, unbounded 3x variants, arbitrary width bucket, CDN cache-key confetti, wrong format by content type, weak JPEG fallback, global quality constant, missing byte budget, EXIF rotation bug, color-profile loss, unsafe SVG serving, meaningful image hidden as CSS background, first-view lazy loading, giant lazy gallery, wrong `decoding` hint, oversized blur placeholder, base64 HTML bloat, missing content-hash URL, lost original, dropped `Accept` header, public image proxy, unbounded remote transform, Lighthouse-only claim, or duplicate image download | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Image delivery performance reviewed, image role/discovery/priority/candidate/layout/format/quality/cache/pipeline/safety map, findings, fixes or recommendation, measurement or static image-delivery evidence, verification, and remaining image-delivery performance risk |
 | Frontend app, client package, route, design system, component library, import graph, package entrypoint, bundler config, shared vendor chunk, first-route JS, tree shaking, ESM/CJS dependency, barrel file, package exports, sideEffects metadata, PURE annotation, Next use-client boundary, Server Component split, dynamic import, React lazy, Angular defer, Vue route lazy loading, import modularization, icon import, date locale, syntax highlighter, markdown renderer, code editor, Node polyfill, browser target, Babel polyfill, dev-only branch, console stripping, Rollup manualChunks, Vite modulepreload, Tailwind extraction, safelist, or inline asset rule needs client-bundle-pruning triage for dead-code elimination, dependency bloat, initial JS, shared vendor, or route chunk risk | `.mustflow/skills/client-bundle-pruning-review/SKILL.md` | User goal, current diff or target files, bundle target ledger, entry and import graph ledger, dependency format ledger, framework boundary ledger, heavy-feature ledger, polyfill and target ledger, bundle budget or analyzer evidence, and configured command intents | Import paths, package entrypoints, subpath exports, barrel usage, ESM package choices, package `sideEffects`, Rollup `moduleSideEffects`, PURE annotations, client/server boundaries, dynamic imports, framework deferral boundaries, package import modularization, icon/date/highlighter/editor imports, Node polyfill fallbacks, browser and Babel targets, dev-only constants, logging wrapper behavior, manual chunk rules, modulepreload policy, Tailwind extraction shapes, asset inline thresholds, focused tests, and directly synchronized docs or templates | total-dist theater, no initial-JS budget, CJS package on client path, broad utility import, hot-path barrel, package-root import dragging design system, missing subpath export, unsafe `sideEffects: false`, CSS or polyfill shaken away, global `moduleSideEffects: false`, missing PURE hint, false PURE annotation, page-level `use client`, server-safe parser in client bundle, dynamic import with variable path, lazy component declared in render, eager modal/chart/editor/map/search/PDF import, library import before user intent, Angular defer leak, Vue eager route, unverified import modularization, icon catalog import, all date locales, all highlighter languages, Node polyfill bundle, old browser-target helper bloat, whole `core-js` import, dev branch not folded, unsafe console drop, one giant vendor chunk, modulepreload spam, dynamic Tailwind class miss, broad safelist, large inline SVG/font/image, or unmeasured bundle claim | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Client bundle pruning reviewed, first-route/shared-vendor/import/dependency/client-boundary/polyfill/chunk/CSS/asset map, findings, fixes or recommendation, measurement or test evidence, verification, and remaining client-bundle pruning risk |
 | Frontend route, component, animation, scroll path, input path, list, table, chart, map, canvas, media slot, modal, drawer, hydration boundary, DOM read/write path, CSS selector, class toggle, CSS custom property, containment, content-visibility, virtualization, observer, event listener, requestAnimationFrame loop, long task, worker boundary, ResizeObserver path, runtime CSS injection, React memo boundary, context provider, deferred update, transition, or DevTools rendering trace needs frame-render-performance triage for INP, animation smoothness, scroll responsiveness, style recalculation, layout, paint, compositing, main-thread, or hydration risk | `.mustflow/skills/frame-render-performance-review/SKILL.md` | User goal, current diff or target files, interaction and frame ledger, DOM and layout ledger, style and CSS ledger, paint and compositing ledger, event and scheduling ledger, framework render ledger, rendering evidence or measurement gap, and configured command intents | DOM read/write batching, layout-affecting writes, transform/opacity animations, will-change scope, containment, content-visibility and contain-intrinsic-size, virtualization, selector simplification, state-class scope, CSS variable scope, media geometry reservation, native lazy loading, IntersectionObserver, passive listeners, overscroll-behavior, requestAnimationFrame scheduling, long-task chunking, worker and OffscreenCanvas boundaries, ResizeObserver, runtime CSS rule reduction, React prop and context stability, deferred and transition updates, hydration narrowing, focused tests, and directly synchronized docs or templates | forced synchronous layout, layout thrashing, width/height/top/left animation, stale will-change, missing containment, unsafe contain side effect, content-visibility scroll jump, offscreen chart or canvas work, oversized DOM, deep wrapper tree, expensive selector, body/html state blast, root CSS variable churn, unreserved media slot, LCP concern misrouted as frame fix, JS lazy loader overhead, scroll polling, non-passive wheel/touch handler, JS scroll lock, setTimeout frame clock, long task, main-thread heavy compute, canvas blocking input, resize measurement loop, runtime style injection, ineffective memo, broad context rerender, urgent heavy result render, full hydration INP cost, Lighthouse-score-only claim, or unmeasured rendering win | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Frame render performance reviewed, interaction/DOM/style/layout/paint/compositing/event/framework map, findings, fixes or recommendation, measurement or static frame-risk evidence, verification, and remaining frame-render performance risk |
+| UI motion, animation, transition, microinteraction, motion recipe, motion design system, CSS animation or transition, WAAPI, Framer Motion, GSAP, View Transition, hover, press, focus, drag, viewport entry, loading, async success, async failure, reduced motion, interruption, cancellation, settlement, timeline track, transform, opacity, filter, layout animation, or additive composition is planned, edited, reviewed, or reported | `.mustflow/skills/motion-system-contract-review/SKILL.md` | User goal, current diff or target files, motion slot, source and target roles, semantic event class, logical from-state and to-state, timeline tracks, interruption policy, settlement policy, reduced-motion policy, binding approach, async signal owner, evidence level, and configured command intents | Motion recipes, component motion props, CSS keyframes and transitions, animation lifecycle handlers, reduced-motion rules, state and signal policies, role/ref/slot/data binding, story fixtures, focused tests, and directly synchronized docs or templates | motion owns product state, false success or failure feedback, timer pretending to be a signal, missing from-state or to-state, same target and channel collision, unsupported additive composition, layout-channel animation, `animation-fill-mode` state lie, missing reduced motion, hover-only access, brittle selector binding, production animation failure blocking core action, or unverified visual proof | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Motion contract reviewed, state/event/track/interruption/settlement/reduced-motion/binding ledgers, async and collision findings, evidence level, verification, and remaining motion contract risk |
 | Frontend component, route, store, query, form, router state, context provider, persisted store, external subscription, optimistic mutation, search/filter/pagination interaction, selected item, list key, or hydration path can duplicate, derive, overwrite, or race the same value across props, local state, server cache, URL params, form drafts, global app context, selectors, storage, or external stores | `.mustflow/skills/frontend-state-ownership-review/SKILL.md` | User goal, current diff or target files, framework and state-library signals, state owner ledger, state class map, synchronization surfaces, identity and collection surfaces, evidence level, and configured command intents | State owner cleanup, derived selectors, nearest-owner move, status or mode union, grouped action, selected ID lookup, query key dimensions, invalidation scope, request cancellation, optimistic rollback, URL-state routing, form draft reset, context split or memoization, persisted-state versioning, reset keys, external subscription wrapper, focused tests, and directly synchronized docs or templates | props-to-state drift, duplicated derived state, effect-derived one-render lag, contradictory booleans, partial grouped-state tear, selected object staleness, server data copied into global store, URL state fork, form draft overwrite, optimistic update without rollback, stale request overwrite, incomplete query key, broad invalidation, index-key local-state swap, raw setter sprawl, context value rerender storm, state too high or too low, non-serializable persisted store, hydration mismatch, unsafe external subscription snapshot, or unverified state owner | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Frontend state surface reviewed, owner ledger and state class map, duplicate or derived state findings, query/URL/form/optimistic/race/context/persistence decisions, tests or evidence level, verification, and remaining state-ownership risk |
 | Frontend UI, design system component, dashboard, form, card, list, table, chart, media slot, modal, drawer, toast, bottom CTA, portal, or responsive surface needs stress-layout review against hostile content, narrow parent containers, async media, skeletons, empty or error states, permission variants, scrollbars, mobile viewport and keyboard behavior, safe areas, line clamps, i18n or RTL, touch input, reduced motion, observer loops, portal edge placement, z-index layers, browser zoom, cascade layers, or reproducible break conditions | `.mustflow/skills/frontend-stress-layout-review/SKILL.md` | User goal, current diff or target files, framework and styling signals, stress fixture ledger, parent container ledger, geometry contract ledger, interaction and state ledger, evidence level, and configured command intents | Stress fixtures, stories, tests, parent-container-aware constraints, container queries, `min-width: 0`, `minmax(0, 1fr)`, `overflow-wrap: anywhere`, reserved media dimensions, `aspect-ratio`, skeleton geometry, empty and error states, permission variants, stable scroll containers, `scrollbar-gutter: stable`, mobile viewport and keyboard constraints, `safe-area-inset-*`, explicit `line-height`, logical properties, touch-accessible affordances, `prefers-reduced-motion`, observer scope, portal placement, z-index tokens, table and chart stress handling, zoom-safe geometry, cascade layer fixes, and directly synchronized docs or templates | happy-path fixture blindness, parent-width overflow, flex or grid min-content blowout, unbroken text overflow, async media or font layout shift, skeleton mismatch, collapsed empty state, error-state overlap, permission action wrapping, late `display: none` layout jump, scrollbar width wrap, fragile `100vh`, keyboard-covered CTA, unsafe-area overlap, line-clamp/action collision, localization or RTL breakage, hover-only control, layout-affecting hover or animation, ResizeObserver loop, clipped portal, z-index arms race, unusable wide table, chart zero-width mount, browser zoom clipping, CSS specificity loss, or vague non-reproducible visual complaint | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Frontend stress layout reviewed, stress fixture and parent-container ledgers, reproducible break conditions, fixes or recommendation, evidence level, verification, and remaining stress-layout risk |
 | Frontend UI, design-system component, form, dialog, menu, tab, combobox, custom select, table, card, media, icon button, image, toast, live update, drag interaction, focus style, keyboard handler, `onClick`, `role`, `tabIndex`, `aria-*`, `alt`, hidden content, visually hidden text, or automated accessibility claim needs accessibility-tree review for native semantics, accessible names, visible label consistency, keyboard navigation, focus order and return, forms, errors, status messages, ARIA references, icon or image alternatives, custom widget contracts, non-text contrast, target size, drag alternatives, or a11y evidence limits | `.mustflow/skills/frontend-accessibility-tree-review/SKILL.md` | User goal, current diff or target files, framework and component-library signals, semantic ledger, keyboard ledger, assistive-technology ledger, form ledger, interaction ledger, evidence level, and configured command intents | Native HTML element selection, button/link semantics, `href` cleanup, keyboard parity, tabindex cleanup, focus-visible styling, obscured focus fixes, dialog focus management, icon-only accessible names, visible-label-aligned names, `aria-labelledby` and `aria-describedby` id references, `aria-hidden` cleanup, SVG icon defaults, image `alt`, label and fieldset wiring, `aria-invalid`, error descriptions, submit-failure focus, live regions, ARIA pattern keyboard behavior, custom select constraints, non-text contrast, target-size fixes, drag alternatives, focused tests, accessibility snapshots, and directly synchronized docs or templates | ARIA costume over broken semantics, clickable div, fake link, `href="#"`, missing Enter or Space behavior, tabIndex sprawl, positive tabindex, invisible focus, focus hidden behind sticky layers, modal focus leak, unnamed icon button, visible text fighting `aria-label`, broken `aria-labelledby`, interactive child hidden by `aria-hidden`, duplicate SVG announcement, useless image alt, placeholder-only field, missing legend, color-only error, disconnected error text, submit failure silence, unannounced async status, menu or combobox keyboard mismatch, unnecessary custom select, offscreen focus trap, non-text contrast failure, tiny pointer target, drag-only operation, axe-only proof, or accessibility-tree evidence gap | `changes_status`, `changes_diff_summary`, `lint`, `build`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Frontend accessibility tree reviewed, semantic/keyboard/focus/name/form/status/widget evidence, findings, fixes or recommendation, automated-evidence limits, verification, and remaining accessibility-tree risk |
@@ -536,7 +572,7 @@ routes. Event routes stay inactive until their event occurs.
 | Trigger | Skill Document | Required Input | Edit Scope | Risk | Verification Intents | Expected Output |
 | --- | --- | --- | --- | --- | --- | --- |
 | Architecture, module boundaries, codebase structure, structural improvement, codebase deepening, or testability needs review before choosing a refactor or abstraction | `.mustflow/skills/architecture-deepening-review/SKILL.md` | Target area, structural pain, local patterns, behavior evidence, current changed files, and command contract entries | Review notes, ranked structure candidates, and at most one scoped structural follow-up when requested | speculative abstraction, broad rewrite, pattern-first design, hidden behavior change, or unverified structure claim | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Review target, evidence, candidate scores, selected next action, narrower skill choice, verification, and remaining architecture risk |
-| Service boundaries, modular-monolith boundaries, bounded contexts, team ownership, data ownership, source-of-truth maps, event or queue boundaries, multi-tenant isolation, failure flows, operational recovery, or large-scale architecture split decisions are designed, reviewed, or changed | `.mustflow/skills/service-boundary-architecture/SKILL.md` | Candidate domains, owners, data truth map, communication paths, shared database or cache coupling, failure flows, idempotency, queue/retry/dead-letter behavior, cache consistency, tenant/auth/audit boundaries, observability, deployment, migration, retention, operations tools, and command contract entries | Architecture docs, decision records, context files, boundary source, API/event/queue/cache/read-model contracts, operational runbooks, tests, and directly synchronized docs or templates | noun-first service split, shared database coupling, unknown data owner, happy-path-only design, retry storm, queue backlog with no owner, cache as accidental authority, tenant leak, command-like events, missing observability, unsafe migration, or manual recovery without audit | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Boundary checked, data owners, failure/idempotency/queue/cache/event notes, tenant/auth/retention/observability/deployment/operations notes, verification, and remaining service-boundary risk |
+| Service boundaries, modular-monolith boundaries, bounded contexts, team ownership, data ownership, source-of-truth maps, event or queue boundaries, multi-tenant isolation, failure flows, independent deployment, operational recovery, disaster recovery, cost, toil, or large-scale architecture split decisions are designed, reviewed, or changed | `.mustflow/skills/service-boundary-architecture/SKILL.md` | Candidate domains, owners, data truth map, communication paths, shared database or cache coupling, failure flows, boundary proof ledger, idempotency, queue/retry/dead-letter behavior, cache consistency, tenant/auth/audit boundaries, observability, deployment, migration, retention, operations tools, and command contract entries | Architecture docs, decision records, context files, boundary source, API/event/queue/cache/read-model contracts, operational runbooks, tests, and directly synchronized docs or templates | noun-first service split, shared database coupling, unknown data owner, repeated cross-team co-change, independent-deploy theater, dependency cycle, happy-path-only design, retry storm, queue backlog with no owner, cache as accidental authority, tenant leak, command-like events, missing observability, unsafe migration, weak health probe, untested graceful shutdown, version incompatibility, untested restore or DR, or manual recovery without audit | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Boundary checked, data owners, co-change/deploy/dependency proof, failure/idempotency/queue/cache/event notes, tenant/auth/retention/observability/deployment/health/recovery/cost/toil notes, verification, and remaining service-boundary risk |
 | Code is being refactored, reorganized, renamed, deduplicated, simplified, or structurally improved while existing behavior should be preserved | `.mustflow/skills/behavior-preserving-refactor/SKILL.md` | Refactoring goal, target area, behavior evidence, local patterns, current changed files, and command contract entries | Small behavior-preserving refactor steps, related tests, and directly synchronized docs or contracts | hidden behavior change, broad cleanup, misleading abstraction, unsafe deduplication, or unverified legacy change | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `docs_validate_fast`, `test_release`, `mustflow_check` | Goal, behavior evidence, structural risks, refactoring ladder, changes made, excluded behavior changes, verification, and remaining risks |
 | Class inheritance, base classes, abstract classes, template methods, protected state, mixins, framework subclasses, or subtype hierarchies are introduced, reviewed, or refactored, especially for behavior reuse or feature variants | `.mustflow/skills/composition-over-inheritance/SKILL.md` | Inheritance surface, reuse goal, change dimensions, local composition patterns, compatibility constraints, current changed files, and command contract entries | Classes, functions, role interfaces, policies, strategies, adapters, decorators, state machines, tests, wrappers, and directly synchronized docs or templates | fragile parent-child coupling, subclass explosion, broken substitutability, hidden protected state, over-composition, or untested behavior-preserving refactor | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Inheritance review, keep-or-replace decision, change dimensions, composition pattern, tests, verification, and remaining hierarchy risk |
 | Multiple interchangeable algorithms, policies, calculations, scoring methods, sorting methods, recommendation methods, pricing rules, discount rules, shipping methods, payment methods, notification methods, permission policies, provider choices, feature-flag variants, or repeated branches choose how to do the same kind of work | `.mustflow/skills/strategy-pattern/SKILL.md` | Stable workflow, variants and shared purpose, current branch locations, common input and output shape, selection criteria, local Result, dependency injection, decorator, registry, and test patterns, current changed files, and command contract entries | Strategy function types, interfaces, concrete strategies, selectors, resolvers, registries, decorators, context wiring, tests, and directly synchronized docs or templates | over-abstracted small branch, wrong use-case grouping, context knowing concrete strategies, silent fallback, unsafe user-selected strategy, request-stateful strategy, strategy combination explosion, or untested selector behavior | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `lint`, `build`, `docs_validate_fast`, `test_release`, `mustflow_check` | Strategy classification, shared contract, strategy registry, selector or resolver, default and unsupported-key behavior, tests, verification, and remaining strategy risk |
@@ -557,6 +593,7 @@ routes. Event routes stay inactive until their event occurs.
 | Repository improvement, audit, prioritization, stabilization, polish, onboarding, contributor-readiness, production-readiness, or iterative improvement is requested without a single predetermined edit | `.mustflow/skills/repo-improvement-loop/SKILL.md` | User goal, improvement mode, repository evidence, candidate risks, current changed files, and command contract entries | Repository diagnosis, ranked candidates, and at most one scoped improvement cycle unless the user explicitly requests analysis-only | idea spam, ungrounded prioritization, autonomous loop drift, broad rewrite, or unverified improvement claim | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Mode, evidence inspected, scored candidates, selected improvement, files changed or analysis-only note, verification, next improvement question, and stop reason |
 | Current repository evidence reveals a scope-adjacent bug, missing test, stale synchronized surface, public-contract drift, security or privacy exposure, data-loss risk, brittle error handling, concurrency risk, operational risk, or UX inconsistency outside the literal request | `.mustflow/skills/proactive-risk-surfacing/SKILL.md` | Literal user request, current evidence, risk relationship, severity, expected edit size, authority boundary, and verification options | Fix-or-report decision, small related fixes, focused tests or synchronized surfaces, and final proactive risk notes | scope creep, speculative cleanup, hidden broad refactor, ignored high-severity risk, or false completion claim | `changes_status`, `changes_diff_summary`, `test_related`, `docs_validate_fast`, `test_release`, `mustflow_check` | Candidate decisions: fix now, report only, ask first, or ignore; files changed, verification, and remaining proactive risks |
 | A final report or completion claim needs current evidence for changed files, requirements, command receipts, skipped checks, synchronized surfaces, or remaining risks | `.mustflow/skills/completion-evidence-gate/SKILL.md` | User goal, changed-file evidence, skills used, verification results, skipped checks, synchronized surfaces, and remaining risks | Final report evidence and the smallest missing in-scope evidence surface only | false completion, stale receipts, hidden skipped checks, unsupported readiness claim, or contract drift | `changes_status`, `changes_diff_summary`, `test_related`, `test`, `test_audit`, `lint`, `build`, `docs_validate_fast`, `docs_validate`, `test_release`, `mustflow_check` | Completion status, requirement evidence map, changed and synchronized surfaces, commands run, skipped checks, and final wording boundary |
+| A final report, completion note, repository improvement loop, or follow-up workflow should offer a bounded numbered next-action menu that a user can select with a single digit in the next turn | `.mustflow/skills/next-action-menu/SKILL.md` | Completed or paused task evidence, candidate follow-ups, verification status, skipped checks, gates, and active-menu freshness boundary | Final-report next-action menu, digit-selection interpretation, and directly synchronized workflow docs, templates, or tests when menu behavior changes | stale menu selection, approval bypass, fabricated filler work, broad backlog drift, high-risk action laundering, or command-contract bypass | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Menu included or omitted, numbered rows, recommendation scores, gate labels, selected digit handling, verification, and remaining selection risk |
 | A task is incomplete, blocked, paused, resumed, handed off, context-compacted, or needs bounded restart evidence without storing raw logs, secrets, hidden reasoning, transcripts, or authority-changing summaries | `.mustflow/skills/restricted-handoff-resume/SKILL.md` | Current goal, latest controlling instruction, changed files, command intents run or skipped, verification evidence, blocker or next safe action, and handoff or retention policy | Final report handoff evidence or explicitly configured handoff surface only | stale summary treated as authority, hidden reasoning leak, secret leak, raw log storage, unrelated work history, or missing restart point | `changes_status`, `changes_diff_summary`, `mustflow_check` | Task status, files touched, commands run/skipped, stale-summary check, next safe action or blocker, excluded raw content, and remaining resume risk |
 | A Codex or Hermes local session ID needs read-only reference for task evidence, restart context, failure diagnosis, or continuation planning across agent applications | `.mustflow/skills/cross-agent-session-reference/SKILL.md` | Session ID, source app evidence, current repository root, user goal, redaction requirements, available official session tools or read-only local storage evidence | Bounded session evidence summaries, continuation prompts, current-repository follow-up work, and directly synchronized reports only | foreign session mutation, transcript-as-authority drift, secret exposure, unrelated history dump, stale storage schema, or dispatching work into another app | `changes_status`, `changes_diff_summary`, `mustflow_check` | Source application confidence, read-only access method, extracted evidence, redactions, current verification, next safe action or ambiguity, and remaining stale-session or privacy risk |
 | Declared behavior must stay aligned across code, schemas, templates, tests, and docs | `.mustflow/skills/contract-sync-check/SKILL.md` | Changed files, intended behavior, source of truth, derived surfaces, and command contract entries | Contract source and required synchronized surfaces | contract drift | `changes_status`, `changes_diff_summary`, `docs_validate_fast`, `test_release`, `mustflow_check` | Contract source, synchronized surfaces, deferred surfaces, verification, and drift risk |