npm - mustflow - Versions diffs - 2.85.4 → 2.99.0 - Mend

mustflow 2.85.4 → 2.99.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

package/templates/default/locales/en/.mustflow/skills/ci-pipeline-triage/SKILL.md ADDED Viewed

@@ -0,0 +1,200 @@
+---
+mustflow_doc: skill.ci-pipeline-triage
+locale: en
+canonical: true
+revision: 1
+lifecycle: mustflow-owned
+authority: procedure
+name: ci-pipeline-triage
+description: Apply this skill when a CI/CD workflow, pipeline, job, runner, matrix, trigger, cache, artifact, deployment job, required check, or post-deploy verification is failing, skipped, queued, flaky, slow, green despite broken output, or not yet localized to trigger, runner, environment, build, test, artifact, deploy, or verification boundaries.
+metadata:
+  mustflow_schema: "1"
+  mustflow_kind: procedure
+  pack_id: mustflow.core
+  skill_id: mustflow.core.ci-pipeline-triage
+  command_intents:
+    - changes_status
+    - changes_diff_summary
+    - lint
+    - build
+    - test_related
+    - test
+    - docs_validate_fast
+    - test_release
+    - mustflow_check
+---
+# CI Pipeline Triage
+<!-- mustflow-section: purpose -->
+## Purpose
+Localize CI/CD failures by splitting trigger, runner, environment, build, test, artifact, deploy,
+and verification boundaries before editing code or workflow files.
+The first question is not "what is the last red log line?" It is "which pipeline boundary first
+changed from the last known-good run, and what evidence would disprove each boundary hypothesis?"
+<!-- mustflow-section: use-when -->
+## Use When
+- A CI workflow, pipeline, job, matrix, required check, runner, cache, artifact, deployment step, or
+  smoke check fails, hangs, is skipped, is queued too long, passes while output is broken, or becomes
+  flaky.
+- A failure is not yet localized to trigger filters, workflow parsing, runner selection, environment
+  setup, tool versions, dependency cache, build output, test isolation, artifact transfer,
+  deployment permissions, rollout completion, or post-deploy verification.
+- A pipeline suddenly breaks without application-code changes, or only fails on forks, protected
+  branches, specific runners, specific regions, specific matrix entries, or reruns.
+<!-- mustflow-section: do-not-use-when -->
+## Do Not Use When
+- The failing command is a local configured intent and CI is not involved; use `failure-triage`.
+- The deployment is already localized and the risk is rollout, rollback, probes, migrations, or
+  runtime safety; use `deployment-rollout-safety-review`.
+- The task is only test-suite speed after the CI boundary is known; use
+  `test-suite-performance-review`.
+- The task requires live production secrets, destructive deploys, cloud-console writes, or
+  unconfigured remote commands. Preserve static evidence and report the manual boundary.
+<!-- mustflow-section: required-inputs -->
+## Required Inputs
+- Failure classification: pipeline not created, queued, job failed, flaky rerun, succeeded with bad
+  service output, deployment failed, or post-deploy verification failed.
+- Run identity ledger: commit SHA, branch or tag, trigger event, workflow file revision, matrix
+  entry, runner label and image, architecture, region, toolchain versions, package-manager version,
+  execution time, and run or job id.
+- Last-good comparison: last successful commit and first failing commit, including workflow files,
+  lockfiles, base images, shared scripts, secrets or permission scopes, runner labels, cache keys,
+  feature flags, deployment config, and required-check settings.
+- Boundary ledger: trigger, parsed job graph, matrix expansion, queue time, runner assignment,
+  checkout, environment variables, tool setup, dependency restore, build, tests, cache, artifacts,
+  deploy, smoke, and final status aggregation.
+- Evidence constraints: redaction needs for secrets, tokens, private URLs, environment values,
+  debug logs, artifacts, and diagnostic files.
+<!-- mustflow-section: preconditions -->
+## Preconditions
+- The task matches the Use When conditions and does not match the Do Not Use When exclusions.
+- Higher-priority instructions and `.mustflow/config/commands.toml` have been checked.
+- Required CI evidence is available, or missing evidence can be reported without guessing.
+- Secrets and private data are summarized as presence, length, hash, key name, or permission scope;
+  never copy raw secret values into logs, fixtures, docs, commits, or final reports.
+<!-- mustflow-section: allowed-edits -->
+## Allowed Edits
+- Add or tighten workflow triggers, path filters, matrix guards, version pinning, cache keys,
+  artifact manifests, status aggregation, debug evidence collection, secret-safe diagnostics,
+  timeout classification, runner labels, concurrency locks, environment validation, smoke checks,
+  test isolation, docs, and focused fixtures.
+- Add tests or docs that prove workflow contract behavior, package metadata, template output,
+  release checks, artifact identity, or command-contract mapping when the repository owns those
+  surfaces.
+- Do not add broad reruns, `continue-on-error`, `allow_failure`, `|| true`, blanket cache wipes,
+  floating `latest` references, unbounded debug logging, live deploy commands, or workflow rewrites
+  before the failing boundary is localized.
+<!-- mustflow-section: procedure -->
+## Procedure
+1. Classify the failure shape: not created, queued, job failed, flaky, green-but-bad-output,
+   deployment failed, or verification failed.
+2. Compare the last success with the first failure. Include workflow, lockfile, base image, shared
+   script, secret scope, runner, matrix, cache, environment, feature flag, and deployment changes.
+3. Preserve run identity before reruns overwrite the evidence. Record safe run id, commit, trigger,
+   runner, matrix, tool versions, queue time, start time, and artifact identity.
+4. Rerun only to test determinism. If the same commit and inputs produce different outcomes, treat
+   cache, time, order, network, shared resources, or runner state as first-class suspects.
+5. Check trigger and graph before job logs. Path filters, branch or tag filters, skipped required
+   checks, inherited workflows, matrix expansion, `needs`, and conditional steps can prevent the
+   intended job from existing.
+6. Check false green paths. Look for `continue-on-error`, allowed failures, shell pipelines that
+   ignore non-zero exits, status aggregation that only reads the final notification step, and tests
+   that upload failures as artifacts but return success.
+7. Split queue wait from execution time. Long queue time points to runner labels, concurrency
+   limits, unavailable images, resource quotas, or protected environment approvals, not build code.
+8. Reproduce in a clean environment only after the boundary is known. Prefer the same image,
+   architecture, tool versions, env shape, and lockfile over a developer machine with hidden global
+   state.
+9. Pin floating execution inputs. Base images, actions, plugins, package managers, runtime versions,
+   and shared script refs need stable identities or an explicit freshness policy.
+10. Inspect environment without leaking values. Compare variable presence, safe hashes, lengths,
+    names, permission scopes, timezone, locale, charset, clock, disk, inode, file descriptor,
+    process, and memory limits.
+11. Treat external calls as boundary evidence. Separate DNS, proxy, certificate, HTTP status,
+    retry count, response time, and credential scope, with secrets redacted.
+12. Replace sleeps with readiness evidence. Service containers, databases, queues, and app servers
+    should prove readiness through real health, query, or protocol checks.
+13. Classify cache and artifact separately. Cache is disposable acceleration; artifact is the built
+    output passed forward. Cache keys need lockfile, OS, architecture, runtime, and package-manager
+    dimensions. Artifacts need file list, size, hash, build SHA, and download verification.
+14. Verify that the tested artifact is the deployed artifact. Rebuilding during deploy can make CI
+    test one thing and production receive another.
+15. Check auth and permissions by execution context. Fork PRs, protected branches, environments,
+    OIDC identity, package publishing identity, cloud role, and repository token scopes can differ
+    across otherwise similar runs.
+16. For deployment jobs, require rollout evidence, readiness, smoke checks, error and latency
+    thresholds, and environment concurrency locks instead of treating a zero exit code as success.
+17. Preserve evidence before cleanup. Do not delete runners, caches, artifacts, temporary dirs, or
+    diagnostic logs until the boundary and redaction plan are clear.
+18. Apply the smallest localized fix and verify with the narrowest configured intent that covers the
+    changed workflow, package, docs, template, or test surface.
+<!-- mustflow-section: postconditions -->
+## Postconditions
+- The pipeline failure is localized to trigger, runner, environment, build, test, artifact, deploy,
+  verification, or a named evidence gap.
+- Last-good versus first-failure comparison, run identity, false-green risk, cache and artifact
+  behavior, permission scope, and rerun determinism are explicit where relevant.
+- Follow-up deployment, test performance, security, command-contract, or package-release work is
+  selected only after the CI boundary is localized.
+<!-- mustflow-section: verification -->
+## Verification
+Use configured oneshot command intents when available:
+- `changes_status`
+- `changes_diff_summary`
+- `lint`
+- `build`
+- `test_related`
+- `test`
+- `docs_validate_fast`
+- `test_release`
+- `mustflow_check`
+Prefer the narrowest configured intent that covers workflow docs, package metadata, template output,
+test fixtures, local reproduced behavior, or release-sensitive pipeline surfaces. Do not infer raw
+CI reruns, deploys, cloud shell commands, or provider dashboard writes outside the command contract.
+<!-- mustflow-section: failure-handling -->
+## Failure Handling
+- If run identity, last-good comparison, trigger graph, runner, cache, artifact, or permission
+  evidence is missing, report the missing field instead of guessing.
+- If debug logs contain secrets or private data, stop copying raw output and summarize safely.
+- If CI evidence requires remote provider access that is unavailable or unconfigured, report the
+  manual evidence boundary and continue with local workflow or static evidence.
+- If the boundary points to tests, deployment, secrets, permissions, artifacts, or command contracts,
+  switch to the narrower matching skill before editing that part.
+<!-- mustflow-section: output-format -->
+## Output Format
+- CI pipeline triaged
+- Failure shape and localized boundary
+- Run identity and last-good comparison
+- Trigger, runner, environment, build, test, cache, artifact, deploy, and verification findings
+- Hypotheses killed, still open, and selected follow-up boundary
+- Fix applied or recommended
+- Evidence level: provider run evidence, configured-test evidence, static review risk, manual-only,
+  missing, or not applicable
+- Command intents run
+- Skipped diagnostics and reasons
+- Remaining CI pipeline risk

package/templates/default/locales/en/.mustflow/skills/clarifying-question-gate/SKILL.md CHANGED Viewed

@@ -2,11 +2,11 @@
 mustflow_doc: skill.clarifying-question-gate
 locale: en
 canonical: true
-revision: 1
+revision: 2
 lifecycle: mustflow-owned
 authority: procedure
 name: clarifying-question-gate
-description: Apply this skill when a coding task has missing intent, scope, domain, data, security, UX, dependency, architecture, or verification decisions that cannot be safely inferred from current repository evidence.
+description: Apply this skill when a coding task needs request-contract repair: missing intent, scope, completion evidence, domain, data, security, UX, dependency, architecture, or verification decisions cannot be safely inferred from current repository evidence. Use it to proceed with safe assumptions, ask bounded confirmation questions, or reroute conflicts without becoming a general prompt-writing skill.
 metadata:
   mustflow_schema: "1"
   mustflow_kind: procedure
@@ -23,12 +23,17 @@ metadata:
 <!-- mustflow-section: purpose -->
 ## Purpose
-Ask only the questions that protect the work from expensive wrong assumptions.
+Repair an ambiguous request into an executable task contract, and ask only the questions that
+protect the work from expensive wrong assumptions.
 Good agent work is not maximally autonomous and not maximally interrogative. It moves forward on
 cheap, reversible, repository-evident decisions, and stops before choices that are costly to undo or
 whose correct answer belongs to the user, product owner, security owner, or operations owner.
+The goal is not to make the user rewrite the prompt. Normalize the request inside the current task,
+state the interpretation when it matters, and continue unless a high-cost decision still needs
+confirmation.
 <!-- mustflow-section: use-when -->
 ## Use When
@@ -44,6 +49,7 @@ whose correct answer belongs to the user, product owner, security owner, or oper
   maintenance burden.
 - You are about to add a new dependency, service, folder boundary, storage model, framework pattern,
   persistent state, or broad refactor that the current files do not already require.
+- The request can be safely clarified by a short normalized contract instead of a long back-and-forth.
 <!-- mustflow-section: do-not-use-when -->
 ## Do Not Use When
@@ -54,9 +60,21 @@ whose correct answer belongs to the user, product owner, security owner, or oper
 - A more specific skill already requires a blocking question for the same risk and covers the whole
   decision, such as `structure-discovery-gate`, `auth-permission-change`, `database-migration-change`,
   `dependency-upgrade-review`, or `release-publish-change`.
+- The request is mainly to draft a task prompt, work order, issue, PR instruction, or handoff for
+  another agent; use `task-instruction-authoring`.
+- The work is a production prompt, prompt builder, RAG prompt, structured output, eval, or model/tool
+  policy; use `prompt-contract-quality-review`.
+- Repository, host, user, nested-project, command-contract, or generated instruction sources
+  conflict; use `instruction-conflict-scope-check`.
+- Hidden structural decisions dominate the task, such as a new data model, service boundary, storage
+  strategy, provider, public URL contract, or long-lived architecture choice; use
+  `structure-discovery-gate`.
 - Asking would only delegate ordinary engineering responsibility, such as "should I add tests?",
   "should I handle errors?", "what stack is this?", or "what style should I use?" when the repository
   already answers it.
+- The only useful output would be "copy this rewritten prompt and send it again." Produce a
+  normalized contract and proceed in the current conversation unless the user explicitly requested a
+  reusable prompt artifact or the request is too broken to execute.
 <!-- mustflow-section: required-inputs -->
 ## Required Inputs
@@ -68,6 +86,13 @@ whose correct answer belongs to the user, product owner, security owner, or oper
 - Reversibility classification for each decision: cheap/reversible, moderate, or expensive/hard to
   roll back.
 - A recommended option for each blocking question, with the tradeoff of at least one alternative.
+- A request-state decision: `ready`, `ready_with_assumptions`, `needs_confirmation`,
+  `blocked_by_conflict`, or `insufficient_evidence`.
+- A normalized task contract when the original request is vague enough to risk drift: goal, current
+  context, change scope, excluded scope, user-visible behavior, constraints, completion evidence,
+  verification, report format, and remaining risks.
+- Source tags for contract entries: `user_confirmed`, `repository_derived`, `safe_assumption`, or
+  `unresolved`.
 <!-- mustflow-section: preconditions -->
 ## Preconditions
@@ -79,6 +104,8 @@ whose correct answer belongs to the user, product owner, security owner, or oper
   scope.
 - Questions are limited to decisions that block safe implementation, not curiosity, preference
   collection, or broad product discovery.
+- Product decisions are separated from engineering responsibilities. Do not ask whether to preserve
+  existing style, avoid swallowed errors, add appropriate tests, or follow command contracts.
 <!-- mustflow-section: allowed-edits -->
 ## Allowed Edits
@@ -105,32 +132,68 @@ whose correct answer belongs to the user, product owner, security owner, or oper
      working;
    - `blocking_question`: stop before implementation because the wrong choice would be expensive,
      user-visible, security-sensitive, data-affecting, dependency-affecting, or hard to roll back.
-4. Ask about observable completion before feature shape when success is unclear:
+4. Choose exactly one request state:
+   - `ready`: no material ambiguity remains; proceed normally.
+   - `ready_with_assumptions`: only narrow reversible assumptions remain; proceed and report them.
+   - `needs_confirmation`: one or more user-owned, high-cost, or hard-to-reverse decisions must be
+     confirmed before implementation.
+   - `blocked_by_conflict`: instructions or command authority conflict; reroute to
+     `instruction-conflict-scope-check`.
+   - `insufficient_evidence`: more repository reading, reproduction, or scoped analysis is needed
+     before asking or implementing.
+5. Build a normalized task contract when the user request is underspecified but executable:
+   - goal;
+   - current context;
+   - change scope;
+   - excluded scope;
+   - user-visible behavior;
+   - constraints;
+   - completion evidence;
+   - verification;
+   - report format;
+   - remaining risks.
+   Tag each non-obvious contract entry as `user_confirmed`, `repository_derived`,
+   `safe_assumption`, or `unresolved`. Do not add new product requirements while normalizing.
+6. Ask about observable completion before feature shape when success is unclear:
    - what behavior proves the task is done;
    - which user path, command, test, screenshot, migration state, or registry/release state closes it.
-5. Ask about scope only when plausible scopes have different cost or risk:
+7. Ask about scope only when plausible scopes have different cost or risk:
    - minimal symptom fix, root-cause fix, or broader cleanup;
    - prototype, maintainable production path, or release-ready path.
-6. Ask about existing users and data before changing persistence, lifecycle, deletion, migration,
+8. Ask about existing users and data before changing persistence, lifecycle, deletion, migration,
    retention, cache, API compatibility, or old-client behavior.
-7. Ask about failure UX before implementing user-visible success flows where failure handling is a
+9. Ask about failure UX before implementing user-visible success flows where failure handling is a
    product decision: retry, queue, message, audit/log-only, rollback, partial success, or manual
    recovery.
-8. Ask about security and authorization before relying on UI hiding, client-side checks, roles,
+10. Ask about security and authorization before relying on UI hiding, client-side checks, roles,
    invites, team boundaries, file access, billing state, or admin features.
-9. Ask before adding or swapping dependencies, services, queues, databases, auth providers, design
+11. Ask before adding or swapping dependencies, services, queues, databases, auth providers, design
    systems, state managers, or major folder boundaries.
-10. Ask about verification when there is no declared command intent or when the user expects a
+12. Ask about verification when there is no declared command intent or when the user expects a
     specific proof beyond the repository's configured checks.
-11. Keep the question set short:
+13. Keep the question set short:
     - ask at most three questions at once;
+    - ask only one question when its answer may make later questions irrelevant;
     - each question must name the decision, the recommended choice, the consequence of that choice,
       and one meaningful alternative;
     - avoid open-ended prompts like "how should I implement this?" unless no responsible options can
       be framed from repository evidence.
-12. If no blocking question remains, proceed without ceremony. State only the assumptions that matter
+14. Do not ask bad engineering-delegation questions:
+    - "Should I add tests?"
+    - "Should I handle errors?"
+    - "Should I follow existing style?"
+    - "Should I check current files?"
+    - "Should I preserve existing behavior?"
+15. Use prompt rewriting only as an exception:
+    - the user explicitly asks for a prompt, issue, PR body, work order, or handoff for another
+      agent;
+    - the current request is too broken to execute and a normalized contract plus confirmation is the
+      smallest safe next step.
+    Otherwise, show the normalized contract only when it materially reduces drift, then proceed in
+    the same conversation.
+16. If no blocking question remains, proceed without ceremony. State only the assumptions that matter
     to review or rollback.
-13. If a blocking question remains unanswered, do not implement around it. Offer the smallest safe
+17. If a blocking question remains unanswered, do not implement around it. Offer the smallest safe
     non-blocked action, such as read-only analysis, a plan, a reproduction, or a narrow preparatory
     refactor when another selected skill supports it.
@@ -142,6 +205,9 @@ whose correct answer belongs to the user, product owner, security owner, or oper
 - Expensive, user-owned, security-sensitive, data-affecting, dependency-affecting, and public-contract
   decisions are resolved before implementation.
 - Safe assumptions are narrow, reversible, and reported.
+- Any normalized contract preserves the user's original request separately from repository-derived
+  facts and safe assumptions.
+- Prompt rewriting is not used as a substitute for proceeding in the current task.
 - The final work can be judged against observable success criteria or a reported verification gap.
 <!-- mustflow-section: verification -->
@@ -165,6 +231,10 @@ run the specific configured verification intents required by the selected implem
   the evidence if it affects the final report.
 - If a blocking question reveals a larger feature, switch to the relevant skill before editing that
   new scope.
+- If the issue is an instruction conflict rather than missing detail, switch to
+  `instruction-conflict-scope-check` instead of negotiating the conflict as a preference question.
+- If structural design owns the decision, switch to `structure-discovery-gate`; if a prompt artifact
+  or work order owns it, switch to `task-instruction-authoring` or `prompt-contract-quality-review`.
 - If the task becomes over-scoped, reduce the next action to the smallest safe slice with explicit
   acceptance evidence.
 - If verification intent is missing, report the missing command contract instead of inventing a raw
@@ -174,6 +244,10 @@ run the specific configured verification intents required by the selected implem
 ## Output Format
 - Repository evidence inspected
+- Request state: `ready`, `ready_with_assumptions`, `needs_confirmation`, `blocked_by_conflict`, or
+  `insufficient_evidence`
+- Normalized task contract, only when needed, with `user_confirmed`, `repository_derived`,
+  `safe_assumption`, and `unresolved` source tags
 - Blocking questions asked, with recommendation and tradeoff
 - Safe assumptions made
 - Decisions intentionally deferred

package/templates/default/locales/en/.mustflow/skills/docker-runtime-triage/SKILL.md ADDED Viewed

@@ -0,0 +1,191 @@
+---
+mustflow_doc: skill.docker-runtime-triage
+locale: en
+canonical: true
+revision: 1
+lifecycle: mustflow-owned
+authority: procedure
+name: docker-runtime-triage
+description: Apply this skill when a Docker Engine, Docker Desktop, Docker Compose, container start, crash loop, health check, image pull, build cache, port mapping, DNS, network, volume, bind mount, storage, proxy, registry, Docker context, daemon, cgroup, OOM, signal handling, PID 1, or container runtime symptom is failing, slow, intermittent, or not yet localized to host, daemon, image, Compose config, app process, network, storage, resource, or registry boundaries.
+metadata:
+  mustflow_schema: "1"
+  mustflow_kind: procedure
+  pack_id: mustflow.core
+  skill_id: mustflow.core.docker-runtime-triage
+  command_intents:
+    - changes_status
+    - changes_diff_summary
+    - lint
+    - build
+    - test_related
+    - test
+    - docs_validate_fast
+    - test_release
+    - mustflow_check
+---
+# Docker Runtime Triage
+<!-- mustflow-section: purpose -->
+## Purpose
+Localize Docker and container runtime failures before blaming application code, Docker itself, or
+the most recent Dockerfile edit.
+<!-- mustflow-section: use-when -->
+## Use When
+- A container fails to start, exits immediately, restarts repeatedly, is unhealthy, cannot pull or
+  find an image, cannot bind a port, cannot resolve DNS, cannot reach another service, loses data,
+  grows disk usage, OOMs, receives wrong signals, or behaves differently under Compose.
+- The task is to diagnose Docker Engine, Docker Desktop, daemon, context, image store, registry,
+  proxy, network, mount, volume, resource, health, Compose, build, or runtime behavior.
+- Evidence may be lost by pruning, rebuilding, restarting, or forcing recreation before the current
+  container, image, event, and daemon state are captured.
+<!-- mustflow-section: do-not-use-when -->
+## Do Not Use When
+- The task only edits Dockerfiles, Compose files, CI image builds, SBOM, provenance, image tags, or
+  container security posture; use `docker-code-change`.
+- The task is already localized to an application-level API, database, cache, queue, auth, or
+  performance bug inside the running container; use the narrower owning skill.
+- The user asks for destructive cleanup, prune, image deletion, volume deletion, or daemon reset
+  without explicit approval and preserved evidence.
+<!-- mustflow-section: required-inputs -->
+## Required Inputs
+- Runtime packet: current time, Docker client/server versions, active Docker context, relevant
+  environment variables, daemon warnings, host OS, storage driver, cgroup mode, and Docker Desktop
+  or Engine boundary.
+- Container ledger: stopped and running containers, full command, image id, state, restart policy,
+  exit code, OOMKilled flag, health status, start and finish times, logs around the failure window,
+  and recent runtime events.
+- Actual config ledger: image, entrypoint, command, environment, user, working directory, mounts,
+  networks, published ports, exposed ports, labels, resource limits, health check, and restart
+  policy from the running container or rendered Compose config.
+- Host resource ledger: CPU, memory, swap, disk bytes, inode use, Docker system usage, image store
+  mode, build cache, volume usage, and kernel OOM or storage errors when available.
+- Network ledger: container network, aliases, container IP, route, resolver config, DNS result,
+  port listener address, host port mapping, proxy settings, MTU or VPN suspicion, and firewall
+  boundary.
+- Storage ledger: bind mounts, named volumes, writable layer changes, missing files hidden by
+  mounts, generated host paths, persistent data location, and cleanup risk.
+<!-- mustflow-section: preconditions -->
+## Preconditions
+- The task matches the Use When conditions and does not match the Do Not Use When exclusions.
+- Higher-priority instructions and `.mustflow/config/commands.toml` have been checked.
+- Evidence capture comes before destructive cleanup, prune, rebuild, restart loops, volume deletion,
+  forced recreation, or broad firewall changes.
+<!-- mustflow-section: allowed-edits -->
+## Allowed Edits
+- Add or tighten Dockerfile, Compose, health check, entrypoint, signal handling, port binding,
+  network, volume, resource-limit, `.dockerignore`, docs, fixtures, and tests only after the failing
+  boundary is localized.
+- Add focused tests or docs that preserve the corrected runtime contract.
+- Do not run or document inferred long-running servers, background containers, destructive prune
+  actions, broad firewall resets, registry pushes, or credentialed image pulls outside configured
+  command intents.
+<!-- mustflow-section: procedure -->
+## Procedure
+1. Capture the runtime packet before cleanup. Separate Docker client, server, context, daemon,
+   Desktop, host OS, storage driver, cgroup, image store, and proxy evidence.
+2. Prove whether the host and daemon can run any known-small container before blaming the
+   application image. If that boundary fails, classify the issue as host, daemon, registry, or
+   runtime setup rather than app code.
+3. Compare image pull, image existence, container creation, process start, health, and app readiness
+   as separate phases. A successful pull does not prove runtime start, and a started process does
+   not prove readiness.
+4. Inspect stopped containers and full state, not only currently running containers. Preserve exit
+   code, OOMKilled, restart count, error, health, started and finished times, and recent events.
+5. Treat restart policy as evidence mutator. If a loop hides the first error, report the need to
+   pause or disable restart behavior before drawing conclusions.
+6. Separate container logs from daemon logs. Empty app logs can mean the process never started,
+   logged elsewhere, used a nonstandard logging driver, or failed before stdout and stderr existed.
+7. Do not treat exit code 137 as automatic OOM. Compare OOMKilled, kernel evidence, manual kill,
+   stop timeout, and signal handling before deciding.
+8. Check PID 1 and signal behavior when stops are slow or children survive. Prefer exec-form
+   entrypoints, init handling, and graceful shutdown evidence when the localized fix owns the image.
+9. Compare resource usage against limits. CPU, memory, I/O, and network numbers are meaningless
+   without container and host limits, pool pressure, and restart history.
+10. Split disk bytes from inode exhaustion and writable-layer growth. Do not prune before naming
+    whether images, containers, volumes, build cache, logs, or bind mounts own the growth.
+11. Check actual mounts before trusting image contents. Bind mounts can hide files built into the
+    image, and mistaken host paths can create directories where files were expected.
+12. Split network failures into DNS, route, TCP connect, TLS, HTTP, listener address, port mapping,
+    Docker network membership, proxy, firewall, MTU, and VPN boundaries.
+13. Remember that container `localhost` is the same container. For Compose-style service calls,
+    verify service names, aliases, networks, and whether the target process listens on an external
+    interface instead of loopback only.
+14. Render Compose config before interpreting it. Variable substitution, `.env`, shell environment,
+    overrides, profiles, relative paths, and service health conditions can change the actual
+    container contract.
+15. Separate start order from readiness. `depends_on`-style sequencing needs health or application
+    retry evidence before it is treated as a working dependency contract.
+16. Separate tag names from image identity. Compare image id, digest, architecture, pull timing, and
+    forced recreation behavior when "new image deployed" is part of the claim.
+17. For build failures, separate context content, ignored files, base-image pull, cache reuse,
+    stage-specific cache invalidation, native dependencies, and final runtime contents.
+18. Once the boundary is localized, switch to `docker-code-change`, language-specific skills,
+    network, storage, process, API, database, cache, or observability skills for the owning fix.
+<!-- mustflow-section: postconditions -->
+## Postconditions
+- Host, daemon, context, image, container, Compose, app process, network, storage, resource, proxy,
+  registry, and build boundaries are localized or named as evidence gaps.
+- Destructive cleanup, broad firewall reset, rebuild, restart, force recreate, or prune was not used
+  as a substitute for evidence.
+- Any source edit is tied to the localized runtime boundary.
+<!-- mustflow-section: verification -->
+## Verification
+Use configured oneshot command intents when available:
+- `changes_status`
+- `changes_diff_summary`
+- `lint`
+- `build`
+- `test_related`
+- `test`
+- `docs_validate_fast`
+- `test_release`
+- `mustflow_check`
+Report missing Docker daemon, Compose rendering, image build, runtime smoke, health, network,
+volume, inspect, event, vulnerability, SBOM, provenance, registry, or Desktop diagnostic evidence
+instead of inventing raw Docker commands.
+<!-- mustflow-section: failure-handling -->
+## Failure Handling
+- If the container or daemon evidence was already destroyed, report the missing evidence and use the
+  next reproducible packet rather than reconstructing from memory.
+- If a destructive cleanup appears necessary, stop and ask for explicit approval after naming the
+  evidence that will be lost.
+- If credentials, registry tokens, private environment variables, host paths, or user data appear in
+  evidence, redact before storing or reporting.
+- If configured verification fails, preserve the failing intent and output tail, then fix only the
+  localized boundary.
+<!-- mustflow-section: output-format -->
+## Output Format
+- Docker runtime triaged
+- Host, daemon, context, image, container, Compose, process, resource, storage, network, proxy,
+  registry, and build findings
+- Evidence preserved and evidence missing
+- Fix applied or recommended
+- Evidence level: configured-test evidence, static review risk, manual-only, missing, or not
+  applicable
+- Command intents run
+- Skipped Docker diagnostics and reasons
+- Remaining Docker runtime risk