mustflow 2.85.4 → 2.99.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (78) hide show
  1. package/dist/cli/commands/script-pack.js +10 -0
  2. package/dist/cli/i18n/en.js +183 -0
  3. package/dist/cli/i18n/es.js +183 -0
  4. package/dist/cli/i18n/fr.js +183 -0
  5. package/dist/cli/i18n/hi.js +183 -0
  6. package/dist/cli/i18n/ko.js +183 -0
  7. package/dist/cli/i18n/zh.js +183 -0
  8. package/dist/cli/lib/script-pack-registry.js +284 -1
  9. package/dist/cli/script-packs/code-change-impact.js +6 -0
  10. package/dist/cli/script-packs/code-import-cycle.js +193 -0
  11. package/dist/cli/script-packs/docs-link-integrity.js +145 -0
  12. package/dist/cli/script-packs/repo-approval-gate.js +100 -0
  13. package/dist/cli/script-packs/repo-git-ignore-audit.js +119 -0
  14. package/dist/cli/script-packs/repo-manifest-lock-drift.js +122 -0
  15. package/dist/cli/script-packs/repo-merge-conflict-scan.js +123 -0
  16. package/dist/cli/script-packs/repo-skill-route-audit.js +86 -0
  17. package/dist/cli/script-packs/repo-version-source.js +92 -0
  18. package/dist/cli/script-packs/test-performance-report.js +247 -0
  19. package/dist/cli/script-packs/test-regression-selector.js +167 -0
  20. package/dist/core/change-impact.js +23 -51
  21. package/dist/core/change-surface-classification.js +198 -0
  22. package/dist/core/docs-link-integrity.js +443 -0
  23. package/dist/core/import-cycle.js +152 -0
  24. package/dist/core/public-json-contracts.js +116 -0
  25. package/dist/core/repo-approval-gate.js +116 -0
  26. package/dist/core/repo-git-ignore-audit.js +302 -0
  27. package/dist/core/repo-manifest-lock-drift.js +321 -0
  28. package/dist/core/repo-merge-conflict-scan.js +335 -0
  29. package/dist/core/repo-version-source.js +82 -0
  30. package/dist/core/script-pack-suggestions.js +77 -1
  31. package/dist/core/skill-route-audit.js +354 -0
  32. package/dist/core/test-performance-report.js +697 -0
  33. package/dist/core/test-regression-selector.js +335 -0
  34. package/package.json +1 -1
  35. package/schemas/README.md +40 -2
  36. package/schemas/change-impact-report.schema.json +35 -1
  37. package/schemas/import-cycle-report.schema.json +157 -0
  38. package/schemas/link-integrity-report.schema.json +176 -0
  39. package/schemas/repo-approval-gate-report.schema.json +115 -0
  40. package/schemas/repo-git-ignore-audit-report.schema.json +201 -0
  41. package/schemas/repo-manifest-lock-drift-report.schema.json +202 -0
  42. package/schemas/repo-merge-conflict-scan-report.schema.json +169 -0
  43. package/schemas/repo-version-source-report.schema.json +127 -0
  44. package/schemas/skill-route-audit-report.schema.json +144 -0
  45. package/schemas/test-performance-report.schema.json +319 -0
  46. package/schemas/test-regression-selector-report.schema.json +187 -0
  47. package/templates/default/i18n.toml +66 -18
  48. package/templates/default/locales/en/.mustflow/skills/INDEX.md +45 -8
  49. package/templates/default/locales/en/.mustflow/skills/api-access-control-review/SKILL.md +48 -27
  50. package/templates/default/locales/en/.mustflow/skills/api-failure-triage/SKILL.md +270 -0
  51. package/templates/default/locales/en/.mustflow/skills/auth-flow-triage/SKILL.md +192 -0
  52. package/templates/default/locales/en/.mustflow/skills/auth-permission-change/SKILL.md +59 -13
  53. package/templates/default/locales/en/.mustflow/skills/backend-log-evidence-review/SKILL.md +14 -5
  54. package/templates/default/locales/en/.mustflow/skills/cache-integrity-review/SKILL.md +30 -15
  55. package/templates/default/locales/en/.mustflow/skills/change-blast-radius-review/SKILL.md +45 -32
  56. package/templates/default/locales/en/.mustflow/skills/ci-pipeline-triage/SKILL.md +200 -0
  57. package/templates/default/locales/en/.mustflow/skills/clarifying-question-gate/SKILL.md +87 -13
  58. package/templates/default/locales/en/.mustflow/skills/docker-runtime-triage/SKILL.md +191 -0
  59. package/templates/default/locales/en/.mustflow/skills/go-code-change/SKILL.md +18 -13
  60. package/templates/default/locales/en/.mustflow/skills/line-ending-hygiene/SKILL.md +18 -10
  61. package/templates/default/locales/en/.mustflow/skills/llm-hallucination-control-review/SKILL.md +4 -1
  62. package/templates/default/locales/en/.mustflow/skills/motion-system-contract-review/SKILL.md +155 -0
  63. package/templates/default/locales/en/.mustflow/skills/next-action-menu/SKILL.md +177 -0
  64. package/templates/default/locales/en/.mustflow/skills/observability-debuggability-review/SKILL.md +15 -7
  65. package/templates/default/locales/en/.mustflow/skills/payment-integrity-review/SKILL.md +59 -35
  66. package/templates/default/locales/en/.mustflow/skills/powershell-code-change/SKILL.md +16 -6
  67. package/templates/default/locales/en/.mustflow/skills/prompt-contract-quality-review/SKILL.md +4 -1
  68. package/templates/default/locales/en/.mustflow/skills/python-code-change/SKILL.md +19 -10
  69. package/templates/default/locales/en/.mustflow/skills/rag-pipeline-triage/SKILL.md +206 -0
  70. package/templates/default/locales/en/.mustflow/skills/routes.toml +54 -0
  71. package/templates/default/locales/en/.mustflow/skills/rust-code-change/SKILL.md +10 -4
  72. package/templates/default/locales/en/.mustflow/skills/search-index-integrity-review/SKILL.md +181 -0
  73. package/templates/default/locales/en/.mustflow/skills/service-boundary-architecture/SKILL.md +37 -23
  74. package/templates/default/locales/en/.mustflow/skills/test-suite-performance-review/SKILL.md +9 -0
  75. package/templates/default/locales/en/.mustflow/skills/typescript-code-change/SKILL.md +14 -9
  76. package/templates/default/locales/en/.mustflow/skills/vector-search-integrity-review/SKILL.md +209 -0
  77. package/templates/default/locales/en/.mustflow/skills/version-freshness-check/SKILL.md +16 -14
  78. package/templates/default/manifest.toml +64 -1
@@ -0,0 +1,200 @@
1
+ ---
2
+ mustflow_doc: skill.ci-pipeline-triage
3
+ locale: en
4
+ canonical: true
5
+ revision: 1
6
+ lifecycle: mustflow-owned
7
+ authority: procedure
8
+ name: ci-pipeline-triage
9
+ description: Apply this skill when a CI/CD workflow, pipeline, job, runner, matrix, trigger, cache, artifact, deployment job, required check, or post-deploy verification is failing, skipped, queued, flaky, slow, green despite broken output, or not yet localized to trigger, runner, environment, build, test, artifact, deploy, or verification boundaries.
10
+ metadata:
11
+ mustflow_schema: "1"
12
+ mustflow_kind: procedure
13
+ pack_id: mustflow.core
14
+ skill_id: mustflow.core.ci-pipeline-triage
15
+ command_intents:
16
+ - changes_status
17
+ - changes_diff_summary
18
+ - lint
19
+ - build
20
+ - test_related
21
+ - test
22
+ - docs_validate_fast
23
+ - test_release
24
+ - mustflow_check
25
+ ---
26
+
27
+ # CI Pipeline Triage
28
+
29
+ <!-- mustflow-section: purpose -->
30
+ ## Purpose
31
+
32
+ Localize CI/CD failures by splitting trigger, runner, environment, build, test, artifact, deploy,
33
+ and verification boundaries before editing code or workflow files.
34
+
35
+ The first question is not "what is the last red log line?" It is "which pipeline boundary first
36
+ changed from the last known-good run, and what evidence would disprove each boundary hypothesis?"
37
+
38
+ <!-- mustflow-section: use-when -->
39
+ ## Use When
40
+
41
+ - A CI workflow, pipeline, job, matrix, required check, runner, cache, artifact, deployment step, or
42
+ smoke check fails, hangs, is skipped, is queued too long, passes while output is broken, or becomes
43
+ flaky.
44
+ - A failure is not yet localized to trigger filters, workflow parsing, runner selection, environment
45
+ setup, tool versions, dependency cache, build output, test isolation, artifact transfer,
46
+ deployment permissions, rollout completion, or post-deploy verification.
47
+ - A pipeline suddenly breaks without application-code changes, or only fails on forks, protected
48
+ branches, specific runners, specific regions, specific matrix entries, or reruns.
49
+
50
+ <!-- mustflow-section: do-not-use-when -->
51
+ ## Do Not Use When
52
+
53
+ - The failing command is a local configured intent and CI is not involved; use `failure-triage`.
54
+ - The deployment is already localized and the risk is rollout, rollback, probes, migrations, or
55
+ runtime safety; use `deployment-rollout-safety-review`.
56
+ - The task is only test-suite speed after the CI boundary is known; use
57
+ `test-suite-performance-review`.
58
+ - The task requires live production secrets, destructive deploys, cloud-console writes, or
59
+ unconfigured remote commands. Preserve static evidence and report the manual boundary.
60
+
61
+ <!-- mustflow-section: required-inputs -->
62
+ ## Required Inputs
63
+
64
+ - Failure classification: pipeline not created, queued, job failed, flaky rerun, succeeded with bad
65
+ service output, deployment failed, or post-deploy verification failed.
66
+ - Run identity ledger: commit SHA, branch or tag, trigger event, workflow file revision, matrix
67
+ entry, runner label and image, architecture, region, toolchain versions, package-manager version,
68
+ execution time, and run or job id.
69
+ - Last-good comparison: last successful commit and first failing commit, including workflow files,
70
+ lockfiles, base images, shared scripts, secrets or permission scopes, runner labels, cache keys,
71
+ feature flags, deployment config, and required-check settings.
72
+ - Boundary ledger: trigger, parsed job graph, matrix expansion, queue time, runner assignment,
73
+ checkout, environment variables, tool setup, dependency restore, build, tests, cache, artifacts,
74
+ deploy, smoke, and final status aggregation.
75
+ - Evidence constraints: redaction needs for secrets, tokens, private URLs, environment values,
76
+ debug logs, artifacts, and diagnostic files.
77
+
78
+ <!-- mustflow-section: preconditions -->
79
+ ## Preconditions
80
+
81
+ - The task matches the Use When conditions and does not match the Do Not Use When exclusions.
82
+ - Higher-priority instructions and `.mustflow/config/commands.toml` have been checked.
83
+ - Required CI evidence is available, or missing evidence can be reported without guessing.
84
+ - Secrets and private data are summarized as presence, length, hash, key name, or permission scope;
85
+ never copy raw secret values into logs, fixtures, docs, commits, or final reports.
86
+
87
+ <!-- mustflow-section: allowed-edits -->
88
+ ## Allowed Edits
89
+
90
+ - Add or tighten workflow triggers, path filters, matrix guards, version pinning, cache keys,
91
+ artifact manifests, status aggregation, debug evidence collection, secret-safe diagnostics,
92
+ timeout classification, runner labels, concurrency locks, environment validation, smoke checks,
93
+ test isolation, docs, and focused fixtures.
94
+ - Add tests or docs that prove workflow contract behavior, package metadata, template output,
95
+ release checks, artifact identity, or command-contract mapping when the repository owns those
96
+ surfaces.
97
+ - Do not add broad reruns, `continue-on-error`, `allow_failure`, `|| true`, blanket cache wipes,
98
+ floating `latest` references, unbounded debug logging, live deploy commands, or workflow rewrites
99
+ before the failing boundary is localized.
100
+
101
+ <!-- mustflow-section: procedure -->
102
+ ## Procedure
103
+
104
+ 1. Classify the failure shape: not created, queued, job failed, flaky, green-but-bad-output,
105
+ deployment failed, or verification failed.
106
+ 2. Compare the last success with the first failure. Include workflow, lockfile, base image, shared
107
+ script, secret scope, runner, matrix, cache, environment, feature flag, and deployment changes.
108
+ 3. Preserve run identity before reruns overwrite the evidence. Record safe run id, commit, trigger,
109
+ runner, matrix, tool versions, queue time, start time, and artifact identity.
110
+ 4. Rerun only to test determinism. If the same commit and inputs produce different outcomes, treat
111
+ cache, time, order, network, shared resources, or runner state as first-class suspects.
112
+ 5. Check trigger and graph before job logs. Path filters, branch or tag filters, skipped required
113
+ checks, inherited workflows, matrix expansion, `needs`, and conditional steps can prevent the
114
+ intended job from existing.
115
+ 6. Check false green paths. Look for `continue-on-error`, allowed failures, shell pipelines that
116
+ ignore non-zero exits, status aggregation that only reads the final notification step, and tests
117
+ that upload failures as artifacts but return success.
118
+ 7. Split queue wait from execution time. Long queue time points to runner labels, concurrency
119
+ limits, unavailable images, resource quotas, or protected environment approvals, not build code.
120
+ 8. Reproduce in a clean environment only after the boundary is known. Prefer the same image,
121
+ architecture, tool versions, env shape, and lockfile over a developer machine with hidden global
122
+ state.
123
+ 9. Pin floating execution inputs. Base images, actions, plugins, package managers, runtime versions,
124
+ and shared script refs need stable identities or an explicit freshness policy.
125
+ 10. Inspect environment without leaking values. Compare variable presence, safe hashes, lengths,
126
+ names, permission scopes, timezone, locale, charset, clock, disk, inode, file descriptor,
127
+ process, and memory limits.
128
+ 11. Treat external calls as boundary evidence. Separate DNS, proxy, certificate, HTTP status,
129
+ retry count, response time, and credential scope, with secrets redacted.
130
+ 12. Replace sleeps with readiness evidence. Service containers, databases, queues, and app servers
131
+ should prove readiness through real health, query, or protocol checks.
132
+ 13. Classify cache and artifact separately. Cache is disposable acceleration; artifact is the built
133
+ output passed forward. Cache keys need lockfile, OS, architecture, runtime, and package-manager
134
+ dimensions. Artifacts need file list, size, hash, build SHA, and download verification.
135
+ 14. Verify that the tested artifact is the deployed artifact. Rebuilding during deploy can make CI
136
+ test one thing and production receive another.
137
+ 15. Check auth and permissions by execution context. Fork PRs, protected branches, environments,
138
+ OIDC identity, package publishing identity, cloud role, and repository token scopes can differ
139
+ across otherwise similar runs.
140
+ 16. For deployment jobs, require rollout evidence, readiness, smoke checks, error and latency
141
+ thresholds, and environment concurrency locks instead of treating a zero exit code as success.
142
+ 17. Preserve evidence before cleanup. Do not delete runners, caches, artifacts, temporary dirs, or
143
+ diagnostic logs until the boundary and redaction plan are clear.
144
+ 18. Apply the smallest localized fix and verify with the narrowest configured intent that covers the
145
+ changed workflow, package, docs, template, or test surface.
146
+
147
+ <!-- mustflow-section: postconditions -->
148
+ ## Postconditions
149
+
150
+ - The pipeline failure is localized to trigger, runner, environment, build, test, artifact, deploy,
151
+ verification, or a named evidence gap.
152
+ - Last-good versus first-failure comparison, run identity, false-green risk, cache and artifact
153
+ behavior, permission scope, and rerun determinism are explicit where relevant.
154
+ - Follow-up deployment, test performance, security, command-contract, or package-release work is
155
+ selected only after the CI boundary is localized.
156
+
157
+ <!-- mustflow-section: verification -->
158
+ ## Verification
159
+
160
+ Use configured oneshot command intents when available:
161
+
162
+ - `changes_status`
163
+ - `changes_diff_summary`
164
+ - `lint`
165
+ - `build`
166
+ - `test_related`
167
+ - `test`
168
+ - `docs_validate_fast`
169
+ - `test_release`
170
+ - `mustflow_check`
171
+
172
+ Prefer the narrowest configured intent that covers workflow docs, package metadata, template output,
173
+ test fixtures, local reproduced behavior, or release-sensitive pipeline surfaces. Do not infer raw
174
+ CI reruns, deploys, cloud shell commands, or provider dashboard writes outside the command contract.
175
+
176
+ <!-- mustflow-section: failure-handling -->
177
+ ## Failure Handling
178
+
179
+ - If run identity, last-good comparison, trigger graph, runner, cache, artifact, or permission
180
+ evidence is missing, report the missing field instead of guessing.
181
+ - If debug logs contain secrets or private data, stop copying raw output and summarize safely.
182
+ - If CI evidence requires remote provider access that is unavailable or unconfigured, report the
183
+ manual evidence boundary and continue with local workflow or static evidence.
184
+ - If the boundary points to tests, deployment, secrets, permissions, artifacts, or command contracts,
185
+ switch to the narrower matching skill before editing that part.
186
+
187
+ <!-- mustflow-section: output-format -->
188
+ ## Output Format
189
+
190
+ - CI pipeline triaged
191
+ - Failure shape and localized boundary
192
+ - Run identity and last-good comparison
193
+ - Trigger, runner, environment, build, test, cache, artifact, deploy, and verification findings
194
+ - Hypotheses killed, still open, and selected follow-up boundary
195
+ - Fix applied or recommended
196
+ - Evidence level: provider run evidence, configured-test evidence, static review risk, manual-only,
197
+ missing, or not applicable
198
+ - Command intents run
199
+ - Skipped diagnostics and reasons
200
+ - Remaining CI pipeline risk
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.clarifying-question-gate
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: clarifying-question-gate
9
- description: Apply this skill when a coding task has missing intent, scope, domain, data, security, UX, dependency, architecture, or verification decisions that cannot be safely inferred from current repository evidence.
9
+ description: Apply this skill when a coding task needs request-contract repair: missing intent, scope, completion evidence, domain, data, security, UX, dependency, architecture, or verification decisions cannot be safely inferred from current repository evidence. Use it to proceed with safe assumptions, ask bounded confirmation questions, or reroute conflicts without becoming a general prompt-writing skill.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -23,12 +23,17 @@ metadata:
23
23
  <!-- mustflow-section: purpose -->
24
24
  ## Purpose
25
25
 
26
- Ask only the questions that protect the work from expensive wrong assumptions.
26
+ Repair an ambiguous request into an executable task contract, and ask only the questions that
27
+ protect the work from expensive wrong assumptions.
27
28
 
28
29
  Good agent work is not maximally autonomous and not maximally interrogative. It moves forward on
29
30
  cheap, reversible, repository-evident decisions, and stops before choices that are costly to undo or
30
31
  whose correct answer belongs to the user, product owner, security owner, or operations owner.
31
32
 
33
+ The goal is not to make the user rewrite the prompt. Normalize the request inside the current task,
34
+ state the interpretation when it matters, and continue unless a high-cost decision still needs
35
+ confirmation.
36
+
32
37
  <!-- mustflow-section: use-when -->
33
38
  ## Use When
34
39
 
@@ -44,6 +49,7 @@ whose correct answer belongs to the user, product owner, security owner, or oper
44
49
  maintenance burden.
45
50
  - You are about to add a new dependency, service, folder boundary, storage model, framework pattern,
46
51
  persistent state, or broad refactor that the current files do not already require.
52
+ - The request can be safely clarified by a short normalized contract instead of a long back-and-forth.
47
53
 
48
54
  <!-- mustflow-section: do-not-use-when -->
49
55
  ## Do Not Use When
@@ -54,9 +60,21 @@ whose correct answer belongs to the user, product owner, security owner, or oper
54
60
  - A more specific skill already requires a blocking question for the same risk and covers the whole
55
61
  decision, such as `structure-discovery-gate`, `auth-permission-change`, `database-migration-change`,
56
62
  `dependency-upgrade-review`, or `release-publish-change`.
63
+ - The request is mainly to draft a task prompt, work order, issue, PR instruction, or handoff for
64
+ another agent; use `task-instruction-authoring`.
65
+ - The work is a production prompt, prompt builder, RAG prompt, structured output, eval, or model/tool
66
+ policy; use `prompt-contract-quality-review`.
67
+ - Repository, host, user, nested-project, command-contract, or generated instruction sources
68
+ conflict; use `instruction-conflict-scope-check`.
69
+ - Hidden structural decisions dominate the task, such as a new data model, service boundary, storage
70
+ strategy, provider, public URL contract, or long-lived architecture choice; use
71
+ `structure-discovery-gate`.
57
72
  - Asking would only delegate ordinary engineering responsibility, such as "should I add tests?",
58
73
  "should I handle errors?", "what stack is this?", or "what style should I use?" when the repository
59
74
  already answers it.
75
+ - The only useful output would be "copy this rewritten prompt and send it again." Produce a
76
+ normalized contract and proceed in the current conversation unless the user explicitly requested a
77
+ reusable prompt artifact or the request is too broken to execute.
60
78
 
61
79
  <!-- mustflow-section: required-inputs -->
62
80
  ## Required Inputs
@@ -68,6 +86,13 @@ whose correct answer belongs to the user, product owner, security owner, or oper
68
86
  - Reversibility classification for each decision: cheap/reversible, moderate, or expensive/hard to
69
87
  roll back.
70
88
  - A recommended option for each blocking question, with the tradeoff of at least one alternative.
89
+ - A request-state decision: `ready`, `ready_with_assumptions`, `needs_confirmation`,
90
+ `blocked_by_conflict`, or `insufficient_evidence`.
91
+ - A normalized task contract when the original request is vague enough to risk drift: goal, current
92
+ context, change scope, excluded scope, user-visible behavior, constraints, completion evidence,
93
+ verification, report format, and remaining risks.
94
+ - Source tags for contract entries: `user_confirmed`, `repository_derived`, `safe_assumption`, or
95
+ `unresolved`.
71
96
 
72
97
  <!-- mustflow-section: preconditions -->
73
98
  ## Preconditions
@@ -79,6 +104,8 @@ whose correct answer belongs to the user, product owner, security owner, or oper
79
104
  scope.
80
105
  - Questions are limited to decisions that block safe implementation, not curiosity, preference
81
106
  collection, or broad product discovery.
107
+ - Product decisions are separated from engineering responsibilities. Do not ask whether to preserve
108
+ existing style, avoid swallowed errors, add appropriate tests, or follow command contracts.
82
109
 
83
110
  <!-- mustflow-section: allowed-edits -->
84
111
  ## Allowed Edits
@@ -105,32 +132,68 @@ whose correct answer belongs to the user, product owner, security owner, or oper
105
132
  working;
106
133
  - `blocking_question`: stop before implementation because the wrong choice would be expensive,
107
134
  user-visible, security-sensitive, data-affecting, dependency-affecting, or hard to roll back.
108
- 4. Ask about observable completion before feature shape when success is unclear:
135
+ 4. Choose exactly one request state:
136
+ - `ready`: no material ambiguity remains; proceed normally.
137
+ - `ready_with_assumptions`: only narrow reversible assumptions remain; proceed and report them.
138
+ - `needs_confirmation`: one or more user-owned, high-cost, or hard-to-reverse decisions must be
139
+ confirmed before implementation.
140
+ - `blocked_by_conflict`: instructions or command authority conflict; reroute to
141
+ `instruction-conflict-scope-check`.
142
+ - `insufficient_evidence`: more repository reading, reproduction, or scoped analysis is needed
143
+ before asking or implementing.
144
+ 5. Build a normalized task contract when the user request is underspecified but executable:
145
+ - goal;
146
+ - current context;
147
+ - change scope;
148
+ - excluded scope;
149
+ - user-visible behavior;
150
+ - constraints;
151
+ - completion evidence;
152
+ - verification;
153
+ - report format;
154
+ - remaining risks.
155
+ Tag each non-obvious contract entry as `user_confirmed`, `repository_derived`,
156
+ `safe_assumption`, or `unresolved`. Do not add new product requirements while normalizing.
157
+ 6. Ask about observable completion before feature shape when success is unclear:
109
158
  - what behavior proves the task is done;
110
159
  - which user path, command, test, screenshot, migration state, or registry/release state closes it.
111
- 5. Ask about scope only when plausible scopes have different cost or risk:
160
+ 7. Ask about scope only when plausible scopes have different cost or risk:
112
161
  - minimal symptom fix, root-cause fix, or broader cleanup;
113
162
  - prototype, maintainable production path, or release-ready path.
114
- 6. Ask about existing users and data before changing persistence, lifecycle, deletion, migration,
163
+ 8. Ask about existing users and data before changing persistence, lifecycle, deletion, migration,
115
164
  retention, cache, API compatibility, or old-client behavior.
116
- 7. Ask about failure UX before implementing user-visible success flows where failure handling is a
165
+ 9. Ask about failure UX before implementing user-visible success flows where failure handling is a
117
166
  product decision: retry, queue, message, audit/log-only, rollback, partial success, or manual
118
167
  recovery.
119
- 8. Ask about security and authorization before relying on UI hiding, client-side checks, roles,
168
+ 10. Ask about security and authorization before relying on UI hiding, client-side checks, roles,
120
169
  invites, team boundaries, file access, billing state, or admin features.
121
- 9. Ask before adding or swapping dependencies, services, queues, databases, auth providers, design
170
+ 11. Ask before adding or swapping dependencies, services, queues, databases, auth providers, design
122
171
  systems, state managers, or major folder boundaries.
123
- 10. Ask about verification when there is no declared command intent or when the user expects a
172
+ 12. Ask about verification when there is no declared command intent or when the user expects a
124
173
  specific proof beyond the repository's configured checks.
125
- 11. Keep the question set short:
174
+ 13. Keep the question set short:
126
175
  - ask at most three questions at once;
176
+ - ask only one question when its answer may make later questions irrelevant;
127
177
  - each question must name the decision, the recommended choice, the consequence of that choice,
128
178
  and one meaningful alternative;
129
179
  - avoid open-ended prompts like "how should I implement this?" unless no responsible options can
130
180
  be framed from repository evidence.
131
- 12. If no blocking question remains, proceed without ceremony. State only the assumptions that matter
181
+ 14. Do not ask bad engineering-delegation questions:
182
+ - "Should I add tests?"
183
+ - "Should I handle errors?"
184
+ - "Should I follow existing style?"
185
+ - "Should I check current files?"
186
+ - "Should I preserve existing behavior?"
187
+ 15. Use prompt rewriting only as an exception:
188
+ - the user explicitly asks for a prompt, issue, PR body, work order, or handoff for another
189
+ agent;
190
+ - the current request is too broken to execute and a normalized contract plus confirmation is the
191
+ smallest safe next step.
192
+ Otherwise, show the normalized contract only when it materially reduces drift, then proceed in
193
+ the same conversation.
194
+ 16. If no blocking question remains, proceed without ceremony. State only the assumptions that matter
132
195
  to review or rollback.
133
- 13. If a blocking question remains unanswered, do not implement around it. Offer the smallest safe
196
+ 17. If a blocking question remains unanswered, do not implement around it. Offer the smallest safe
134
197
  non-blocked action, such as read-only analysis, a plan, a reproduction, or a narrow preparatory
135
198
  refactor when another selected skill supports it.
136
199
 
@@ -142,6 +205,9 @@ whose correct answer belongs to the user, product owner, security owner, or oper
142
205
  - Expensive, user-owned, security-sensitive, data-affecting, dependency-affecting, and public-contract
143
206
  decisions are resolved before implementation.
144
207
  - Safe assumptions are narrow, reversible, and reported.
208
+ - Any normalized contract preserves the user's original request separately from repository-derived
209
+ facts and safe assumptions.
210
+ - Prompt rewriting is not used as a substitute for proceeding in the current task.
145
211
  - The final work can be judged against observable success criteria or a reported verification gap.
146
212
 
147
213
  <!-- mustflow-section: verification -->
@@ -165,6 +231,10 @@ run the specific configured verification intents required by the selected implem
165
231
  the evidence if it affects the final report.
166
232
  - If a blocking question reveals a larger feature, switch to the relevant skill before editing that
167
233
  new scope.
234
+ - If the issue is an instruction conflict rather than missing detail, switch to
235
+ `instruction-conflict-scope-check` instead of negotiating the conflict as a preference question.
236
+ - If structural design owns the decision, switch to `structure-discovery-gate`; if a prompt artifact
237
+ or work order owns it, switch to `task-instruction-authoring` or `prompt-contract-quality-review`.
168
238
  - If the task becomes over-scoped, reduce the next action to the smallest safe slice with explicit
169
239
  acceptance evidence.
170
240
  - If verification intent is missing, report the missing command contract instead of inventing a raw
@@ -174,6 +244,10 @@ run the specific configured verification intents required by the selected implem
174
244
  ## Output Format
175
245
 
176
246
  - Repository evidence inspected
247
+ - Request state: `ready`, `ready_with_assumptions`, `needs_confirmation`, `blocked_by_conflict`, or
248
+ `insufficient_evidence`
249
+ - Normalized task contract, only when needed, with `user_confirmed`, `repository_derived`,
250
+ `safe_assumption`, and `unresolved` source tags
177
251
  - Blocking questions asked, with recommendation and tradeoff
178
252
  - Safe assumptions made
179
253
  - Decisions intentionally deferred
@@ -0,0 +1,191 @@
1
+ ---
2
+ mustflow_doc: skill.docker-runtime-triage
3
+ locale: en
4
+ canonical: true
5
+ revision: 1
6
+ lifecycle: mustflow-owned
7
+ authority: procedure
8
+ name: docker-runtime-triage
9
+ description: Apply this skill when a Docker Engine, Docker Desktop, Docker Compose, container start, crash loop, health check, image pull, build cache, port mapping, DNS, network, volume, bind mount, storage, proxy, registry, Docker context, daemon, cgroup, OOM, signal handling, PID 1, or container runtime symptom is failing, slow, intermittent, or not yet localized to host, daemon, image, Compose config, app process, network, storage, resource, or registry boundaries.
10
+ metadata:
11
+ mustflow_schema: "1"
12
+ mustflow_kind: procedure
13
+ pack_id: mustflow.core
14
+ skill_id: mustflow.core.docker-runtime-triage
15
+ command_intents:
16
+ - changes_status
17
+ - changes_diff_summary
18
+ - lint
19
+ - build
20
+ - test_related
21
+ - test
22
+ - docs_validate_fast
23
+ - test_release
24
+ - mustflow_check
25
+ ---
26
+
27
+ # Docker Runtime Triage
28
+
29
+ <!-- mustflow-section: purpose -->
30
+ ## Purpose
31
+
32
+ Localize Docker and container runtime failures before blaming application code, Docker itself, or
33
+ the most recent Dockerfile edit.
34
+
35
+ <!-- mustflow-section: use-when -->
36
+ ## Use When
37
+
38
+ - A container fails to start, exits immediately, restarts repeatedly, is unhealthy, cannot pull or
39
+ find an image, cannot bind a port, cannot resolve DNS, cannot reach another service, loses data,
40
+ grows disk usage, OOMs, receives wrong signals, or behaves differently under Compose.
41
+ - The task is to diagnose Docker Engine, Docker Desktop, daemon, context, image store, registry,
42
+ proxy, network, mount, volume, resource, health, Compose, build, or runtime behavior.
43
+ - Evidence may be lost by pruning, rebuilding, restarting, or forcing recreation before the current
44
+ container, image, event, and daemon state are captured.
45
+
46
+ <!-- mustflow-section: do-not-use-when -->
47
+ ## Do Not Use When
48
+
49
+ - The task only edits Dockerfiles, Compose files, CI image builds, SBOM, provenance, image tags, or
50
+ container security posture; use `docker-code-change`.
51
+ - The task is already localized to an application-level API, database, cache, queue, auth, or
52
+ performance bug inside the running container; use the narrower owning skill.
53
+ - The user asks for destructive cleanup, prune, image deletion, volume deletion, or daemon reset
54
+ without explicit approval and preserved evidence.
55
+
56
+ <!-- mustflow-section: required-inputs -->
57
+ ## Required Inputs
58
+
59
+ - Runtime packet: current time, Docker client/server versions, active Docker context, relevant
60
+ environment variables, daemon warnings, host OS, storage driver, cgroup mode, and Docker Desktop
61
+ or Engine boundary.
62
+ - Container ledger: stopped and running containers, full command, image id, state, restart policy,
63
+ exit code, OOMKilled flag, health status, start and finish times, logs around the failure window,
64
+ and recent runtime events.
65
+ - Actual config ledger: image, entrypoint, command, environment, user, working directory, mounts,
66
+ networks, published ports, exposed ports, labels, resource limits, health check, and restart
67
+ policy from the running container or rendered Compose config.
68
+ - Host resource ledger: CPU, memory, swap, disk bytes, inode use, Docker system usage, image store
69
+ mode, build cache, volume usage, and kernel OOM or storage errors when available.
70
+ - Network ledger: container network, aliases, container IP, route, resolver config, DNS result,
71
+ port listener address, host port mapping, proxy settings, MTU or VPN suspicion, and firewall
72
+ boundary.
73
+ - Storage ledger: bind mounts, named volumes, writable layer changes, missing files hidden by
74
+ mounts, generated host paths, persistent data location, and cleanup risk.
75
+
76
+ <!-- mustflow-section: preconditions -->
77
+ ## Preconditions
78
+
79
+ - The task matches the Use When conditions and does not match the Do Not Use When exclusions.
80
+ - Higher-priority instructions and `.mustflow/config/commands.toml` have been checked.
81
+ - Evidence capture comes before destructive cleanup, prune, rebuild, restart loops, volume deletion,
82
+ forced recreation, or broad firewall changes.
83
+
84
+ <!-- mustflow-section: allowed-edits -->
85
+ ## Allowed Edits
86
+
87
+ - Add or tighten Dockerfile, Compose, health check, entrypoint, signal handling, port binding,
88
+ network, volume, resource-limit, `.dockerignore`, docs, fixtures, and tests only after the failing
89
+ boundary is localized.
90
+ - Add focused tests or docs that preserve the corrected runtime contract.
91
+ - Do not run or document inferred long-running servers, background containers, destructive prune
92
+ actions, broad firewall resets, registry pushes, or credentialed image pulls outside configured
93
+ command intents.
94
+
95
+ <!-- mustflow-section: procedure -->
96
+ ## Procedure
97
+
98
+ 1. Capture the runtime packet before cleanup. Separate Docker client, server, context, daemon,
99
+ Desktop, host OS, storage driver, cgroup, image store, and proxy evidence.
100
+ 2. Prove whether the host and daemon can run any known-small container before blaming the
101
+ application image. If that boundary fails, classify the issue as host, daemon, registry, or
102
+ runtime setup rather than app code.
103
+ 3. Compare image pull, image existence, container creation, process start, health, and app readiness
104
+ as separate phases. A successful pull does not prove runtime start, and a started process does
105
+ not prove readiness.
106
+ 4. Inspect stopped containers and full state, not only currently running containers. Preserve exit
107
+ code, OOMKilled, restart count, error, health, started and finished times, and recent events.
108
+ 5. Treat restart policy as evidence mutator. If a loop hides the first error, report the need to
109
+ pause or disable restart behavior before drawing conclusions.
110
+ 6. Separate container logs from daemon logs. Empty app logs can mean the process never started,
111
+ logged elsewhere, used a nonstandard logging driver, or failed before stdout and stderr existed.
112
+ 7. Do not treat exit code 137 as automatic OOM. Compare OOMKilled, kernel evidence, manual kill,
113
+ stop timeout, and signal handling before deciding.
114
+ 8. Check PID 1 and signal behavior when stops are slow or children survive. Prefer exec-form
115
+ entrypoints, init handling, and graceful shutdown evidence when the localized fix owns the image.
116
+ 9. Compare resource usage against limits. CPU, memory, I/O, and network numbers are meaningless
117
+ without container and host limits, pool pressure, and restart history.
118
+ 10. Split disk bytes from inode exhaustion and writable-layer growth. Do not prune before naming
119
+ whether images, containers, volumes, build cache, logs, or bind mounts own the growth.
120
+ 11. Check actual mounts before trusting image contents. Bind mounts can hide files built into the
121
+ image, and mistaken host paths can create directories where files were expected.
122
+ 12. Split network failures into DNS, route, TCP connect, TLS, HTTP, listener address, port mapping,
123
+ Docker network membership, proxy, firewall, MTU, and VPN boundaries.
124
+ 13. Remember that container `localhost` is the same container. For Compose-style service calls,
125
+ verify service names, aliases, networks, and whether the target process listens on an external
126
+ interface instead of loopback only.
127
+ 14. Render Compose config before interpreting it. Variable substitution, `.env`, shell environment,
128
+ overrides, profiles, relative paths, and service health conditions can change the actual
129
+ container contract.
130
+ 15. Separate start order from readiness. `depends_on`-style sequencing needs health or application
131
+ retry evidence before it is treated as a working dependency contract.
132
+ 16. Separate tag names from image identity. Compare image id, digest, architecture, pull timing, and
133
+ forced recreation behavior when "new image deployed" is part of the claim.
134
+ 17. For build failures, separate context content, ignored files, base-image pull, cache reuse,
135
+ stage-specific cache invalidation, native dependencies, and final runtime contents.
136
+ 18. Once the boundary is localized, switch to `docker-code-change`, language-specific skills,
137
+ network, storage, process, API, database, cache, or observability skills for the owning fix.
138
+
139
+ <!-- mustflow-section: postconditions -->
140
+ ## Postconditions
141
+
142
+ - Host, daemon, context, image, container, Compose, app process, network, storage, resource, proxy,
143
+ registry, and build boundaries are localized or named as evidence gaps.
144
+ - Destructive cleanup, broad firewall reset, rebuild, restart, force recreate, or prune was not used
145
+ as a substitute for evidence.
146
+ - Any source edit is tied to the localized runtime boundary.
147
+
148
+ <!-- mustflow-section: verification -->
149
+ ## Verification
150
+
151
+ Use configured oneshot command intents when available:
152
+
153
+ - `changes_status`
154
+ - `changes_diff_summary`
155
+ - `lint`
156
+ - `build`
157
+ - `test_related`
158
+ - `test`
159
+ - `docs_validate_fast`
160
+ - `test_release`
161
+ - `mustflow_check`
162
+
163
+ Report missing Docker daemon, Compose rendering, image build, runtime smoke, health, network,
164
+ volume, inspect, event, vulnerability, SBOM, provenance, registry, or Desktop diagnostic evidence
165
+ instead of inventing raw Docker commands.
166
+
167
+ <!-- mustflow-section: failure-handling -->
168
+ ## Failure Handling
169
+
170
+ - If the container or daemon evidence was already destroyed, report the missing evidence and use the
171
+ next reproducible packet rather than reconstructing from memory.
172
+ - If a destructive cleanup appears necessary, stop and ask for explicit approval after naming the
173
+ evidence that will be lost.
174
+ - If credentials, registry tokens, private environment variables, host paths, or user data appear in
175
+ evidence, redact before storing or reporting.
176
+ - If configured verification fails, preserve the failing intent and output tail, then fix only the
177
+ localized boundary.
178
+
179
+ <!-- mustflow-section: output-format -->
180
+ ## Output Format
181
+
182
+ - Docker runtime triaged
183
+ - Host, daemon, context, image, container, Compose, process, resource, storage, network, proxy,
184
+ registry, and build findings
185
+ - Evidence preserved and evidence missing
186
+ - Fix applied or recommended
187
+ - Evidence level: configured-test evidence, static review risk, manual-only, missing, or not
188
+ applicable
189
+ - Command intents run
190
+ - Skipped Docker diagnostics and reasons
191
+ - Remaining Docker runtime risk