@zhixuan92/multi-model-agent 4.9.1 → 5.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (251) hide show
  1. package/README.md +4 -3
  2. package/bin/mmagent.mjs +47 -0
  3. package/package.json +24 -43
  4. package/postinstall.mjs +8 -0
  5. package/dist/cli/index.d.ts +0 -62
  6. package/dist/cli/index.d.ts.map +0 -1
  7. package/dist/cli/index.js +0 -345
  8. package/dist/cli/index.js.map +0 -1
  9. package/dist/cli/info.d.ts +0 -22
  10. package/dist/cli/info.d.ts.map +0 -1
  11. package/dist/cli/info.js +0 -100
  12. package/dist/cli/info.js.map +0 -1
  13. package/dist/cli/logs.d.ts +0 -15
  14. package/dist/cli/logs.d.ts.map +0 -1
  15. package/dist/cli/logs.js +0 -102
  16. package/dist/cli/logs.js.map +0 -1
  17. package/dist/cli/print-token.d.ts +0 -18
  18. package/dist/cli/print-token.d.ts.map +0 -1
  19. package/dist/cli/print-token.js +0 -60
  20. package/dist/cli/print-token.js.map +0 -1
  21. package/dist/cli/serve.d.ts +0 -28
  22. package/dist/cli/serve.d.ts.map +0 -1
  23. package/dist/cli/serve.js +0 -405
  24. package/dist/cli/serve.js.map +0 -1
  25. package/dist/cli/status.d.ts +0 -49
  26. package/dist/cli/status.d.ts.map +0 -1
  27. package/dist/cli/status.js +0 -155
  28. package/dist/cli/status.js.map +0 -1
  29. package/dist/cli/sync-skills.d.ts +0 -58
  30. package/dist/cli/sync-skills.d.ts.map +0 -1
  31. package/dist/cli/sync-skills.js +0 -266
  32. package/dist/cli/sync-skills.js.map +0 -1
  33. package/dist/cli/telemetry.d.ts +0 -10
  34. package/dist/cli/telemetry.d.ts.map +0 -1
  35. package/dist/cli/telemetry.js +0 -161
  36. package/dist/cli/telemetry.js.map +0 -1
  37. package/dist/cli/toggle.d.ts +0 -26
  38. package/dist/cli/toggle.d.ts.map +0 -1
  39. package/dist/cli/toggle.js +0 -185
  40. package/dist/cli/toggle.js.map +0 -1
  41. package/dist/http/async-dispatch.d.ts +0 -44
  42. package/dist/http/async-dispatch.d.ts.map +0 -1
  43. package/dist/http/async-dispatch.js +0 -175
  44. package/dist/http/async-dispatch.js.map +0 -1
  45. package/dist/http/auth.d.ts +0 -20
  46. package/dist/http/auth.d.ts.map +0 -1
  47. package/dist/http/auth.js +0 -56
  48. package/dist/http/auth.js.map +0 -1
  49. package/dist/http/canonicalize-file-paths.d.ts +0 -8
  50. package/dist/http/canonicalize-file-paths.d.ts.map +0 -1
  51. package/dist/http/canonicalize-file-paths.js +0 -43
  52. package/dist/http/canonicalize-file-paths.js.map +0 -1
  53. package/dist/http/cwd-validator.d.ts +0 -11
  54. package/dist/http/cwd-validator.d.ts.map +0 -1
  55. package/dist/http/cwd-validator.js +0 -130
  56. package/dist/http/cwd-validator.js.map +0 -1
  57. package/dist/http/errors.d.ts +0 -4
  58. package/dist/http/errors.d.ts.map +0 -1
  59. package/dist/http/errors.js +0 -9
  60. package/dist/http/errors.js.map +0 -1
  61. package/dist/http/execution-context.d.ts +0 -18
  62. package/dist/http/execution-context.d.ts.map +0 -1
  63. package/dist/http/execution-context.js +0 -61
  64. package/dist/http/execution-context.js.map +0 -1
  65. package/dist/http/handler-deps.d.ts +0 -19
  66. package/dist/http/handler-deps.d.ts.map +0 -1
  67. package/dist/http/handler-deps.js +0 -2
  68. package/dist/http/handler-deps.js.map +0 -1
  69. package/dist/http/handlers/control/batch-slice.d.ts +0 -4
  70. package/dist/http/handlers/control/batch-slice.d.ts.map +0 -1
  71. package/dist/http/handlers/control/batch-slice.js +0 -40
  72. package/dist/http/handlers/control/batch-slice.js.map +0 -1
  73. package/dist/http/handlers/control/batch.d.ts +0 -23
  74. package/dist/http/handlers/control/batch.d.ts.map +0 -1
  75. package/dist/http/handlers/control/batch.js +0 -332
  76. package/dist/http/handlers/control/batch.js.map +0 -1
  77. package/dist/http/handlers/control/context-blocks.d.ts +0 -22
  78. package/dist/http/handlers/control/context-blocks.d.ts.map +0 -1
  79. package/dist/http/handlers/control/context-blocks.js +0 -111
  80. package/dist/http/handlers/control/context-blocks.js.map +0 -1
  81. package/dist/http/handlers/introspection/health.d.ts +0 -20
  82. package/dist/http/handlers/introspection/health.d.ts.map +0 -1
  83. package/dist/http/handlers/introspection/health.js +0 -18
  84. package/dist/http/handlers/introspection/health.js.map +0 -1
  85. package/dist/http/handlers/introspection/status.d.ts +0 -26
  86. package/dist/http/handlers/introspection/status.d.ts.map +0 -1
  87. package/dist/http/handlers/introspection/status.js +0 -136
  88. package/dist/http/handlers/introspection/status.js.map +0 -1
  89. package/dist/http/handlers/tools/audit.d.ts +0 -4
  90. package/dist/http/handlers/tools/audit.d.ts.map +0 -1
  91. package/dist/http/handlers/tools/audit.js +0 -43
  92. package/dist/http/handlers/tools/audit.js.map +0 -1
  93. package/dist/http/handlers/tools/debug.d.ts +0 -4
  94. package/dist/http/handlers/tools/debug.d.ts.map +0 -1
  95. package/dist/http/handlers/tools/debug.js +0 -43
  96. package/dist/http/handlers/tools/debug.js.map +0 -1
  97. package/dist/http/handlers/tools/delegate.d.ts +0 -4
  98. package/dist/http/handlers/tools/delegate.d.ts.map +0 -1
  99. package/dist/http/handlers/tools/delegate.js +0 -43
  100. package/dist/http/handlers/tools/delegate.js.map +0 -1
  101. package/dist/http/handlers/tools/execute-plan.d.ts +0 -4
  102. package/dist/http/handlers/tools/execute-plan.d.ts.map +0 -1
  103. package/dist/http/handlers/tools/execute-plan.js +0 -45
  104. package/dist/http/handlers/tools/execute-plan.js.map +0 -1
  105. package/dist/http/handlers/tools/investigate.d.ts +0 -4
  106. package/dist/http/handlers/tools/investigate.d.ts.map +0 -1
  107. package/dist/http/handlers/tools/investigate.js +0 -64
  108. package/dist/http/handlers/tools/investigate.js.map +0 -1
  109. package/dist/http/handlers/tools/journal-recall.d.ts +0 -4
  110. package/dist/http/handlers/tools/journal-recall.d.ts.map +0 -1
  111. package/dist/http/handlers/tools/journal-recall.js +0 -40
  112. package/dist/http/handlers/tools/journal-recall.js.map +0 -1
  113. package/dist/http/handlers/tools/journal-record.d.ts +0 -4
  114. package/dist/http/handlers/tools/journal-record.d.ts.map +0 -1
  115. package/dist/http/handlers/tools/journal-record.js +0 -35
  116. package/dist/http/handlers/tools/journal-record.js.map +0 -1
  117. package/dist/http/handlers/tools/research.d.ts +0 -4
  118. package/dist/http/handlers/tools/research.d.ts.map +0 -1
  119. package/dist/http/handlers/tools/research.js +0 -64
  120. package/dist/http/handlers/tools/research.js.map +0 -1
  121. package/dist/http/handlers/tools/retry.d.ts +0 -4
  122. package/dist/http/handlers/tools/retry.d.ts.map +0 -1
  123. package/dist/http/handlers/tools/retry.js +0 -73
  124. package/dist/http/handlers/tools/retry.js.map +0 -1
  125. package/dist/http/handlers/tools/review.d.ts +0 -4
  126. package/dist/http/handlers/tools/review.d.ts.map +0 -1
  127. package/dist/http/handlers/tools/review.js +0 -43
  128. package/dist/http/handlers/tools/review.js.map +0 -1
  129. package/dist/http/middleware/body-reader.d.ts +0 -16
  130. package/dist/http/middleware/body-reader.d.ts.map +0 -1
  131. package/dist/http/middleware/body-reader.js +0 -44
  132. package/dist/http/middleware/body-reader.js.map +0 -1
  133. package/dist/http/middleware/caller-identity.d.ts +0 -16
  134. package/dist/http/middleware/caller-identity.d.ts.map +0 -1
  135. package/dist/http/middleware/caller-identity.js +0 -16
  136. package/dist/http/middleware/caller-identity.js.map +0 -1
  137. package/dist/http/middleware/decompress.d.ts +0 -14
  138. package/dist/http/middleware/decompress.d.ts.map +0 -1
  139. package/dist/http/middleware/decompress.js +0 -51
  140. package/dist/http/middleware/decompress.js.map +0 -1
  141. package/dist/http/project-registry.d.ts +0 -54
  142. package/dist/http/project-registry.d.ts.map +0 -1
  143. package/dist/http/project-registry.js +0 -130
  144. package/dist/http/project-registry.js.map +0 -1
  145. package/dist/http/request-observability.d.ts +0 -8
  146. package/dist/http/request-observability.d.ts.map +0 -1
  147. package/dist/http/request-observability.js +0 -20
  148. package/dist/http/request-observability.js.map +0 -1
  149. package/dist/http/request-pipeline.d.ts +0 -16
  150. package/dist/http/request-pipeline.d.ts.map +0 -1
  151. package/dist/http/request-pipeline.js +0 -144
  152. package/dist/http/request-pipeline.js.map +0 -1
  153. package/dist/http/server.d.ts +0 -17
  154. package/dist/http/server.d.ts.map +0 -1
  155. package/dist/http/server.js +0 -300
  156. package/dist/http/server.js.map +0 -1
  157. package/dist/http/types.d.ts +0 -20
  158. package/dist/http/types.d.ts.map +0 -1
  159. package/dist/http/types.js +0 -2
  160. package/dist/http/types.js.map +0 -1
  161. package/dist/skill-install/disabled-state.d.ts +0 -35
  162. package/dist/skill-install/disabled-state.d.ts.map +0 -1
  163. package/dist/skill-install/disabled-state.js +0 -96
  164. package/dist/skill-install/disabled-state.js.map +0 -1
  165. package/dist/skill-install/discover.d.ts +0 -29
  166. package/dist/skill-install/discover.d.ts.map +0 -1
  167. package/dist/skill-install/discover.js +0 -104
  168. package/dist/skill-install/discover.js.map +0 -1
  169. package/dist/skill-install/include-utils.d.ts +0 -27
  170. package/dist/skill-install/include-utils.d.ts.map +0 -1
  171. package/dist/skill-install/include-utils.js +0 -90
  172. package/dist/skill-install/include-utils.js.map +0 -1
  173. package/dist/skill-install/manifest.d.ts +0 -82
  174. package/dist/skill-install/manifest.d.ts.map +0 -1
  175. package/dist/skill-install/manifest.js +0 -215
  176. package/dist/skill-install/manifest.js.map +0 -1
  177. package/dist/skill-install/skill-installer-common.d.ts +0 -26
  178. package/dist/skill-install/skill-installer-common.d.ts.map +0 -1
  179. package/dist/skill-install/skill-installer-common.js +0 -139
  180. package/dist/skill-install/skill-installer-common.js.map +0 -1
  181. package/dist/skill-install/skill-installers/claude-code.d.ts +0 -43
  182. package/dist/skill-install/skill-installers/claude-code.d.ts.map +0 -1
  183. package/dist/skill-install/skill-installers/claude-code.js +0 -65
  184. package/dist/skill-install/skill-installers/claude-code.js.map +0 -1
  185. package/dist/skill-install/skill-installers/codex-cli.d.ts +0 -27
  186. package/dist/skill-install/skill-installers/codex-cli.d.ts.map +0 -1
  187. package/dist/skill-install/skill-installers/codex-cli.js +0 -84
  188. package/dist/skill-install/skill-installers/codex-cli.js.map +0 -1
  189. package/dist/skill-install/skill-installers/cursor.d.ts +0 -72
  190. package/dist/skill-install/skill-installers/cursor.d.ts.map +0 -1
  191. package/dist/skill-install/skill-installers/cursor.js +0 -81
  192. package/dist/skill-install/skill-installers/cursor.js.map +0 -1
  193. package/dist/skill-install/skill-installers/gemini-cli.d.ts +0 -50
  194. package/dist/skill-install/skill-installers/gemini-cli.d.ts.map +0 -1
  195. package/dist/skill-install/skill-installers/gemini-cli.js +0 -72
  196. package/dist/skill-install/skill-installers/gemini-cli.js.map +0 -1
  197. package/dist/skill-install/skill-manifest-sync.d.ts +0 -11
  198. package/dist/skill-install/skill-manifest-sync.d.ts.map +0 -1
  199. package/dist/skill-install/skill-manifest-sync.js +0 -65
  200. package/dist/skill-install/skill-manifest-sync.js.map +0 -1
  201. package/dist/skills/_shared/auth.md +0 -41
  202. package/dist/skills/_shared/error-handling.md +0 -31
  203. package/dist/skills/_shared/polling.md +0 -88
  204. package/dist/skills/_shared/response-shape.md +0 -55
  205. package/dist/skills/_shared/review-policy.md +0 -15
  206. package/dist/skills/mma-audit/SKILL.md +0 -270
  207. package/dist/skills/mma-context-blocks/SKILL.md +0 -148
  208. package/dist/skills/mma-debug/SKILL.md +0 -208
  209. package/dist/skills/mma-delegate/SKILL.md +0 -216
  210. package/dist/skills/mma-execute-plan/SKILL.md +0 -214
  211. package/dist/skills/mma-explore/SKILL.md +0 -190
  212. package/dist/skills/mma-investigate/SKILL.md +0 -258
  213. package/dist/skills/mma-journal-recall/SKILL.md +0 -242
  214. package/dist/skills/mma-journal-record/SKILL.md +0 -189
  215. package/dist/skills/mma-research/SKILL.md +0 -223
  216. package/dist/skills/mma-retry/SKILL.md +0 -221
  217. package/dist/skills/mma-review/SKILL.md +0 -209
  218. package/dist/skills/multi-model-agent/SKILL.md +0 -206
  219. package/dist/telemetry/consent.d.ts +0 -4
  220. package/dist/telemetry/consent.d.ts.map +0 -1
  221. package/dist/telemetry/consent.js +0 -40
  222. package/dist/telemetry/consent.js.map +0 -1
  223. package/dist/telemetry/flusher.d.ts +0 -19
  224. package/dist/telemetry/flusher.d.ts.map +0 -1
  225. package/dist/telemetry/flusher.js +0 -277
  226. package/dist/telemetry/flusher.js.map +0 -1
  227. package/dist/telemetry/generation.d.ts +0 -9
  228. package/dist/telemetry/generation.d.ts.map +0 -1
  229. package/dist/telemetry/generation.js +0 -33
  230. package/dist/telemetry/generation.js.map +0 -1
  231. package/dist/telemetry/identity.d.ts +0 -9
  232. package/dist/telemetry/identity.d.ts.map +0 -1
  233. package/dist/telemetry/identity.js +0 -35
  234. package/dist/telemetry/identity.js.map +0 -1
  235. package/dist/telemetry/install-id.d.ts +0 -13
  236. package/dist/telemetry/install-id.d.ts.map +0 -1
  237. package/dist/telemetry/install-id.js +0 -49
  238. package/dist/telemetry/install-id.js.map +0 -1
  239. package/dist/telemetry/install-meta.d.ts +0 -10
  240. package/dist/telemetry/install-meta.d.ts.map +0 -1
  241. package/dist/telemetry/install-meta.js +0 -15
  242. package/dist/telemetry/install-meta.js.map +0 -1
  243. package/dist/telemetry/queue.d.ts +0 -35
  244. package/dist/telemetry/queue.d.ts.map +0 -1
  245. package/dist/telemetry/queue.js +0 -287
  246. package/dist/telemetry/queue.js.map +0 -1
  247. package/dist/telemetry/recorder.d.ts +0 -39
  248. package/dist/telemetry/recorder.d.ts.map +0 -1
  249. package/dist/telemetry/recorder.js +0 -173
  250. package/dist/telemetry/recorder.js.map +0 -1
  251. package/scripts/postinstall.js +0 -36
@@ -1,270 +0,0 @@
1
- ---
2
- name: mma-audit
3
- description: >-
4
- Use when the user asks to audit a spec / plan / design doc / skill file. The
5
- `subtype` field picks the criteria set. `default` (prose-coherence) is the
6
- general doc auditor. `plan` verifies a code-execution plan against the actual
7
- codebase — run this before any `mma-execute-plan` dispatch. `spec` audits
8
- requirement prose for testability and decision-trace. `skill` audits a
9
- SKILL.md against reader-effectiveness criteria.
10
- when_to_use: >-
11
- User asks for a doc / spec / plan / skill audit OR a methodology skill
12
- (superpowers:dispatching-parallel-agents, /security-review) points at one AND
13
- mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
14
- Audit a CODE-EXECUTION PLAN against the codebase — use subtype=plan.
15
- version: 4.9.1
16
- ---
17
-
18
- # mma-audit
19
-
20
- ## Overview
21
-
22
- `mma-audit` sends a prose artifact to workers for structured auditing. The `subtype` field picks WHICH criteria set the workers apply — every subtype runs through the same sequential-criteria read-only lifecycle, but each one carries its own criteria list, semantics, and prompt scaffolding.
23
-
24
- **Four subtypes — picked by the kind of artifact, not by the lens you want:**
25
-
26
- | You're auditing… | Use… | What it checks |
27
- |---|---|---|
28
- | A general prose artifact (design doc, recommendation, post-mortem, README) | `subtype: 'default'` | Comprehensive prose-coherence — would a literal-following worker produce the right outcome from this prose alone? Catches ambiguity, contradictions, missing branches, drift, scope-creep. **Does NOT verify against any codebase.** |
29
- | A **code-execution PLAN** (`docs/superpowers/plans/*.md` or similar) before running it via `mma-execute-plan` | `subtype: 'plan'` | Plan-vs-codebase coherence — for every method / type / file path / signature / import / verify command the plan names, the codebase actually contains it as described. Catches the bug class the prose-coherence audit cannot see (e.g. plan says `registerBlock` but actual interface is `register`). |
30
- | A **requirement spec** (what we want, why; success criteria) | `subtype: 'spec'` | Requirement-prose executability across 9 criteria — testability, scope explicitness AND decomposability, acceptance-criteria coverage, non-functional capture, requirement conflicts, decision-trace, assumption exposure, placeholder scan, and design-decomposition presence (architecture / components / data flow / error handling / testing). |
31
- | A **SKILL.md** for an `mma-*` skill or comparable agent-facing playbook | `subtype: 'skill'` | Skill-file reader-effectiveness — when-to-use specificity, endpoint contract integrity, example correctness, anti-pattern coverage, link integrity. |
32
-
33
- If you want to bias workers toward a narrow lens (security only, performance only, accessibility only), put that in the free-text `background` portion of the prompt — `subtype` is criteria machinery, not a lens selector.
34
-
35
- ## When to Use
36
-
37
- - `subtype: 'default'` — a general prose artifact needs a critical read for internal executability (the artifact will be acted on by a worker reading the prose alone).
38
- - `subtype: 'plan'` — you have a written code-execution plan on disk and you're about to dispatch tasks from it via `mma-execute-plan`. This is the ONLY subtype that grounds findings against real source files.
39
- - `subtype: 'spec'` — you have a requirement / brainstorming-output spec and want to verify every requirement is testable, traceable, and unambiguous BEFORE writing the plan. Typical predecessor to `writing-plans`.
40
- - `subtype: 'skill'` — you're authoring or revising an `mma-*` skill or comparable SKILL.md and want to know whether agents will actually read it the right way.
41
-
42
- **Don't use mma-audit when:** the thing being audited is source code (→ `mma-review`); a 30-second `Read` would answer it; or you want to verify a plan that hasn't been written yet (write the plan first).
43
-
44
- ## Endpoint
45
-
46
- `POST /audit?cwd=<abs-path>`
47
-
48
- @include _shared/auth.md
49
-
50
- ## Request body
51
-
52
- ```json
53
- {
54
- "document": "inline content to audit (optional if filePaths given)",
55
- "subtype": "default",
56
- "filePaths": ["/project/docs/spec.md"],
57
- "contextBlockIds": []
58
- }
59
- ```
60
-
61
- | Field | Type | Required | Notes |
62
- |---|---|---|---|
63
- | `document` | string | no | Inline document content |
64
- | `subtype` | `'default' \| 'plan' \| 'spec' \| 'skill'` | no (defaults to `'default'`) | See "Picking subtype" below. |
65
- | `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
66
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
67
-
68
- Either `document` or `filePaths` (or both) must be provided.
69
-
70
- > Worker tier for `mma-audit` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
71
-
72
- ### Picking subtype
73
-
74
- | Value | When to use |
75
- |---|---|
76
- | `default` (or omit the field) | **General prose — design doc, recommendation, post-mortem, README, brief.** Comprehensive prose-coherence audit. Does NOT verify against any codebase. |
77
- | `plan` | **Code-execution plans being audited against a real codebase.** Single-file input (the plan markdown). Workers grep / read source files under `cwd` to verify every named symbol / path / signature / import / verify command. Use this BEFORE every `mma-execute-plan` dispatch. |
78
- | `spec` | **Requirement spec / brainstorming-output / what-we-want prose.** 9 criteria target testability, scope explicitness + decomposability, acceptance-criteria coverage, non-functional capture, requirement conflicts, decision-trace, assumption exposure, placeholder scan, and design-decomposition presence. |
79
- | `skill` | **`SKILL.md` or comparable agent-facing playbook.** Criteria target when-to-use specificity, endpoint contract integrity, example correctness, anti-pattern coverage, link integrity. |
80
-
81
- You can run BOTH on a plan: first `spec` or `default` (prose quality), then `plan` (does the plan match the codebase?). They cover orthogonal failure modes.
82
-
83
- The legacy `auditType` field and its `correctness` / `style` / `general` / `security` / `performance` values no longer exist. Sending `auditType` returns `400 invalid_request`. Sending unknown `subtype` values returns `400 invalid_request` with the allowed enum.
84
-
85
- ### Plan-audit specifics
86
-
87
- When `subtype: 'plan'`:
88
-
89
- - `filePaths` MUST contain exactly **one entry** — the plan markdown. Sending zero or 2+ entries → `400 invalid_request` with the message: *"Plan audit takes exactly one filePath (the plan markdown). The worker discovers and verifies source files itself via its tool surface — do not pre-list source files."*
90
- - `document` (inline content) is not used in plan mode — the plan must be on disk so workers can reference it by `?cwd=`-relative path.
91
- - The worker runs the sequential-criteria loop with the plan-audit criteria set across 12 perspectives in three groups: **EXTERNAL CODEBASE COHERENCE** (1 PATH EXISTENCE, 2 SYMBOL EXISTENCE, 3 SIGNATURE MATCH, 4 IMPORT GRAPH, 5 TEST HARNESS AVAILABILITY, 6 STEP SEQUENCE WITHIN TASK, 7 CROSS-TASK DEPENDENCIES, 8 VERIFICATION COMMAND VALIDITY), **INTRA-PLAN STRUCTURE** (9 TASK GRANULARITY, 11 PLACEHOLDER LANGUAGE, 12 PLAN SKELETON), and **SPEC ALIGNMENT** (10 SPEC COVERAGE).
92
- - To enable perspective 10 (SPEC COVERAGE), register the upstream spec as a context block via `mma-context-blocks` and pass its `blockId` in `contextBlockIds`. Without a spec in context, perspective 10 emits "No findings for this criterion." and the other 11 still run.
93
- - Read the findings list. Fix the plan and re-audit if any `critical` or `high` plan-audit findings remain.
94
-
95
- ## Full example
96
-
97
- ### Default audit (general prose)
98
-
99
- ```bash
100
- BATCH=$(curl -f --show-error -s -X POST \
101
- -H "X-MMA-Client: $MMA_CLIENT" \
102
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
103
- -H "Authorization: Bearer $TOKEN" \
104
- -H "Content-Type: application/json" \
105
- -d '{"subtype":"default","filePaths":["/project/docs/api-spec.md"]}' \
106
- "http://localhost:$PORT/audit?cwd=/project")
107
- BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
108
- ```
109
-
110
- ### Spec audit (requirement prose)
111
-
112
- ```bash
113
- BATCH=$(curl -f --show-error -s -X POST \
114
- -H "X-MMA-Client: $MMA_CLIENT" \
115
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
116
- -H "Authorization: Bearer $TOKEN" \
117
- -H "Content-Type: application/json" \
118
- -d '{"subtype":"spec","filePaths":["/project/docs/superpowers/specs/2026-05-12-feature-design.md"]}' \
119
- "http://localhost:$PORT/audit?cwd=/project")
120
- ```
121
-
122
- ### Skill audit (SKILL.md)
123
-
124
- ```bash
125
- BATCH=$(curl -f --show-error -s -X POST \
126
- -H "X-MMA-Client: $MMA_CLIENT" \
127
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
128
- -H "Authorization: Bearer $TOKEN" \
129
- -H "Content-Type: application/json" \
130
- -d '{"subtype":"skill","filePaths":["/project/packages/server/src/skills/mma-audit/SKILL.md"]}' \
131
- "http://localhost:$PORT/audit?cwd=/project")
132
- ```
133
-
134
- ### Plan audit (verify a code-execution plan against the codebase)
135
-
136
- ```bash
137
- BATCH=$(curl -f --show-error -s -X POST \
138
- -H "X-MMA-Client: $MMA_CLIENT" \
139
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
140
- -H "Authorization: Bearer $TOKEN" \
141
- -H "Content-Type: application/json" \
142
- -d '{"subtype":"plan","filePaths":["/project/docs/superpowers/plans/2026-05-10-feature.md"]}' \
143
- "http://localhost:$PORT/audit?cwd=/project")
144
- ```
145
-
146
- @include _shared/polling.md
147
-
148
- @include _shared/response-shape.md
149
-
150
- ## Reading the findings
151
-
152
- The main agent reads `completed` + `message` + `findings` — the findings are the answer. For
153
- read-only routes, `filesChanged` is always `[]` and `commitSha` is always `null`.
154
-
155
- ```json
156
- {
157
- "completed": true,
158
- "message": "Plan audit complete; 2 findings.",
159
- "findings": [
160
- { "id": "F1", "severity": "high", "category": "path-existence",
161
- "claim": "Step 3 names `src/utils/foo.ts` which does not exist.",
162
- "evidence": "Worker grepped for the file under cwd — no match found.",
163
- "suggestion": "Use `src/utils/bar.ts` instead.",
164
- "source": "implementer" }
165
- ],
166
- "filesChanged": [],
167
- "commitSha": null,
168
- "summary": "...",
169
- "telemetry": { ... }
170
- }
171
- ```
172
-
173
- ### Finding shape
174
-
175
- Every finding has this shape:
176
-
177
- | Field | Type | Notes |
178
- |---|---|---|
179
- | `id` | string | Worker-assigned, e.g. `F1`, `F2`. Stable across chain. |
180
- | `severity` | `'critical' \| 'high' \| 'medium' \| 'low'` | 4-tier. |
181
- | `category` | string | Topical bucket, e.g. `path-existence`, `prose-coherence`. |
182
- | `claim` | string | One-sentence summary. |
183
- | `evidence` | string ≥20 chars | Verbatim from source when grounded. |
184
- | `suggestion?` | string | Optional fix recommendation. |
185
- | `source` | `'implementer' \| 'reviewer'` | Who produced the finding. |
186
-
187
- `annotatorConfidence` and `evidenceGrounded` are retired — they were v4 fields with no producers.
188
-
189
- ### Recommended rendering by the main agent
190
-
191
- 1. Show ALL findings — never silently drop. Severity and grounding are soft
192
- signals, not gates.
193
- 2. Default sort: severity (critical → low), then `id` ascending.
194
- 3. `severity` is the authoritative value — use it directly.
195
- 4. Mark findings with `evidence` shorter than 30 chars as "low-evidence"
196
- (lighter color or `(low evidence)` annotation). User decides what to do.
197
- 5. Severity-tier counts feed the dashboard.
198
-
199
- ## Best practices
200
-
201
- This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-audit`:
202
-
203
- - **Recipe A — Audit-iterate-clean.** `mma-audit` → fix → `mma-audit` again. Sequential rounds. Register the doc via `mma-context-blocks` before round 1 and reuse the same ID across all rounds — avoids re-inlining the same content into every audit call.
204
-
205
- - **Recipe E — Plan-validate-execute.** Before any `mma-execute-plan` batch, run `mma-audit` with `subtype: 'plan'` on the plan file. Read the findings. If any `critical` / `high` finding survives, fix the plan and re-audit. This catches the bug class where the plan's named methods/files don't actually exist in the codebase — symbols a prose-coherence audit cannot see.
206
-
207
- - **Recipe F — Spec-then-plan-then-execute (the canonical flow).** When working from a brainstorming spec: `mma-audit` (`subtype: 'spec'`) → fix → `writing-plans` → register the spec as a context block via `mma-context-blocks` → `mma-audit` (`subtype: 'plan'`, `contextBlockIds: [specBlockId]`) → fix → `mma-execute-plan`. Spec audit covers requirement-prose executability; plan audit covers BOTH plan-vs-codebase coherence AND plan-vs-spec coverage (perspective 10 fires only when the spec is in context, which is why the context-block step is load-bearing in this recipe).
208
-
209
- Anti-pattern alert: **`parallel-rounds-same-target`** (AP1). Three parallel audits on the same document re-flag the same issues without seeing each other's fixes. Run rounds sequentially with a fix between each.
210
-
211
- ## Common pitfalls
212
-
213
- ❌ **Auditing source code with `mma-audit`**
214
- The auditor lacks codebase context (no type info, no call-site lookup, no test awareness). Findings are speculative. **Fix:** use `mma-review` — it pulls in surrounding source context and validates against the actual types.
215
-
216
- ❌ **Single huge `document` string instead of `filePaths`**
217
- Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
218
-
219
- ❌ **Sending the legacy `auditType` field**
220
- The field was renamed to `subtype` and the value set was narrowed. **Fix:** use `subtype` with one of `default` / `plan` / `spec` / `skill`. For "security only" / "performance only" lenses, put the bias in the free-text prompt — there is no narrow-lens subtype.
221
-
222
- ❌ **Re-auditing the same files round after round without delta context**
223
- Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
224
-
225
- ## Terminal context block
226
-
227
- Every completed **read-route** task (audit / review / debug / investigate / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
228
-
229
- Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
230
-
231
- contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
232
-
233
- **Use cases:**
234
- - Pass round-N audit findings to round N+1 via `contextBlockIds`
235
- - Feed audit results into a downstream `mma-delegate` fix step
236
- - Accumulate findings across iterative audit rounds
237
-
238
- The block is registered server-side at task completion; no caller action is needed to create it. Delete it explicitly via `DELETE /context-blocks/:id` when no longer needed, or let it expire on session teardown.
239
-
240
- ## Outcome semantics
241
-
242
- Every task result carries outcome fields that describe the audit's conclusion status:
243
-
244
- | Field | Type | Meaning |
245
- |---|---|---|
246
- | `findingsOutcome` | `'found' \| 'clean' \| 'not_applicable'` | Answers the question: did the audit uncover issues? |
247
- | `findingsOutcomeReason` | `string \| null` | When `findingsOutcome` is set, this explains why (e.g. "3 critical findings: broken paths, missing symbols, mismatched signatures" or "Document is clean across all audit criteria"). |
248
- | `outcomeInferred` | `boolean` | `true` if the system inferred the outcome from findings count; `false` if the auditor explicitly stated it. |
249
- | `outcomeMalformed` | `boolean` | `true` if the outcome line was malformed and had to be repaired; `false` otherwise. |
250
-
251
- ### Enum values
252
-
253
- - **`found`** — the audit surfaced one or more issues (findings) in the artifact across one or more criteria. This indicates the artifact needs rework before downstream use.
254
- - **`clean`** — the audit completed and found zero issues. The artifact is clear across all audit criteria and ready for downstream use.
255
- - **`not_applicable`** — the audit could not proceed (e.g., wrong input type, missing preconditions, or system error). This is rare; most audits resolve to `found` or `clean`.
256
-
257
- ### Empty findings ≠ failure
258
-
259
- A crucial semantic: **empty findings does NOT mean `completed: false` or a failed task.** Finding nothing wrong is a successful audit outcome — it means the document passed the bar. An audit with zero findings is `completed: true` with `findingsOutcome: 'clean'`.
260
-
261
- ### Per-route legal outcomes
262
-
263
- The legal outcomes for this route are: `['found', 'clean']`
264
-
265
- - **`found`** — one or more issues were detected across the audit criteria.
266
- - **`clean`** — zero issues were detected; the artifact is ready for downstream use.
267
-
268
- The outcome `not_applicable` is not legal for `mma-audit` (except on actual precondition failures) because an audit always produces a verdict: either issues found or clean.
269
-
270
- @include _shared/error-handling.md
@@ -1,148 +0,0 @@
1
- ---
2
- name: mma-context-blocks
3
- description: >-
4
- Use when a document larger than ~2 KB will be referenced by 2+ subsequent
5
- mma-* calls — register once, pass the returned ID to each call instead of
6
- re-uploading the same content. OR a spec / plan / error log was already
7
- inlined into one task and is about to be inlined into a second — register on
8
- the second reference, never the third.
9
- when_to_use: >-
10
- A document (spec, plan, codebase summary, prior round's findings, error log)
11
- larger than ~2 KB will be referenced by two or more mma-* calls in a row.
12
- Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
13
- mma-execute-plan / mma-audit / mma-review / mma-debug / mma-investigate.
14
- Cheaper and faster than inlining the same content N times.
15
- version: 4.9.1
16
- ---
17
-
18
- # mma-context-blocks
19
-
20
- ## Overview
21
-
22
- Store large documents once; reference them by ID in subsequent `mma-*` calls via `contextBlockIds`. The service prepends the block content to each task prompt that references the ID — content is transmitted ONCE to the daemon, then reused server-side.
23
-
24
- **Core principle:** Without context blocks, the same document is sent N times for N tasks. Blocks transmit once. The savings compound on shared specs, prior-round findings, and codebase summaries.
25
-
26
- ## When to Use
27
-
28
- **Use when:**
29
- - A doc >2 KB will be referenced by ≥2 mma-* calls
30
- - You're running iterative audit/review rounds (round 2 references round 1's findings)
31
- - A spec or design doc is the shared input across N parallel tasks
32
- - A long error log is the context for debug + delegate calls
33
-
34
- **Don't use when:**
35
- - The doc is <2 KB and used once → just inline it (registration overhead exceeds savings)
36
- - The doc changes between calls → context blocks are immutable; register a new one
37
- - Single task that doesn't reference any large shared content → no benefit
38
-
39
- ## Endpoints
40
-
41
- ### Register a context block
42
-
43
- `POST /context-blocks?cwd=<abs-path>`
44
-
45
- @include _shared/auth.md
46
-
47
- #### Request body
48
-
49
- ```json
50
- {
51
- "content": "# Project spec\n...",
52
- "ttlMs": 3600000
53
- }
54
- ```
55
-
56
- | Field | Type | Required | Notes |
57
- |---|---|---|---|
58
- | `content` | string | yes | Document content (min 1 char, max 50 MiB) |
59
- | `ttlMs` | number | no | Time-to-live in ms; omit for idle-expiry (default 24 h idle). A block that is not referenced by any active batch for 24 h is eligible for eviction. |
60
-
61
- #### Response (201)
62
-
63
- ```json
64
- { "id": "cb_abc123" }
65
- ```
66
-
67
- Use this `id` as a `contextBlockIds` entry in any `mma-*` skill that supports it.
68
-
69
- ### Delete a context block
70
-
71
- `DELETE /context-blocks/:id?cwd=<abs-path>`
72
-
73
- Returns `200 { ok: true }` on success. Returns `409 pinned` if the block is held by one or more active batches — wait for those batches to complete before deleting.
74
-
75
- ## Full example
76
-
77
- ```bash
78
- # Register spec document once
79
- ID=$(curl -f --show-error -s -X POST \
80
- -H "X-MMA-Client: $MMA_CLIENT" \
81
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
82
- -H "Authorization: Bearer $TOKEN" \
83
- -H "Content-Type: application/json" \
84
- -d "{\"content\":$(jq -Rs . < /project/docs/spec.md)}" \
85
- "http://localhost:$PORT/context-blocks?cwd=/project" | jq -r '.id')
86
-
87
- # Reference from N delegate tasks
88
- curl -f --show-error -s -X POST \
89
- -H "X-MMA-Client: $MMA_CLIENT" \
90
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
91
- -H "Authorization: Bearer $TOKEN" \
92
- -H "Content-Type: application/json" \
93
- -d "{\"tasks\":[
94
- {\"prompt\":\"Implement section 3 per spec\",\"contextBlockIds\":[\"$ID\"]},
95
- {\"prompt\":\"Implement section 4 per spec\",\"contextBlockIds\":[\"$ID\"]}
96
- ]}" \
97
- "http://localhost:$PORT/delegate?cwd=/project"
98
- ```
99
-
100
- ## v5 wire shape (register-context-block route)
101
-
102
- Every task result is a `ComposePayload`. For the `register-context-block` route, the envelope has one additional field beyond the standard seven:
103
-
104
- ```json
105
- {
106
- "completed": true,
107
- "message": "Context block cb_abc123 registered (12345 bytes)",
108
- "findings": [],
109
- "summary": "",
110
- "filesChanged": [],
111
- "commitSha": null,
112
- "blockId": "cb_abc123",
113
- "telemetry": { ... }
114
- }
115
- ```
116
-
117
- `blockId` is **non-null only for the `register-context-block` route**. For every other route (`delegate`, `execute-plan`, `investigate`, etc.), `blockId` is `null`. This is the only signal that distinguishes a register-context-block result from any other route — no route-keyed discriminated union, just one extra nullable field on the shared shape.
118
-
119
- The terminal context block (per-task, auto-registered) uses a different ID format and is separate from the `blockId` in the wire envelope.
120
-
121
- ## Best practices
122
-
123
- This skill is the cross-cutting state mechanism described in `multi-model-agent` → "Best practices". Recipes that use context blocks:
124
-
125
- - **Recipe A — Audit-iterate-clean.** Register the doc once before round 1; pass round-N's findings block ID into round N+1.
126
- - **Recipe B — Debug-fix-verify.** Register the failing test output / reproduction log before the debug call; reuse on verify.
127
- - **Recipe C — Investigate-plan-execute.** Register the plan file before `mma-execute-plan`.
128
- - **Recipe D — Plan-execute-retry.** No new registration needed — `mma-retry` inherits the original batch's `contextBlockIds`.
129
-
130
- Anti-pattern alert: **`re-inlined-shared-content`** (AP3). Pasting the same spec into 5 task prompts costs N× tokens. Register once; pass `contextBlockIds`.
131
-
132
- ## Common pitfalls
133
-
134
- ❌ **Inlining the same 50KB spec into every task prompt**
135
- > tasks: [{prompt: "Implement section 3:\n[50KB spec]"}, {prompt: "Implement section 4:\n[50KB spec]"}]
136
-
137
- N×50KB transmissions; main context burns through tokens. **Fix:** register the spec once, pass `contextBlockIds: ["cb_xxx"]` to each task.
138
-
139
- ❌ **Forgetting to delete unused blocks**
140
- Blocks count against the project's context-block quota (`maxEntries` 500). **Fix:** explicitly `DELETE` after the dependent batches finish — or let idle expiry (24 h) evict them.
141
-
142
- ❌ **Trying to update a block's content**
143
- Blocks are immutable. **Fix:** register a new block with the new content; switch the `contextBlockIds` to the new ID.
144
-
145
- ❌ **Deleting a block while a batch still references it**
146
- Returns `409 pinned`. **Fix:** poll the dependent batches to terminal first, then delete.
147
-
148
- @include _shared/error-handling.md
@@ -1,208 +0,0 @@
1
- ---
2
- name: mma-debug
3
- description: >-
4
- Use when a test fails, a build breaks, or behavior is unexpected AND narrowing
5
- the root cause requires reading files, reproducing the failure, or tracing
6
- across multiple modules — the worker investigates so the main agent stays on
7
- the hypothesis
8
- when_to_use: >-
9
- A failure has surfaced (test/build/runtime) AND you need investigation work —
10
- read files, reproduce, trace — OR a methodology skill
11
- (superpowers:systematic-debugging) points at the investigation step. Delegate
12
- the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
13
- version: 4.9.1
14
- ---
15
-
16
- # mma-debug
17
-
18
- ## Overview
19
-
20
- Submit a problem, context, and hypothesis to a worker for focused debugging. Unlike `mma-audit` and `mma-review`, all `filePaths` are investigated TOGETHER in a single task (not parallelized per file) — debugging needs cross-file reasoning.
21
-
22
- **Core principle:** The hypothesis is judgment (your job). Reading files and reproducing the failure is labor (the worker's job). Pass the hypothesis as input; receive structured findings.
23
-
24
- ## When to Use
25
-
26
- **Use when:**
27
- - A test fails / build breaks / runtime behavior is unexpected
28
- - The root cause likely spans 2+ files
29
- - You have a hypothesis to test (or want the worker to suggest one)
30
- - A methodology skill (`superpowers:systematic-debugging`) routed here
31
-
32
- **Don't use when:**
33
- - The error message points at one file you can read in 30 seconds → just `Read`
34
- - You don't know what's broken yet → use `mma-investigate` first to map the area
35
- - You already know the fix → skip debug, dispatch `mma-delegate` with the fix
36
-
37
- ## Endpoint
38
-
39
- `POST /debug?cwd=<abs-path>`
40
-
41
- @include _shared/auth.md
42
-
43
- ## Request body
44
-
45
- ```json
46
- {
47
- "problem": "POST /login returns 500 when password contains special characters",
48
- "context": "Regression introduced in commit abc123; only affects production config",
49
- "hypothesis": "The bcrypt binding fails on non-ASCII input in the Docker image",
50
- "subtype": "default",
51
- "filePaths": [
52
- "/project/src/auth/login.ts",
53
- "/project/src/auth/password.ts"
54
- ],
55
- "contextBlockIds": []
56
- }
57
- ```
58
-
59
- | Field | Type | Required | Notes |
60
- |---|---|---|---|
61
- | `problem` | string | yes | What is broken (one sentence; concrete symptom) |
62
- | `context` | string | no | Background — what changed recently, what works, what doesn't |
63
- | `hypothesis` | string | no | Your initial theory; worker tests it first, then explores |
64
- | `subtype` | `'default'` | no (defaults to `'default'`) | Reserved for future criteria sets; only `default` is wired today. |
65
- | `filePaths` | string[] | no | All files investigated together (cross-file reasoning) |
66
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` (e.g. error logs, traces) |
67
-
68
- > Worker tier for `mma-debug` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
69
-
70
- ## Full example
71
-
72
- ```bash
73
- BATCH=$(curl -f --show-error -s -X POST \
74
- -H "X-MMA-Client: $MMA_CLIENT" \
75
- -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
76
- -H "Authorization: Bearer $TOKEN" \
77
- -H "Content-Type: application/json" \
78
- -d '{"problem":"Tests fail on CI only","hypothesis":"Missing env var","filePaths":["/project/src/config.ts"]}' \
79
- "http://localhost:$PORT/debug?cwd=/project")
80
- BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
81
- ```
82
-
83
- @include _shared/polling.md
84
-
85
- @include _shared/response-shape.md
86
-
87
- ## Reading the findings
88
-
89
- The main agent reads `completed` + `message` + `findings` — the findings are the answer. For
90
- read-only routes, `filesChanged` is always `[]` and `commitSha` is always `null`.
91
-
92
- ```json
93
- {
94
- "completed": true,
95
- "message": "Investigation complete; 1 finding.",
96
- "findings": [
97
- { "id": "F1", "severity": "high", "category": "root-cause",
98
- "claim": "bcrypt binding fails on non-ASCII input in the Docker image.",
99
- "evidence": "Worker reproduced the failure with `pass='café'`; strace shows EINVAL on encode call.",
100
- "suggestion": "Normalize input to NFC form before calling bcrypt.",
101
- "source": "implementer" }
102
- ],
103
- "filesChanged": [],
104
- "commitSha": null,
105
- "summary": "...",
106
- "telemetry": { ... }
107
- }
108
- ```
109
-
110
- ### Finding shape
111
-
112
- Every finding has this shape:
113
-
114
- | Field | Type | Notes |
115
- |---|---|---|
116
- | `id` | string | Worker-assigned, e.g. `F1`, `F2`. Stable across chain. |
117
- | `severity` | `'critical' \| 'high' \| 'medium' \| 'low'` | 4-tier. |
118
- | `category` | string | Topical bucket, e.g. `root-cause`, `reproduction`. |
119
- | `claim` | string | One-sentence summary. |
120
- | `evidence` | string ≥20 chars | Verbatim from source when grounded. |
121
- | `suggestion?` | string | Optional fix recommendation. |
122
- | `source` | `'implementer' \| 'reviewer'` | Who produced the finding. |
123
-
124
- `annotatorConfidence` and `evidenceGrounded` are retired — they were v4 fields with no producers.
125
-
126
- ### Recommended rendering by the main agent
127
-
128
- 1. Show ALL findings — never silently drop. Severity and grounding are soft
129
- signals, not gates.
130
- 2. Default sort: severity (critical → low), then `id` ascending.
131
- 3. `severity` is the authoritative value — use it directly.
132
- 4. Mark findings with `evidence` shorter than 30 chars as "low-evidence"
133
- (lighter color or `(low evidence)` annotation). User decides what to do.
134
- 5. Severity-tier counts feed the dashboard.
135
-
136
- ## Best practices
137
-
138
- This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-debug`:
139
-
140
- - **Recipe B — Debug-fix-review.** `mma-debug` → `mma-delegate` (apply fix) → `mma-review` with the acceptance criteria in the brief. Strict order. Register the failing test output / reproduction log as a context block before the debug call; reuse it on the review call.
141
-
142
- Anti-pattern alert: **`inline-labor-leakage`** (AP2). If you're about to read 3+ files in main context to "understand the bug," that's the labor we delegate — call `mma-debug` with the hypothesis instead.
143
-
144
- ## Common pitfalls
145
-
146
- ❌ **Vague `problem`**
147
- > "The login is broken"
148
-
149
- Worker has no symptom to chase. **Fix:** specific reproducer — `"POST /login with body {user:'a@b.c', pass:'café'} returns 500 with 'invalid character' in stderr"`.
150
-
151
- ❌ **No `hypothesis`**
152
- The worker explores blindly, often investigates the wrong area first. **Fix:** even a weak hypothesis ("might be encoding-related") narrows the search space.
153
-
154
- ❌ **Splitting one bug across multiple `mma-debug` calls**
155
- Debug intentionally bundles `filePaths` for cross-file reasoning. Splitting defeats this. **Fix:** one call with all suspect files; if you really have N independent failures, use `mma-delegate` with N tasks.
156
-
157
- ❌ **Treating `mma-debug` as the fix step**
158
- Debug investigates and proposes; it doesn't necessarily write the fix. **Fix:** if the worker identifies a fix, dispatch `mma-delegate` to implement it (or write it inline if you understand it).
159
-
160
- ❌ **Skipping when an error message looks self-explanatory**
161
- Often the obvious cause isn't the real one. **Fix:** a 30-second debug pass costs less than a wrong fix that breaks something else.
162
-
163
- ## Terminal context block
164
-
165
- Every completed **read-route** task (audit / review / debug / investigate / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
166
-
167
- Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
168
-
169
- contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
170
-
171
- **Use cases:**
172
- - Pass debug findings to a downstream `mma-delegate` fix step
173
- - Feed the root-cause analysis into a follow-up `mma-review` with acceptance criteria in the brief
174
- - Carry debug context forward through the debug → fix → review chain
175
-
176
- The block is registered server-side at task completion; no caller action is needed to create it. Delete it explicitly via `DELETE /context-blocks/:id` when no longer needed, or let it expire on session teardown.
177
-
178
- ## Outcome semantics
179
-
180
- Every task result carries outcome fields that describe the debugging investigation's conclusion status:
181
-
182
- | Field | Type | Meaning |
183
- |---|---|---|
184
- | `findingsOutcome` | `'found' \| 'clean' \| 'not_applicable'` | Answers the question: did the investigation identify a root cause? |
185
- | `findingsOutcomeReason` | `string \| null` | When `findingsOutcome` is set, this explains why (e.g. "Root cause identified with high confidence: bcrypt binding fails on non-ASCII input" or "No evidence supports the hypothesis; root cause remains unknown"). |
186
- | `outcomeInferred` | `boolean` | `true` if the system inferred the outcome from findings count; `false` if the investigator explicitly stated it. |
187
- | `outcomeMalformed` | `boolean` | `true` if the outcome line was malformed and had to be repaired; `false` otherwise. |
188
-
189
- ### Enum values
190
-
191
- - **`found`** — the investigation identified one or more root-cause hypotheses (findings) with supporting evidence. This indicates the problem has a diagnosed cause.
192
- - **`clean`** — the investigation completed but found zero root causes. This is rare for debug and indicates the failure remains unexplained despite thorough investigation.
193
- - **`not_applicable`** — the investigation could not proceed (e.g., inability to reproduce the failure, missing context, or out of scope). This is the "unable to diagnose" state.
194
-
195
- ### Empty findings ≠ failure
196
-
197
- A crucial semantic: **empty findings does NOT mean `completed: false` or a failed debug session.** An investigation that proceeds thoroughly and produces zero root-cause candidates is a valid `completed: true` outcome; it means "I looked hard and found nothing." For debug, this often surfaces a `not_applicable` outcome instead (root cause is elsewhere), but zero findings is still a success.
198
-
199
- ### Per-route legal outcomes
200
-
201
- The legal outcomes for this route are: `['found', 'not_applicable']`
202
-
203
- - **`found`** — one or more root-cause hypotheses were identified across the investigation criteria.
204
- - **`not_applicable`** — the failure could not be diagnosed (reproduction failed, wrong area, or scope issue).
205
-
206
- The outcome `clean` (zero findings + success) is not legal for `mma-debug` because a debug session always either identifies a root cause or cannot proceed.
207
-
208
- @include _shared/error-handling.md