@zhixuan92/multi-model-agent 3.3.0 → 3.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/README.md +76 -33
  2. package/dist/http/canonicalize-file-paths.d.ts +8 -0
  3. package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
  4. package/dist/http/canonicalize-file-paths.js +43 -0
  5. package/dist/http/canonicalize-file-paths.js.map +1 -0
  6. package/dist/http/execution-context.d.ts.map +1 -1
  7. package/dist/http/execution-context.js +0 -14
  8. package/dist/http/execution-context.js.map +1 -1
  9. package/dist/http/handlers/tools/execute-plan.d.ts.map +1 -1
  10. package/dist/http/handlers/tools/execute-plan.js +21 -3
  11. package/dist/http/handlers/tools/execute-plan.js.map +1 -1
  12. package/dist/http/handlers/tools/investigate.d.ts +4 -0
  13. package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
  14. package/dist/http/handlers/tools/investigate.js +81 -0
  15. package/dist/http/handlers/tools/investigate.js.map +1 -0
  16. package/dist/http/server.d.ts.map +1 -1
  17. package/dist/http/server.js +5 -2
  18. package/dist/http/server.js.map +1 -1
  19. package/dist/install/discover.d.ts +1 -1
  20. package/dist/install/discover.d.ts.map +1 -1
  21. package/dist/install/discover.js +1 -0
  22. package/dist/install/discover.js.map +1 -1
  23. package/dist/openapi.d.ts.map +1 -1
  24. package/dist/openapi.js +6 -0
  25. package/dist/openapi.js.map +1 -1
  26. package/dist/skills/_shared/verify-and-review.md +12 -0
  27. package/dist/skills/mma-audit/SKILL.md +45 -18
  28. package/dist/skills/mma-clarifications/SKILL.md +73 -29
  29. package/dist/skills/mma-context-blocks/SKILL.md +56 -24
  30. package/dist/skills/mma-debug/SKILL.md +54 -22
  31. package/dist/skills/mma-delegate/SKILL.md +58 -26
  32. package/dist/skills/mma-execute-plan/SKILL.md +55 -29
  33. package/dist/skills/mma-investigate/SKILL.md +137 -0
  34. package/dist/skills/mma-retry/SKILL.md +65 -22
  35. package/dist/skills/mma-review/SKILL.md +49 -20
  36. package/dist/skills/mma-verify/SKILL.md +49 -18
  37. package/dist/skills/multi-model-agent/SKILL.md +84 -46
  38. package/package.json +2 -2
@@ -0,0 +1,12 @@
1
+ ### `verifyCommand` — local verification after each task
2
+
3
+ Set when the worker can run a deterministic local check after editing — `npm test`, `npm run lint`, a focused package test. Commands run in order; each must be non-empty after trimming. Output is fed back to the reviewer. Omit when no reliable command exists.
4
+
5
+ ### `reviewPolicy` — review lifecycle per task
6
+
7
+ | Value | Behavior | Use when |
8
+ |---|---|---|
9
+ | `"full"` | Spec review + quality review (default) | Default for new code or risky edits |
10
+ | `"spec_only"` | Spec review only | Quality review unnecessary (mechanical work) |
11
+ | `"diff_only"` | Single-pass review of the produced diff | Cheap mechanical refactors (file moves, renames, import-path updates) |
12
+ | `"off"` | Skip review entirely | Trusted low-risk tasks where `verifyCommand` is enough |
@@ -1,30 +1,45 @@
1
1
  ---
2
2
  name: mma-audit
3
3
  description: >-
4
- Audit a document, spec, or config for security, performance, correctness, or
5
- style issues via the local mmagent HTTP service. Sub-agents run in parallel
6
- per file no context pollution in the main model.
4
+ Use when the user asks to audit a document, spec, config, or PR description
5
+ for security, correctness, performance, or style issues and the audit can
6
+ run in parallel per file with no context pollution
7
7
  when_to_use: >-
8
- The user asks to audit a document, spec, or config (for security, correctness,
9
- performance, or style) OR a methodology skill
10
- (superpowers:dispatching-parallel-agents, /security-review) points at an audit
11
- task. Delegate via mmagent so the audit runs on independent workers — your
12
- main context stays free to synthesize findings.
13
- version: 3.3.0
8
+ User asks for a doc/spec/config audit OR a methodology skill
9
+ (superpowers:dispatching-parallel-agents, /security-review) points at one AND
10
+ mmagent is running. Delegate so each file audits on its own worker; the main
11
+ agent only synthesizes findings. Audit on PROSE/SPEC docs use mma-review for
12
+ source code.
13
+ version: 3.5.0
14
14
  ---
15
15
 
16
- ## mma-audit
16
+ # mma-audit
17
17
 
18
- Send a document or set of files to sub-agents for structured auditing. Each
19
- file is audited independently in parallel; results are indexed by file.
18
+ ## Overview
20
19
 
21
- ### Endpoint
20
+ Send a document or set of files to workers for structured auditing. Each file is audited independently in parallel; per-file results are indexed by path in the terminal envelope.
21
+
22
+ **Core principle:** One worker per file = no cross-file context pollution. The aggregator (you) decides what to do with the findings.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A spec / design doc / API contract / config file needs a critical read
28
+ - The audit type is `security`, `performance`, `correctness`, or `style` (or a combination)
29
+ - 2+ files would benefit from parallel audit
30
+
31
+ **Don't use when:**
32
+ - The thing being audited is source code → `mma-review` (knows about types, call sites, test coverage)
33
+ - You want a quick look ("does this look right?") → just `Read` and use your judgment
34
+ - The doc references many other files the auditor must cross-reference → consider `mma-review` instead (it pulls in source context)
35
+
36
+ ## Endpoint
22
37
 
23
38
  `POST /audit?cwd=<abs-path>`
24
39
 
25
40
  @include _shared/auth.md
26
41
 
27
- ### Request body
42
+ ## Request body
28
43
 
29
44
  ```json
30
45
  {
@@ -39,12 +54,12 @@ file is audited independently in parallel; results are indexed by file.
39
54
  |---|---|---|---|
40
55
  | `document` | string | no | Inline document content |
41
56
  | `auditType` | string \| string[] | yes | `security`, `performance`, `correctness`, `style`, or `general`; or an array of the first four |
42
- | `filePaths` | string[] | no | Files to audit (parallel) |
57
+ | `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
43
58
  | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
44
59
 
45
60
  Either `document` or `filePaths` (or both) must be provided.
46
61
 
47
- ### Full example
62
+ ## Full example
48
63
 
49
64
  ```bash
50
65
  BATCH=$(curl -f --show-error -s -X POST \
@@ -55,10 +70,22 @@ BATCH=$(curl -f --show-error -s -X POST \
55
70
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
56
71
  ```
57
72
 
58
- Then poll until complete:
59
-
60
73
  @include _shared/polling.md
61
74
 
62
75
  @include _shared/response-shape.md
63
76
 
77
+ ## Common pitfalls
78
+
79
+ ❌ **Auditing source code with `mma-audit`**
80
+ The auditor lacks codebase context (no type info, no call-site lookup, no test awareness). Findings are speculative. **Fix:** use `mma-review` — it pulls in surrounding source context and validates against the actual types.
81
+
82
+ ❌ **Single huge `document` string instead of `filePaths`**
83
+ Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
84
+
85
+ ❌ **Asking for `auditType: "general"` when you mean something specific**
86
+ `"general"` is a catch-all that produces watery findings. **Fix:** pick the dimension you actually care about (`"correctness"` for spec gaps, `"security"` for threat models, etc.).
87
+
88
+ ❌ **Re-auditing the same files round after round without delta context**
89
+ Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
90
+
64
91
  @include _shared/error-handling.md
@@ -1,33 +1,65 @@
1
1
  ---
2
2
  name: mma-clarifications
3
3
  description: >-
4
- Confirm or correct mmagent's proposed interpretation when a batch is awaiting
5
- clarification before it can proceed. Paired skill to every mma-* task
6
- dispatcher.
4
+ Use when a previous mma-* batch's terminal envelope has
5
+ `proposedInterpretation` as a string (not the `not_applicable` sentinel) the
6
+ service paused waiting for you to confirm or correct its read of the task
7
7
  when_to_use: >-
8
- A previous mma-delegate / mma-audit / mma-review / mma-execute-plan / etc.
9
- terminal envelope has `proposedInterpretation` as a string (not a
10
- NotApplicable sentinel). Read the proposal and call this skill to accept or
11
- correct it. The batch resumes after the POST returns.
12
- version: 3.3.0
8
+ A previous mma-delegate / mma-audit / mma-review / mma-execute-plan /
9
+ mma-debug / mma-investigate terminal envelope has `proposedInterpretation` as
10
+ a string. Read the proposal, decide whether to accept or correct it, then call
11
+ this skill. The batch resumes immediately after the POST returns.
12
+ version: 3.5.0
13
13
  ---
14
14
 
15
- ## mma-clarifications
15
+ # mma-clarifications
16
16
 
17
- When a batch pauses with `state: 'awaiting_clarification'`, the service has
18
- proposed an interpretation of the task and is waiting for your decision.
19
- Read the proposal, then call `POST /clarifications/confirm` to either accept
20
- or correct it. The batch resumes immediately after confirmation.
17
+ ## Overview
21
18
 
22
- ### Endpoint
19
+ When a batch pauses with `state: 'awaiting_clarification'`, the service has proposed an interpretation of an ambiguous task and is waiting for your decision. Read the proposal, then `POST /clarifications/confirm` with either the proposal verbatim (accept) or a corrected version (override). The batch resumes immediately.
20
+
21
+ **Core principle:** Clarification is a quality gate, not an error. Ambiguous tasks would silently produce the wrong work — the pause forces a deliberate choice.
22
+
23
+ ## When to Use
24
+
25
+ ```dot
26
+ digraph when_to_use {
27
+ "Polling a batch?" [shape=diamond];
28
+ "state == awaiting_clarification?" [shape=diamond];
29
+ "proposedInterpretation is a string?" [shape=diamond];
30
+ "Read proposal" [shape=box];
31
+ "Accept or correct" [shape=diamond];
32
+ "POST proposal verbatim" [shape=box];
33
+ "POST corrected text" [shape=box];
34
+
35
+ "Polling a batch?" -> "state == awaiting_clarification?";
36
+ "state == awaiting_clarification?" -> "proposedInterpretation is a string?" [label="yes"];
37
+ "state == awaiting_clarification?" -> "Continue polling" [label="no"];
38
+ "proposedInterpretation is a string?" -> "Read proposal" [label="yes"];
39
+ "Read proposal" -> "Accept or correct";
40
+ "Accept or correct" -> "POST proposal verbatim" [label="proposal is right"];
41
+ "Accept or correct" -> "POST corrected text" [label="proposal is wrong"];
42
+ }
43
+ ```
44
+
45
+ **Use when:**
46
+ - Polling a batch and the terminal envelope has `proposedInterpretation` as a string
47
+ - The mma-* skill that dispatched explicitly references this skill in its "if awaiting_clarification" line
48
+
49
+ **Don't use when:**
50
+ - `proposedInterpretation` is `{ kind: 'not_applicable', ... }` → batch isn't waiting; just read `results`
51
+ - The batch failed (`error` is a real object) → don't confirm; debug or re-dispatch
52
+ - You don't yet have a `batchId` → this skill resumes existing batches, not new ones
53
+
54
+ ## Endpoint
23
55
 
24
56
  `POST /clarifications/confirm`
25
57
 
26
- Auth required. Not cwd-gated (operates on a `batchId`).
58
+ Auth required. NOT cwd-gated operates on a `batchId`.
27
59
 
28
60
  @include _shared/auth.md
29
61
 
30
- ### Request body
62
+ ## Request body
31
63
 
32
64
  ```json
33
65
  {
@@ -39,38 +71,50 @@ Auth required. Not cwd-gated (operates on a `batchId`).
39
71
  | Field | Type | Required | Notes |
40
72
  |---|---|---|---|
41
73
  | `batchId` | string (UUID) | yes | Batch in `awaiting_clarification` state |
42
- | `interpretation` | string | yes | Accept proposal verbatim or provide a corrected version |
74
+ | `interpretation` | string | yes | Accept proposal verbatim, OR provide corrected text the worker should follow instead |
43
75
 
44
- ### Response (200)
76
+ ## Response (200)
45
77
 
46
78
  ```json
47
79
  { "batchId": "...", "state": "pending" }
48
80
  ```
49
81
 
50
- `state` is usually `pending` (batch resumes). It may be `complete` if the
51
- executor was already waiting and finishes immediately.
82
+ `state` is usually `pending` (batch resumes). May be `complete` if the executor was already waiting and finishes immediately.
52
83
 
53
- ### Full flow
84
+ ## Full flow
54
85
 
55
86
  ```bash
56
- # 1. Poll until awaiting_clarification
57
- STATE=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
58
- "http://localhost:$PORT/batch/$BATCH_ID" | jq -r '.state')
87
+ # 1. Poll until terminal
88
+ RESP=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
89
+ "http://localhost:$PORT/batch/$BATCH_ID")
59
90
 
60
- # 2. Read the proposal
61
- PROPOSAL=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
62
- "http://localhost:$PORT/batch/$BATCH_ID" | jq -r '.proposedInterpretation')
91
+ # 2. Check for a string proposal (not the not_applicable sentinel)
92
+ PROPOSAL=$(echo "$RESP" | jq -r 'select(.proposedInterpretation | type == "string") | .proposedInterpretation')
63
93
 
64
- # 3. Confirm (accept proposal or supply corrected text)
94
+ # 3. Confirm accept proposal verbatim, or supply corrected text
65
95
  curl -f --show-error -s -X POST \
66
96
  -H "Authorization: Bearer $TOKEN" \
67
97
  -H "Content-Type: application/json" \
68
98
  -d "{\"batchId\":\"$BATCH_ID\",\"interpretation\":\"$PROPOSAL\"}" \
69
99
  "http://localhost:$PORT/clarifications/confirm"
70
100
 
71
- # 4. Resume polling
101
+ # 4. Resume polling for terminal
72
102
  ```
73
103
 
74
104
  @include _shared/polling.md
75
105
 
106
+ ## Common pitfalls
107
+
108
+ ❌ **Confirming a wrong proposal verbatim because "the service knows best"**
109
+ The service is GUESSING from limited context. If the proposal would do the wrong thing, supply corrected `interpretation` text. **Why:** post-confirmation work is hard to undo.
110
+
111
+ ❌ **Treating the pause as an error**
112
+ `awaiting_clarification` is a SUCCESS path — it caught ambiguity before producing wrong work. Read, decide, confirm.
113
+
114
+ ❌ **Forgetting the `batchId` is the original, not a new one**
115
+ This endpoint mutates the existing batch — it does not create a new one. **Fix:** poll the SAME `batchId` after confirming.
116
+
117
+ ❌ **Polling without checking `proposedInterpretation`'s shape**
118
+ The field is either a `string` (paused) or `{ kind: 'not_applicable' }` (terminal). **Fix:** check the JSON type before treating it as text.
119
+
76
120
  @include _shared/error-handling.md
@@ -1,23 +1,40 @@
1
1
  ---
2
2
  name: mma-context-blocks
3
3
  description: >-
4
- Register large reused documents (spec, plan, codebase summary) as a context
5
- block the mmagent service caches, then reference it by ID across multiple
6
- mma-* calls. Avoids re-uploading the same content on every task.
4
+ Use when a document larger than ~2 KB will be referenced by 2+ subsequent
5
+ mma-* calls register once, pass the returned ID to each call instead of
6
+ re-uploading the same content
7
7
  when_to_use: >-
8
- A document larger than ~2 KB will be referenced by two or more mma-* calls in
9
- a row. Register once here, then pass the returned ID via the contextBlockIds
10
- field on mma-delegate / mma-execute-plan / mma-audit / mma-review / mma-verify
11
- / mma-debug. Cheaper and faster than inlining the same content in every
12
- request body.
13
- version: 3.3.0
8
+ A document (spec, plan, codebase summary, prior round's findings, error log)
9
+ larger than ~2 KB will be referenced by two or more mma-* calls in a row.
10
+ Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
11
+ mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
12
+ mma-investigate. Cheaper and faster than inlining the same content N times.
13
+ version: 3.5.0
14
14
  ---
15
15
 
16
- ## mma-context-blocks
16
+ # mma-context-blocks
17
17
 
18
- Store large documents once; reference them by ID in subsequent `mma-*` calls
19
- via `contextBlockIds`. The service prepends the block content to each task
20
- prompt that references it.
18
+ ## Overview
19
+
20
+ Store large documents once; reference them by ID in subsequent `mma-*` calls via `contextBlockIds`. The service prepends the block content to each task prompt that references the ID — content is transmitted ONCE to the daemon, then reused server-side.
21
+
22
+ **Core principle:** Without context blocks, the same document is sent N times for N tasks. Blocks transmit once. The savings compound on shared specs, prior-round findings, and codebase summaries.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A doc >2 KB will be referenced by ≥2 mma-* calls
28
+ - You're running iterative audit/review rounds (round 2 references round 1's findings)
29
+ - A spec or design doc is the shared input across N parallel tasks
30
+ - A long error log is the context for debug + delegate calls
31
+
32
+ **Don't use when:**
33
+ - The doc is <2 KB and used once → just inline it (registration overhead exceeds savings)
34
+ - The doc changes between calls → context blocks are immutable; register a new one
35
+ - Single task that doesn't reference any large shared content → no benefit
36
+
37
+ ## Endpoints
21
38
 
22
39
  ### Register a context block
23
40
 
@@ -37,7 +54,7 @@ prompt that references it.
37
54
  | Field | Type | Required | Notes |
38
55
  |---|---|---|---|
39
56
  | `content` | string | yes | Document content (min 1 char) |
40
- | `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped |
57
+ | `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped (default 1h) |
41
58
 
42
59
  #### Response (201)
43
60
 
@@ -45,34 +62,49 @@ prompt that references it.
45
62
  { "id": "cb_abc123" }
46
63
  ```
47
64
 
48
- Use this `id` as a `contextBlockIds` entry in `mma-delegate`, `mma-audit`,
49
- `mma-review`, `mma-verify`, `mma-debug`, or `mma-execute-plan`.
65
+ Use this `id` as a `contextBlockIds` entry in any `mma-*` skill that supports it.
50
66
 
51
67
  ### Delete a context block
52
68
 
53
69
  `DELETE /context-blocks/:id?cwd=<abs-path>`
54
70
 
55
- Returns `200 { ok: true }` on success.
56
-
57
- Returns `409 pinned` if the block is held by one or more active batches —
58
- wait for those batches to complete before deleting.
71
+ Returns `200 { ok: true }` on success. Returns `409 pinned` if the block is held by one or more active batches — wait for those batches to complete before deleting.
59
72
 
60
- ### Example
73
+ ## Full example
61
74
 
62
75
  ```bash
63
- # Register spec document
76
+ # Register spec document once
64
77
  ID=$(curl -f --show-error -s -X POST \
65
78
  -H "Authorization: Bearer $TOKEN" \
66
79
  -H "Content-Type: application/json" \
67
80
  -d "{\"content\":$(jq -Rs . < /project/docs/spec.md)}" \
68
81
  "http://localhost:$PORT/context-blocks?cwd=/project" | jq -r '.id')
69
82
 
70
- # Use in a delegate call
83
+ # Reference from N delegate tasks
71
84
  curl -f --show-error -s -X POST \
72
85
  -H "Authorization: Bearer $TOKEN" \
73
86
  -H "Content-Type: application/json" \
74
- -d "{\"tasks\":[{\"prompt\":\"Implement per spec\",\"contextBlockIds\":[\"$ID\"]}]}" \
87
+ -d "{\"tasks\":[
88
+ {\"prompt\":\"Implement section 3 per spec\",\"contextBlockIds\":[\"$ID\"]},
89
+ {\"prompt\":\"Implement section 4 per spec\",\"contextBlockIds\":[\"$ID\"]}
90
+ ]}" \
75
91
  "http://localhost:$PORT/delegate?cwd=/project"
76
92
  ```
77
93
 
94
+ ## Common pitfalls
95
+
96
+ ❌ **Inlining the same 50KB spec into every task prompt**
97
+ > tasks: [{prompt: "Implement section 3:\n[50KB spec]"}, {prompt: "Implement section 4:\n[50KB spec]"}]
98
+
99
+ N×50KB transmissions; main context burns through tokens. **Fix:** register the spec once, pass `contextBlockIds: ["cb_xxx"]` to each task.
100
+
101
+ ❌ **Forgetting to delete short-TTL blocks**
102
+ Blocks count against the project's context-block quota. **Fix:** explicitly `DELETE` after the dependent batches finish — or set a short `ttlMs` so they self-evict.
103
+
104
+ ❌ **Trying to update a block's content**
105
+ Blocks are immutable. **Fix:** register a new block with the new content; switch the `contextBlockIds` to the new ID.
106
+
107
+ ❌ **Deleting a block while a batch still references it**
108
+ Returns `409 pinned`. **Fix:** poll the dependent batches to terminal first, then delete.
109
+
78
110
  @include _shared/error-handling.md
@@ -1,31 +1,46 @@
1
1
  ---
2
2
  name: mma-debug
3
3
  description: >-
4
- Debug a failure using a structured hypothesis via the local mmagent HTTP
5
- service. All provided files are investigated together in a single task on a
6
- worker.
4
+ Use when a test fails, a build breaks, or behavior is unexpected AND narrowing
5
+ the root cause requires reading files, reproducing the failure, or tracing
6
+ across multiple modules — the worker investigates so the main agent stays on
7
+ the hypothesis
7
8
  when_to_use: >-
8
- A test fails, a build breaks, or behavior is unexpected AND you need to read
9
- files, reproduce the failure, or narrow root cause OR a methodology skill
9
+ A failure has surfaced (test/build/runtime) AND you need investigation work —
10
+ read files, reproduce, trace OR a methodology skill
10
11
  (superpowers:systematic-debugging) points at the investigation step. Delegate
11
- the read/reproduce/trace work to a mmagent worker so your main context stays
12
- focused on the hypothesis and the fix.
13
- version: 3.3.0
12
+ the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
13
+ version: 3.5.0
14
14
  ---
15
15
 
16
- ## mma-debug
16
+ # mma-debug
17
17
 
18
- Submit a problem, context, and hypothesis to a sub-agent for focused
19
- debugging. Unlike other tools, all `filePaths` are investigated together
20
- in a single task (not parallelised per file).
18
+ ## Overview
21
19
 
22
- ### Endpoint
20
+ Submit a problem, context, and hypothesis to a worker for focused debugging. Unlike `mma-audit` and `mma-review`, all `filePaths` are investigated TOGETHER in a single task (not parallelized per file) — debugging needs cross-file reasoning.
21
+
22
+ **Core principle:** The hypothesis is judgment (your job). Reading files and reproducing the failure is labor (the worker's job). Pass the hypothesis as input; receive structured findings.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A test fails / build breaks / runtime behavior is unexpected
28
+ - The root cause likely spans 2+ files
29
+ - You have a hypothesis to test (or want the worker to suggest one)
30
+ - A methodology skill (`superpowers:systematic-debugging`) routed here
31
+
32
+ **Don't use when:**
33
+ - The error message points at one file you can read in 30 seconds → just `Read`
34
+ - You don't know what's broken yet → use `mma-investigate` first to map the area
35
+ - You already know the fix → skip debug, dispatch `mma-delegate` with the fix
36
+
37
+ ## Endpoint
23
38
 
24
39
  `POST /debug?cwd=<abs-path>`
25
40
 
26
41
  @include _shared/auth.md
27
42
 
28
- ### Request body
43
+ ## Request body
29
44
 
30
45
  ```json
31
46
  {
@@ -42,13 +57,13 @@ in a single task (not parallelised per file).
42
57
 
43
58
  | Field | Type | Required | Notes |
44
59
  |---|---|---|---|
45
- | `problem` | string | yes | What is broken |
46
- | `context` | string | no | Background information |
47
- | `hypothesis` | string | no | Initial theory to test |
48
- | `filePaths` | string[] | no | All files investigated together |
49
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
60
+ | `problem` | string | yes | What is broken (one sentence; concrete symptom) |
61
+ | `context` | string | no | Background what changed recently, what works, what doesn't |
62
+ | `hypothesis` | string | no | Your initial theory; worker tests it first, then explores |
63
+ | `filePaths` | string[] | no | All files investigated together (cross-file reasoning) |
64
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` (e.g. error logs, traces) |
50
65
 
51
- ### Full example
66
+ ## Full example
52
67
 
53
68
  ```bash
54
69
  BATCH=$(curl -f --show-error -s -X POST \
@@ -59,10 +74,27 @@ BATCH=$(curl -f --show-error -s -X POST \
59
74
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
60
75
  ```
61
76
 
62
- Then poll until complete:
63
-
64
77
  @include _shared/polling.md
65
78
 
66
79
  @include _shared/response-shape.md
67
80
 
81
+ ## Common pitfalls
82
+
83
+ ❌ **Vague `problem`**
84
+ > "The login is broken"
85
+
86
+ Worker has no symptom to chase. **Fix:** specific reproducer — `"POST /login with body {user:'a@b.c', pass:'café'} returns 500 with 'invalid character' in stderr"`.
87
+
88
+ ❌ **No `hypothesis`**
89
+ The worker explores blindly, often investigates the wrong area first. **Fix:** even a weak hypothesis ("might be encoding-related") narrows the search space.
90
+
91
+ ❌ **Splitting one bug across multiple `mma-debug` calls**
92
+ Debug intentionally bundles `filePaths` for cross-file reasoning. Splitting defeats this. **Fix:** one call with all suspect files; if you really have N independent failures, use `mma-delegate` with N tasks.
93
+
94
+ ❌ **Treating `mma-debug` as the fix step**
95
+ Debug investigates and proposes; it doesn't necessarily write the fix. If the worker identifies a fix, dispatch `mma-delegate` to implement it (or write it inline if you understand it).
96
+
97
+ ❌ **Skipping when an error message looks self-explanatory**
98
+ Often the obvious cause isn't the real one. A 30-second debug pass costs less than a wrong fix that breaks something else.
99
+
68
100
  @include _shared/error-handling.md
@@ -1,32 +1,47 @@
1
1
  ---
2
2
  name: mma-delegate
3
3
  description: >-
4
- Fan out ad-hoc implementation or research tasks to sub-agents in parallel via
5
- the local mmagent HTTP service. Tasks run on cheap workers that don't consume
6
- your main-model context window.
4
+ Use when you have one or more ad-hoc implementation or research tasks WITHOUT
5
+ a plan file on disk and you want them to run on cheap workers in parallel
6
+ instead of consuming main-context tokens
7
7
  when_to_use: >-
8
- You have one or more ad-hoc implementation or research tasks WITHOUT a plan
9
- file on disk AND mmagent is running. Prefer this over inline Agent dispatches
10
- or superpowers:dispatching-parallel-agents — delegated workers are cheaper,
11
- parallel-safe, and keep main context free. If a plan file exists, use
12
- mma-execute-plan; if the task is an audit/review/verify/debug, prefer the
13
- matching mma-* skill instead.
14
- version: 3.3.0
8
+ You have ad-hoc implementation or research tasks (no plan file on disk) AND
9
+ mmagent is running. Prefer this over inline Agent dispatches or
10
+ superpowers:dispatching-parallel-agents — workers are cheaper, parallel-safe,
11
+ and keep main context free. If a plan file exists use mma-execute-plan. If
12
+ the task is audit / review / verify / debug / investigate → use the matching
13
+ specialized skill.
14
+ version: 3.5.0
15
15
  ---
16
16
 
17
- ## mma-delegate
17
+ # mma-delegate
18
18
 
19
- Dispatch one or more tasks to sub-agents concurrently. Each task is an
20
- independent instruction with optional file scope, acceptance criteria, and
21
- context block references.
19
+ ## Overview
22
20
 
23
- ### Endpoint
21
+ Dispatch one or more ad-hoc tasks to sub-agents concurrently. Each task is an independent instruction with optional file scope, acceptance criteria, and context blocks.
22
+
23
+ **Core principle:** Workers run on cheap providers; the main agent consumes only the structured per-task report. Parallelize freely as long as tasks don't write the same files.
24
+
25
+ ## When to Use
26
+
27
+ **Use when:**
28
+ - 2+ unrelated implementation tasks (parallel speedup)
29
+ - A research task you'd otherwise spend tokens reading and grepping
30
+ - A focused refactor that fits in one prompt
31
+ - The task does NOT match audit / review / verify / debug / investigate (those have specialized skills)
32
+
33
+ **Don't use when:**
34
+ - A plan file exists on disk → `mma-execute-plan` (descriptors auto-match plan headings)
35
+ - Two tasks write the same file → dispatch sequentially, not in one batch (workers race)
36
+ - The work needs to read across many files for synthesis only → `mma-investigate` is cheaper (read-only)
37
+
38
+ ## Endpoint
24
39
 
25
40
  `POST /delegate?cwd=<abs-path>`
26
41
 
27
42
  @include _shared/auth.md
28
43
 
29
- ### Request body
44
+ ## Request body
30
45
 
31
46
  ```json
32
47
  {
@@ -46,18 +61,16 @@ context block references.
46
61
  |---|---|---|---|
47
62
  | `tasks` | array | yes | At least one task |
48
63
  | `tasks[].prompt` | string | yes | The task instruction |
49
- | `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"` (cheap). Pick `"complex"` when the task is ambiguous, touches many files, is security-sensitive, or a prior standard run came back with `filesWritten: 0` / ran out of turns. Complex workers cost more but finish bigger jobs. |
50
- | `tasks[].filePaths` | string[] | no | Files the sub-agent focuses on |
64
+ | `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"`. Pick `"complex"` when the task is ambiguous, security-sensitive, touches many files, or a prior standard run came back with `filesWritten: 0` / hit `incompleteReason: "turn_cap"`. |
65
+ | `tasks[].filePaths` | string[] | no | Files the worker focuses on |
51
66
  | `tasks[].done` | string | no | Acceptance criteria |
52
67
  | `tasks[].contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
53
- | `tasks[].verifyCommand` | string[] | no | Commands to run after task completion to verify the work |
54
- | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | Review lifecycle policy. Default `"full"` |
55
-
56
- Set `verifyCommand` when the worker can run a deterministic local check after editing, such as `npm test`, `npm run lint`, or a focused package test. Commands run in order after task completion; each string must be non-empty after trimming. Omit it when no reliable command exists.
68
+ | `tasks[].verifyCommand` | string[] | no | See verify-and-review snippet below |
69
+ | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | See verify-and-review snippet below. Default `"full"` |
57
70
 
58
- Set `reviewPolicy: 'diff_only'` when you want a cheaper single-pass review of the produced diff without spec-review rework loops. Use `reviewPolicy: 'full'` for default spec + quality review, `reviewPolicy: 'spec_only'` when quality review is not needed, and `reviewPolicy: 'off'` only for trusted low-risk tasks where verification is enough.
71
+ @include _shared/verify-and-review.md
59
72
 
60
- ### Full example
73
+ ## Full example
61
74
 
62
75
  ```bash
63
76
  BATCH=$(curl -f --show-error -s -X POST \
@@ -68,10 +81,29 @@ BATCH=$(curl -f --show-error -s -X POST \
68
81
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
69
82
  ```
70
83
 
71
- Then poll until complete:
72
-
73
84
  @include _shared/polling.md
74
85
 
75
86
  @include _shared/response-shape.md
76
87
 
88
+ ## Common pitfalls
89
+
90
+ ❌ **Two tasks writing the same file in one batch**
91
+ > tasks: [{prompt:"add JWT to login.ts"}, {prompt:"add logging to login.ts"}]
92
+
93
+ Workers run concurrently and race on the file. **Fix:** dispatch sequentially, or merge into one prompt.
94
+
95
+ ❌ **Vague `prompt`, no `done` criterion**
96
+ > "improve the auth module"
97
+
98
+ Worker has no completion signal — likely returns `done_with_concerns`. **Fix:** specific verb + acceptance: `"Add input validation to login.ts so all string fields reject empty/whitespace; tests pass"`.
99
+
100
+ ❌ **Defaulting to `agentType: "complex"` for everything**
101
+ Standard tier is 5–10× cheaper and finishes most edits. Escalate only when standard returns `filesWritten: 0` or `incompleteReason: "turn_cap"`.
102
+
103
+ ❌ **Inlining a 50KB doc into every prompt**
104
+ N tasks × 50KB = N transmissions. **Fix:** register the doc once via `mma-context-blocks`, pass the `contextBlockIds` to each task.
105
+
106
+ ❌ **Reading the worker's diff inline before review**
107
+ The reviewer sees the full diff with the original prompt as context. Reading inline burns main-context tokens for no quality gain.
108
+
77
109
  @include _shared/error-handling.md