@zhixuan92/multi-model-agent 3.2.0 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (65) hide show
  1. package/README.md +62 -33
  2. package/dist/http/canonicalize-file-paths.d.ts +8 -0
  3. package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
  4. package/dist/http/canonicalize-file-paths.js +43 -0
  5. package/dist/http/canonicalize-file-paths.js.map +1 -0
  6. package/dist/http/execution-context.d.ts.map +1 -1
  7. package/dist/http/execution-context.js +0 -14
  8. package/dist/http/execution-context.js.map +1 -1
  9. package/dist/http/handlers/control/batch-slice.d.ts +4 -0
  10. package/dist/http/handlers/control/batch-slice.d.ts.map +1 -0
  11. package/dist/http/handlers/control/batch-slice.js +40 -0
  12. package/dist/http/handlers/control/batch-slice.js.map +1 -0
  13. package/dist/http/handlers/control/retry.d.ts +4 -0
  14. package/dist/http/handlers/control/retry.d.ts.map +1 -0
  15. package/dist/http/handlers/control/retry.js +60 -0
  16. package/dist/http/handlers/control/retry.js.map +1 -0
  17. package/dist/http/handlers/tools/audit.d.ts.map +1 -1
  18. package/dist/http/handlers/tools/audit.js +2 -0
  19. package/dist/http/handlers/tools/audit.js.map +1 -1
  20. package/dist/http/handlers/tools/debug.d.ts.map +1 -1
  21. package/dist/http/handlers/tools/debug.js +2 -0
  22. package/dist/http/handlers/tools/debug.js.map +1 -1
  23. package/dist/http/handlers/tools/delegate.d.ts.map +1 -1
  24. package/dist/http/handlers/tools/delegate.js +2 -0
  25. package/dist/http/handlers/tools/delegate.js.map +1 -1
  26. package/dist/http/handlers/tools/execute-plan.d.ts.map +1 -1
  27. package/dist/http/handlers/tools/execute-plan.js +2 -0
  28. package/dist/http/handlers/tools/execute-plan.js.map +1 -1
  29. package/dist/http/handlers/tools/investigate.d.ts +4 -0
  30. package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
  31. package/dist/http/handlers/tools/investigate.js +81 -0
  32. package/dist/http/handlers/tools/investigate.js.map +1 -0
  33. package/dist/http/handlers/tools/review.d.ts.map +1 -1
  34. package/dist/http/handlers/tools/review.js +2 -0
  35. package/dist/http/handlers/tools/review.js.map +1 -1
  36. package/dist/http/handlers/tools/verify.d.ts.map +1 -1
  37. package/dist/http/handlers/tools/verify.js +2 -0
  38. package/dist/http/handlers/tools/verify.js.map +1 -1
  39. package/dist/http/request-observability.d.ts +9 -0
  40. package/dist/http/request-observability.d.ts.map +1 -0
  41. package/dist/http/request-observability.js +36 -0
  42. package/dist/http/request-observability.js.map +1 -0
  43. package/dist/http/server.d.ts.map +1 -1
  44. package/dist/http/server.js +52 -11
  45. package/dist/http/server.js.map +1 -1
  46. package/dist/install/discover.d.ts +1 -1
  47. package/dist/install/discover.d.ts.map +1 -1
  48. package/dist/install/discover.js +1 -0
  49. package/dist/install/discover.js.map +1 -1
  50. package/dist/openapi.d.ts.map +1 -1
  51. package/dist/openapi.js +6 -0
  52. package/dist/openapi.js.map +1 -1
  53. package/dist/skills/_shared/verify-and-review.md +12 -0
  54. package/dist/skills/mma-audit/SKILL.md +45 -18
  55. package/dist/skills/mma-clarifications/SKILL.md +73 -29
  56. package/dist/skills/mma-context-blocks/SKILL.md +56 -24
  57. package/dist/skills/mma-debug/SKILL.md +54 -22
  58. package/dist/skills/mma-delegate/SKILL.md +59 -21
  59. package/dist/skills/mma-execute-plan/SKILL.md +56 -24
  60. package/dist/skills/mma-investigate/SKILL.md +137 -0
  61. package/dist/skills/mma-retry/SKILL.md +65 -22
  62. package/dist/skills/mma-review/SKILL.md +49 -20
  63. package/dist/skills/mma-verify/SKILL.md +49 -18
  64. package/dist/skills/multi-model-agent/SKILL.md +84 -46
  65. package/package.json +2 -2
@@ -1,33 +1,65 @@
1
1
  ---
2
2
  name: mma-clarifications
3
3
  description: >-
4
- Confirm or correct mmagent's proposed interpretation when a batch is awaiting
5
- clarification before it can proceed. Paired skill to every mma-* task
6
- dispatcher.
4
+ Use when a previous mma-* batch's terminal envelope has
5
+ `proposedInterpretation` as a string (not the `not_applicable` sentinel) the
6
+ service paused waiting for you to confirm or correct its read of the task
7
7
  when_to_use: >-
8
- A previous mma-delegate / mma-audit / mma-review / mma-execute-plan / etc.
9
- terminal envelope has `proposedInterpretation` as a string (not a
10
- NotApplicable sentinel). Read the proposal and call this skill to accept or
11
- correct it. The batch resumes after the POST returns.
12
- version: 3.2.0
8
+ A previous mma-delegate / mma-audit / mma-review / mma-execute-plan /
9
+ mma-debug / mma-investigate terminal envelope has `proposedInterpretation` as
10
+ a string. Read the proposal, decide whether to accept or correct it, then call
11
+ this skill. The batch resumes immediately after the POST returns.
12
+ version: 3.4.0
13
13
  ---
14
14
 
15
- ## mma-clarifications
15
+ # mma-clarifications
16
16
 
17
- When a batch pauses with `state: 'awaiting_clarification'`, the service has
18
- proposed an interpretation of the task and is waiting for your decision.
19
- Read the proposal, then call `POST /clarifications/confirm` to either accept
20
- or correct it. The batch resumes immediately after confirmation.
17
+ ## Overview
21
18
 
22
- ### Endpoint
19
+ When a batch pauses with `state: 'awaiting_clarification'`, the service has proposed an interpretation of an ambiguous task and is waiting for your decision. Read the proposal, then `POST /clarifications/confirm` with either the proposal verbatim (accept) or a corrected version (override). The batch resumes immediately.
20
+
21
+ **Core principle:** Clarification is a quality gate, not an error. Ambiguous tasks would silently produce the wrong work — the pause forces a deliberate choice.
22
+
23
+ ## When to Use
24
+
25
+ ```dot
26
+ digraph when_to_use {
27
+ "Polling a batch?" [shape=diamond];
28
+ "state == awaiting_clarification?" [shape=diamond];
29
+ "proposedInterpretation is a string?" [shape=diamond];
30
+ "Read proposal" [shape=box];
31
+ "Accept or correct" [shape=diamond];
32
+ "POST proposal verbatim" [shape=box];
33
+ "POST corrected text" [shape=box];
34
+
35
+ "Polling a batch?" -> "state == awaiting_clarification?";
36
+ "state == awaiting_clarification?" -> "proposedInterpretation is a string?" [label="yes"];
37
+ "state == awaiting_clarification?" -> "Continue polling" [label="no"];
38
+ "proposedInterpretation is a string?" -> "Read proposal" [label="yes"];
39
+ "Read proposal" -> "Accept or correct";
40
+ "Accept or correct" -> "POST proposal verbatim" [label="proposal is right"];
41
+ "Accept or correct" -> "POST corrected text" [label="proposal is wrong"];
42
+ }
43
+ ```
44
+
45
+ **Use when:**
46
+ - Polling a batch and the terminal envelope has `proposedInterpretation` as a string
47
+ - The mma-* skill that dispatched explicitly references this skill in its "if awaiting_clarification" line
48
+
49
+ **Don't use when:**
50
+ - `proposedInterpretation` is `{ kind: 'not_applicable', ... }` → batch isn't waiting; just read `results`
51
+ - The batch failed (`error` is a real object) → don't confirm; debug or re-dispatch
52
+ - You don't yet have a `batchId` → this skill resumes existing batches, not new ones
53
+
54
+ ## Endpoint
23
55
 
24
56
  `POST /clarifications/confirm`
25
57
 
26
- Auth required. Not cwd-gated (operates on a `batchId`).
58
+ Auth required. NOT cwd-gated operates on a `batchId`.
27
59
 
28
60
  @include _shared/auth.md
29
61
 
30
- ### Request body
62
+ ## Request body
31
63
 
32
64
  ```json
33
65
  {
@@ -39,38 +71,50 @@ Auth required. Not cwd-gated (operates on a `batchId`).
39
71
  | Field | Type | Required | Notes |
40
72
  |---|---|---|---|
41
73
  | `batchId` | string (UUID) | yes | Batch in `awaiting_clarification` state |
42
- | `interpretation` | string | yes | Accept proposal verbatim or provide a corrected version |
74
+ | `interpretation` | string | yes | Accept proposal verbatim, OR provide corrected text the worker should follow instead |
43
75
 
44
- ### Response (200)
76
+ ## Response (200)
45
77
 
46
78
  ```json
47
79
  { "batchId": "...", "state": "pending" }
48
80
  ```
49
81
 
50
- `state` is usually `pending` (batch resumes). It may be `complete` if the
51
- executor was already waiting and finishes immediately.
82
+ `state` is usually `pending` (batch resumes). May be `complete` if the executor was already waiting and finishes immediately.
52
83
 
53
- ### Full flow
84
+ ## Full flow
54
85
 
55
86
  ```bash
56
- # 1. Poll until awaiting_clarification
57
- STATE=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
58
- "http://localhost:$PORT/batch/$BATCH_ID" | jq -r '.state')
87
+ # 1. Poll until terminal
88
+ RESP=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
89
+ "http://localhost:$PORT/batch/$BATCH_ID")
59
90
 
60
- # 2. Read the proposal
61
- PROPOSAL=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
62
- "http://localhost:$PORT/batch/$BATCH_ID" | jq -r '.proposedInterpretation')
91
+ # 2. Check for a string proposal (not the not_applicable sentinel)
92
+ PROPOSAL=$(echo "$RESP" | jq -r 'select(.proposedInterpretation | type == "string") | .proposedInterpretation')
63
93
 
64
- # 3. Confirm (accept proposal or supply corrected text)
94
+ # 3. Confirm accept proposal verbatim, or supply corrected text
65
95
  curl -f --show-error -s -X POST \
66
96
  -H "Authorization: Bearer $TOKEN" \
67
97
  -H "Content-Type: application/json" \
68
98
  -d "{\"batchId\":\"$BATCH_ID\",\"interpretation\":\"$PROPOSAL\"}" \
69
99
  "http://localhost:$PORT/clarifications/confirm"
70
100
 
71
- # 4. Resume polling
101
+ # 4. Resume polling for terminal
72
102
  ```
73
103
 
74
104
  @include _shared/polling.md
75
105
 
106
+ ## Common pitfalls
107
+
108
+ ❌ **Confirming a wrong proposal verbatim because "the service knows best"**
109
+ The service is GUESSING from limited context. If the proposal would do the wrong thing, supply corrected `interpretation` text. **Why:** post-confirmation work is hard to undo.
110
+
111
+ ❌ **Treating the pause as an error**
112
+ `awaiting_clarification` is a SUCCESS path — it caught ambiguity before producing wrong work. Read, decide, confirm.
113
+
114
+ ❌ **Forgetting the `batchId` is the original, not a new one**
115
+ This endpoint mutates the existing batch — it does not create a new one. **Fix:** poll the SAME `batchId` after confirming.
116
+
117
+ ❌ **Polling without checking `proposedInterpretation`'s shape**
118
+ The field is either a `string` (paused) or `{ kind: 'not_applicable' }` (terminal). **Fix:** check the JSON type before treating it as text.
119
+
76
120
  @include _shared/error-handling.md
@@ -1,23 +1,40 @@
1
1
  ---
2
2
  name: mma-context-blocks
3
3
  description: >-
4
- Register large reused documents (spec, plan, codebase summary) as a context
5
- block the mmagent service caches, then reference it by ID across multiple
6
- mma-* calls. Avoids re-uploading the same content on every task.
4
+ Use when a document larger than ~2 KB will be referenced by 2+ subsequent
5
+ mma-* calls register once, pass the returned ID to each call instead of
6
+ re-uploading the same content
7
7
  when_to_use: >-
8
- A document larger than ~2 KB will be referenced by two or more mma-* calls in
9
- a row. Register once here, then pass the returned ID via the contextBlockIds
10
- field on mma-delegate / mma-execute-plan / mma-audit / mma-review / mma-verify
11
- / mma-debug. Cheaper and faster than inlining the same content in every
12
- request body.
13
- version: 3.2.0
8
+ A document (spec, plan, codebase summary, prior round's findings, error log)
9
+ larger than ~2 KB will be referenced by two or more mma-* calls in a row.
10
+ Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
11
+ mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
12
+ mma-investigate. Cheaper and faster than inlining the same content N times.
13
+ version: 3.4.0
14
14
  ---
15
15
 
16
- ## mma-context-blocks
16
+ # mma-context-blocks
17
17
 
18
- Store large documents once; reference them by ID in subsequent `mma-*` calls
19
- via `contextBlockIds`. The service prepends the block content to each task
20
- prompt that references it.
18
+ ## Overview
19
+
20
+ Store large documents once; reference them by ID in subsequent `mma-*` calls via `contextBlockIds`. The service prepends the block content to each task prompt that references the ID — content is transmitted ONCE to the daemon, then reused server-side.
21
+
22
+ **Core principle:** Without context blocks, the same document is sent N times for N tasks. Blocks transmit once. The savings compound on shared specs, prior-round findings, and codebase summaries.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A doc >2 KB will be referenced by ≥2 mma-* calls
28
+ - You're running iterative audit/review rounds (round 2 references round 1's findings)
29
+ - A spec or design doc is the shared input across N parallel tasks
30
+ - A long error log is the context for debug + delegate calls
31
+
32
+ **Don't use when:**
33
+ - The doc is <2 KB and used once → just inline it (registration overhead exceeds savings)
34
+ - The doc changes between calls → context blocks are immutable; register a new one
35
+ - Single task that doesn't reference any large shared content → no benefit
36
+
37
+ ## Endpoints
21
38
 
22
39
  ### Register a context block
23
40
 
@@ -37,7 +54,7 @@ prompt that references it.
37
54
  | Field | Type | Required | Notes |
38
55
  |---|---|---|---|
39
56
  | `content` | string | yes | Document content (min 1 char) |
40
- | `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped |
57
+ | `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped (default 1h) |
41
58
 
42
59
  #### Response (201)
43
60
 
@@ -45,34 +62,49 @@ prompt that references it.
45
62
  { "id": "cb_abc123" }
46
63
  ```
47
64
 
48
- Use this `id` as a `contextBlockIds` entry in `mma-delegate`, `mma-audit`,
49
- `mma-review`, `mma-verify`, `mma-debug`, or `mma-execute-plan`.
65
+ Use this `id` as a `contextBlockIds` entry in any `mma-*` skill that supports it.
50
66
 
51
67
  ### Delete a context block
52
68
 
53
69
  `DELETE /context-blocks/:id?cwd=<abs-path>`
54
70
 
55
- Returns `200 { ok: true }` on success.
56
-
57
- Returns `409 pinned` if the block is held by one or more active batches —
58
- wait for those batches to complete before deleting.
71
+ Returns `200 { ok: true }` on success. Returns `409 pinned` if the block is held by one or more active batches — wait for those batches to complete before deleting.
59
72
 
60
- ### Example
73
+ ## Full example
61
74
 
62
75
  ```bash
63
- # Register spec document
76
+ # Register spec document once
64
77
  ID=$(curl -f --show-error -s -X POST \
65
78
  -H "Authorization: Bearer $TOKEN" \
66
79
  -H "Content-Type: application/json" \
67
80
  -d "{\"content\":$(jq -Rs . < /project/docs/spec.md)}" \
68
81
  "http://localhost:$PORT/context-blocks?cwd=/project" | jq -r '.id')
69
82
 
70
- # Use in a delegate call
83
+ # Reference from N delegate tasks
71
84
  curl -f --show-error -s -X POST \
72
85
  -H "Authorization: Bearer $TOKEN" \
73
86
  -H "Content-Type: application/json" \
74
- -d "{\"tasks\":[{\"prompt\":\"Implement per spec\",\"contextBlockIds\":[\"$ID\"]}]}" \
87
+ -d "{\"tasks\":[
88
+ {\"prompt\":\"Implement section 3 per spec\",\"contextBlockIds\":[\"$ID\"]},
89
+ {\"prompt\":\"Implement section 4 per spec\",\"contextBlockIds\":[\"$ID\"]}
90
+ ]}" \
75
91
  "http://localhost:$PORT/delegate?cwd=/project"
76
92
  ```
77
93
 
94
+ ## Common pitfalls
95
+
96
+ ❌ **Inlining the same 50KB spec into every task prompt**
97
+ > tasks: [{prompt: "Implement section 3:\n[50KB spec]"}, {prompt: "Implement section 4:\n[50KB spec]"}]
98
+
99
+ N×50KB transmissions; main context burns through tokens. **Fix:** register the spec once, pass `contextBlockIds: ["cb_xxx"]` to each task.
100
+
101
+ ❌ **Forgetting to delete short-TTL blocks**
102
+ Blocks count against the project's context-block quota. **Fix:** explicitly `DELETE` after the dependent batches finish — or set a short `ttlMs` so they self-evict.
103
+
104
+ ❌ **Trying to update a block's content**
105
+ Blocks are immutable. **Fix:** register a new block with the new content; switch the `contextBlockIds` to the new ID.
106
+
107
+ ❌ **Deleting a block while a batch still references it**
108
+ Returns `409 pinned`. **Fix:** poll the dependent batches to terminal first, then delete.
109
+
78
110
  @include _shared/error-handling.md
@@ -1,31 +1,46 @@
1
1
  ---
2
2
  name: mma-debug
3
3
  description: >-
4
- Debug a failure using a structured hypothesis via the local mmagent HTTP
5
- service. All provided files are investigated together in a single task on a
6
- worker.
4
+ Use when a test fails, a build breaks, or behavior is unexpected AND narrowing
5
+ the root cause requires reading files, reproducing the failure, or tracing
6
+ across multiple modules — the worker investigates so the main agent stays on
7
+ the hypothesis
7
8
  when_to_use: >-
8
- A test fails, a build breaks, or behavior is unexpected AND you need to read
9
- files, reproduce the failure, or narrow root cause OR a methodology skill
9
+ A failure has surfaced (test/build/runtime) AND you need investigation work —
10
+ read files, reproduce, trace OR a methodology skill
10
11
  (superpowers:systematic-debugging) points at the investigation step. Delegate
11
- the read/reproduce/trace work to a mmagent worker so your main context stays
12
- focused on the hypothesis and the fix.
13
- version: 3.2.0
12
+ the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
13
+ version: 3.4.0
14
14
  ---
15
15
 
16
- ## mma-debug
16
+ # mma-debug
17
17
 
18
- Submit a problem, context, and hypothesis to a sub-agent for focused
19
- debugging. Unlike other tools, all `filePaths` are investigated together
20
- in a single task (not parallelised per file).
18
+ ## Overview
21
19
 
22
- ### Endpoint
20
+ Submit a problem, context, and hypothesis to a worker for focused debugging. Unlike `mma-audit` and `mma-review`, all `filePaths` are investigated TOGETHER in a single task (not parallelized per file) — debugging needs cross-file reasoning.
21
+
22
+ **Core principle:** The hypothesis is judgment (your job). Reading files and reproducing the failure is labor (the worker's job). Pass the hypothesis as input; receive structured findings.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A test fails / build breaks / runtime behavior is unexpected
28
+ - The root cause likely spans 2+ files
29
+ - You have a hypothesis to test (or want the worker to suggest one)
30
+ - A methodology skill (`superpowers:systematic-debugging`) routed here
31
+
32
+ **Don't use when:**
33
+ - The error message points at one file you can read in 30 seconds → just `Read`
34
+ - You don't know what's broken yet → use `mma-investigate` first to map the area
35
+ - You already know the fix → skip debug, dispatch `mma-delegate` with the fix
36
+
37
+ ## Endpoint
23
38
 
24
39
  `POST /debug?cwd=<abs-path>`
25
40
 
26
41
  @include _shared/auth.md
27
42
 
28
- ### Request body
43
+ ## Request body
29
44
 
30
45
  ```json
31
46
  {
@@ -42,13 +57,13 @@ in a single task (not parallelised per file).
42
57
 
43
58
  | Field | Type | Required | Notes |
44
59
  |---|---|---|---|
45
- | `problem` | string | yes | What is broken |
46
- | `context` | string | no | Background information |
47
- | `hypothesis` | string | no | Initial theory to test |
48
- | `filePaths` | string[] | no | All files investigated together |
49
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
60
+ | `problem` | string | yes | What is broken (one sentence; concrete symptom) |
61
+ | `context` | string | no | Background what changed recently, what works, what doesn't |
62
+ | `hypothesis` | string | no | Your initial theory; worker tests it first, then explores |
63
+ | `filePaths` | string[] | no | All files investigated together (cross-file reasoning) |
64
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` (e.g. error logs, traces) |
50
65
 
51
- ### Full example
66
+ ## Full example
52
67
 
53
68
  ```bash
54
69
  BATCH=$(curl -f --show-error -s -X POST \
@@ -59,10 +74,27 @@ BATCH=$(curl -f --show-error -s -X POST \
59
74
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
60
75
  ```
61
76
 
62
- Then poll until complete:
63
-
64
77
  @include _shared/polling.md
65
78
 
66
79
  @include _shared/response-shape.md
67
80
 
81
+ ## Common pitfalls
82
+
83
+ ❌ **Vague `problem`**
84
+ > "The login is broken"
85
+
86
+ Worker has no symptom to chase. **Fix:** specific reproducer — `"POST /login with body {user:'a@b.c', pass:'café'} returns 500 with 'invalid character' in stderr"`.
87
+
88
+ ❌ **No `hypothesis`**
89
+ The worker explores blindly, often investigates the wrong area first. **Fix:** even a weak hypothesis ("might be encoding-related") narrows the search space.
90
+
91
+ ❌ **Splitting one bug across multiple `mma-debug` calls**
92
+ Debug intentionally bundles `filePaths` for cross-file reasoning. Splitting defeats this. **Fix:** one call with all suspect files; if you really have N independent failures, use `mma-delegate` with N tasks.
93
+
94
+ ❌ **Treating `mma-debug` as the fix step**
95
+ Debug investigates and proposes; it doesn't necessarily write the fix. If the worker identifies a fix, dispatch `mma-delegate` to implement it (or write it inline if you understand it).
96
+
97
+ ❌ **Skipping when an error message looks self-explanatory**
98
+ Often the obvious cause isn't the real one. A 30-second debug pass costs less than a wrong fix that breaks something else.
99
+
68
100
  @include _shared/error-handling.md
@@ -1,32 +1,47 @@
1
1
  ---
2
2
  name: mma-delegate
3
3
  description: >-
4
- Fan out ad-hoc implementation or research tasks to sub-agents in parallel via
5
- the local mmagent HTTP service. Tasks run on cheap workers that don't consume
6
- your main-model context window.
4
+ Use when you have one or more ad-hoc implementation or research tasks WITHOUT
5
+ a plan file on disk and you want them to run on cheap workers in parallel
6
+ instead of consuming main-context tokens
7
7
  when_to_use: >-
8
- You have one or more ad-hoc implementation or research tasks WITHOUT a plan
9
- file on disk AND mmagent is running. Prefer this over inline Agent dispatches
10
- or superpowers:dispatching-parallel-agents — delegated workers are cheaper,
11
- parallel-safe, and keep main context free. If a plan file exists, use
12
- mma-execute-plan; if the task is an audit/review/verify/debug, prefer the
13
- matching mma-* skill instead.
14
- version: 3.2.0
8
+ You have ad-hoc implementation or research tasks (no plan file on disk) AND
9
+ mmagent is running. Prefer this over inline Agent dispatches or
10
+ superpowers:dispatching-parallel-agents — workers are cheaper, parallel-safe,
11
+ and keep main context free. If a plan file exists use mma-execute-plan. If
12
+ the task is audit / review / verify / debug / investigate → use the matching
13
+ specialized skill.
14
+ version: 3.4.0
15
15
  ---
16
16
 
17
- ## mma-delegate
17
+ # mma-delegate
18
18
 
19
- Dispatch one or more tasks to sub-agents concurrently. Each task is an
20
- independent instruction with optional file scope, acceptance criteria, and
21
- context block references.
19
+ ## Overview
22
20
 
23
- ### Endpoint
21
+ Dispatch one or more ad-hoc tasks to sub-agents concurrently. Each task is an independent instruction with optional file scope, acceptance criteria, and context blocks.
22
+
23
+ **Core principle:** Workers run on cheap providers; the main agent consumes only the structured per-task report. Parallelize freely as long as tasks don't write the same files.
24
+
25
+ ## When to Use
26
+
27
+ **Use when:**
28
+ - 2+ unrelated implementation tasks (parallel speedup)
29
+ - A research task you'd otherwise spend tokens reading and grepping
30
+ - A focused refactor that fits in one prompt
31
+ - The task does NOT match audit / review / verify / debug / investigate (those have specialized skills)
32
+
33
+ **Don't use when:**
34
+ - A plan file exists on disk → `mma-execute-plan` (descriptors auto-match plan headings)
35
+ - Two tasks write the same file → dispatch sequentially, not in one batch (workers race)
36
+ - The work needs to read across many files for synthesis only → `mma-investigate` is cheaper (read-only)
37
+
38
+ ## Endpoint
24
39
 
25
40
  `POST /delegate?cwd=<abs-path>`
26
41
 
27
42
  @include _shared/auth.md
28
43
 
29
- ### Request body
44
+ ## Request body
30
45
 
31
46
  ```json
32
47
  {
@@ -46,12 +61,16 @@ context block references.
46
61
  |---|---|---|---|
47
62
  | `tasks` | array | yes | At least one task |
48
63
  | `tasks[].prompt` | string | yes | The task instruction |
49
- | `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"` (cheap). Pick `"complex"` when the task is ambiguous, touches many files, is security-sensitive, or a prior standard run came back with `filesWritten: 0` / ran out of turns. Complex workers cost more but finish bigger jobs. |
50
- | `tasks[].filePaths` | string[] | no | Files the sub-agent focuses on |
64
+ | `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"`. Pick `"complex"` when the task is ambiguous, security-sensitive, touches many files, or a prior standard run came back with `filesWritten: 0` / hit `incompleteReason: "turn_cap"`. |
65
+ | `tasks[].filePaths` | string[] | no | Files the worker focuses on |
51
66
  | `tasks[].done` | string | no | Acceptance criteria |
52
67
  | `tasks[].contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
68
+ | `tasks[].verifyCommand` | string[] | no | See verify-and-review snippet below |
69
+ | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | See verify-and-review snippet below. Default `"full"` |
70
+
71
+ @include _shared/verify-and-review.md
53
72
 
54
- ### Full example
73
+ ## Full example
55
74
 
56
75
  ```bash
57
76
  BATCH=$(curl -f --show-error -s -X POST \
@@ -62,10 +81,29 @@ BATCH=$(curl -f --show-error -s -X POST \
62
81
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
63
82
  ```
64
83
 
65
- Then poll until complete:
66
-
67
84
  @include _shared/polling.md
68
85
 
69
86
  @include _shared/response-shape.md
70
87
 
88
+ ## Common pitfalls
89
+
90
+ ❌ **Two tasks writing the same file in one batch**
91
+ > tasks: [{prompt:"add JWT to login.ts"}, {prompt:"add logging to login.ts"}]
92
+
93
+ Workers run concurrently and race on the file. **Fix:** dispatch sequentially, or merge into one prompt.
94
+
95
+ ❌ **Vague `prompt`, no `done` criterion**
96
+ > "improve the auth module"
97
+
98
+ Worker has no completion signal — likely returns `done_with_concerns`. **Fix:** specific verb + acceptance: `"Add input validation to login.ts so all string fields reject empty/whitespace; tests pass"`.
99
+
100
+ ❌ **Defaulting to `agentType: "complex"` for everything**
101
+ Standard tier is 5–10× cheaper and finishes most edits. Escalate only when standard returns `filesWritten: 0` or `incompleteReason: "turn_cap"`.
102
+
103
+ ❌ **Inlining a 50KB doc into every prompt**
104
+ N tasks × 50KB = N transmissions. **Fix:** register the doc once via `mma-context-blocks`, pass the `contextBlockIds` to each task.
105
+
106
+ ❌ **Reading the worker's diff inline before review**
107
+ The reviewer sees the full diff with the original prompt as context. Reading inline burns main-context tokens for no quality gain.
108
+
71
109
  @include _shared/error-handling.md
@@ -1,32 +1,45 @@
1
1
  ---
2
2
  name: mma-execute-plan
3
3
  description: >-
4
- Execute tasks from a plan or spec file on disk via the local mmagent HTTP
5
- service. Delegates to cheap sub-agents that don't consume your main-model
6
- context window. Task descriptors match plan headings; tasks run in parallel.
4
+ Use when a plan or spec file exists on disk (any markdown with numbered task
5
+ headings docs/superpowers/plans/*.md, a TODO list, a spec doc) and you need
6
+ to implement one or more tasks from it on cheap workers in parallel
7
7
  when_to_use: >-
8
- A plan file exists on disk (any markdown with numbered task headings
9
- docs/superpowers/plans/*.md, a TODO list, a spec doc) AND you need to
10
- implement one or more tasks from it. Prefer this over inline Agent dispatches
11
- or superpowers:subagent-driven-development / superpowers:executing-plans when
12
- mmagent is running — delegated workers are cheaper and don't pollute main
13
- context. Task descriptors must match the plan headings verbatim.
14
- version: 3.2.0
8
+ A plan file exists on disk AND you need to implement one or more tasks from it
9
+ AND mmagent is running. Prefer this over inline Agent dispatches or
10
+ superpowers:subagent-driven-development / superpowers:executing-plans
11
+ workers are cheaper and don't pollute main context. Task descriptors must
12
+ match plan headings verbatim.
13
+ version: 3.4.0
15
14
  ---
16
15
 
17
- ## mma-execute-plan
16
+ # mma-execute-plan
18
17
 
19
- Dispatch named tasks from a plan file to sub-agents. Task descriptors must
20
- match plan headings (e.g. `"1. Setup database schema"`). All tasks run in
21
- parallel and duplicate descriptors are rejected.
18
+ ## Overview
22
19
 
23
- ### Endpoint
20
+ Dispatch named tasks from a plan file to workers. Each `tasks` string must match a heading in the plan verbatim (e.g. `"1. Setup database schema"`). All tasks run in parallel; duplicate descriptors are rejected.
21
+
22
+ **Core principle:** The plan IS the prompt. Workers re-read the plan file in-process and find their named task — you don't need to inline the task body.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A plan/spec markdown exists with numbered task headings
28
+ - You want to dispatch a subset (or all) of those tasks
29
+ - Tasks are mostly independent (parallel-safe)
30
+
31
+ **Don't use when:**
32
+ - No plan file → `mma-delegate` (pass the prompt directly)
33
+ - Tasks form a hard linear sequence (later tasks depend on earlier ones' outputs) → dispatch in order, one batch each
34
+ - The "plan" is in conversation only, not on disk → write it to disk first, or use `mma-delegate`
35
+
36
+ ## Endpoint
24
37
 
25
38
  `POST /execute-plan?cwd=<abs-path>`
26
39
 
27
40
  @include _shared/auth.md
28
41
 
29
- ### Request body
42
+ ## Request body
30
43
 
31
44
  ```json
32
45
  {
@@ -46,16 +59,19 @@ parallel and duplicate descriptors are rejected.
46
59
 
47
60
  | Field | Type | Required | Notes |
48
61
  |---|---|---|---|
49
- | `tasks` | string[] | yes | At least one; must be unique; match plan headings |
62
+ | `tasks` | string[] \| `{task, reviewPolicy}[]` | yes | At least one; must be unique; each string matches a plan heading |
50
63
  | `context` | string | no | Short additional context not in the plan |
51
- | `filePaths` | string[] | no | Plan file + relevant source files |
64
+ | `filePaths` | string[] | no | Plan file + relevant source files. Required: the plan file itself. |
52
65
  | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
53
- | `agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"` (cheap). Switch to `"complex"` for tasks too large for a standard-tier model to finish in the turn budget (reads many files, produces many edits, or the last run came back with `filesWritten: 0`). |
66
+ | `agentType` | `"standard"` / `"complex"` | no | Default `"standard"`. Use `"complex"` for tasks too large for the standard tier reads many files, produces many edits, or the last run came back with `filesWritten: 0`. |
67
+ | `verifyCommand` | string[] | no | See verify-and-review snippet below |
68
+ | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | See verify-and-review snippet below. Default `"full"`. |
69
+
70
+ @include _shared/verify-and-review.md
54
71
 
55
- If the batch reaches `awaiting_clarification`, use `mma-clarifications`
56
- to confirm or correct the proposed interpretation.
72
+ If the batch reaches `awaiting_clarification`, use `mma-clarifications` to confirm or correct the proposed interpretation.
57
73
 
58
- ### Full example
74
+ ## Full example
59
75
 
60
76
  ```bash
61
77
  BATCH=$(curl -f --show-error -s -X POST \
@@ -66,10 +82,26 @@ BATCH=$(curl -f --show-error -s -X POST \
66
82
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
67
83
  ```
68
84
 
69
- Then poll until complete:
70
-
71
85
  @include _shared/polling.md
72
86
 
73
87
  @include _shared/response-shape.md
74
88
 
89
+ ## Common pitfalls
90
+
91
+ ❌ **Task descriptor doesn't match plan heading verbatim**
92
+ > tasks: ["Migrate db schema"] ← plan heading is "3. Migrate database schema"
93
+
94
+ Worker rejects with "no matching task" or matches the wrong one. **Fix:** copy the heading from the plan, including the leading number.
95
+
96
+ ❌ **Forgetting the plan file in `filePaths`**
97
+ > filePaths: ["/project/src/db/schema.sql"] ← no plan file
98
+
99
+ Worker can't read the task body. **Fix:** always include the plan path: `filePaths: ["/project/docs/plan.md", "/project/src/db/schema.sql"]`.
100
+
101
+ ❌ **Dispatching dependent tasks in one batch**
102
+ Task 5 depends on Task 4's output → workers race; Task 5 might run before Task 4 finishes. **Fix:** dispatch Task 4, wait for terminal, then dispatch Task 5.
103
+
104
+ ❌ **Skipping `verifyCommand` when one exists**
105
+ A passing local check is the cheapest signal you're going to get. **Fix:** wire `["npm test"]` or the focused package test.
106
+
75
107
  @include _shared/error-handling.md