@zhixuan92/multi-model-agent 3.3.0 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +62 -33
- package/dist/http/canonicalize-file-paths.d.ts +8 -0
- package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
- package/dist/http/canonicalize-file-paths.js +43 -0
- package/dist/http/canonicalize-file-paths.js.map +1 -0
- package/dist/http/execution-context.d.ts.map +1 -1
- package/dist/http/execution-context.js +0 -14
- package/dist/http/execution-context.js.map +1 -1
- package/dist/http/handlers/tools/investigate.d.ts +4 -0
- package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
- package/dist/http/handlers/tools/investigate.js +81 -0
- package/dist/http/handlers/tools/investigate.js.map +1 -0
- package/dist/http/server.d.ts.map +1 -1
- package/dist/http/server.js +5 -2
- package/dist/http/server.js.map +1 -1
- package/dist/install/discover.d.ts +1 -1
- package/dist/install/discover.d.ts.map +1 -1
- package/dist/install/discover.js +1 -0
- package/dist/install/discover.js.map +1 -1
- package/dist/openapi.d.ts.map +1 -1
- package/dist/openapi.js +6 -0
- package/dist/openapi.js.map +1 -1
- package/dist/skills/_shared/verify-and-review.md +12 -0
- package/dist/skills/mma-audit/SKILL.md +45 -18
- package/dist/skills/mma-clarifications/SKILL.md +73 -29
- package/dist/skills/mma-context-blocks/SKILL.md +56 -24
- package/dist/skills/mma-debug/SKILL.md +54 -22
- package/dist/skills/mma-delegate/SKILL.md +58 -26
- package/dist/skills/mma-execute-plan/SKILL.md +55 -29
- package/dist/skills/mma-investigate/SKILL.md +137 -0
- package/dist/skills/mma-retry/SKILL.md +65 -22
- package/dist/skills/mma-review/SKILL.md +49 -20
- package/dist/skills/mma-verify/SKILL.md +49 -18
- package/dist/skills/multi-model-agent/SKILL.md +84 -46
- package/package.json +2 -2
|
@@ -1,30 +1,45 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-audit
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
per file
|
|
4
|
+
Use when the user asks to audit a document, spec, config, or PR description
|
|
5
|
+
for security, correctness, performance, or style issues — and the audit can
|
|
6
|
+
run in parallel per file with no context pollution
|
|
7
7
|
when_to_use: >-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
version: 3.
|
|
8
|
+
User asks for a doc/spec/config audit OR a methodology skill
|
|
9
|
+
(superpowers:dispatching-parallel-agents, /security-review) points at one AND
|
|
10
|
+
mmagent is running. Delegate so each file audits on its own worker; the main
|
|
11
|
+
agent only synthesizes findings. Audit on PROSE/SPEC docs — use mma-review for
|
|
12
|
+
source code.
|
|
13
|
+
version: 3.4.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
# mma-audit
|
|
17
17
|
|
|
18
|
-
|
|
19
|
-
file is audited independently in parallel; results are indexed by file.
|
|
18
|
+
## Overview
|
|
20
19
|
|
|
21
|
-
|
|
20
|
+
Send a document or set of files to workers for structured auditing. Each file is audited independently in parallel; per-file results are indexed by path in the terminal envelope.
|
|
21
|
+
|
|
22
|
+
**Core principle:** One worker per file = no cross-file context pollution. The aggregator (you) decides what to do with the findings.
|
|
23
|
+
|
|
24
|
+
## When to Use
|
|
25
|
+
|
|
26
|
+
**Use when:**
|
|
27
|
+
- A spec / design doc / API contract / config file needs a critical read
|
|
28
|
+
- The audit type is `security`, `performance`, `correctness`, or `style` (or a combination)
|
|
29
|
+
- 2+ files would benefit from parallel audit
|
|
30
|
+
|
|
31
|
+
**Don't use when:**
|
|
32
|
+
- The thing being audited is source code → `mma-review` (knows about types, call sites, test coverage)
|
|
33
|
+
- You want a quick look ("does this look right?") → just `Read` and use your judgment
|
|
34
|
+
- The doc references many other files the auditor must cross-reference → consider `mma-review` instead (it pulls in source context)
|
|
35
|
+
|
|
36
|
+
## Endpoint
|
|
22
37
|
|
|
23
38
|
`POST /audit?cwd=<abs-path>`
|
|
24
39
|
|
|
25
40
|
@include _shared/auth.md
|
|
26
41
|
|
|
27
|
-
|
|
42
|
+
## Request body
|
|
28
43
|
|
|
29
44
|
```json
|
|
30
45
|
{
|
|
@@ -39,12 +54,12 @@ file is audited independently in parallel; results are indexed by file.
|
|
|
39
54
|
|---|---|---|---|
|
|
40
55
|
| `document` | string | no | Inline document content |
|
|
41
56
|
| `auditType` | string \| string[] | yes | `security`, `performance`, `correctness`, `style`, or `general`; or an array of the first four |
|
|
42
|
-
| `filePaths` | string[] | no | Files to audit (parallel) |
|
|
57
|
+
| `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
|
|
43
58
|
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
|
|
44
59
|
|
|
45
60
|
Either `document` or `filePaths` (or both) must be provided.
|
|
46
61
|
|
|
47
|
-
|
|
62
|
+
## Full example
|
|
48
63
|
|
|
49
64
|
```bash
|
|
50
65
|
BATCH=$(curl -f --show-error -s -X POST \
|
|
@@ -55,10 +70,22 @@ BATCH=$(curl -f --show-error -s -X POST \
|
|
|
55
70
|
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
56
71
|
```
|
|
57
72
|
|
|
58
|
-
Then poll until complete:
|
|
59
|
-
|
|
60
73
|
@include _shared/polling.md
|
|
61
74
|
|
|
62
75
|
@include _shared/response-shape.md
|
|
63
76
|
|
|
77
|
+
## Common pitfalls
|
|
78
|
+
|
|
79
|
+
❌ **Auditing source code with `mma-audit`**
|
|
80
|
+
The auditor lacks codebase context (no type info, no call-site lookup, no test awareness). Findings are speculative. **Fix:** use `mma-review` — it pulls in surrounding source context and validates against the actual types.
|
|
81
|
+
|
|
82
|
+
❌ **Single huge `document` string instead of `filePaths`**
|
|
83
|
+
Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
|
|
84
|
+
|
|
85
|
+
❌ **Asking for `auditType: "general"` when you mean something specific**
|
|
86
|
+
`"general"` is a catch-all that produces watery findings. **Fix:** pick the dimension you actually care about (`"correctness"` for spec gaps, `"security"` for threat models, etc.).
|
|
87
|
+
|
|
88
|
+
❌ **Re-auditing the same files round after round without delta context**
|
|
89
|
+
Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
|
|
90
|
+
|
|
64
91
|
@include _shared/error-handling.md
|
|
@@ -1,33 +1,65 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-clarifications
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Use when a previous mma-* batch's terminal envelope has
|
|
5
|
+
`proposedInterpretation` as a string (not the `not_applicable` sentinel) — the
|
|
6
|
+
service paused waiting for you to confirm or correct its read of the task
|
|
7
7
|
when_to_use: >-
|
|
8
|
-
A previous mma-delegate / mma-audit / mma-review / mma-execute-plan /
|
|
9
|
-
terminal envelope has `proposedInterpretation` as
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
version: 3.
|
|
8
|
+
A previous mma-delegate / mma-audit / mma-review / mma-execute-plan /
|
|
9
|
+
mma-debug / mma-investigate terminal envelope has `proposedInterpretation` as
|
|
10
|
+
a string. Read the proposal, decide whether to accept or correct it, then call
|
|
11
|
+
this skill. The batch resumes immediately after the POST returns.
|
|
12
|
+
version: 3.4.0
|
|
13
13
|
---
|
|
14
14
|
|
|
15
|
-
|
|
15
|
+
# mma-clarifications
|
|
16
16
|
|
|
17
|
-
|
|
18
|
-
proposed an interpretation of the task and is waiting for your decision.
|
|
19
|
-
Read the proposal, then call `POST /clarifications/confirm` to either accept
|
|
20
|
-
or correct it. The batch resumes immediately after confirmation.
|
|
17
|
+
## Overview
|
|
21
18
|
|
|
22
|
-
|
|
19
|
+
When a batch pauses with `state: 'awaiting_clarification'`, the service has proposed an interpretation of an ambiguous task and is waiting for your decision. Read the proposal, then `POST /clarifications/confirm` with either the proposal verbatim (accept) or a corrected version (override). The batch resumes immediately.
|
|
20
|
+
|
|
21
|
+
**Core principle:** Clarification is a quality gate, not an error. Ambiguous tasks would silently produce the wrong work — the pause forces a deliberate choice.
|
|
22
|
+
|
|
23
|
+
## When to Use
|
|
24
|
+
|
|
25
|
+
```dot
|
|
26
|
+
digraph when_to_use {
|
|
27
|
+
"Polling a batch?" [shape=diamond];
|
|
28
|
+
"state == awaiting_clarification?" [shape=diamond];
|
|
29
|
+
"proposedInterpretation is a string?" [shape=diamond];
|
|
30
|
+
"Read proposal" [shape=box];
|
|
31
|
+
"Accept or correct" [shape=diamond];
|
|
32
|
+
"POST proposal verbatim" [shape=box];
|
|
33
|
+
"POST corrected text" [shape=box];
|
|
34
|
+
|
|
35
|
+
"Polling a batch?" -> "state == awaiting_clarification?";
|
|
36
|
+
"state == awaiting_clarification?" -> "proposedInterpretation is a string?" [label="yes"];
|
|
37
|
+
"state == awaiting_clarification?" -> "Continue polling" [label="no"];
|
|
38
|
+
"proposedInterpretation is a string?" -> "Read proposal" [label="yes"];
|
|
39
|
+
"Read proposal" -> "Accept or correct";
|
|
40
|
+
"Accept or correct" -> "POST proposal verbatim" [label="proposal is right"];
|
|
41
|
+
"Accept or correct" -> "POST corrected text" [label="proposal is wrong"];
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
**Use when:**
|
|
46
|
+
- Polling a batch and the terminal envelope has `proposedInterpretation` as a string
|
|
47
|
+
- The mma-* skill that dispatched explicitly references this skill in its "if awaiting_clarification" line
|
|
48
|
+
|
|
49
|
+
**Don't use when:**
|
|
50
|
+
- `proposedInterpretation` is `{ kind: 'not_applicable', ... }` → batch isn't waiting; just read `results`
|
|
51
|
+
- The batch failed (`error` is a real object) → don't confirm; debug or re-dispatch
|
|
52
|
+
- You don't yet have a `batchId` → this skill resumes existing batches, not new ones
|
|
53
|
+
|
|
54
|
+
## Endpoint
|
|
23
55
|
|
|
24
56
|
`POST /clarifications/confirm`
|
|
25
57
|
|
|
26
|
-
Auth required.
|
|
58
|
+
Auth required. NOT cwd-gated — operates on a `batchId`.
|
|
27
59
|
|
|
28
60
|
@include _shared/auth.md
|
|
29
61
|
|
|
30
|
-
|
|
62
|
+
## Request body
|
|
31
63
|
|
|
32
64
|
```json
|
|
33
65
|
{
|
|
@@ -39,38 +71,50 @@ Auth required. Not cwd-gated (operates on a `batchId`).
|
|
|
39
71
|
| Field | Type | Required | Notes |
|
|
40
72
|
|---|---|---|---|
|
|
41
73
|
| `batchId` | string (UUID) | yes | Batch in `awaiting_clarification` state |
|
|
42
|
-
| `interpretation` | string | yes | Accept proposal verbatim
|
|
74
|
+
| `interpretation` | string | yes | Accept proposal verbatim, OR provide corrected text the worker should follow instead |
|
|
43
75
|
|
|
44
|
-
|
|
76
|
+
## Response (200)
|
|
45
77
|
|
|
46
78
|
```json
|
|
47
79
|
{ "batchId": "...", "state": "pending" }
|
|
48
80
|
```
|
|
49
81
|
|
|
50
|
-
`state` is usually `pending` (batch resumes).
|
|
51
|
-
executor was already waiting and finishes immediately.
|
|
82
|
+
`state` is usually `pending` (batch resumes). May be `complete` if the executor was already waiting and finishes immediately.
|
|
52
83
|
|
|
53
|
-
|
|
84
|
+
## Full flow
|
|
54
85
|
|
|
55
86
|
```bash
|
|
56
|
-
# 1. Poll until
|
|
57
|
-
|
|
58
|
-
"http://localhost:$PORT/batch/$BATCH_ID"
|
|
87
|
+
# 1. Poll until terminal
|
|
88
|
+
RESP=$(curl -f --show-error -s -H "Authorization: Bearer $TOKEN" \
|
|
89
|
+
"http://localhost:$PORT/batch/$BATCH_ID")
|
|
59
90
|
|
|
60
|
-
# 2.
|
|
61
|
-
PROPOSAL=$(
|
|
62
|
-
"http://localhost:$PORT/batch/$BATCH_ID" | jq -r '.proposedInterpretation')
|
|
91
|
+
# 2. Check for a string proposal (not the not_applicable sentinel)
|
|
92
|
+
PROPOSAL=$(echo "$RESP" | jq -r 'select(.proposedInterpretation | type == "string") | .proposedInterpretation')
|
|
63
93
|
|
|
64
|
-
# 3. Confirm
|
|
94
|
+
# 3. Confirm — accept proposal verbatim, or supply corrected text
|
|
65
95
|
curl -f --show-error -s -X POST \
|
|
66
96
|
-H "Authorization: Bearer $TOKEN" \
|
|
67
97
|
-H "Content-Type: application/json" \
|
|
68
98
|
-d "{\"batchId\":\"$BATCH_ID\",\"interpretation\":\"$PROPOSAL\"}" \
|
|
69
99
|
"http://localhost:$PORT/clarifications/confirm"
|
|
70
100
|
|
|
71
|
-
# 4. Resume polling
|
|
101
|
+
# 4. Resume polling for terminal
|
|
72
102
|
```
|
|
73
103
|
|
|
74
104
|
@include _shared/polling.md
|
|
75
105
|
|
|
106
|
+
## Common pitfalls
|
|
107
|
+
|
|
108
|
+
❌ **Confirming a wrong proposal verbatim because "the service knows best"**
|
|
109
|
+
The service is GUESSING from limited context. If the proposal would do the wrong thing, supply corrected `interpretation` text. **Why:** post-confirmation work is hard to undo.
|
|
110
|
+
|
|
111
|
+
❌ **Treating the pause as an error**
|
|
112
|
+
`awaiting_clarification` is a SUCCESS path — it caught ambiguity before producing wrong work. Read, decide, confirm.
|
|
113
|
+
|
|
114
|
+
❌ **Forgetting the `batchId` is the original, not a new one**
|
|
115
|
+
This endpoint mutates the existing batch — it does not create a new one. **Fix:** poll the SAME `batchId` after confirming.
|
|
116
|
+
|
|
117
|
+
❌ **Polling without checking `proposedInterpretation`'s shape**
|
|
118
|
+
The field is either a `string` (paused) or `{ kind: 'not_applicable' }` (terminal). **Fix:** check the JSON type before treating it as text.
|
|
119
|
+
|
|
76
120
|
@include _shared/error-handling.md
|
|
@@ -1,23 +1,40 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-context-blocks
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Use when a document larger than ~2 KB will be referenced by 2+ subsequent
|
|
5
|
+
mma-* calls — register once, pass the returned ID to each call instead of
|
|
6
|
+
re-uploading the same content
|
|
7
7
|
when_to_use: >-
|
|
8
|
-
A document
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
/ mma-
|
|
12
|
-
|
|
13
|
-
version: 3.
|
|
8
|
+
A document (spec, plan, codebase summary, prior round's findings, error log)
|
|
9
|
+
larger than ~2 KB will be referenced by two or more mma-* calls in a row.
|
|
10
|
+
Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
|
|
11
|
+
mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
|
|
12
|
+
mma-investigate. Cheaper and faster than inlining the same content N times.
|
|
13
|
+
version: 3.4.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
# mma-context-blocks
|
|
17
17
|
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
prompt that references
|
|
18
|
+
## Overview
|
|
19
|
+
|
|
20
|
+
Store large documents once; reference them by ID in subsequent `mma-*` calls via `contextBlockIds`. The service prepends the block content to each task prompt that references the ID — content is transmitted ONCE to the daemon, then reused server-side.
|
|
21
|
+
|
|
22
|
+
**Core principle:** Without context blocks, the same document is sent N times for N tasks. Blocks transmit once. The savings compound on shared specs, prior-round findings, and codebase summaries.
|
|
23
|
+
|
|
24
|
+
## When to Use
|
|
25
|
+
|
|
26
|
+
**Use when:**
|
|
27
|
+
- A doc >2 KB will be referenced by ≥2 mma-* calls
|
|
28
|
+
- You're running iterative audit/review rounds (round 2 references round 1's findings)
|
|
29
|
+
- A spec or design doc is the shared input across N parallel tasks
|
|
30
|
+
- A long error log is the context for debug + delegate calls
|
|
31
|
+
|
|
32
|
+
**Don't use when:**
|
|
33
|
+
- The doc is <2 KB and used once → just inline it (registration overhead exceeds savings)
|
|
34
|
+
- The doc changes between calls → context blocks are immutable; register a new one
|
|
35
|
+
- Single task that doesn't reference any large shared content → no benefit
|
|
36
|
+
|
|
37
|
+
## Endpoints
|
|
21
38
|
|
|
22
39
|
### Register a context block
|
|
23
40
|
|
|
@@ -37,7 +54,7 @@ prompt that references it.
|
|
|
37
54
|
| Field | Type | Required | Notes |
|
|
38
55
|
|---|---|---|---|
|
|
39
56
|
| `content` | string | yes | Document content (min 1 char) |
|
|
40
|
-
| `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped |
|
|
57
|
+
| `ttlMs` | number | no | Time-to-live in ms; omit for session-scoped (default 1h) |
|
|
41
58
|
|
|
42
59
|
#### Response (201)
|
|
43
60
|
|
|
@@ -45,34 +62,49 @@ prompt that references it.
|
|
|
45
62
|
{ "id": "cb_abc123" }
|
|
46
63
|
```
|
|
47
64
|
|
|
48
|
-
Use this `id` as a `contextBlockIds` entry in `mma
|
|
49
|
-
`mma-review`, `mma-verify`, `mma-debug`, or `mma-execute-plan`.
|
|
65
|
+
Use this `id` as a `contextBlockIds` entry in any `mma-*` skill that supports it.
|
|
50
66
|
|
|
51
67
|
### Delete a context block
|
|
52
68
|
|
|
53
69
|
`DELETE /context-blocks/:id?cwd=<abs-path>`
|
|
54
70
|
|
|
55
|
-
Returns `200 { ok: true }` on success.
|
|
56
|
-
|
|
57
|
-
Returns `409 pinned` if the block is held by one or more active batches —
|
|
58
|
-
wait for those batches to complete before deleting.
|
|
71
|
+
Returns `200 { ok: true }` on success. Returns `409 pinned` if the block is held by one or more active batches — wait for those batches to complete before deleting.
|
|
59
72
|
|
|
60
|
-
|
|
73
|
+
## Full example
|
|
61
74
|
|
|
62
75
|
```bash
|
|
63
|
-
# Register spec document
|
|
76
|
+
# Register spec document once
|
|
64
77
|
ID=$(curl -f --show-error -s -X POST \
|
|
65
78
|
-H "Authorization: Bearer $TOKEN" \
|
|
66
79
|
-H "Content-Type: application/json" \
|
|
67
80
|
-d "{\"content\":$(jq -Rs . < /project/docs/spec.md)}" \
|
|
68
81
|
"http://localhost:$PORT/context-blocks?cwd=/project" | jq -r '.id')
|
|
69
82
|
|
|
70
|
-
#
|
|
83
|
+
# Reference from N delegate tasks
|
|
71
84
|
curl -f --show-error -s -X POST \
|
|
72
85
|
-H "Authorization: Bearer $TOKEN" \
|
|
73
86
|
-H "Content-Type: application/json" \
|
|
74
|
-
-d "{\"tasks\":[
|
|
87
|
+
-d "{\"tasks\":[
|
|
88
|
+
{\"prompt\":\"Implement section 3 per spec\",\"contextBlockIds\":[\"$ID\"]},
|
|
89
|
+
{\"prompt\":\"Implement section 4 per spec\",\"contextBlockIds\":[\"$ID\"]}
|
|
90
|
+
]}" \
|
|
75
91
|
"http://localhost:$PORT/delegate?cwd=/project"
|
|
76
92
|
```
|
|
77
93
|
|
|
94
|
+
## Common pitfalls
|
|
95
|
+
|
|
96
|
+
❌ **Inlining the same 50KB spec into every task prompt**
|
|
97
|
+
> tasks: [{prompt: "Implement section 3:\n[50KB spec]"}, {prompt: "Implement section 4:\n[50KB spec]"}]
|
|
98
|
+
|
|
99
|
+
N×50KB transmissions; main context burns through tokens. **Fix:** register the spec once, pass `contextBlockIds: ["cb_xxx"]` to each task.
|
|
100
|
+
|
|
101
|
+
❌ **Forgetting to delete short-TTL blocks**
|
|
102
|
+
Blocks count against the project's context-block quota. **Fix:** explicitly `DELETE` after the dependent batches finish — or set a short `ttlMs` so they self-evict.
|
|
103
|
+
|
|
104
|
+
❌ **Trying to update a block's content**
|
|
105
|
+
Blocks are immutable. **Fix:** register a new block with the new content; switch the `contextBlockIds` to the new ID.
|
|
106
|
+
|
|
107
|
+
❌ **Deleting a block while a batch still references it**
|
|
108
|
+
Returns `409 pinned`. **Fix:** poll the dependent batches to terminal first, then delete.
|
|
109
|
+
|
|
78
110
|
@include _shared/error-handling.md
|
|
@@ -1,31 +1,46 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-debug
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
worker
|
|
4
|
+
Use when a test fails, a build breaks, or behavior is unexpected AND narrowing
|
|
5
|
+
the root cause requires reading files, reproducing the failure, or tracing
|
|
6
|
+
across multiple modules — the worker investigates so the main agent stays on
|
|
7
|
+
the hypothesis
|
|
7
8
|
when_to_use: >-
|
|
8
|
-
A
|
|
9
|
-
files, reproduce
|
|
9
|
+
A failure has surfaced (test/build/runtime) AND you need investigation work —
|
|
10
|
+
read files, reproduce, trace — OR a methodology skill
|
|
10
11
|
(superpowers:systematic-debugging) points at the investigation step. Delegate
|
|
11
|
-
the read/reproduce/trace
|
|
12
|
-
|
|
13
|
-
version: 3.3.0
|
|
12
|
+
the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
|
|
13
|
+
version: 3.4.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
# mma-debug
|
|
17
17
|
|
|
18
|
-
|
|
19
|
-
debugging. Unlike other tools, all `filePaths` are investigated together
|
|
20
|
-
in a single task (not parallelised per file).
|
|
18
|
+
## Overview
|
|
21
19
|
|
|
22
|
-
|
|
20
|
+
Submit a problem, context, and hypothesis to a worker for focused debugging. Unlike `mma-audit` and `mma-review`, all `filePaths` are investigated TOGETHER in a single task (not parallelized per file) — debugging needs cross-file reasoning.
|
|
21
|
+
|
|
22
|
+
**Core principle:** The hypothesis is judgment (your job). Reading files and reproducing the failure is labor (the worker's job). Pass the hypothesis as input; receive structured findings.
|
|
23
|
+
|
|
24
|
+
## When to Use
|
|
25
|
+
|
|
26
|
+
**Use when:**
|
|
27
|
+
- A test fails / build breaks / runtime behavior is unexpected
|
|
28
|
+
- The root cause likely spans 2+ files
|
|
29
|
+
- You have a hypothesis to test (or want the worker to suggest one)
|
|
30
|
+
- A methodology skill (`superpowers:systematic-debugging`) routed here
|
|
31
|
+
|
|
32
|
+
**Don't use when:**
|
|
33
|
+
- The error message points at one file you can read in 30 seconds → just `Read`
|
|
34
|
+
- You don't know what's broken yet → use `mma-investigate` first to map the area
|
|
35
|
+
- You already know the fix → skip debug, dispatch `mma-delegate` with the fix
|
|
36
|
+
|
|
37
|
+
## Endpoint
|
|
23
38
|
|
|
24
39
|
`POST /debug?cwd=<abs-path>`
|
|
25
40
|
|
|
26
41
|
@include _shared/auth.md
|
|
27
42
|
|
|
28
|
-
|
|
43
|
+
## Request body
|
|
29
44
|
|
|
30
45
|
```json
|
|
31
46
|
{
|
|
@@ -42,13 +57,13 @@ in a single task (not parallelised per file).
|
|
|
42
57
|
|
|
43
58
|
| Field | Type | Required | Notes |
|
|
44
59
|
|---|---|---|---|
|
|
45
|
-
| `problem` | string | yes | What is broken |
|
|
46
|
-
| `context` | string | no | Background
|
|
47
|
-
| `hypothesis` | string | no |
|
|
48
|
-
| `filePaths` | string[] | no | All files investigated together |
|
|
49
|
-
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
|
|
60
|
+
| `problem` | string | yes | What is broken (one sentence; concrete symptom) |
|
|
61
|
+
| `context` | string | no | Background — what changed recently, what works, what doesn't |
|
|
62
|
+
| `hypothesis` | string | no | Your initial theory; worker tests it first, then explores |
|
|
63
|
+
| `filePaths` | string[] | no | All files investigated together (cross-file reasoning) |
|
|
64
|
+
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` (e.g. error logs, traces) |
|
|
50
65
|
|
|
51
|
-
|
|
66
|
+
## Full example
|
|
52
67
|
|
|
53
68
|
```bash
|
|
54
69
|
BATCH=$(curl -f --show-error -s -X POST \
|
|
@@ -59,10 +74,27 @@ BATCH=$(curl -f --show-error -s -X POST \
|
|
|
59
74
|
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
60
75
|
```
|
|
61
76
|
|
|
62
|
-
Then poll until complete:
|
|
63
|
-
|
|
64
77
|
@include _shared/polling.md
|
|
65
78
|
|
|
66
79
|
@include _shared/response-shape.md
|
|
67
80
|
|
|
81
|
+
## Common pitfalls
|
|
82
|
+
|
|
83
|
+
❌ **Vague `problem`**
|
|
84
|
+
> "The login is broken"
|
|
85
|
+
|
|
86
|
+
Worker has no symptom to chase. **Fix:** specific reproducer — `"POST /login with body {user:'a@b.c', pass:'café'} returns 500 with 'invalid character' in stderr"`.
|
|
87
|
+
|
|
88
|
+
❌ **No `hypothesis`**
|
|
89
|
+
The worker explores blindly, often investigates the wrong area first. **Fix:** even a weak hypothesis ("might be encoding-related") narrows the search space.
|
|
90
|
+
|
|
91
|
+
❌ **Splitting one bug across multiple `mma-debug` calls**
|
|
92
|
+
Debug intentionally bundles `filePaths` for cross-file reasoning. Splitting defeats this. **Fix:** one call with all suspect files; if you really have N independent failures, use `mma-delegate` with N tasks.
|
|
93
|
+
|
|
94
|
+
❌ **Treating `mma-debug` as the fix step**
|
|
95
|
+
Debug investigates and proposes; it doesn't necessarily write the fix. If the worker identifies a fix, dispatch `mma-delegate` to implement it (or write it inline if you understand it).
|
|
96
|
+
|
|
97
|
+
❌ **Skipping when an error message looks self-explanatory**
|
|
98
|
+
Often the obvious cause isn't the real one. A 30-second debug pass costs less than a wrong fix that breaks something else.
|
|
99
|
+
|
|
68
100
|
@include _shared/error-handling.md
|
|
@@ -1,32 +1,47 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-delegate
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Use when you have one or more ad-hoc implementation or research tasks WITHOUT
|
|
5
|
+
a plan file on disk and you want them to run on cheap workers in parallel
|
|
6
|
+
instead of consuming main-context tokens
|
|
7
7
|
when_to_use: >-
|
|
8
|
-
You have
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
version: 3.
|
|
8
|
+
You have ad-hoc implementation or research tasks (no plan file on disk) AND
|
|
9
|
+
mmagent is running. Prefer this over inline Agent dispatches or
|
|
10
|
+
superpowers:dispatching-parallel-agents — workers are cheaper, parallel-safe,
|
|
11
|
+
and keep main context free. If a plan file exists → use mma-execute-plan. If
|
|
12
|
+
the task is audit / review / verify / debug / investigate → use the matching
|
|
13
|
+
specialized skill.
|
|
14
|
+
version: 3.4.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
|
-
|
|
17
|
+
# mma-delegate
|
|
18
18
|
|
|
19
|
-
|
|
20
|
-
independent instruction with optional file scope, acceptance criteria, and
|
|
21
|
-
context block references.
|
|
19
|
+
## Overview
|
|
22
20
|
|
|
23
|
-
|
|
21
|
+
Dispatch one or more ad-hoc tasks to sub-agents concurrently. Each task is an independent instruction with optional file scope, acceptance criteria, and context blocks.
|
|
22
|
+
|
|
23
|
+
**Core principle:** Workers run on cheap providers; the main agent consumes only the structured per-task report. Parallelize freely as long as tasks don't write the same files.
|
|
24
|
+
|
|
25
|
+
## When to Use
|
|
26
|
+
|
|
27
|
+
**Use when:**
|
|
28
|
+
- 2+ unrelated implementation tasks (parallel speedup)
|
|
29
|
+
- A research task you'd otherwise spend tokens reading and grepping
|
|
30
|
+
- A focused refactor that fits in one prompt
|
|
31
|
+
- The task does NOT match audit / review / verify / debug / investigate (those have specialized skills)
|
|
32
|
+
|
|
33
|
+
**Don't use when:**
|
|
34
|
+
- A plan file exists on disk → `mma-execute-plan` (descriptors auto-match plan headings)
|
|
35
|
+
- Two tasks write the same file → dispatch sequentially, not in one batch (workers race)
|
|
36
|
+
- The work needs to read across many files for synthesis only → `mma-investigate` is cheaper (read-only)
|
|
37
|
+
|
|
38
|
+
## Endpoint
|
|
24
39
|
|
|
25
40
|
`POST /delegate?cwd=<abs-path>`
|
|
26
41
|
|
|
27
42
|
@include _shared/auth.md
|
|
28
43
|
|
|
29
|
-
|
|
44
|
+
## Request body
|
|
30
45
|
|
|
31
46
|
```json
|
|
32
47
|
{
|
|
@@ -46,18 +61,16 @@ context block references.
|
|
|
46
61
|
|---|---|---|---|
|
|
47
62
|
| `tasks` | array | yes | At least one task |
|
|
48
63
|
| `tasks[].prompt` | string | yes | The task instruction |
|
|
49
|
-
| `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"
|
|
50
|
-
| `tasks[].filePaths` | string[] | no | Files the
|
|
64
|
+
| `tasks[].agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"`. Pick `"complex"` when the task is ambiguous, security-sensitive, touches many files, or a prior standard run came back with `filesWritten: 0` / hit `incompleteReason: "turn_cap"`. |
|
|
65
|
+
| `tasks[].filePaths` | string[] | no | Files the worker focuses on |
|
|
51
66
|
| `tasks[].done` | string | no | Acceptance criteria |
|
|
52
67
|
| `tasks[].contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
|
|
53
|
-
| `tasks[].verifyCommand` | string[] | no |
|
|
54
|
-
| `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no |
|
|
55
|
-
|
|
56
|
-
Set `verifyCommand` when the worker can run a deterministic local check after editing, such as `npm test`, `npm run lint`, or a focused package test. Commands run in order after task completion; each string must be non-empty after trimming. Omit it when no reliable command exists.
|
|
68
|
+
| `tasks[].verifyCommand` | string[] | no | See verify-and-review snippet below |
|
|
69
|
+
| `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | See verify-and-review snippet below. Default `"full"` |
|
|
57
70
|
|
|
58
|
-
|
|
71
|
+
@include _shared/verify-and-review.md
|
|
59
72
|
|
|
60
|
-
|
|
73
|
+
## Full example
|
|
61
74
|
|
|
62
75
|
```bash
|
|
63
76
|
BATCH=$(curl -f --show-error -s -X POST \
|
|
@@ -68,10 +81,29 @@ BATCH=$(curl -f --show-error -s -X POST \
|
|
|
68
81
|
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
69
82
|
```
|
|
70
83
|
|
|
71
|
-
Then poll until complete:
|
|
72
|
-
|
|
73
84
|
@include _shared/polling.md
|
|
74
85
|
|
|
75
86
|
@include _shared/response-shape.md
|
|
76
87
|
|
|
88
|
+
## Common pitfalls
|
|
89
|
+
|
|
90
|
+
❌ **Two tasks writing the same file in one batch**
|
|
91
|
+
> tasks: [{prompt:"add JWT to login.ts"}, {prompt:"add logging to login.ts"}]
|
|
92
|
+
|
|
93
|
+
Workers run concurrently and race on the file. **Fix:** dispatch sequentially, or merge into one prompt.
|
|
94
|
+
|
|
95
|
+
❌ **Vague `prompt`, no `done` criterion**
|
|
96
|
+
> "improve the auth module"
|
|
97
|
+
|
|
98
|
+
Worker has no completion signal — likely returns `done_with_concerns`. **Fix:** specific verb + acceptance: `"Add input validation to login.ts so all string fields reject empty/whitespace; tests pass"`.
|
|
99
|
+
|
|
100
|
+
❌ **Defaulting to `agentType: "complex"` for everything**
|
|
101
|
+
Standard tier is 5–10× cheaper and finishes most edits. Escalate only when standard returns `filesWritten: 0` or `incompleteReason: "turn_cap"`.
|
|
102
|
+
|
|
103
|
+
❌ **Inlining a 50KB doc into every prompt**
|
|
104
|
+
N tasks × 50KB = N transmissions. **Fix:** register the doc once via `mma-context-blocks`, pass the `contextBlockIds` to each task.
|
|
105
|
+
|
|
106
|
+
❌ **Reading the worker's diff inline before review**
|
|
107
|
+
The reviewer sees the full diff with the original prompt as context. Reading inline burns main-context tokens for no quality gain.
|
|
108
|
+
|
|
77
109
|
@include _shared/error-handling.md
|