@zhixuan92/multi-model-agent 3.3.0 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +62 -33
  2. package/dist/http/canonicalize-file-paths.d.ts +8 -0
  3. package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
  4. package/dist/http/canonicalize-file-paths.js +43 -0
  5. package/dist/http/canonicalize-file-paths.js.map +1 -0
  6. package/dist/http/execution-context.d.ts.map +1 -1
  7. package/dist/http/execution-context.js +0 -14
  8. package/dist/http/execution-context.js.map +1 -1
  9. package/dist/http/handlers/tools/investigate.d.ts +4 -0
  10. package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
  11. package/dist/http/handlers/tools/investigate.js +81 -0
  12. package/dist/http/handlers/tools/investigate.js.map +1 -0
  13. package/dist/http/server.d.ts.map +1 -1
  14. package/dist/http/server.js +5 -2
  15. package/dist/http/server.js.map +1 -1
  16. package/dist/install/discover.d.ts +1 -1
  17. package/dist/install/discover.d.ts.map +1 -1
  18. package/dist/install/discover.js +1 -0
  19. package/dist/install/discover.js.map +1 -1
  20. package/dist/openapi.d.ts.map +1 -1
  21. package/dist/openapi.js +6 -0
  22. package/dist/openapi.js.map +1 -1
  23. package/dist/skills/_shared/verify-and-review.md +12 -0
  24. package/dist/skills/mma-audit/SKILL.md +45 -18
  25. package/dist/skills/mma-clarifications/SKILL.md +73 -29
  26. package/dist/skills/mma-context-blocks/SKILL.md +56 -24
  27. package/dist/skills/mma-debug/SKILL.md +54 -22
  28. package/dist/skills/mma-delegate/SKILL.md +58 -26
  29. package/dist/skills/mma-execute-plan/SKILL.md +55 -29
  30. package/dist/skills/mma-investigate/SKILL.md +137 -0
  31. package/dist/skills/mma-retry/SKILL.md +65 -22
  32. package/dist/skills/mma-review/SKILL.md +49 -20
  33. package/dist/skills/mma-verify/SKILL.md +49 -18
  34. package/dist/skills/multi-model-agent/SKILL.md +84 -46
  35. package/package.json +2 -2
@@ -1,32 +1,45 @@
1
1
  ---
2
2
  name: mma-execute-plan
3
3
  description: >-
4
- Execute tasks from a plan or spec file on disk via the local mmagent HTTP
5
- service. Delegates to cheap sub-agents that don't consume your main-model
6
- context window. Task descriptors match plan headings; tasks run in parallel.
4
+ Use when a plan or spec file exists on disk (any markdown with numbered task
5
+ headings docs/superpowers/plans/*.md, a TODO list, a spec doc) and you need
6
+ to implement one or more tasks from it on cheap workers in parallel
7
7
  when_to_use: >-
8
- A plan file exists on disk (any markdown with numbered task headings
9
- docs/superpowers/plans/*.md, a TODO list, a spec doc) AND you need to
10
- implement one or more tasks from it. Prefer this over inline Agent dispatches
11
- or superpowers:subagent-driven-development / superpowers:executing-plans when
12
- mmagent is running — delegated workers are cheaper and don't pollute main
13
- context. Task descriptors must match the plan headings verbatim.
14
- version: 3.3.0
8
+ A plan file exists on disk AND you need to implement one or more tasks from it
9
+ AND mmagent is running. Prefer this over inline Agent dispatches or
10
+ superpowers:subagent-driven-development / superpowers:executing-plans
11
+ workers are cheaper and don't pollute main context. Task descriptors must
12
+ match plan headings verbatim.
13
+ version: 3.4.0
15
14
  ---
16
15
 
17
- ## mma-execute-plan
16
+ # mma-execute-plan
18
17
 
19
- Dispatch named tasks from a plan file to sub-agents. Task descriptors must
20
- match plan headings (e.g. `"1. Setup database schema"`). All tasks run in
21
- parallel and duplicate descriptors are rejected.
18
+ ## Overview
22
19
 
23
- ### Endpoint
20
+ Dispatch named tasks from a plan file to workers. Each `tasks` string must match a heading in the plan verbatim (e.g. `"1. Setup database schema"`). All tasks run in parallel; duplicate descriptors are rejected.
21
+
22
+ **Core principle:** The plan IS the prompt. Workers re-read the plan file in-process and find their named task — you don't need to inline the task body.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - A plan/spec markdown exists with numbered task headings
28
+ - You want to dispatch a subset (or all) of those tasks
29
+ - Tasks are mostly independent (parallel-safe)
30
+
31
+ **Don't use when:**
32
+ - No plan file → `mma-delegate` (pass the prompt directly)
33
+ - Tasks form a hard linear sequence (later tasks depend on earlier ones' outputs) → dispatch in order, one batch each
34
+ - The "plan" is in conversation only, not on disk → write it to disk first, or use `mma-delegate`
35
+
36
+ ## Endpoint
24
37
 
25
38
  `POST /execute-plan?cwd=<abs-path>`
26
39
 
27
40
  @include _shared/auth.md
28
41
 
29
- ### Request body
42
+ ## Request body
30
43
 
31
44
  ```json
32
45
  {
@@ -46,22 +59,19 @@ parallel and duplicate descriptors are rejected.
46
59
 
47
60
  | Field | Type | Required | Notes |
48
61
  |---|---|---|---|
49
- | `tasks` | string[] | yes | At least one; must be unique; match plan headings |
62
+ | `tasks` | string[] \| `{task, reviewPolicy}[]` | yes | At least one; must be unique; each string matches a plan heading |
50
63
  | `context` | string | no | Short additional context not in the plan |
51
- | `filePaths` | string[] | no | Plan file + relevant source files |
64
+ | `filePaths` | string[] | no | Plan file + relevant source files. Required: the plan file itself. |
52
65
  | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
53
- | `agentType` | `"standard"` / `"complex"` | no | Worker tier. Default `"standard"` (cheap). Switch to `"complex"` for tasks too large for a standard-tier model to finish in the turn budget (reads many files, produces many edits, or the last run came back with `filesWritten: 0`). |
54
- | `verifyCommand` | string[] | no | Commands to run after each plan task completion to verify the work |
55
- | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | Per-task review lifecycle policy when a task is passed as `{ "task": "...", "reviewPolicy": "..." }`. Default `"full"` |
56
-
57
- Set `verifyCommand` when the worker can run a deterministic local check after editing, such as `npm test`, `npm run lint`, or a focused package test. Commands run in order after task completion; each string must be non-empty after trimming. Omit it when no reliable command exists.
66
+ | `agentType` | `"standard"` / `"complex"` | no | Default `"standard"`. Use `"complex"` for tasks too large for the standard tier reads many files, produces many edits, or the last run came back with `filesWritten: 0`. |
67
+ | `verifyCommand` | string[] | no | See verify-and-review snippet below |
68
+ | `tasks[].reviewPolicy` | `"full"` / `"spec_only"` / `"diff_only"` / `"off"` | no | See verify-and-review snippet below. Default `"full"`. |
58
69
 
59
- Set `reviewPolicy: 'diff_only'` when you want a cheaper single-pass review of the produced diff without spec-review rework loops. Use `reviewPolicy: 'full'` for default spec + quality review, `reviewPolicy: 'spec_only'` when quality review is not needed, and `reviewPolicy: 'off'` only for trusted low-risk tasks where verification is enough.
70
+ @include _shared/verify-and-review.md
60
71
 
61
- If the batch reaches `awaiting_clarification`, use `mma-clarifications`
62
- to confirm or correct the proposed interpretation.
72
+ If the batch reaches `awaiting_clarification`, use `mma-clarifications` to confirm or correct the proposed interpretation.
63
73
 
64
- ### Full example
74
+ ## Full example
65
75
 
66
76
  ```bash
67
77
  BATCH=$(curl -f --show-error -s -X POST \
@@ -72,10 +82,26 @@ BATCH=$(curl -f --show-error -s -X POST \
72
82
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
73
83
  ```
74
84
 
75
- Then poll until complete:
76
-
77
85
  @include _shared/polling.md
78
86
 
79
87
  @include _shared/response-shape.md
80
88
 
89
+ ## Common pitfalls
90
+
91
+ ❌ **Task descriptor doesn't match plan heading verbatim**
92
+ > tasks: ["Migrate db schema"] ← plan heading is "3. Migrate database schema"
93
+
94
+ Worker rejects with "no matching task" or matches the wrong one. **Fix:** copy the heading from the plan, including the leading number.
95
+
96
+ ❌ **Forgetting the plan file in `filePaths`**
97
+ > filePaths: ["/project/src/db/schema.sql"] ← no plan file
98
+
99
+ Worker can't read the task body. **Fix:** always include the plan path: `filePaths: ["/project/docs/plan.md", "/project/src/db/schema.sql"]`.
100
+
101
+ ❌ **Dispatching dependent tasks in one batch**
102
+ Task 5 depends on Task 4's output → workers race; Task 5 might run before Task 4 finishes. **Fix:** dispatch Task 4, wait for terminal, then dispatch Task 5.
103
+
104
+ ❌ **Skipping `verifyCommand` when one exists**
105
+ A passing local check is the cheapest signal you're going to get. **Fix:** wire `["npm test"]` or the focused package test.
106
+
81
107
  @include _shared/error-handling.md
@@ -0,0 +1,137 @@
1
+ ---
2
+ name: mma-investigate
3
+ description: >-
4
+ Use when you need to answer a question about the codebase ("how does X work",
5
+ "where is Y called", "what does this directory do") and reading + grepping the
6
+ codebase yourself would consume main-context tokens
7
+ when_to_use: >-
8
+ A question about THIS codebase has surfaced — from the user, from a
9
+ methodology skill, or from your own next-step planning — AND mmagent is
10
+ running. Delegate the read/grep/synthesis to a worker so the main context
11
+ stays on judgment. Codebase only — does not perform web research or
12
+ git-history queries.
13
+ version: 3.4.0
14
+ ---
15
+
16
+ # mma-investigate
17
+
18
+ ## Overview
19
+
20
+ Answer a codebase question via a read-only mmagent worker. The worker greps and reads on its cheap budget; you read its synthesis on yours.
21
+
22
+ **Core principle:** Investigation is labor (read, grep, synthesize). Delegate it. The main agent stays on judgment — deciding what the answer means and what to do with it.
23
+
24
+ ## When to Use
25
+
26
+ ```dot
27
+ digraph when_to_use {
28
+ "Question about codebase?" [shape=diamond];
29
+ "About web / git history?" [shape=diamond];
30
+ "Already have the file in context?" [shape=diamond];
31
+ "mma-investigate" [shape=box];
32
+ "Read inline (1–2 reads)" [shape=box];
33
+ "WebSearch / git log" [shape=box];
34
+
35
+ "Question about codebase?" -> "About web / git history?";
36
+ "About web / git history?" -> "WebSearch / git log" [label="yes"];
37
+ "About web / git history?" -> "Already have the file in context?" [label="no"];
38
+ "Already have the file in context?" -> "Read inline (1–2 reads)" [label="yes"];
39
+ "Already have the file in context?" -> "mma-investigate" [label="no"];
40
+ }
41
+ ```
42
+
43
+ **Use when:**
44
+ - "How does X work in this codebase?"
45
+ - "Where is Y called from?"
46
+ - "What does this directory do?"
47
+ - The answer requires reading 3+ files or grepping
48
+ - Cross-cutting investigations (auth flow across modules, data lineage)
49
+
50
+ **Don't use when:**
51
+ - The answer is in 1–2 files you already have in context → just `Read`
52
+ - It's about web docs / external APIs → `WebSearch` / `WebFetch`
53
+ - It's about git history → `git log` / `git blame`
54
+ - You need to MODIFY code based on the finding → `mma-delegate` (research + edit)
55
+
56
+ ## Endpoint
57
+
58
+ `POST /investigate?cwd=<abs-path>`
59
+
60
+ @include _shared/auth.md
61
+
62
+ ## Request body
63
+
64
+ ```json
65
+ {
66
+ "question": "How does the auth middleware handle token refresh?",
67
+ "filePaths": ["/project/src/auth/"],
68
+ "contextBlockIds": []
69
+ }
70
+ ```
71
+
72
+ | Field | Type | Required | Notes |
73
+ |---|---|---|---|
74
+ | `question` | string | yes | Natural-language investigation question |
75
+ | `filePaths` | string[] | no | Anchor paths the worker starts from. Worker may grep beyond. |
76
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` — enables follow-up / delta investigation |
77
+ | `agentType` | `'standard' \| 'complex'` | no | Caller override of the route default (`'complex'`) |
78
+ | `tools` | `'none' \| 'readonly'` | no | Default `'readonly'`. `'no-shell'` and `'full'` are rejected — investigation is read-only |
79
+
80
+ **Anchor narrow questions with `filePaths`:**
81
+
82
+ ❌ `{ "question": "Where is parseConfig called?" }` — searches the whole repo
83
+ ✅ `{ "question": "Where is parseConfig called?", "filePaths": ["src/"] }` — bounded
84
+
85
+ **Why:** the worker greps and reads under its cost ceiling. Without anchors, broad questions exhaust the budget before they finish.
86
+
87
+ ## Full example
88
+
89
+ ```bash
90
+ BATCH=$(curl -f --show-error -s -X POST \
91
+ -H "Authorization: Bearer $TOKEN" \
92
+ -H "Content-Type: application/json" \
93
+ -d '{"question":"How does the auth middleware handle token refresh?"}' \
94
+ "http://localhost:$PORT/investigate?cwd=/project")
95
+ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
96
+ ```
97
+
98
+ @include _shared/polling.md
99
+
100
+ @include _shared/response-shape.md
101
+
102
+ ## Per-task report shape
103
+
104
+ Each task carries an `investigation` field on its per-task report:
105
+
106
+ ```json
107
+ {
108
+ "investigation": {
109
+ "citations": [
110
+ { "file": "src/auth/refresh.ts", "lines": "45-72", "claim": "Refresh handler reads bearer." }
111
+ ],
112
+ "confidence": { "level": "high", "rationale": "All claims directly cited." },
113
+ "diagnostics": {
114
+ "malformedCitationLines": 0,
115
+ "missingRequiredSections": [],
116
+ "invalidRequiredSections": []
117
+ }
118
+ }
119
+ }
120
+ ```
121
+
122
+ `workerStatus` is one of `done`, `done_with_concerns`, `needs_context`, `blocked`. When `done_with_concerns`, the per-task report carries `incompleteReason` (`turn_cap`, `cost_cap`, `timeout`, or `missing_sections`). When `needs_context`, the worker flagged a `[needs_context]` bullet under `## Unresolved` — re-dispatch with extra context (anchor paths, a context block, or a clarification turn).
123
+
124
+ ## Common pitfalls
125
+
126
+ ❌ **Asking for a fix instead of an answer**
127
+ > question: "Refactor the auth middleware to use JWT"
128
+
129
+ The investigator can't write — `tools: 'readonly'`. **Fix:** use `mma-delegate` for research-then-edit, or split: investigate first, then dispatch the edit.
130
+
131
+ ❌ **Treating `done_with_concerns` as failure**
132
+ The worker still produced citations and a confidence level. Read them — partial coverage with `incompleteReason: 'turn_cap'` often answers the question well enough. Re-dispatch with a tighter scope only if the citations are unusable.
133
+
134
+ ❌ **Inline-reading instead of delegating**
135
+ About to `Read` 3+ files just to answer one question? That's the wrong tradeoff — the worker reads on its cheap budget; you read its synthesis on yours.
136
+
137
+ @include _shared/error-handling.md
@@ -1,30 +1,62 @@
1
1
  ---
2
2
  name: mma-retry
3
3
  description: >-
4
- Re-run specific failed or incomplete tasks from a previous mmagent batch by
5
- index. Preserves the original task specs and only re-executes the named
6
- indices.
4
+ Use when a previous mma-* batch returned partial results (some tasks failed or
5
+ came back incomplete) and you want to re-run JUST the failed indices without
6
+ re-dispatching the whole batch
7
7
  when_to_use: >-
8
- A previous mma-delegate / mma-execute-plan batch returned partial results and
9
- you want to re-try the failed indices only. Prefer this over redispatching the
10
- whole batch or inline-retrying it's idempotent and keeps the original
11
- batch's diagnostics intact.
12
- version: 3.3.0
8
+ A previous mma-delegate / mma-execute-plan / mma-audit / mma-review /
9
+ mma-verify / mma-debug / mma-investigate batch returned partial results AND
10
+ you want to re-try the failed indices only. Prefer this over re-dispatching
11
+ the whole batch or inline-retrying — it's idempotent and preserves the
12
+ original batch's diagnostics.
13
+ version: 3.4.0
13
14
  ---
14
15
 
15
- ## mma-retry
16
+ # mma-retry
16
17
 
17
- Re-run selected tasks from a completed or failed batch. Specify the original
18
- `batchId` and the zero-based indices of the tasks to re-run. The retry runs
19
- those tasks fresh with the same configuration as the original batch.
18
+ ## Overview
20
19
 
21
- ### Endpoint
20
+ Re-run selected tasks from a completed or failed batch. Specify the original `batchId` and the zero-based indices of the tasks to re-run. The retry runs those tasks fresh with the same configuration as the original batch and produces a new `batchId`.
21
+
22
+ **Core principle:** A batch is the unit of dispatch, but a TASK is the unit of failure. Retry at the task level so successful tasks aren't re-charged.
23
+
24
+ ## When to Use
25
+
26
+ ```dot
27
+ digraph when_to_use {
28
+ "Batch returned terminal?" [shape=diamond];
29
+ "Some tasks failed/incomplete?" [shape=diamond];
30
+ "All tasks failed?" [shape=diamond];
31
+ "mma-retry (selected indices)" [shape=box];
32
+ "Re-dispatch the whole batch" [shape=box];
33
+ "Investigate first (mma-debug)" [shape=box];
34
+
35
+ "Batch returned terminal?" -> "Some tasks failed/incomplete?";
36
+ "Some tasks failed/incomplete?" -> "All tasks failed?" [label="yes"];
37
+ "Some tasks failed/incomplete?" -> "Done — read results" [label="no"];
38
+ "All tasks failed?" -> "Investigate first (mma-debug)" [label="yes"];
39
+ "All tasks failed?" -> "mma-retry (selected indices)" [label="no — partial"];
40
+ }
41
+ ```
42
+
43
+ **Use when:**
44
+ - A previous batch's terminal envelope shows mixed `done` / `done_with_concerns` / `failed`
45
+ - 1–N tasks (but not all) need a re-run with the same config
46
+ - You want to keep the original batch's diagnostics intact for comparison
47
+
48
+ **Don't use when:**
49
+ - All tasks failed → investigate the systemic cause first (`mma-debug`); retrying won't help
50
+ - The original batch is `expired` (TTL elapsed) → re-dispatch fresh
51
+ - You want to change the prompt → re-dispatch with the new prompt; retry preserves the original
52
+
53
+ ## Endpoint
22
54
 
23
55
  `POST /retry?cwd=<abs-path>`
24
56
 
25
57
  @include _shared/auth.md
26
58
 
27
- ### Request body
59
+ ## Request body
28
60
 
29
61
  ```json
30
62
  {
@@ -35,13 +67,12 @@ those tasks fresh with the same configuration as the original batch.
35
67
 
36
68
  | Field | Type | Required | Notes |
37
69
  |---|---|---|---|
38
- | `batchId` | string (UUID) | yes | Batch ID from a previous dispatch |
39
- | `taskIndices` | number[] | yes | Zero-based indices to re-run |
70
+ | `batchId` | string (UUID) | yes | Batch ID from a previous dispatch (not yet expired) |
71
+ | `taskIndices` | number[] | yes | Zero-based indices to re-run; must be non-negative integers |
40
72
 
41
- `taskIndices` must be non-negative integers. To re-run all tasks, pass all
42
- indices from `0` to `tasks.length - 1`.
73
+ To re-run all tasks: pass `[0, 1, ..., tasks.length - 1]`. (But consider: if all failed, debug instead of retrying.)
43
74
 
44
- ### Full example
75
+ ## Full example
45
76
 
46
77
  ```bash
47
78
  # Original batch had 4 tasks; re-run tasks at index 1 and 3
@@ -50,13 +81,25 @@ BATCH=$(curl -f --show-error -s -X POST \
50
81
  -H "Content-Type: application/json" \
51
82
  -d '{"batchId":"550e8400-e29b-41d4-a716-446655440000","taskIndices":[1,3]}' \
52
83
  "http://localhost:$PORT/retry?cwd=/project")
53
- BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
84
+ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId') # NEW batchId — not the original
54
85
  ```
55
86
 
56
- The retry produces a new `batchId`. Poll the new ID until complete:
57
-
58
87
  @include _shared/polling.md
59
88
 
60
89
  @include _shared/response-shape.md
61
90
 
91
+ ## Common pitfalls
92
+
93
+ ❌ **Retrying after the batch expired**
94
+ TTL elapsed → original task specs are gone. **Fix:** re-dispatch fresh; the retry endpoint returns 404.
95
+
96
+ ❌ **Retrying without addressing the root cause**
97
+ A flaky task that failed once will likely fail again. **Fix:** investigate (`mma-debug` or read the original `result.error.message`), then retry — or escalate `agentType` to `complex` by re-dispatching.
98
+
99
+ ❌ **Confusing the new and original `batchId`**
100
+ Retry produces a NEW batchId; polling the original returns the old terminal state. **Fix:** save the retry's `batchId` and poll that one.
101
+
102
+ ❌ **Using retry to change task config**
103
+ Retry preserves the ORIGINAL config (prompt, agentType, filePaths, reviewPolicy). **Fix:** if you want different config, re-dispatch with `mma-delegate` / `mma-execute-plan`.
104
+
62
105
  @include _shared/error-handling.md
@@ -1,29 +1,46 @@
1
1
  ---
2
2
  name: mma-review
3
3
  description: >-
4
- Review code for quality, security, performance, or correctness via the local
5
- mmagent HTTP service. Sub-agents run in parallel per file, independent
6
- context.
4
+ Use when source code needs a quality / security / correctness pass pre-merge
5
+ review, post-implementation sanity check, or focused look at a small file set
6
+ — and the review can run in parallel per file
7
7
  when_to_use: >-
8
- The user asks for a code review, pre-merge check, or quality pass over one or
9
- more files OR a methodology skill (superpowers:requesting-code-review,
10
- /review, /security-review) points at a review task. Delegate the reviewer pass
11
- to mmagent workers your main context stays free to decide what to merge.
12
- version: 3.3.0
8
+ User asks for a code review or pre-merge check, OR a methodology skill
9
+ (superpowers:requesting-code-review, /review, /security-review) points at one,
10
+ AND mmagent is running. Delegate so each file reviews on its own worker; the
11
+ main agent only decides what to merge. Review on SOURCE CODE use mma-audit
12
+ for prose specs / configs.
13
+ version: 3.4.0
13
14
  ---
14
15
 
15
- ## mma-review
16
+ # mma-review
16
17
 
17
- Send code or files to sub-agents for structured review. Each file is reviewed
18
- independently in parallel; results are index-aligned with `filePaths`.
18
+ ## Overview
19
19
 
20
- ### Endpoint
20
+ Send code files to workers for structured review. Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
21
+
22
+ **Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - 1+ source code files just changed (post-implementation review)
28
+ - Pre-merge sanity check on a focused diff
29
+ - Security-sensitive code path (`focus: ["security"]`)
30
+ - A specialized review pass (e.g. `focus: ["performance"]` on hot-path code)
31
+
32
+ **Don't use when:**
33
+ - The thing being reviewed is prose / spec / config → `mma-audit` (better-suited prompt template)
34
+ - You want to know whether a complete branch is mergeable → run `/ultrareview` (multi-model branch review) instead
35
+ - The diff is one-line / one-character → reading inline is faster than dispatch
36
+
37
+ ## Endpoint
21
38
 
22
39
  `POST /review?cwd=<abs-path>`
23
40
 
24
41
  @include _shared/auth.md
25
42
 
26
- ### Request body
43
+ ## Request body
27
44
 
28
45
  ```json
29
46
  {
@@ -36,14 +53,14 @@ independently in parallel; results are index-aligned with `filePaths`.
36
53
 
37
54
  | Field | Type | Required | Notes |
38
55
  |---|---|---|---|
39
- | `code` | string | no | Inline code to review |
40
- | `focus` | string[] | no | Any of `security`, `performance`, `correctness`, `style` |
41
- | `filePaths` | string[] | no | Files to review (parallel) |
42
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
56
+ | `code` | string | no | Inline code snippet to review |
57
+ | `focus` | string[] | no | Any of `security`, `performance`, `correctness`, `style`. Omit for general review. |
58
+ | `filePaths` | string[] | no | Files to review (one worker per file, parallel) |
59
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` — useful for design docs the reviewer should validate against |
43
60
 
44
61
  Either `code` or `filePaths` (or both) must be provided.
45
62
 
46
- ### Full example
63
+ ## Full example
47
64
 
48
65
  ```bash
49
66
  BATCH=$(curl -f --show-error -s -X POST \
@@ -54,10 +71,22 @@ BATCH=$(curl -f --show-error -s -X POST \
54
71
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
55
72
  ```
56
73
 
57
- Then poll until complete:
58
-
59
74
  @include _shared/polling.md
60
75
 
61
76
  @include _shared/response-shape.md
62
77
 
78
+ ## Common pitfalls
79
+
80
+ ❌ **Reviewing a plan/spec markdown with `mma-review`**
81
+ The reviewer is tuned for code constructs (types, call sites, test coverage). On prose it produces vague nits. **Fix:** use `mma-audit` for docs/specs, `mma-review` for source.
82
+
83
+ ❌ **Omitting `focus` and getting watery findings**
84
+ A general review surfaces low-signal style nits alongside real bugs. **Fix:** specify `focus: ["correctness"]` or `["security"]` to bias the reviewer toward the dimension you care about.
85
+
86
+ ❌ **Inlining the spec the reviewer should validate against**
87
+ If the reviewer needs to check the diff against a design doc, register the doc once via `mma-context-blocks` and pass the `contextBlockIds`. Inlining N times wastes tokens.
88
+
89
+ ❌ **Skipping review because "I already read it"**
90
+ Self-review and cross-model review are not the same thing. The whole reason to delegate is the different blind spots. Read the findings; merge what you agree with.
91
+
63
92
  @include _shared/error-handling.md
@@ -1,28 +1,45 @@
1
1
  ---
2
2
  name: mma-verify
3
3
  description: >-
4
- Verify work against a checklist via the local mmagent HTTP service. Sub-agents
5
- check each item independently.
4
+ Use when work is "complete" and you need to confirm acceptance criteria are
5
+ actually met before claiming so to the user — each checklist item verified
6
+ independently against the work
6
7
  when_to_use: >-
7
8
  The user (or a methodology skill like
8
- superpowers:verification-before-completion) wants acceptance-criteria checked
9
- against implemented work. Delegate the evidence-gathering to mmagent workers
10
- each checklist item is verified independently and in parallel.
11
- version: 3.3.0
9
+ superpowers:verification-before-completion) needs acceptance-criteria checked
10
+ against implemented work BEFORE claiming success. Delegate so each checklist
11
+ item gets independent evidence-gathering on a worker. Use this BEFORE saying
12
+ "done" — never after.
13
+ version: 3.4.0
12
14
  ---
13
15
 
14
- ## mma-verify
16
+ # mma-verify
15
17
 
16
- Submit work product and a checklist to sub-agents for independent verification.
17
- Each checklist item is verified in parallel; results are index-aligned.
18
+ ## Overview
18
19
 
19
- ### Endpoint
20
+ Submit work product and a checklist to workers for independent verification. Each checklist item is verified in parallel; results are index-aligned with the input.
21
+
22
+ **Core principle:** Self-verification ("I read the files; they look correct") has no external validation. Workers check independently and return evidence (or absence of it) per item.
23
+
24
+ ## When to Use
25
+
26
+ **Use when:**
27
+ - You're about to claim a task is "done" and need evidence per acceptance item
28
+ - A methodology skill (superpowers:verification-before-completion) routed here
29
+ - The user gave a checklist and asked you to confirm each item
30
+
31
+ **Don't use when:**
32
+ - The "checklist" is one item — read inline, faster than dispatch
33
+ - You don't have explicit acceptance criteria — write them first, then dispatch
34
+ - The work hasn't been done yet — verification is a post-condition, not a pre-condition
35
+
36
+ ## Endpoint
20
37
 
21
38
  `POST /verify?cwd=<abs-path>`
22
39
 
23
40
  @include _shared/auth.md
24
41
 
25
- ### Request body
42
+ ## Request body
26
43
 
27
44
  ```json
28
45
  {
@@ -39,12 +56,12 @@ Each checklist item is verified in parallel; results are index-aligned.
39
56
 
40
57
  | Field | Type | Required | Notes |
41
58
  |---|---|---|---|
42
- | `work` | string | no | Inline work product description |
43
- | `checklist` | string[] | yes | At least one item |
44
- | `filePaths` | string[] | no | Files to verify against (parallel) |
45
- | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
59
+ | `work` | string | no | Inline work-product description (e.g. summary of what changed) |
60
+ | `checklist` | string[] | yes | At least one item — each item verified by its own worker |
61
+ | `filePaths` | string[] | no | Files to verify against (workers can read them) |
62
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` (e.g. the spec the work was supposed to satisfy) |
46
63
 
47
- ### Full example
64
+ ## Full example
48
65
 
49
66
  ```bash
50
67
  BATCH=$(curl -f --show-error -s -X POST \
@@ -55,10 +72,24 @@ BATCH=$(curl -f --show-error -s -X POST \
55
72
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
56
73
  ```
57
74
 
58
- Then poll until complete:
59
-
60
75
  @include _shared/polling.md
61
76
 
62
77
  @include _shared/response-shape.md
63
78
 
79
+ ## Common pitfalls
80
+
81
+ ❌ **Vague checklist items**
82
+ > "Code is good"
83
+
84
+ The worker can't gather evidence for "good". **Fix:** specific, falsifiable criteria — `"Function parseConfig has at least 3 unit tests covering: missing field, malformed JSON, empty file"`.
85
+
86
+ ❌ **Verifying without `filePaths`**
87
+ Worker has nothing to read; verdict is speculative. **Fix:** always pass the file(s) the work landed in.
88
+
89
+ ❌ **Treating verify as the implementation step**
90
+ Verify CHECKS work; it doesn't DO work. If a checklist item fails, dispatch `mma-delegate` to fix it, then re-verify.
91
+
92
+ ❌ **Skipping verify because "tests pass"**
93
+ Tests verify the test cases that exist. Verify checks the acceptance criteria — which often include things tests don't (docs updated, no debug-print left, etc.).
94
+
64
95
  @include _shared/error-handling.md