@zhixuan92/multi-model-agent 4.0.6 → 4.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -1
- package/dist/skills/mma-audit/SKILL.md +25 -13
- package/dist/skills/mma-context-blocks/SKILL.md +1 -1
- package/dist/skills/mma-debug/SKILL.md +1 -1
- package/dist/skills/mma-delegate/SKILL.md +1 -1
- package/dist/skills/mma-execute-plan/SKILL.md +1 -1
- package/dist/skills/mma-explore/SKILL.md +1 -1
- package/dist/skills/mma-investigate/SKILL.md +1 -1
- package/dist/skills/mma-research/SKILL.md +1 -1
- package/dist/skills/mma-retry/SKILL.md +1 -1
- package/dist/skills/mma-review/SKILL.md +12 -3
- package/dist/skills/mma-verify/SKILL.md +1 -1
- package/dist/skills/multi-model-agent/SKILL.md +1 -1
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -121,7 +121,7 @@ Skills are the surface your AI client sees. `mmagent sync-skills` writes them to
|
|
|
121
121
|
| `mma-research` | `POST /research` | External multi-source research with citations — arxiv, semantic_scholar, github_search, rss, brave-with-`site:`-filters — for a focused question. |
|
|
122
122
|
| `mma-debug` | `POST /debug` | A test fails, a build breaks, or behavior is unexpected — delegate the reproduce/trace, keep the hypothesis on the main agent. |
|
|
123
123
|
| `mma-review` | `POST /review` | Source-code review (pre-merge, post-implementation, security-focused). One worker per file, in parallel. |
|
|
124
|
-
| `mma-audit` | `POST /audit` | Audit a
|
|
124
|
+
| `mma-audit` | `POST /audit` | Audit a spec / plan / design doc / recommendation doc for executability blockers (contradictions, ambiguity, recommendation-coherence gaps). Default is the comprehensive sweep; `security` and `performance` are narrow opt-in lenses. |
|
|
125
125
|
| `mma-verify` | `POST /verify` | Check acceptance criteria against finished work *before* claiming done. One worker per checklist item. |
|
|
126
126
|
|
|
127
127
|
### Plumbing skills
|
|
@@ -1,29 +1,31 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mma-audit
|
|
3
3
|
description: >-
|
|
4
|
-
Use when the user asks to audit a
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Use when the user asks to audit a spec, plan, design doc, recommendation doc,
|
|
5
|
+
or config — the audit checks whether a literal-following worker could execute
|
|
6
|
+
the artifact without ambiguity, contradiction, or missing context. Default is
|
|
7
|
+
the comprehensive sweep; narrow lenses (security/performance) exist for cases
|
|
8
|
+
that want only one dimension.
|
|
7
9
|
when_to_use: >-
|
|
8
|
-
User asks for a doc/spec/config audit OR a methodology skill
|
|
10
|
+
User asks for a doc/spec/plan/config audit OR a methodology skill
|
|
9
11
|
(superpowers:dispatching-parallel-agents, /security-review) points at one AND
|
|
10
12
|
mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
|
|
11
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
12
14
|
---
|
|
13
15
|
|
|
14
16
|
# mma-audit
|
|
15
17
|
|
|
16
18
|
## Overview
|
|
17
19
|
|
|
18
|
-
Send a
|
|
20
|
+
Send a spec, plan, design doc, or recommendation doc to a worker for structured auditing. The audit's purpose is to make the artifact **executable by a low-judgment worker** — meaning a sub-agent that follows instructions literally and cannot disambiguate. Findings target executability blockers: ambiguity, internal contradictions, unspecified branches, missing verification, overloaded terms, out-of-order steps.
|
|
19
21
|
|
|
20
22
|
**Core principle:** One worker per file = no cross-file context pollution. The aggregator (you) decides what to do with the findings.
|
|
21
23
|
|
|
22
24
|
## When to Use
|
|
23
25
|
|
|
24
26
|
**Use when:**
|
|
25
|
-
- A spec
|
|
26
|
-
- The
|
|
27
|
+
- A spec, plan, design doc, recommendation doc, or post-mortem needs a critical read
|
|
28
|
+
- The artifact will subsequently be executed by a worker (or any reader who follows it literally)
|
|
27
29
|
- 2+ files would benefit from parallel audit
|
|
28
30
|
|
|
29
31
|
**Don't use when:**
|
|
@@ -42,7 +44,7 @@ Send a document or set of files to workers for structured auditing. Each file is
|
|
|
42
44
|
```json
|
|
43
45
|
{
|
|
44
46
|
"document": "inline content to audit (optional if filePaths given)",
|
|
45
|
-
"auditType": "
|
|
47
|
+
"auditType": "default",
|
|
46
48
|
"filePaths": ["/project/docs/spec.md"],
|
|
47
49
|
"contextBlockIds": []
|
|
48
50
|
}
|
|
@@ -51,7 +53,7 @@ Send a document or set of files to workers for structured auditing. Each file is
|
|
|
51
53
|
| Field | Type | Required | Notes |
|
|
52
54
|
|---|---|---|---|
|
|
53
55
|
| `document` | string | no | Inline document content |
|
|
54
|
-
| `auditType` |
|
|
56
|
+
| `auditType` | `'default' \| 'security' \| 'performance'` | no (defaults to `'default'`) | See "Picking auditType" below — `default` is right for ~90% of audits |
|
|
55
57
|
| `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
|
|
56
58
|
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
|
|
57
59
|
|
|
@@ -59,6 +61,16 @@ Either `document` or `filePaths` (or both) must be provided.
|
|
|
59
61
|
|
|
60
62
|
> Worker tier for `mma-audit` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
|
|
61
63
|
|
|
64
|
+
### Picking auditType
|
|
65
|
+
|
|
66
|
+
| Value | When to use |
|
|
67
|
+
|---|---|
|
|
68
|
+
| `default` (or omit the field) | **Right answer for ~90% of audits.** Spec, plan, design doc, recommendation doc, post-mortem, audit, brief, README — any prose artifact. Sweeps the full executability + correctness + clarity taxonomy with security and performance lenses applied. |
|
|
69
|
+
| `security` | Narrow opt-in. Use ONLY when you specifically want security findings and not general audit findings (e.g., a threat model where stylistic noise is unwanted). |
|
|
70
|
+
| `performance` | Narrow opt-in. Use ONLY when you specifically want performance findings (e.g., a scaling design where you want hot-path / latency / unbounded-loop findings only). |
|
|
71
|
+
|
|
72
|
+
The legacy values `correctness`, `style`, and `general` no longer exist — they were a false dichotomy. Sending any of them returns `400 invalid_request` with a hint to use `default`.
|
|
73
|
+
|
|
62
74
|
## Full example
|
|
63
75
|
|
|
64
76
|
```bash
|
|
@@ -67,7 +79,7 @@ BATCH=$(curl -f --show-error -s -X POST \
|
|
|
67
79
|
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
68
80
|
-H "Authorization: Bearer $TOKEN" \
|
|
69
81
|
-H "Content-Type: application/json" \
|
|
70
|
-
-d '{"auditType":"
|
|
82
|
+
-d '{"auditType":"default","filePaths":["/project/docs/api-spec.md"]}' \
|
|
71
83
|
"http://localhost:$PORT/audit?cwd=/project")
|
|
72
84
|
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
73
85
|
```
|
|
@@ -130,8 +142,8 @@ The auditor lacks codebase context (no type info, no call-site lookup, no test a
|
|
|
130
142
|
❌ **Single huge `document` string instead of `filePaths`**
|
|
131
143
|
Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
|
|
132
144
|
|
|
133
|
-
❌ **
|
|
134
|
-
|
|
145
|
+
❌ **Sending legacy auditType values (`correctness`, `style`, `general`)**
|
|
146
|
+
These were removed — they were a false dichotomy that biased workers toward stylistic proofreading on prose artifacts. **Fix:** use `default` (or omit the field). Use `security` or `performance` only when you specifically want a narrow lens.
|
|
135
147
|
|
|
136
148
|
❌ **Re-auditing the same files round after round without delta context**
|
|
137
149
|
Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
|
|
13
13
|
mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
|
|
14
14
|
mma-investigate. Cheaper and faster than inlining the same content N times.
|
|
15
|
-
version: 4.0
|
|
15
|
+
version: 4.2.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-context-blocks
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
read files, reproduce, trace — OR a methodology skill
|
|
11
11
|
(superpowers:systematic-debugging) points at the investigation step. Delegate
|
|
12
12
|
the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-debug
|
|
@@ -11,7 +11,7 @@ when_to_use: >-
|
|
|
11
11
|
and keep main context free. If a plan file exists → use mma-execute-plan. If
|
|
12
12
|
the task is audit / review / verify / debug / investigate → use the matching
|
|
13
13
|
specialized skill.
|
|
14
|
-
version: 4.0
|
|
14
|
+
version: 4.2.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# mma-delegate
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
superpowers:subagent-driven-development / superpowers:executing-plans —
|
|
11
11
|
workers are cheaper and don't pollute main context. Task descriptors must
|
|
12
12
|
match plan headings verbatim.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-execute-plan
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
out mma-investigate (internal) + mma-research (external) in parallel and
|
|
13
13
|
synthesise the results yourself. DO NOT use for convergent single-answer
|
|
14
14
|
questions — those are mma-investigate.
|
|
15
|
-
version: 4.0
|
|
15
|
+
version: 4.2.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-explore
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
git-history queries. OR you are about to read 3+ files / run any grep in main
|
|
13
13
|
context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
|
|
14
14
|
skill instead.
|
|
15
|
-
version: 4.0
|
|
15
|
+
version: 4.2.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-investigate
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
others do, what published methods exist) AND mmagent is running. Delegate the
|
|
11
11
|
multi-source web/adapter research to a worker so the main context stays on
|
|
12
12
|
judgment. NOT for codebase questions — those are mma-investigate.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-research
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
you want to re-try the failed indices only. Prefer this over re-dispatching
|
|
11
11
|
the whole batch or inline-retrying — it's idempotent and preserves the
|
|
12
12
|
original batch's diagnostics.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-retry
|
|
@@ -10,16 +10,18 @@ when_to_use: >-
|
|
|
10
10
|
AND mmagent is running. Delegate so each file reviews on its own worker; the
|
|
11
11
|
main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
|
|
12
12
|
for prose specs / configs.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-review
|
|
17
17
|
|
|
18
18
|
## Overview
|
|
19
19
|
|
|
20
|
-
Send code files to
|
|
20
|
+
mma-review is the **pre-merge gate**. Send code files (or a diff) to a worker for structured review against an executability bar: would a maintainer who reads only the verdict and the diff understand which changes are required, why each is required, and where each lives — well enough to apply the fix and re-merge without re-investigating?
|
|
21
21
|
|
|
22
|
-
|
|
22
|
+
Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
|
|
23
|
+
|
|
24
|
+
**Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses. The reviewer runs against a 10-category failure-mode taxonomy (test gap, cross-file ripple, missing edge case, race, resource leak, backward-compat break, security/performance regression, implicit-contract assumption, pre-existing-bug-vs-new-regression separation) and weighs every change through the security, performance, and correctness lenses regardless of `focus`.
|
|
23
25
|
|
|
24
26
|
## When to Use
|
|
25
27
|
|
|
@@ -34,6 +36,13 @@ Send code files to workers for structured review. Each file is reviewed independ
|
|
|
34
36
|
- You want to know whether a complete branch is mergeable → run `/ultrareview` (multi-model branch review) instead
|
|
35
37
|
- The diff is one-line / one-character → reading inline is faster than dispatch
|
|
36
38
|
|
|
39
|
+
## How to invoke for cross-file ripple detection
|
|
40
|
+
|
|
41
|
+
The cross-file ripple pass (changed-symbol → broken caller) only fires when the worker can identify what changed. Two patterns:
|
|
42
|
+
|
|
43
|
+
- **Diff-as-input (preferred for cross-file ripple)**: pass the diff via the `code` field, plus the named files via `filePaths`. The worker treats the diff as the change-set and greps for callers of changed public symbols.
|
|
44
|
+
- **Files-only (static review)**: pass only `filePaths`. The worker reviews the files in their current state without a change-set, so cross-file ripple is degenerate. Test gap, missing edge case, race, leak, and security/performance findings still fire.
|
|
45
|
+
|
|
37
46
|
## Endpoint
|
|
38
47
|
|
|
39
48
|
`POST /review?cwd=<abs-path>`
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
against implemented work BEFORE claiming success. Delegate so each checklist
|
|
11
11
|
item gets independent evidence-gathering on a worker. Use this BEFORE saying
|
|
12
12
|
"done" — never after.
|
|
13
|
-
version: 4.0
|
|
13
|
+
version: 4.2.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-verify
|
|
@@ -11,7 +11,7 @@ when_to_use: >-
|
|
|
11
11
|
tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
|
|
12
12
|
and delegate there. Applies equally whether the user invoked a superpowers
|
|
13
13
|
methodology skill or asked directly.
|
|
14
|
-
version: 4.0
|
|
14
|
+
version: 4.2.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# multi-model-agent (router)
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@zhixuan92/multi-model-agent",
|
|
3
|
-
"version": "4.0
|
|
3
|
+
"version": "4.2.0",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
|
|
@@ -53,7 +53,7 @@
|
|
|
53
53
|
},
|
|
54
54
|
"dependencies": {
|
|
55
55
|
"@asteasolutions/zod-to-openapi": "^8.5.0",
|
|
56
|
-
"@zhixuan92/multi-model-agent-core": "^4.0
|
|
56
|
+
"@zhixuan92/multi-model-agent-core": "^4.2.0",
|
|
57
57
|
"gray-matter": "^4.0.3",
|
|
58
58
|
"minimist": "^1.2.8",
|
|
59
59
|
"proper-lockfile": "^4.1.2",
|