@zhixuan92/multi-model-agent 4.0.6 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -121,7 +121,7 @@ Skills are the surface your AI client sees. `mmagent sync-skills` writes them to
121
121
  | `mma-research` | `POST /research` | External multi-source research with citations — arxiv, semantic_scholar, github_search, rss, brave-with-`site:`-filters — for a focused question. |
122
122
  | `mma-debug` | `POST /debug` | A test fails, a build breaks, or behavior is unexpected — delegate the reproduce/trace, keep the hypothesis on the main agent. |
123
123
  | `mma-review` | `POST /review` | Source-code review (pre-merge, post-implementation, security-focused). One worker per file, in parallel. |
124
- | `mma-audit` | `POST /audit` | Audit a prose document spec, config, PR description for correctness, security, or style. |
124
+ | `mma-audit` | `POST /audit` | Audit a spec / plan / design doc / recommendation doc for executability blockers (contradictions, ambiguity, recommendation-coherence gaps). Default is the comprehensive sweep; `security` and `performance` are narrow opt-in lenses. |
125
125
  | `mma-verify` | `POST /verify` | Check acceptance criteria against finished work *before* claiming done. One worker per checklist item. |
126
126
 
127
127
  ### Plumbing skills
@@ -1,29 +1,31 @@
1
1
  ---
2
2
  name: mma-audit
3
3
  description: >-
4
- Use when the user asks to audit a document, spec, config, or PR description
5
- for security, correctness, performance, or style issues and 2+ files need
6
- independent audit passes
4
+ Use when the user asks to audit a spec, plan, design doc, recommendation doc,
5
+ or config the audit checks whether a literal-following worker could execute
6
+ the artifact without ambiguity, contradiction, or missing context. Default is
7
+ the comprehensive sweep; narrow lenses (security/performance) exist for cases
8
+ that want only one dimension.
7
9
  when_to_use: >-
8
- User asks for a doc/spec/config audit OR a methodology skill
10
+ User asks for a doc/spec/plan/config audit OR a methodology skill
9
11
  (superpowers:dispatching-parallel-agents, /security-review) points at one AND
10
12
  mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
11
- version: 4.0.6
13
+ version: 4.1.0
12
14
  ---
13
15
 
14
16
  # mma-audit
15
17
 
16
18
  ## Overview
17
19
 
18
- Send a document or set of files to workers for structured auditing. Each file is audited independently in parallel; per-file results are indexed by path in the terminal envelope.
20
+ Send a spec, plan, design doc, or recommendation doc to a worker for structured auditing. The audit's purpose is to make the artifact **executable by a low-judgment worker** meaning a sub-agent that follows instructions literally and cannot disambiguate. Findings target executability blockers: ambiguity, internal contradictions, unspecified branches, missing verification, overloaded terms, out-of-order steps.
19
21
 
20
22
  **Core principle:** One worker per file = no cross-file context pollution. The aggregator (you) decides what to do with the findings.
21
23
 
22
24
  ## When to Use
23
25
 
24
26
  **Use when:**
25
- - A spec / design doc / API contract / config file needs a critical read
26
- - The audit type is `security`, `performance`, `correctness`, or `style` (or a combination)
27
+ - A spec, plan, design doc, recommendation doc, or post-mortem needs a critical read
28
+ - The artifact will subsequently be executed by a worker (or any reader who follows it literally)
27
29
  - 2+ files would benefit from parallel audit
28
30
 
29
31
  **Don't use when:**
@@ -42,7 +44,7 @@ Send a document or set of files to workers for structured auditing. Each file is
42
44
  ```json
43
45
  {
44
46
  "document": "inline content to audit (optional if filePaths given)",
45
- "auditType": "correctness",
47
+ "auditType": "default",
46
48
  "filePaths": ["/project/docs/spec.md"],
47
49
  "contextBlockIds": []
48
50
  }
@@ -51,7 +53,7 @@ Send a document or set of files to workers for structured auditing. Each file is
51
53
  | Field | Type | Required | Notes |
52
54
  |---|---|---|---|
53
55
  | `document` | string | no | Inline document content |
54
- | `auditType` | string \| string[] | yes | `security`, `performance`, `correctness`, `style`, or `general`; or an array of the first four |
56
+ | `auditType` | `'default' \| 'security' \| 'performance'` | no (defaults to `'default'`) | See "Picking auditType" below — `default` is right for ~90% of audits |
55
57
  | `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
56
58
  | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
57
59
 
@@ -59,6 +61,16 @@ Either `document` or `filePaths` (or both) must be provided.
59
61
 
60
62
  > Worker tier for `mma-audit` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
61
63
 
64
+ ### Picking auditType
65
+
66
+ | Value | When to use |
67
+ |---|---|
68
+ | `default` (or omit the field) | **Right answer for ~90% of audits.** Spec, plan, design doc, recommendation doc, post-mortem, audit, brief, README — any prose artifact. Sweeps the full executability + correctness + clarity taxonomy with security and performance lenses applied. |
69
+ | `security` | Narrow opt-in. Use ONLY when you specifically want security findings and not general audit findings (e.g., a threat model where stylistic noise is unwanted). |
70
+ | `performance` | Narrow opt-in. Use ONLY when you specifically want performance findings (e.g., a scaling design where you want hot-path / latency / unbounded-loop findings only). |
71
+
72
+ The legacy values `correctness`, `style`, and `general` no longer exist — they were a false dichotomy. Sending any of them returns `400 invalid_request` with a hint to use `default`.
73
+
62
74
  ## Full example
63
75
 
64
76
  ```bash
@@ -67,7 +79,7 @@ BATCH=$(curl -f --show-error -s -X POST \
67
79
  -H "X-MMA-Client: $MMA_CLIENT" \
68
80
  -H "Authorization: Bearer $TOKEN" \
69
81
  -H "Content-Type: application/json" \
70
- -d '{"auditType":"correctness","filePaths":["/project/docs/api-spec.md"]}' \
82
+ -d '{"auditType":"default","filePaths":["/project/docs/api-spec.md"]}' \
71
83
  "http://localhost:$PORT/audit?cwd=/project")
72
84
  BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
73
85
  ```
@@ -130,8 +142,8 @@ The auditor lacks codebase context (no type info, no call-site lookup, no test a
130
142
  ❌ **Single huge `document` string instead of `filePaths`**
131
143
  Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
132
144
 
133
- ❌ **Asking for `auditType: "general"` when you mean something specific**
134
- `"general"` is a catch-all that produces watery findings. **Fix:** pick the dimension you actually care about (`"correctness"` for spec gaps, `"security"` for threat models, etc.).
145
+ ❌ **Sending legacy auditType values (`correctness`, `style`, `general`)**
146
+ These were removed — they were a false dichotomy that biased workers toward stylistic proofreading on prose artifacts. **Fix:** use `default` (or omit the field). Use `security` or `performance` only when you specifically want a narrow lens.
135
147
 
136
148
  ❌ **Re-auditing the same files round after round without delta context**
137
149
  Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
13
13
  mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
14
14
  mma-investigate. Cheaper and faster than inlining the same content N times.
15
- version: 4.0.6
15
+ version: 4.1.0
16
16
  ---
17
17
 
18
18
  # mma-context-blocks
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  read files, reproduce, trace — OR a methodology skill
11
11
  (superpowers:systematic-debugging) points at the investigation step. Delegate
12
12
  the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-debug
@@ -11,7 +11,7 @@ when_to_use: >-
11
11
  and keep main context free. If a plan file exists → use mma-execute-plan. If
12
12
  the task is audit / review / verify / debug / investigate → use the matching
13
13
  specialized skill.
14
- version: 4.0.6
14
+ version: 4.1.0
15
15
  ---
16
16
 
17
17
  # mma-delegate
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  superpowers:subagent-driven-development / superpowers:executing-plans —
11
11
  workers are cheaper and don't pollute main context. Task descriptors must
12
12
  match plan headings verbatim.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-execute-plan
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  out mma-investigate (internal) + mma-research (external) in parallel and
13
13
  synthesise the results yourself. DO NOT use for convergent single-answer
14
14
  questions — those are mma-investigate.
15
- version: 4.0.6
15
+ version: 4.1.0
16
16
  ---
17
17
 
18
18
  # mma-explore
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  git-history queries. OR you are about to read 3+ files / run any grep in main
13
13
  context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
14
14
  skill instead.
15
- version: 4.0.6
15
+ version: 4.1.0
16
16
  ---
17
17
 
18
18
  # mma-investigate
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  others do, what published methods exist) AND mmagent is running. Delegate the
11
11
  multi-source web/adapter research to a worker so the main context stays on
12
12
  judgment. NOT for codebase questions — those are mma-investigate.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-research
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  you want to re-try the failed indices only. Prefer this over re-dispatching
11
11
  the whole batch or inline-retrying — it's idempotent and preserves the
12
12
  original batch's diagnostics.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-retry
@@ -10,16 +10,18 @@ when_to_use: >-
10
10
  AND mmagent is running. Delegate so each file reviews on its own worker; the
11
11
  main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
12
12
  for prose specs / configs.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-review
17
17
 
18
18
  ## Overview
19
19
 
20
- Send code files to workers for structured review. Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
20
+ mma-review is the **pre-merge gate**. Send code files (or a diff) to a worker for structured review against an executability bar: would a maintainer who reads only the verdict and the diff understand which changes are required, why each is required, and where each lives — well enough to apply the fix and re-merge without re-investigating?
21
21
 
22
- **Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses.
22
+ Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
23
+
24
+ **Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses. The reviewer runs against a 10-category failure-mode taxonomy (test gap, cross-file ripple, missing edge case, race, resource leak, backward-compat break, security/performance regression, implicit-contract assumption, pre-existing-bug-vs-new-regression separation) and weighs every change through the security, performance, and correctness lenses regardless of `focus`.
23
25
 
24
26
  ## When to Use
25
27
 
@@ -34,6 +36,13 @@ Send code files to workers for structured review. Each file is reviewed independ
34
36
  - You want to know whether a complete branch is mergeable → run `/ultrareview` (multi-model branch review) instead
35
37
  - The diff is one-line / one-character → reading inline is faster than dispatch
36
38
 
39
+ ## How to invoke for cross-file ripple detection
40
+
41
+ The cross-file ripple pass (changed-symbol → broken caller) only fires when the worker can identify what changed. Two patterns:
42
+
43
+ - **Diff-as-input (preferred for cross-file ripple)**: pass the diff via the `code` field, plus the named files via `filePaths`. The worker treats the diff as the change-set and greps for callers of changed public symbols.
44
+ - **Files-only (static review)**: pass only `filePaths`. The worker reviews the files in their current state without a change-set, so cross-file ripple is degenerate. Test gap, missing edge case, race, leak, and security/performance findings still fire.
45
+
37
46
  ## Endpoint
38
47
 
39
48
  `POST /review?cwd=<abs-path>`
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  against implemented work BEFORE claiming success. Delegate so each checklist
11
11
  item gets independent evidence-gathering on a worker. Use this BEFORE saying
12
12
  "done" — never after.
13
- version: 4.0.6
13
+ version: 4.1.0
14
14
  ---
15
15
 
16
16
  # mma-verify
@@ -11,7 +11,7 @@ when_to_use: >-
11
11
  tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
12
12
  and delegate there. Applies equally whether the user invoked a superpowers
13
13
  methodology skill or asked directly.
14
- version: 4.0.6
14
+ version: 4.1.0
15
15
  ---
16
16
 
17
17
  # multi-model-agent (router)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zhixuan92/multi-model-agent",
3
- "version": "4.0.6",
3
+ "version": "4.1.0",
4
4
  "type": "module",
5
5
  "license": "MIT",
6
6
  "description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
@@ -53,7 +53,7 @@
53
53
  },
54
54
  "dependencies": {
55
55
  "@asteasolutions/zod-to-openapi": "^8.5.0",
56
- "@zhixuan92/multi-model-agent-core": "^4.0.6",
56
+ "@zhixuan92/multi-model-agent-core": "^4.1.0",
57
57
  "gray-matter": "^4.0.3",
58
58
  "minimist": "^1.2.8",
59
59
  "proper-lockfile": "^4.1.2",