npm - @zhixuan92/multi-model-agent - Versions diffs - 4.0.6 → 4.1.0 - Mend

@zhixuan92/multi-model-agent 4.0.6 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/README.md +1 -1
package/dist/skills/mma-audit/SKILL.md +25 -13
package/dist/skills/mma-context-blocks/SKILL.md +1 -1
package/dist/skills/mma-debug/SKILL.md +1 -1
package/dist/skills/mma-delegate/SKILL.md +1 -1
package/dist/skills/mma-execute-plan/SKILL.md +1 -1
package/dist/skills/mma-explore/SKILL.md +1 -1
package/dist/skills/mma-investigate/SKILL.md +1 -1
package/dist/skills/mma-research/SKILL.md +1 -1
package/dist/skills/mma-retry/SKILL.md +1 -1
package/dist/skills/mma-review/SKILL.md +12 -3
package/dist/skills/mma-verify/SKILL.md +1 -1
package/dist/skills/multi-model-agent/SKILL.md +1 -1
package/package.json +2 -2

package/README.md CHANGED Viewed

@@ -121,7 +121,7 @@ Skills are the surface your AI client sees. `mmagent sync-skills` writes them to
 | `mma-research` | `POST /research` | External multi-source research with citations — arxiv, semantic_scholar, github_search, rss, brave-with-`site:`-filters — for a focused question. |
 | `mma-debug` | `POST /debug` | A test fails, a build breaks, or behavior is unexpected — delegate the reproduce/trace, keep the hypothesis on the main agent. |
 | `mma-review` | `POST /review` | Source-code review (pre-merge, post-implementation, security-focused). One worker per file, in parallel. |
-| `mma-audit` | `POST /audit` | Audit a prose document — spec, config, PR description — for correctness, security, or style. |
+| `mma-audit` | `POST /audit` | Audit a spec / plan / design doc / recommendation doc for executability blockers (contradictions, ambiguity, recommendation-coherence gaps). Default is the comprehensive sweep; `security` and `performance` are narrow opt-in lenses. |
 | `mma-verify` | `POST /verify` | Check acceptance criteria against finished work *before* claiming done. One worker per checklist item. |
 ### Plumbing skills

package/dist/skills/mma-audit/SKILL.md CHANGED Viewed

@@ -1,29 +1,31 @@
 ---
 name: mma-audit
 description: >-
-  Use when the user asks to audit a document, spec, config, or PR description
-  for security, correctness, performance, or style issues — and 2+ files need
-  independent audit passes
+  Use when the user asks to audit a spec, plan, design doc, recommendation doc,
+  or config — the audit checks whether a literal-following worker could execute
+  the artifact without ambiguity, contradiction, or missing context. Default is
+  the comprehensive sweep; narrow lenses (security/performance) exist for cases
+  that want only one dimension.
 when_to_use: >-
-  User asks for a doc/spec/config audit OR a methodology skill
+  User asks for a doc/spec/plan/config audit OR a methodology skill
   (superpowers:dispatching-parallel-agents, /security-review) points at one AND
   mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-audit
 ## Overview
-Send a document or set of files to workers for structured auditing. Each file is audited independently in parallel; per-file results are indexed by path in the terminal envelope.
+Send a spec, plan, design doc, or recommendation doc to a worker for structured auditing. The audit's purpose is to make the artifact **executable by a low-judgment worker** — meaning a sub-agent that follows instructions literally and cannot disambiguate. Findings target executability blockers: ambiguity, internal contradictions, unspecified branches, missing verification, overloaded terms, out-of-order steps.
 **Core principle:** One worker per file = no cross-file context pollution. The aggregator (you) decides what to do with the findings.
 ## When to Use
 **Use when:**
-- A spec / design doc / API contract / config file needs a critical read
-- The audit type is `security`, `performance`, `correctness`, or `style` (or a combination)
+- A spec, plan, design doc, recommendation doc, or post-mortem needs a critical read
+- The artifact will subsequently be executed by a worker (or any reader who follows it literally)
 - 2+ files would benefit from parallel audit
 **Don't use when:**
@@ -42,7 +44,7 @@ Send a document or set of files to workers for structured auditing. Each file is
 ```json
 {
   "document": "inline content to audit (optional if filePaths given)",
-  "auditType": "correctness",
+  "auditType": "default",
   "filePaths": ["/project/docs/spec.md"],
   "contextBlockIds": []
 }
@@ -51,7 +53,7 @@ Send a document or set of files to workers for structured auditing. Each file is
 | Field | Type | Required | Notes |
 |---|---|---|---|
 | `document` | string | no | Inline document content |
-| `auditType` | string \| string[] | yes | `security`, `performance`, `correctness`, `style`, or `general`; or an array of the first four |
+| `auditType` | `'default' \| 'security' \| 'performance'` | no (defaults to `'default'`) | See "Picking auditType" below — `default` is right for ~90% of audits |
 | `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
 | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
@@ -59,6 +61,16 @@ Either `document` or `filePaths` (or both) must be provided.
 > Worker tier for `mma-audit` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
+### Picking auditType
+| Value | When to use |
+|---|---|
+| `default` (or omit the field) | **Right answer for ~90% of audits.** Spec, plan, design doc, recommendation doc, post-mortem, audit, brief, README — any prose artifact. Sweeps the full executability + correctness + clarity taxonomy with security and performance lenses applied. |
+| `security` | Narrow opt-in. Use ONLY when you specifically want security findings and not general audit findings (e.g., a threat model where stylistic noise is unwanted). |
+| `performance` | Narrow opt-in. Use ONLY when you specifically want performance findings (e.g., a scaling design where you want hot-path / latency / unbounded-loop findings only). |
+The legacy values `correctness`, `style`, and `general` no longer exist — they were a false dichotomy. Sending any of them returns `400 invalid_request` with a hint to use `default`.
 ## Full example
 ```bash
@@ -67,7 +79,7 @@ BATCH=$(curl -f --show-error -s -X POST \
   -H "X-MMA-Client: $MMA_CLIENT" \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: application/json" \
-  -d '{"auditType":"correctness","filePaths":["/project/docs/api-spec.md"]}' \
+  -d '{"auditType":"default","filePaths":["/project/docs/api-spec.md"]}' \
   "http://localhost:$PORT/audit?cwd=/project")
 BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
 ```
@@ -130,8 +142,8 @@ The auditor lacks codebase context (no type info, no call-site lookup, no test a
 ❌ **Single huge `document` string instead of `filePaths`**
 Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
-❌ **Asking for `auditType: "general"` when you mean something specific**
-`"general"` is a catch-all that produces watery findings. **Fix:** pick the dimension you actually care about (`"correctness"` for spec gaps, `"security"` for threat models, etc.).
+❌ **Sending legacy auditType values (`correctness`, `style`, `general`)**
+These were removed — they were a false dichotomy that biased workers toward stylistic proofreading on prose artifacts. **Fix:** use `default` (or omit the field). Use `security` or `performance` only when you specifically want a narrow lens.
 ❌ **Re-auditing the same files round after round without delta context**
 Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.

package/dist/skills/mma-context-blocks/SKILL.md CHANGED Viewed

@@ -12,7 +12,7 @@ when_to_use: >-
   Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
   mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
   mma-investigate. Cheaper and faster than inlining the same content N times.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-context-blocks

package/dist/skills/mma-debug/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ when_to_use: >-
   read files, reproduce, trace — OR a methodology skill
   (superpowers:systematic-debugging) points at the investigation step. Delegate
   the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-debug

package/dist/skills/mma-delegate/SKILL.md CHANGED Viewed

@@ -11,7 +11,7 @@ when_to_use: >-
   and keep main context free. If a plan file exists → use mma-execute-plan. If
   the task is audit / review / verify / debug / investigate → use the matching
   specialized skill.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-delegate

package/dist/skills/mma-execute-plan/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ when_to_use: >-
   superpowers:subagent-driven-development / superpowers:executing-plans —
   workers are cheaper and don't pollute main context. Task descriptors must
   match plan headings verbatim.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-execute-plan

package/dist/skills/mma-explore/SKILL.md CHANGED Viewed

@@ -12,7 +12,7 @@ when_to_use: >-
   out mma-investigate (internal) + mma-research (external) in parallel and
   synthesise the results yourself. DO NOT use for convergent single-answer
   questions — those are mma-investigate.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-explore

package/dist/skills/mma-investigate/SKILL.md CHANGED Viewed

@@ -12,7 +12,7 @@ when_to_use: >-
   git-history queries. OR you are about to read 3+ files / run any grep in main
   context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
   skill instead.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-investigate

package/dist/skills/mma-research/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ when_to_use: >-
   others do, what published methods exist) AND mmagent is running. Delegate the
   multi-source web/adapter research to a worker so the main context stays on
   judgment. NOT for codebase questions — those are mma-investigate.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-research

package/dist/skills/mma-retry/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ when_to_use: >-
   you want to re-try the failed indices only. Prefer this over re-dispatching
   the whole batch or inline-retrying — it's idempotent and preserves the
   original batch's diagnostics.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-retry

package/dist/skills/mma-review/SKILL.md CHANGED Viewed

@@ -10,16 +10,18 @@ when_to_use: >-
   AND mmagent is running. Delegate so each file reviews on its own worker; the
   main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
   for prose specs / configs.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-review
 ## Overview
-Send code files to workers for structured review. Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
+mma-review is the **pre-merge gate**. Send code files (or a diff) to a worker for structured review against an executability bar: would a maintainer who reads only the verdict and the diff understand which changes are required, why each is required, and where each lives — well enough to apply the fix and re-merge without re-investigating?
-**Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses.
+Each file is reviewed independently in parallel; results are index-aligned with `filePaths`.
+**Core principle:** Reviewer is a different model from the implementer — different training, different blind spots. Cross-model review catches what self-review misses. The reviewer runs against a 10-category failure-mode taxonomy (test gap, cross-file ripple, missing edge case, race, resource leak, backward-compat break, security/performance regression, implicit-contract assumption, pre-existing-bug-vs-new-regression separation) and weighs every change through the security, performance, and correctness lenses regardless of `focus`.
 ## When to Use
@@ -34,6 +36,13 @@ Send code files to workers for structured review. Each file is reviewed independ
 - You want to know whether a complete branch is mergeable → run `/ultrareview` (multi-model branch review) instead
 - The diff is one-line / one-character → reading inline is faster than dispatch
+## How to invoke for cross-file ripple detection
+The cross-file ripple pass (changed-symbol → broken caller) only fires when the worker can identify what changed. Two patterns:
+- **Diff-as-input (preferred for cross-file ripple)**: pass the diff via the `code` field, plus the named files via `filePaths`. The worker treats the diff as the change-set and greps for callers of changed public symbols.
+- **Files-only (static review)**: pass only `filePaths`. The worker reviews the files in their current state without a change-set, so cross-file ripple is degenerate. Test gap, missing edge case, race, leak, and security/performance findings still fire.
 ## Endpoint
 `POST /review?cwd=<abs-path>`

package/dist/skills/mma-verify/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ when_to_use: >-
   against implemented work BEFORE claiming success. Delegate so each checklist
   item gets independent evidence-gathering on a worker. Use this BEFORE saying
   "done" — never after.
-version: 4.0.6
+version: 4.1.0
 ---
 # mma-verify

package/dist/skills/multi-model-agent/SKILL.md CHANGED Viewed

@@ -11,7 +11,7 @@ when_to_use: >-
   tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
   and delegate there. Applies equally whether the user invoked a superpowers
   methodology skill or asked directly.
-version: 4.0.6
+version: 4.1.0
 ---
 # multi-model-agent (router)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zhixuan92/multi-model-agent",
-  "version": "4.0.6",
+  "version": "4.1.0",
   "type": "module",
   "license": "MIT",
   "description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
@@ -53,7 +53,7 @@
   },
   "dependencies": {
     "@asteasolutions/zod-to-openapi": "^8.5.0",
-    "@zhixuan92/multi-model-agent-core": "^4.0.6",
+    "@zhixuan92/multi-model-agent-core": "^4.1.0",
     "gray-matter": "^4.0.3",
     "minimist": "^1.2.8",
     "proper-lockfile": "^4.1.2",