@zhixuan92/multi-model-agent 4.5.4 → 4.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. package/README.md +6 -3
  2. package/dist/http/async-dispatch.d.ts.map +1 -1
  3. package/dist/http/async-dispatch.js +21 -16
  4. package/dist/http/async-dispatch.js.map +1 -1
  5. package/dist/http/execution-context.d.ts.map +1 -1
  6. package/dist/http/execution-context.js +12 -9
  7. package/dist/http/execution-context.js.map +1 -1
  8. package/dist/http/handler-deps.d.ts +0 -6
  9. package/dist/http/handler-deps.d.ts.map +1 -1
  10. package/dist/http/handlers/control/batch.d.ts.map +1 -1
  11. package/dist/http/handlers/control/batch.js +50 -0
  12. package/dist/http/handlers/control/batch.js.map +1 -1
  13. package/dist/http/handlers/control/context-blocks.d.ts +0 -2
  14. package/dist/http/handlers/control/context-blocks.d.ts.map +1 -1
  15. package/dist/http/handlers/control/context-blocks.js +3 -1
  16. package/dist/http/handlers/control/context-blocks.js.map +1 -1
  17. package/dist/http/handlers/control/retry.js.map +1 -1
  18. package/dist/http/handlers/tools/audit.d.ts.map +1 -1
  19. package/dist/http/handlers/tools/audit.js +1 -11
  20. package/dist/http/handlers/tools/audit.js.map +1 -1
  21. package/dist/http/handlers/tools/debug.d.ts.map +1 -1
  22. package/dist/http/handlers/tools/debug.js +1 -11
  23. package/dist/http/handlers/tools/debug.js.map +1 -1
  24. package/dist/http/handlers/tools/delegate.d.ts.map +1 -1
  25. package/dist/http/handlers/tools/delegate.js +1 -11
  26. package/dist/http/handlers/tools/delegate.js.map +1 -1
  27. package/dist/http/handlers/tools/execute-plan.d.ts.map +1 -1
  28. package/dist/http/handlers/tools/execute-plan.js +1 -11
  29. package/dist/http/handlers/tools/execute-plan.js.map +1 -1
  30. package/dist/http/handlers/tools/investigate.d.ts.map +1 -1
  31. package/dist/http/handlers/tools/investigate.js +1 -11
  32. package/dist/http/handlers/tools/investigate.js.map +1 -1
  33. package/dist/http/handlers/tools/research.d.ts.map +1 -1
  34. package/dist/http/handlers/tools/research.js +1 -11
  35. package/dist/http/handlers/tools/research.js.map +1 -1
  36. package/dist/http/handlers/tools/retry.d.ts.map +1 -1
  37. package/dist/http/handlers/tools/retry.js +6 -16
  38. package/dist/http/handlers/tools/retry.js.map +1 -1
  39. package/dist/http/handlers/tools/review.d.ts.map +1 -1
  40. package/dist/http/handlers/tools/review.js +1 -11
  41. package/dist/http/handlers/tools/review.js.map +1 -1
  42. package/dist/http/request-observability.d.ts.map +1 -1
  43. package/dist/http/request-observability.js +6 -8
  44. package/dist/http/request-observability.js.map +1 -1
  45. package/dist/http/server.d.ts.map +1 -1
  46. package/dist/http/server.js +20 -42
  47. package/dist/http/server.js.map +1 -1
  48. package/dist/skills/mma-audit/SKILL.md +38 -25
  49. package/dist/skills/mma-context-blocks/SKILL.md +22 -1
  50. package/dist/skills/mma-debug/SKILL.md +38 -25
  51. package/dist/skills/mma-delegate/SKILL.md +103 -11
  52. package/dist/skills/mma-execute-plan/SKILL.md +101 -2
  53. package/dist/skills/mma-explore/SKILL.md +21 -5
  54. package/dist/skills/mma-investigate/SKILL.md +62 -38
  55. package/dist/skills/mma-research/SKILL.md +52 -3
  56. package/dist/skills/mma-retry/SKILL.md +102 -3
  57. package/dist/skills/mma-review/SKILL.md +38 -25
  58. package/dist/skills/multi-model-agent/SKILL.md +1 -1
  59. package/package.json +2 -2
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  AND mmagent is running. Delegate so each file reviews on its own worker; the
11
11
  main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
12
12
  for prose specs / configs.
13
- version: 4.5.4
13
+ version: 4.6.0
14
14
  ---
15
15
 
16
16
  # mma-review
@@ -90,41 +90,54 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
90
90
 
91
91
  @include _shared/response-shape.md
92
92
 
93
- ## Reading the findings (3.10.5+)
93
+ ## Reading the findings
94
94
 
95
- The terminal envelope's `results[N].annotatedFindings` is a list of structured
96
- findings the reviewer extracted and scored from the implementer's narrative.
97
- Every finding has the same shape:
95
+ The main agent reads `completed` + `message` + `findings` — the findings are the answer. For
96
+ read-only routes, `filesChanged` is always `[]` and `commitSha` is always `null`.
97
+
98
+ ```json
99
+ {
100
+ "completed": true,
101
+ "message": "Review complete; 3 findings.",
102
+ "findings": [
103
+ { "id": "F1", "severity": "critical", "category": "test-gap",
104
+ "claim": "login.ts has no test for null username edge case.",
105
+ "evidence": "Worker read login.ts and grepped for test files — no null-case test found.",
106
+ "suggestion": "Add test case: `login(null) throws ValidationError`.",
107
+ "source": "reviewer" }
108
+ ],
109
+ "filesChanged": [],
110
+ "commitSha": null,
111
+ "summary": "...",
112
+ "telemetry": { ... }
113
+ }
114
+ ```
115
+
116
+ ### Finding shape
117
+
118
+ Every finding has this shape:
98
119
 
99
120
  | Field | Type | Notes |
100
121
  |---|---|---|
101
- | `id` | string | Reviewer-assigned, e.g. `F1`, `F2`. |
122
+ | `id` | string | Worker-assigned, e.g. `F1`, `F2`. Stable across chain. |
102
123
  | `severity` | `'critical' \| 'high' \| 'medium' \| 'low'` | 4-tier. |
124
+ | `category` | string | Topical bucket, e.g. `test-gap`, `cross-file-ripple`. |
103
125
  | `claim` | string | One-sentence summary. |
104
- | `evidence` | string ≥20 chars | Quoted from worker output when grounded. |
126
+ | `evidence` | string ≥20 chars | Verbatim from source when grounded. |
105
127
  | `suggestion?` | string | Optional fix recommendation. |
106
- | `annotatorConfidence` | `number \| null` | 0–100 from the reviewer; `null` when emitted via deterministic fallback. |
107
- | `evidenceGrounded` | boolean | True when `evidence` is a verbatim substring of worker output. |
108
-
109
- ### Verdict states (`qualityReviewVerdict`)
128
+ | `source` | `'implementer' \| 'reviewer'` | Who produced the finding. |
110
129
 
111
- - `'annotated'` — every finding is structured. May be reviewer-emitted (with
112
- numeric `annotatorConfidence`) or deterministic-fallback (with
113
- `annotatorConfidence: null`). The route ALWAYS reaches `'annotated'` unless
114
- the reviewer call itself fails transport.
115
- - `'error'` — only when the reviewer call fails transport (network / 5xx).
130
+ `annotatorConfidence` and `evidenceGrounded` are retired they were v4 fields with no producers.
116
131
 
117
132
  ### Recommended rendering by the main agent
118
133
 
119
- 1. Show ALL findings — never silently drop. Confidence and grounding are
120
- soft signals, not gates.
121
- 2. Default sort: severity (critical → low) then `annotatorConfidence` desc
122
- (nulls last).
123
- 3. `severity` is the reviewer's authoritative final value use it directly.
124
- 4. Mark findings with `evidenceGrounded: false` or
125
- `annotatorConfidence < 70` as "lower-trust" (collapsed section, lighter
126
- color, or `(low confidence)` annotation). User decides what to do.
127
- 5. Severity-tier counts feed the dashboard via V3 `findingsBySeverity`.
134
+ 1. Show ALL findings — never silently drop. Severity and grounding are soft
135
+ signals, not gates.
136
+ 2. Default sort: severity (critical → low), then `id` ascending.
137
+ 3. `severity` is the authoritative value — use it directly.
138
+ 4. Mark findings with `evidence` shorter than 30 chars as "low-evidence"
139
+ (lighter color or `(low evidence)` annotation). User decides what to do.
140
+ 5. Severity-tier counts feed the dashboard.
128
141
 
129
142
  @include _shared/budget-defaults.md
130
143
 
@@ -11,7 +11,7 @@ when_to_use: >-
11
11
  tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
12
12
  and delegate there. Applies equally whether the user invoked a superpowers
13
13
  methodology skill or asked directly.
14
- version: 4.5.4
14
+ version: 4.6.0
15
15
  ---
16
16
 
17
17
  # multi-model-agent (router)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zhixuan92/multi-model-agent",
3
- "version": "4.5.4",
3
+ "version": "4.6.0",
4
4
  "type": "module",
5
5
  "license": "MIT",
6
6
  "description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
@@ -53,7 +53,7 @@
53
53
  },
54
54
  "dependencies": {
55
55
  "@asteasolutions/zod-to-openapi": "^8.5.0",
56
- "@zhixuan92/multi-model-agent-core": "^4.5.4",
56
+ "@zhixuan92/multi-model-agent-core": "^4.6.0",
57
57
  "gray-matter": "^4.0.3",
58
58
  "minimist": "^1.2.8",
59
59
  "proper-lockfile": "^4.1.2",