@zhixuan92/multi-model-agent 3.3.0 → 3.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +76 -33
- package/dist/http/canonicalize-file-paths.d.ts +8 -0
- package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
- package/dist/http/canonicalize-file-paths.js +43 -0
- package/dist/http/canonicalize-file-paths.js.map +1 -0
- package/dist/http/execution-context.d.ts.map +1 -1
- package/dist/http/execution-context.js +0 -14
- package/dist/http/execution-context.js.map +1 -1
- package/dist/http/handlers/tools/execute-plan.d.ts.map +1 -1
- package/dist/http/handlers/tools/execute-plan.js +21 -3
- package/dist/http/handlers/tools/execute-plan.js.map +1 -1
- package/dist/http/handlers/tools/investigate.d.ts +4 -0
- package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
- package/dist/http/handlers/tools/investigate.js +81 -0
- package/dist/http/handlers/tools/investigate.js.map +1 -0
- package/dist/http/server.d.ts.map +1 -1
- package/dist/http/server.js +5 -2
- package/dist/http/server.js.map +1 -1
- package/dist/install/discover.d.ts +1 -1
- package/dist/install/discover.d.ts.map +1 -1
- package/dist/install/discover.js +1 -0
- package/dist/install/discover.js.map +1 -1
- package/dist/openapi.d.ts.map +1 -1
- package/dist/openapi.js +6 -0
- package/dist/openapi.js.map +1 -1
- package/dist/skills/_shared/verify-and-review.md +12 -0
- package/dist/skills/mma-audit/SKILL.md +45 -18
- package/dist/skills/mma-clarifications/SKILL.md +73 -29
- package/dist/skills/mma-context-blocks/SKILL.md +56 -24
- package/dist/skills/mma-debug/SKILL.md +54 -22
- package/dist/skills/mma-delegate/SKILL.md +58 -26
- package/dist/skills/mma-execute-plan/SKILL.md +55 -29
- package/dist/skills/mma-investigate/SKILL.md +137 -0
- package/dist/skills/mma-retry/SKILL.md +65 -22
- package/dist/skills/mma-review/SKILL.md +49 -20
- package/dist/skills/mma-verify/SKILL.md +49 -18
- package/dist/skills/multi-model-agent/SKILL.md +84 -46
- package/package.json +2 -2
|
@@ -1,27 +1,74 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: multi-model-agent
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
defaulting to inline Agent dispatches
|
|
4
|
+
Use first whenever you're about to delegate any tool-using work — picks the
|
|
5
|
+
right mma-* skill (audit, review, verify, debug, plan execution, codebase
|
|
6
|
+
investigation, ad-hoc delegation, retry, context-block reuse, clarification
|
|
7
|
+
resume) instead of defaulting to inline Agent dispatches
|
|
8
8
|
when_to_use: >-
|
|
9
9
|
The user asks for work you'd normally delegate — audit, code review, checklist
|
|
10
|
-
verification, debugging, plan execution, or ad-hoc parallel
|
|
11
|
-
mmagent is running. Read this once, pick the matching mma-* skill,
|
|
12
|
-
delegate there. Applies equally whether the user invoked a superpowers
|
|
13
|
-
methodology skill or
|
|
14
|
-
version: 3.
|
|
10
|
+
verification, debugging, plan execution, codebase Q&A, or ad-hoc parallel
|
|
11
|
+
tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
|
|
12
|
+
and delegate there. Applies equally whether the user invoked a superpowers
|
|
13
|
+
methodology skill or asked directly.
|
|
14
|
+
version: 3.5.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
17
|
+
# multi-model-agent (router)
|
|
18
|
+
|
|
19
|
+
## Overview
|
|
20
|
+
|
|
21
|
+
Local HTTP service that fans out tool-using work to sub-agents on different LLM providers (Claude, OpenAI-compatible, Codex). Workers run on cheap models; the main agent stays on judgment.
|
|
22
|
+
|
|
23
|
+
**Core principle:** Pick the most specific `mma-*` skill that fits the task. Specificity reduces input — specialized skills know their route, schema, and defaults so you write less.
|
|
24
|
+
|
|
25
|
+
## Skill map
|
|
26
|
+
|
|
27
|
+
```dot
|
|
28
|
+
digraph picker {
|
|
29
|
+
"Plan/spec file on disk?" [shape=diamond];
|
|
30
|
+
"Audit a doc?" [shape=diamond];
|
|
31
|
+
"Review code?" [shape=diamond];
|
|
32
|
+
"Verify a checklist?" [shape=diamond];
|
|
33
|
+
"Debug a failure?" [shape=diamond];
|
|
34
|
+
"Codebase question?" [shape=diamond];
|
|
35
|
+
"mma-execute-plan" [shape=box];
|
|
36
|
+
"mma-audit" [shape=box];
|
|
37
|
+
"mma-review" [shape=box];
|
|
38
|
+
"mma-verify" [shape=box];
|
|
39
|
+
"mma-debug" [shape=box];
|
|
40
|
+
"mma-investigate" [shape=box];
|
|
41
|
+
"mma-delegate" [shape=box];
|
|
42
|
+
|
|
43
|
+
"Plan/spec file on disk?" -> "mma-execute-plan" [label="yes"];
|
|
44
|
+
"Plan/spec file on disk?" -> "Audit a doc?" [label="no"];
|
|
45
|
+
"Audit a doc?" -> "mma-audit" [label="yes"];
|
|
46
|
+
"Audit a doc?" -> "Review code?" [label="no"];
|
|
47
|
+
"Review code?" -> "mma-review" [label="yes"];
|
|
48
|
+
"Review code?" -> "Verify a checklist?" [label="no"];
|
|
49
|
+
"Verify a checklist?" -> "mma-verify" [label="yes"];
|
|
50
|
+
"Verify a checklist?" -> "Debug a failure?" [label="no"];
|
|
51
|
+
"Debug a failure?" -> "mma-debug" [label="yes"];
|
|
52
|
+
"Debug a failure?" -> "Codebase question?" [label="no"];
|
|
53
|
+
"Codebase question?" -> "mma-investigate" [label="yes"];
|
|
54
|
+
"Codebase question?" -> "mma-delegate" [label="no — ad-hoc"];
|
|
55
|
+
}
|
|
56
|
+
```
|
|
21
57
|
|
|
22
|
-
|
|
58
|
+
| Skill | Purpose |
|
|
59
|
+
|---|---|
|
|
60
|
+
| `mma-execute-plan` | Implement tasks from a plan or spec file (descriptors match plan headings) |
|
|
61
|
+
| `mma-audit` | Audit a document/spec/config for security, correctness, style, or performance |
|
|
62
|
+
| `mma-review` | Review code for quality, security, performance, correctness |
|
|
63
|
+
| `mma-verify` | Verify work against a checklist (one item per worker, parallel) |
|
|
64
|
+
| `mma-debug` | Debug a failure with a structured hypothesis |
|
|
65
|
+
| `mma-investigate` | Codebase Q&A — structured answer with `file:line` citations + confidence |
|
|
66
|
+
| `mma-delegate` | Ad-hoc implementation / research with no plan file |
|
|
67
|
+
| `mma-retry` | Re-run specific failed/incomplete tasks from a previous batch by index |
|
|
68
|
+
| `mma-context-blocks` | Register a reused doc once; reference by ID across N tasks |
|
|
69
|
+
| `mma-clarifications` | Confirm or correct the service's proposed interpretation |
|
|
23
70
|
|
|
24
|
-
|
|
71
|
+
## Preflight: auto-start the daemon if it is not running
|
|
25
72
|
|
|
26
73
|
```bash
|
|
27
74
|
PORT=7337
|
|
@@ -37,52 +84,43 @@ fi
|
|
|
37
84
|
|
|
38
85
|
Idempotent: already-running daemon → curl succeeds → no-op.
|
|
39
86
|
|
|
40
|
-
|
|
87
|
+
❌ `mmagent serve` (no `&`) — blocks forever, never reaches the next step.
|
|
88
|
+
✅ `mmagent serve >/dev/null 2>&1 & disown` — backgrounded, releases the shell.
|
|
89
|
+
|
|
90
|
+
## Auth token
|
|
41
91
|
|
|
42
|
-
Set the token in your environment:
|
|
43
92
|
```bash
|
|
44
93
|
export MMAGENT_AUTH_TOKEN=$(mmagent print-token)
|
|
45
94
|
```
|
|
46
95
|
|
|
47
|
-
|
|
48
|
-
Every request requires `Authorization: Bearer <token>`.
|
|
49
|
-
|
|
50
|
-
### Skill map
|
|
51
|
-
|
|
52
|
-
| Skill | Purpose |
|
|
53
|
-
|---|---|
|
|
54
|
-
| `mma-delegate` | Ad-hoc implementation/research (no plan file) |
|
|
55
|
-
| `mma-audit` | Audit a document for security, correctness, style, or performance |
|
|
56
|
-
| `mma-review` | Review code for quality, security, or correctness |
|
|
57
|
-
| `mma-verify` | Verify work against a checklist |
|
|
58
|
-
| `mma-debug` | Debug a failure with a structured hypothesis |
|
|
59
|
-
| `mma-execute-plan` | Implement tasks from a plan or spec file |
|
|
60
|
-
| `mma-retry` | Re-run specific failed tasks from a previous batch |
|
|
61
|
-
| `mma-context-blocks` | Register large reused documents to reference by ID |
|
|
62
|
-
| `mma-clarifications` | Confirm or correct the service's proposed interpretation |
|
|
96
|
+
Every request requires `Authorization: Bearer $MMAGENT_AUTH_TOKEN`. The token rotates on every `mmagent serve` restart — re-export after a `pkill`/upgrade.
|
|
63
97
|
|
|
64
|
-
|
|
98
|
+
## Worker tier: `agentType`
|
|
65
99
|
|
|
66
100
|
`mma-delegate` and `mma-execute-plan` accept `agentType: "standard" | "complex"`. Default is `"standard"` (cheaper, faster). Pick `"complex"` when:
|
|
67
101
|
|
|
68
|
-
- The task touches many files or requires multi-step reasoning a
|
|
69
|
-
- A prior standard run came back with `filesWritten: 0` or
|
|
102
|
+
- The task touches many files or requires multi-step reasoning a standard-tier model cannot hold in context.
|
|
103
|
+
- A prior standard run came back with `filesWritten: 0` or `incompleteReason: "turn_cap"` / `"cost_cap"` / `"timeout"`.
|
|
70
104
|
- The task is security-sensitive or ambiguous enough that being wrong is costly.
|
|
71
105
|
|
|
72
|
-
`mma-audit`, `mma-review`, `mma-debug` already default to complex; `mma-verify` already defaults to standard
|
|
106
|
+
`mma-audit`, `mma-review`, `mma-debug`, `mma-investigate` already default to complex; `mma-verify` already defaults to standard. These are not caller-configurable.
|
|
73
107
|
|
|
74
|
-
|
|
108
|
+
## General flow
|
|
75
109
|
|
|
76
|
-
1. Call the
|
|
110
|
+
1. Call the matching `mma-*` skill → receive `{ batchId, statusUrl }`.
|
|
77
111
|
2. Poll `GET /batch/:id`: `202 text/plain` while pending (body is the running headline), `200 application/json` on terminal.
|
|
78
|
-
3. Read `results` / `error` / `proposedInterpretation` from the terminal envelope.
|
|
112
|
+
3. Read `results` / `error` / `proposedInterpretation` from the 7-field terminal envelope.
|
|
79
113
|
|
|
80
|
-
If
|
|
114
|
+
If `proposedInterpretation` is a string (not the `not_applicable` sentinel) → use `mma-clarifications` to confirm/correct.
|
|
81
115
|
|
|
82
|
-
|
|
116
|
+
## Common pitfalls
|
|
83
117
|
|
|
84
|
-
|
|
118
|
+
❌ **Defaulting to inline Agent dispatch when mmagent is up.** mmagent workers cost ~10× less and don't pollute main context. **Why:** every inline tool call burns flagship-model tokens; that's exactly what mmagent exists to avoid.
|
|
85
119
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
120
|
+
❌ **Picking `mma-delegate` when a more specific skill fits.** Audit / review / verify / debug / investigate workers know their route's defaults and emit structured reports. **Why:** specialized skills require less input and produce richer output.
|
|
121
|
+
|
|
122
|
+
❌ **Starting an investigation that needs to write code.** `mma-investigate` is read-only. **Fix:** dispatch `mma-delegate` with research-then-edit framing, or split: investigate → digest → edit.
|
|
123
|
+
|
|
124
|
+
## Diagnosing slow tasks
|
|
125
|
+
|
|
126
|
+
`mmagent serve --verbose` (or `diagnostics.verbose: true` in config) records `tool_call`, `turn_complete`, and `heartbeat` events. Tail with `mmagent logs --follow --batch=$BATCH_ID`.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@zhixuan92/multi-model-agent",
|
|
3
|
-
"version": "3.
|
|
3
|
+
"version": "3.5.0",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
|
|
@@ -52,7 +52,7 @@
|
|
|
52
52
|
},
|
|
53
53
|
"dependencies": {
|
|
54
54
|
"@asteasolutions/zod-to-openapi": "^8.5.0",
|
|
55
|
-
"@zhixuan92/multi-model-agent-core": "^3.
|
|
55
|
+
"@zhixuan92/multi-model-agent-core": "^3.5.0",
|
|
56
56
|
"gray-matter": "^4.0.3",
|
|
57
57
|
"minimist": "^1.2.8",
|
|
58
58
|
"zod": "^4.0.0"
|