@zhixuan92/multi-model-agent 3.3.0 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +62 -33
  2. package/dist/http/canonicalize-file-paths.d.ts +8 -0
  3. package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
  4. package/dist/http/canonicalize-file-paths.js +43 -0
  5. package/dist/http/canonicalize-file-paths.js.map +1 -0
  6. package/dist/http/execution-context.d.ts.map +1 -1
  7. package/dist/http/execution-context.js +0 -14
  8. package/dist/http/execution-context.js.map +1 -1
  9. package/dist/http/handlers/tools/investigate.d.ts +4 -0
  10. package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
  11. package/dist/http/handlers/tools/investigate.js +81 -0
  12. package/dist/http/handlers/tools/investigate.js.map +1 -0
  13. package/dist/http/server.d.ts.map +1 -1
  14. package/dist/http/server.js +5 -2
  15. package/dist/http/server.js.map +1 -1
  16. package/dist/install/discover.d.ts +1 -1
  17. package/dist/install/discover.d.ts.map +1 -1
  18. package/dist/install/discover.js +1 -0
  19. package/dist/install/discover.js.map +1 -1
  20. package/dist/openapi.d.ts.map +1 -1
  21. package/dist/openapi.js +6 -0
  22. package/dist/openapi.js.map +1 -1
  23. package/dist/skills/_shared/verify-and-review.md +12 -0
  24. package/dist/skills/mma-audit/SKILL.md +45 -18
  25. package/dist/skills/mma-clarifications/SKILL.md +73 -29
  26. package/dist/skills/mma-context-blocks/SKILL.md +56 -24
  27. package/dist/skills/mma-debug/SKILL.md +54 -22
  28. package/dist/skills/mma-delegate/SKILL.md +58 -26
  29. package/dist/skills/mma-execute-plan/SKILL.md +55 -29
  30. package/dist/skills/mma-investigate/SKILL.md +137 -0
  31. package/dist/skills/mma-retry/SKILL.md +65 -22
  32. package/dist/skills/mma-review/SKILL.md +49 -20
  33. package/dist/skills/mma-verify/SKILL.md +49 -18
  34. package/dist/skills/multi-model-agent/SKILL.md +84 -46
  35. package/package.json +2 -2
@@ -1,27 +1,74 @@
1
1
  ---
2
2
  name: multi-model-agent
3
3
  description: >-
4
- Router for the multi-model-agent local service. Use first when you're about to
5
- delegate any tool-using work picks the right mma-* skill for the task
6
- (audit, review, verify, debug, plan execution, ad-hoc delegation) instead of
7
- defaulting to inline Agent dispatches.
4
+ Use first whenever you're about to delegate any tool-using work — picks the
5
+ right mma-* skill (audit, review, verify, debug, plan execution, codebase
6
+ investigation, ad-hoc delegation, retry, context-block reuse, clarification
7
+ resume) instead of defaulting to inline Agent dispatches
8
8
  when_to_use: >-
9
9
  The user asks for work you'd normally delegate — audit, code review, checklist
10
- verification, debugging, plan execution, or ad-hoc parallel tasks — AND
11
- mmagent is running. Read this once, pick the matching mma-* skill, and
12
- delegate there. Applies equally whether the user invoked a superpowers
13
- methodology skill or just asked directly.
14
- version: 3.3.0
10
+ verification, debugging, plan execution, codebase Q&A, or ad-hoc parallel
11
+ tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
12
+ and delegate there. Applies equally whether the user invoked a superpowers
13
+ methodology skill or asked directly.
14
+ version: 3.4.0
15
15
  ---
16
16
 
17
- ## multi-model-agent overview
18
-
19
- multi-model-agent is a local HTTP service that fans out tool-using work to
20
- sub-agents running on different LLM providers (Claude, OpenAI-compatible, Codex).
17
+ # multi-model-agent (router)
18
+
19
+ ## Overview
20
+
21
+ Local HTTP service that fans out tool-using work to sub-agents on different LLM providers (Claude, OpenAI-compatible, Codex). Workers run on cheap models; the main agent stays on judgment.
22
+
23
+ **Core principle:** Pick the most specific `mma-*` skill that fits the task. Specificity reduces input — specialized skills know their route, schema, and defaults so you write less.
24
+
25
+ ## Skill map
26
+
27
+ ```dot
28
+ digraph picker {
29
+ "Plan/spec file on disk?" [shape=diamond];
30
+ "Audit a doc?" [shape=diamond];
31
+ "Review code?" [shape=diamond];
32
+ "Verify a checklist?" [shape=diamond];
33
+ "Debug a failure?" [shape=diamond];
34
+ "Codebase question?" [shape=diamond];
35
+ "mma-execute-plan" [shape=box];
36
+ "mma-audit" [shape=box];
37
+ "mma-review" [shape=box];
38
+ "mma-verify" [shape=box];
39
+ "mma-debug" [shape=box];
40
+ "mma-investigate" [shape=box];
41
+ "mma-delegate" [shape=box];
42
+
43
+ "Plan/spec file on disk?" -> "mma-execute-plan" [label="yes"];
44
+ "Plan/spec file on disk?" -> "Audit a doc?" [label="no"];
45
+ "Audit a doc?" -> "mma-audit" [label="yes"];
46
+ "Audit a doc?" -> "Review code?" [label="no"];
47
+ "Review code?" -> "mma-review" [label="yes"];
48
+ "Review code?" -> "Verify a checklist?" [label="no"];
49
+ "Verify a checklist?" -> "mma-verify" [label="yes"];
50
+ "Verify a checklist?" -> "Debug a failure?" [label="no"];
51
+ "Debug a failure?" -> "mma-debug" [label="yes"];
52
+ "Debug a failure?" -> "Codebase question?" [label="no"];
53
+ "Codebase question?" -> "mma-investigate" [label="yes"];
54
+ "Codebase question?" -> "mma-delegate" [label="no — ad-hoc"];
55
+ }
56
+ ```
21
57
 
22
- ### Preflight: auto-start the daemon if it is not running
58
+ | Skill | Purpose |
59
+ |---|---|
60
+ | `mma-execute-plan` | Implement tasks from a plan or spec file (descriptors match plan headings) |
61
+ | `mma-audit` | Audit a document/spec/config for security, correctness, style, or performance |
62
+ | `mma-review` | Review code for quality, security, performance, correctness |
63
+ | `mma-verify` | Verify work against a checklist (one item per worker, parallel) |
64
+ | `mma-debug` | Debug a failure with a structured hypothesis |
65
+ | `mma-investigate` | Codebase Q&A — structured answer with `file:line` citations + confidence |
66
+ | `mma-delegate` | Ad-hoc implementation / research with no plan file |
67
+ | `mma-retry` | Re-run specific failed/incomplete tasks from a previous batch by index |
68
+ | `mma-context-blocks` | Register a reused doc once; reference by ID across N tasks |
69
+ | `mma-clarifications` | Confirm or correct the service's proposed interpretation |
23
70
 
24
- Before any mma-* call, check the server. If it is not up, start it in the background — do NOT run `mmagent serve` synchronously, it blocks forever.
71
+ ## Preflight: auto-start the daemon if it is not running
25
72
 
26
73
  ```bash
27
74
  PORT=7337
@@ -37,52 +84,43 @@ fi
37
84
 
38
85
  Idempotent: already-running daemon → curl succeeds → no-op.
39
86
 
40
- ### Auth token
87
+ `mmagent serve` (no `&`) — blocks forever, never reaches the next step.
88
+ ✅ `mmagent serve >/dev/null 2>&1 & disown` — backgrounded, releases the shell.
89
+
90
+ ## Auth token
41
91
 
42
- Set the token in your environment:
43
92
  ```bash
44
93
  export MMAGENT_AUTH_TOKEN=$(mmagent print-token)
45
94
  ```
46
95
 
47
- Or read it from the env var `MMAGENT_AUTH_TOKEN` if already set.
48
- Every request requires `Authorization: Bearer <token>`.
49
-
50
- ### Skill map
51
-
52
- | Skill | Purpose |
53
- |---|---|
54
- | `mma-delegate` | Ad-hoc implementation/research (no plan file) |
55
- | `mma-audit` | Audit a document for security, correctness, style, or performance |
56
- | `mma-review` | Review code for quality, security, or correctness |
57
- | `mma-verify` | Verify work against a checklist |
58
- | `mma-debug` | Debug a failure with a structured hypothesis |
59
- | `mma-execute-plan` | Implement tasks from a plan or spec file |
60
- | `mma-retry` | Re-run specific failed tasks from a previous batch |
61
- | `mma-context-blocks` | Register large reused documents to reference by ID |
62
- | `mma-clarifications` | Confirm or correct the service's proposed interpretation |
96
+ Every request requires `Authorization: Bearer $MMAGENT_AUTH_TOKEN`. The token rotates on every `mmagent serve` restart re-export after a `pkill`/upgrade.
63
97
 
64
- ### Worker tier: `agentType`
98
+ ## Worker tier: `agentType`
65
99
 
66
100
  `mma-delegate` and `mma-execute-plan` accept `agentType: "standard" | "complex"`. Default is `"standard"` (cheaper, faster). Pick `"complex"` when:
67
101
 
68
- - The task touches many files or requires multi-step reasoning a smaller model cannot hold in context.
69
- - A prior standard run came back with `filesWritten: 0` or exhausted its turn budget (visible in the verbose stream or the final envelope's `batchTimings` / `results`).
102
+ - The task touches many files or requires multi-step reasoning a standard-tier model cannot hold in context.
103
+ - A prior standard run came back with `filesWritten: 0` or `incompleteReason: "turn_cap"` / `"cost_cap"` / `"timeout"`.
70
104
  - The task is security-sensitive or ambiguous enough that being wrong is costly.
71
105
 
72
- `mma-audit`, `mma-review`, `mma-debug` already default to complex; `mma-verify` already defaults to standard these are not configurable from the caller and do not need an `agentType` field.
106
+ `mma-audit`, `mma-review`, `mma-debug`, `mma-investigate` already default to complex; `mma-verify` already defaults to standard. These are not caller-configurable.
73
107
 
74
- ### General flow
108
+ ## General flow
75
109
 
76
- 1. Call the appropriate `mma-*` skill → receive `{ batchId }`.
110
+ 1. Call the matching `mma-*` skill → receive `{ batchId, statusUrl }`.
77
111
  2. Poll `GET /batch/:id`: `202 text/plain` while pending (body is the running headline), `200 application/json` on terminal.
78
- 3. Read `results` / `error` / `proposedInterpretation` from the terminal envelope.
112
+ 3. Read `results` / `error` / `proposedInterpretation` from the 7-field terminal envelope.
79
113
 
80
- If the terminal envelope has `proposedInterpretation` as a string, use `mma-clarifications` to confirm or correct it.
114
+ If `proposedInterpretation` is a string (not the `not_applicable` sentinel) use `mma-clarifications` to confirm/correct.
81
115
 
82
- ### Diagnosing slow tasks
116
+ ## Common pitfalls
83
117
 
84
- Start the server with `mmagent serve --verbose` (or set `diagnostics.verbose: true` in config) to record `tool_call` and `llm_turn` events. Then tail them:
118
+ **Defaulting to inline Agent dispatch when mmagent is up.** mmagent workers cost ~10× less and don't pollute main context. **Why:** every inline tool call burns flagship-model tokens; that's exactly what mmagent exists to avoid.
85
119
 
86
- ```bash
87
- mmagent logs --follow --batch=$BATCH_ID
88
- ```
120
+ ❌ **Picking `mma-delegate` when a more specific skill fits.** Audit / review / verify / debug / investigate workers know their route's defaults and emit structured reports. **Why:** specialized skills require less input and produce richer output.
121
+
122
+ ❌ **Starting an investigation that needs to write code.** `mma-investigate` is read-only. **Fix:** dispatch `mma-delegate` with research-then-edit framing, or split: investigate → digest → edit.
123
+
124
+ ## Diagnosing slow tasks
125
+
126
+ `mmagent serve --verbose` (or `diagnostics.verbose: true` in config) records `tool_call`, `turn_complete`, and `heartbeat` events. Tail with `mmagent logs --follow --batch=$BATCH_ID`.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zhixuan92/multi-model-agent",
3
- "version": "3.3.0",
3
+ "version": "3.4.0",
4
4
  "type": "module",
5
5
  "license": "MIT",
6
6
  "description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
@@ -52,7 +52,7 @@
52
52
  },
53
53
  "dependencies": {
54
54
  "@asteasolutions/zod-to-openapi": "^8.5.0",
55
- "@zhixuan92/multi-model-agent-core": "^3.3.0",
55
+ "@zhixuan92/multi-model-agent-core": "^3.4.0",
56
56
  "gray-matter": "^4.0.3",
57
57
  "minimist": "^1.2.8",
58
58
  "zod": "^4.0.0"