@zhixuan92/multi-model-agent 3.6.5 → 3.6.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -2
- package/dist/skills/mma-audit/SKILL.md +12 -6
- package/dist/skills/mma-clarifications/SKILL.md +11 -2
- package/dist/skills/mma-context-blocks/SKILL.md +15 -2
- package/dist/skills/mma-debug/SKILL.md +11 -3
- package/dist/skills/mma-delegate/SKILL.md +11 -2
- package/dist/skills/mma-execute-plan/SKILL.md +10 -1
- package/dist/skills/mma-investigate/SKILL.md +12 -2
- package/dist/skills/mma-retry/SKILL.md +10 -1
- package/dist/skills/mma-review/SKILL.md +9 -1
- package/dist/skills/mma-verify/SKILL.md +9 -1
- package/dist/skills/multi-model-agent/SKILL.md +61 -2
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -82,7 +82,7 @@ Two ways — pick one:
|
|
|
82
82
|
|
|
83
83
|
```bash
|
|
84
84
|
mmagent serve # 127.0.0.1:7337 by default
|
|
85
|
-
curl -s http://localhost:7337/health # → {"ok":true,"version":"3.6.
|
|
85
|
+
curl -s http://localhost:7337/health # → {"ok":true,"version":"3.6.7",...}
|
|
86
86
|
```
|
|
87
87
|
|
|
88
88
|
For an always-on background install (survives reboots): [launchd / systemd templates](./scripts/README.md).
|
|
@@ -237,7 +237,7 @@ Full design rationale: [DIRECTION.md](https://github.com/zhixuan312/multi-model-
|
|
|
237
237
|
|
|
238
238
|
## What's new
|
|
239
239
|
|
|
240
|
-
Latest: **3.6.
|
|
240
|
+
Latest: **3.6.7** — Telemetry is now permissive on model/client/tool/skill identifiers: schema validates *shape, not vocabulary*. Anthropic 4.x, OpenAI o-series, Bedrock vendor prefixes, OpenRouter `meta-llama/...`, Ollama `llama2:7b`, custom finetunes, MCP tool names from any server — all pass through unchanged instead of being rejected or collapsed to `'other'`. `ModelFamily` enum widened 5 → 12 (added `grok`, `mistral`, `meta`, `qwen`, `zhipu`, `kimi`, `minimax`); `allowlistModel` renamed to `normalizeModelForTelemetry`. Full history: [CHANGELOG](https://github.com/zhixuan312/multi-model-agent/blob/master/CHANGELOG.md).
|
|
241
241
|
|
|
242
242
|
## Full documentation
|
|
243
243
|
|
|
@@ -2,15 +2,13 @@
|
|
|
2
2
|
name: mma-audit
|
|
3
3
|
description: >-
|
|
4
4
|
Use when the user asks to audit a document, spec, config, or PR description
|
|
5
|
-
for security, correctness, performance, or style issues — and
|
|
6
|
-
|
|
5
|
+
for security, correctness, performance, or style issues — and 2+ files need
|
|
6
|
+
independent audit passes
|
|
7
7
|
when_to_use: >-
|
|
8
8
|
User asks for a doc/spec/config audit OR a methodology skill
|
|
9
9
|
(superpowers:dispatching-parallel-agents, /security-review) points at one AND
|
|
10
|
-
mmagent is running.
|
|
11
|
-
|
|
12
|
-
source code.
|
|
13
|
-
version: 3.6.5
|
|
10
|
+
mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
|
|
11
|
+
version: 3.6.7
|
|
14
12
|
---
|
|
15
13
|
|
|
16
14
|
# mma-audit
|
|
@@ -74,6 +72,14 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
74
72
|
|
|
75
73
|
@include _shared/response-shape.md
|
|
76
74
|
|
|
75
|
+
## Best practices
|
|
76
|
+
|
|
77
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-audit`:
|
|
78
|
+
|
|
79
|
+
- **Recipe A — Audit-iterate-clean.** `mma-audit` → fix → `mma-audit` again. Sequential rounds. Register the doc via `mma-context-blocks` before round 1 and reuse the same ID across all rounds — avoids re-inlining the same content into every audit call.
|
|
80
|
+
|
|
81
|
+
Anti-pattern alert: **`parallel-rounds-same-target`** (AP1). Three parallel audits on the same document re-flag the same issues without seeing each other's fixes. Run rounds sequentially with a fix between each.
|
|
82
|
+
|
|
77
83
|
## Common pitfalls
|
|
78
84
|
|
|
79
85
|
❌ **Auditing source code with `mma-audit`**
|
|
@@ -8,8 +8,11 @@ when_to_use: >-
|
|
|
8
8
|
A previous mma-delegate / mma-audit / mma-review / mma-execute-plan /
|
|
9
9
|
mma-debug / mma-investigate terminal envelope has `proposedInterpretation` as
|
|
10
10
|
a string. Read the proposal, decide whether to accept or correct it, then call
|
|
11
|
-
this skill. The batch resumes immediately after the POST returns.
|
|
12
|
-
|
|
11
|
+
this skill. The batch resumes immediately after the POST returns. A string
|
|
12
|
+
`proposedInterpretation` is a hard gate — the batch is paused, not
|
|
13
|
+
informational. The batch will not complete until the caller responds. Treating
|
|
14
|
+
it as advisory is the clarification-as-info anti-pattern (AP5).
|
|
15
|
+
version: 3.6.7
|
|
13
16
|
---
|
|
14
17
|
|
|
15
18
|
# mma-clarifications
|
|
@@ -103,6 +106,12 @@ curl -f --show-error -s -X POST \
|
|
|
103
106
|
|
|
104
107
|
@include _shared/polling.md
|
|
105
108
|
|
|
109
|
+
## Best practices
|
|
110
|
+
|
|
111
|
+
This skill is described in `multi-model-agent` → "Best practices". Clarification resumption is a universal flow — it triggers whenever any batch enters `awaiting_clarification`, regardless of which recipe produced the batch. There is no per-recipe entry to call out.
|
|
112
|
+
|
|
113
|
+
Anti-pattern alert: **`clarification-as-info`** (AP5). A string `proposedInterpretation` is a hard gate, not an FYI. The batch will not complete until the caller responds. Either accept the proposal verbatim or correct it.
|
|
114
|
+
|
|
106
115
|
## Common pitfalls
|
|
107
116
|
|
|
108
117
|
❌ **Confirming a wrong proposal verbatim because "the service knows best"**
|
|
@@ -3,14 +3,16 @@ name: mma-context-blocks
|
|
|
3
3
|
description: >-
|
|
4
4
|
Use when a document larger than ~2 KB will be referenced by 2+ subsequent
|
|
5
5
|
mma-* calls — register once, pass the returned ID to each call instead of
|
|
6
|
-
re-uploading the same content
|
|
6
|
+
re-uploading the same content. OR a spec / plan / error log was already
|
|
7
|
+
inlined into one task and is about to be inlined into a second — register on
|
|
8
|
+
the second reference, never the third.
|
|
7
9
|
when_to_use: >-
|
|
8
10
|
A document (spec, plan, codebase summary, prior round's findings, error log)
|
|
9
11
|
larger than ~2 KB will be referenced by two or more mma-* calls in a row.
|
|
10
12
|
Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
|
|
11
13
|
mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
|
|
12
14
|
mma-investigate. Cheaper and faster than inlining the same content N times.
|
|
13
|
-
version: 3.6.
|
|
15
|
+
version: 3.6.7
|
|
14
16
|
---
|
|
15
17
|
|
|
16
18
|
# mma-context-blocks
|
|
@@ -91,6 +93,17 @@ curl -f --show-error -s -X POST \
|
|
|
91
93
|
"http://localhost:$PORT/delegate?cwd=/project"
|
|
92
94
|
```
|
|
93
95
|
|
|
96
|
+
## Best practices
|
|
97
|
+
|
|
98
|
+
This skill is the cross-cutting state mechanism described in `multi-model-agent` → "Best practices". Recipes that use context blocks:
|
|
99
|
+
|
|
100
|
+
- **Recipe A — Audit-iterate-clean.** Register the doc once before round 1; pass round-N's findings block ID into round N+1.
|
|
101
|
+
- **Recipe B — Debug-fix-verify.** Register the failing test output / reproduction log before the debug call; reuse on verify.
|
|
102
|
+
- **Recipe C — Investigate-plan-execute.** Register the plan file before `mma-execute-plan`.
|
|
103
|
+
- **Recipe D — Plan-execute-retry.** No new registration needed — `mma-retry` inherits the original batch's `contextBlockIds`.
|
|
104
|
+
|
|
105
|
+
Anti-pattern alert: **`re-inlined-shared-content`** (AP3). Pasting the same spec into 5 task prompts costs N× tokens. Register once; pass `contextBlockIds`.
|
|
106
|
+
|
|
94
107
|
## Common pitfalls
|
|
95
108
|
|
|
96
109
|
❌ **Inlining the same 50KB spec into every task prompt**
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
read files, reproduce, trace — OR a methodology skill
|
|
11
11
|
(superpowers:systematic-debugging) points at the investigation step. Delegate
|
|
12
12
|
the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
|
|
13
|
-
version: 3.6.
|
|
13
|
+
version: 3.6.7
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-debug
|
|
@@ -78,6 +78,14 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
78
78
|
|
|
79
79
|
@include _shared/response-shape.md
|
|
80
80
|
|
|
81
|
+
## Best practices
|
|
82
|
+
|
|
83
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-debug`:
|
|
84
|
+
|
|
85
|
+
- **Recipe B — Debug-fix-verify.** `mma-debug` → `mma-delegate` (apply fix) → `mma-verify`. Strict order. Register the failing test output / reproduction log as a context block before the debug call; reuse on verify.
|
|
86
|
+
|
|
87
|
+
Anti-pattern alert: **`inline-labor-leakage`** (AP2). If you're about to read 3+ files in main context to "understand the bug," that's the labor we delegate — call `mma-debug` with the hypothesis instead.
|
|
88
|
+
|
|
81
89
|
## Common pitfalls
|
|
82
90
|
|
|
83
91
|
❌ **Vague `problem`**
|
|
@@ -92,9 +100,9 @@ The worker explores blindly, often investigates the wrong area first. **Fix:** e
|
|
|
92
100
|
Debug intentionally bundles `filePaths` for cross-file reasoning. Splitting defeats this. **Fix:** one call with all suspect files; if you really have N independent failures, use `mma-delegate` with N tasks.
|
|
93
101
|
|
|
94
102
|
❌ **Treating `mma-debug` as the fix step**
|
|
95
|
-
Debug investigates and proposes; it doesn't necessarily write the fix.
|
|
103
|
+
Debug investigates and proposes; it doesn't necessarily write the fix. **Fix:** if the worker identifies a fix, dispatch `mma-delegate` to implement it (or write it inline if you understand it).
|
|
96
104
|
|
|
97
105
|
❌ **Skipping when an error message looks self-explanatory**
|
|
98
|
-
Often the obvious cause isn't the real one.
|
|
106
|
+
Often the obvious cause isn't the real one. **Fix:** a 30-second debug pass costs less than a wrong fix that breaks something else.
|
|
99
107
|
|
|
100
108
|
@include _shared/error-handling.md
|
|
@@ -11,14 +11,14 @@ when_to_use: >-
|
|
|
11
11
|
and keep main context free. If a plan file exists → use mma-execute-plan. If
|
|
12
12
|
the task is audit / review / verify / debug / investigate → use the matching
|
|
13
13
|
specialized skill.
|
|
14
|
-
version: 3.6.
|
|
14
|
+
version: 3.6.7
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# mma-delegate
|
|
18
18
|
|
|
19
19
|
## Overview
|
|
20
20
|
|
|
21
|
-
Dispatch one or more ad-hoc tasks to
|
|
21
|
+
Dispatch one or more ad-hoc tasks to workers concurrently. Each task is an independent instruction with optional file scope, acceptance criteria, and context blocks.
|
|
22
22
|
|
|
23
23
|
**Core principle:** Workers run on cheap providers; the main agent consumes only the structured per-task report. Parallelize freely as long as tasks don't write the same files.
|
|
24
24
|
|
|
@@ -85,6 +85,15 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
85
85
|
|
|
86
86
|
@include _shared/response-shape.md
|
|
87
87
|
|
|
88
|
+
## Best practices
|
|
89
|
+
|
|
90
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-delegate`:
|
|
91
|
+
|
|
92
|
+
- **Recipe A (the fix step).** Between audit rounds, `mma-delegate` applies the fix when the change is more than 1-2 lines. Register the spec/audit findings as a context block; pass via `contextBlockIds`.
|
|
93
|
+
- **Recipe B (the apply-fix step).** After `mma-debug` returns a hypothesis, `mma-delegate` applies the fix. Same context block carries forward to `mma-verify`.
|
|
94
|
+
|
|
95
|
+
Anti-pattern alert: **`inline-labor-leakage`** (AP2). If you're reading 3+ files or grepping in main context before dispatching, you're paying flagship-model tokens for labor. Pass the file paths to `mma-delegate` and let the worker read.
|
|
96
|
+
|
|
88
97
|
## Common pitfalls
|
|
89
98
|
|
|
90
99
|
❌ **Two tasks writing the same file in one batch**
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
superpowers:subagent-driven-development / superpowers:executing-plans —
|
|
11
11
|
workers are cheaper and don't pollute main context. Task descriptors must
|
|
12
12
|
match plan headings verbatim.
|
|
13
|
-
version: 3.6.
|
|
13
|
+
version: 3.6.7
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-execute-plan
|
|
@@ -86,6 +86,15 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
86
86
|
|
|
87
87
|
@include _shared/response-shape.md
|
|
88
88
|
|
|
89
|
+
## Best practices
|
|
90
|
+
|
|
91
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-execute-plan`:
|
|
92
|
+
|
|
93
|
+
- **Recipe C — Investigate-plan-execute.** `mma-investigate` → write the plan → `mma-execute-plan` → `mma-retry` on failed indices. Register the plan file as a context block before the execute-plan call so it isn't re-inlined into every worker's prompt; retry inherits the same configuration.
|
|
94
|
+
- **Recipe D — Plan-execute-retry (entry point).** `mma-execute-plan` is the producer of the `batchId` that `mma-retry` consumes. When this batch returns mixed `done` / `failed`, the next call is `mma-retry` with failed indices, NOT a re-dispatch.
|
|
95
|
+
|
|
96
|
+
Anti-pattern alert: **`full-batch-redispatch`** (AP4). When the batch returns mixed `done` / `failed`, do NOT re-run the whole task list — use `mma-retry` with the failed indices only. Re-running the whole list re-charges every successful task.
|
|
97
|
+
|
|
89
98
|
## Common pitfalls
|
|
90
99
|
|
|
91
100
|
❌ **Task descriptor doesn't match plan heading verbatim**
|
|
@@ -9,8 +9,10 @@ when_to_use: >-
|
|
|
9
9
|
methodology skill, or from your own next-step planning — AND mmagent is
|
|
10
10
|
running. Delegate the read/grep/synthesis to a worker so the main context
|
|
11
11
|
stays on judgment. Codebase only — does not perform web research or
|
|
12
|
-
git-history queries.
|
|
13
|
-
|
|
12
|
+
git-history queries. OR you are about to read 3+ files / run any grep in main
|
|
13
|
+
context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
|
|
14
|
+
skill instead.
|
|
15
|
+
version: 3.6.7
|
|
14
16
|
---
|
|
15
17
|
|
|
16
18
|
# mma-investigate
|
|
@@ -121,6 +123,14 @@ Each task carries an `investigation` field on its per-task report:
|
|
|
121
123
|
|
|
122
124
|
`workerStatus` is one of `done`, `done_with_concerns`, `needs_context`, `blocked`. When `done_with_concerns`, the per-task report carries `incompleteReason` (`turn_cap`, `cost_cap`, `timeout`, or `missing_sections`). When `needs_context`, the worker flagged a `[needs_context]` bullet under `## Unresolved` — re-dispatch with extra context (anchor paths, a context block, or a clarification turn).
|
|
123
125
|
|
|
126
|
+
## Best practices
|
|
127
|
+
|
|
128
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-investigate`:
|
|
129
|
+
|
|
130
|
+
- **Recipe C — Investigate-plan-execute.** `mma-investigate` → write the plan → `mma-execute-plan` → `mma-retry`. The investigation produces the synthesis you need to write the plan; the plan becomes a context block for execute-plan.
|
|
131
|
+
|
|
132
|
+
Anti-pattern alert: **`inline-labor-leakage`** (AP2). If you find yourself reading 3+ files or running any grep in main context, that's the trigger to delegate here instead. Main-context tokens cost ~10× more than worker tokens, and you only need the synthesis, not the raw reads.
|
|
133
|
+
|
|
124
134
|
## Common pitfalls
|
|
125
135
|
|
|
126
136
|
❌ **Asking for a fix instead of an answer**
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
you want to re-try the failed indices only. Prefer this over re-dispatching
|
|
11
11
|
the whole batch or inline-retrying — it's idempotent and preserves the
|
|
12
12
|
original batch's diagnostics.
|
|
13
|
-
version: 3.6.
|
|
13
|
+
version: 3.6.7
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-retry
|
|
@@ -88,6 +88,15 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId') # NEW batchId — not the origina
|
|
|
88
88
|
|
|
89
89
|
@include _shared/response-shape.md
|
|
90
90
|
|
|
91
|
+
## Best practices
|
|
92
|
+
|
|
93
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-retry`:
|
|
94
|
+
|
|
95
|
+
- **Recipe C — Investigate-plan-execute (last step).** After `mma-execute-plan` returns mixed results, retry the failed indices to close the loop.
|
|
96
|
+
- **Recipe D — Plan-execute-retry.** Pass the **original `batchId`** as input, specify the failed indices, keep the same configuration. `mma-retry` produces a NEW `batchId` in its response — poll that one for terminal state. Any `contextBlockIds` from the original carry forward.
|
|
97
|
+
|
|
98
|
+
Anti-pattern alert: **`full-batch-redispatch`** (AP4). Re-dispatching the entire batch re-charges every successful task. Always retry by index.
|
|
99
|
+
|
|
91
100
|
## Common pitfalls
|
|
92
101
|
|
|
93
102
|
❌ **Retrying after the batch expired**
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
AND mmagent is running. Delegate so each file reviews on its own worker; the
|
|
11
11
|
main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
|
|
12
12
|
for prose specs / configs.
|
|
13
|
-
version: 3.6.
|
|
13
|
+
version: 3.6.7
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-review
|
|
@@ -75,6 +75,14 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
75
75
|
|
|
76
76
|
@include _shared/response-shape.md
|
|
77
77
|
|
|
78
|
+
## Best practices
|
|
79
|
+
|
|
80
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-review`:
|
|
81
|
+
|
|
82
|
+
- **Recipe A (analog) — Review-iterate-clean.** `mma-review` → fix → `mma-review` again. Same shape as the audit recipe, applied to source code. Sequential rounds; register the file(s) via `mma-context-blocks` before round 1 and reuse the same ID across rounds.
|
|
83
|
+
|
|
84
|
+
Anti-pattern alert: **`parallel-rounds-same-target`** (AP1). Three parallel reviews of the same source file re-flag the same issues. Run rounds sequentially with a fix between each.
|
|
85
|
+
|
|
78
86
|
## Common pitfalls
|
|
79
87
|
|
|
80
88
|
❌ **Reviewing a plan/spec markdown with `mma-review`**
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
against implemented work BEFORE claiming success. Delegate so each checklist
|
|
11
11
|
item gets independent evidence-gathering on a worker. Use this BEFORE saying
|
|
12
12
|
"done" — never after.
|
|
13
|
-
version: 3.6.
|
|
13
|
+
version: 3.6.7
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-verify
|
|
@@ -76,6 +76,14 @@ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
|
76
76
|
|
|
77
77
|
@include _shared/response-shape.md
|
|
78
78
|
|
|
79
|
+
## Best practices
|
|
80
|
+
|
|
81
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-verify`:
|
|
82
|
+
|
|
83
|
+
- **Recipe B — Debug-fix-verify.** `mma-debug` → `mma-delegate` (fix) → `mma-verify`. Verify checks acceptance criteria against the implemented work. Reuse the context block registered for the debug call.
|
|
84
|
+
|
|
85
|
+
Anti-pattern alert: **`parallel-rounds-same-target`** (AP1, verify analog). Two parallel `mma-verify` calls on the unchanged checklist re-flag the same gaps. Run verify → fix → re-verify sequentially instead.
|
|
86
|
+
|
|
79
87
|
## Common pitfalls
|
|
80
88
|
|
|
81
89
|
❌ **Vague checklist items**
|
|
@@ -11,14 +11,14 @@ when_to_use: >-
|
|
|
11
11
|
tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
|
|
12
12
|
and delegate there. Applies equally whether the user invoked a superpowers
|
|
13
13
|
methodology skill or asked directly.
|
|
14
|
-
version: 3.6.
|
|
14
|
+
version: 3.6.7
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# multi-model-agent (router)
|
|
18
18
|
|
|
19
19
|
## Overview
|
|
20
20
|
|
|
21
|
-
Local HTTP service that fans out tool-using work to
|
|
21
|
+
Local HTTP service that fans out tool-using work to workers on different LLM providers (Claude, OpenAI-compatible, Codex). Workers run on cheap models; the main agent stays on judgment.
|
|
22
22
|
|
|
23
23
|
**Core principle:** Pick the most specific `mma-*` skill that fits the task. Specificity reduces input — specialized skills know their route, schema, and defaults so you write less.
|
|
24
24
|
|
|
@@ -68,6 +68,65 @@ digraph picker {
|
|
|
68
68
|
| `mma-context-blocks` | Register a reused doc once; reference by ID across N tasks |
|
|
69
69
|
| `mma-clarifications` | Confirm or correct the service's proposed interpretation |
|
|
70
70
|
|
|
71
|
+
## Best practices
|
|
72
|
+
|
|
73
|
+
### The unifying principle
|
|
74
|
+
|
|
75
|
+
The main session is for judgment, orchestration, and dialogue with the engineer. Everything else — read, grep, audit, review, debug, implement, verify — gets delegated. If you're about to do labor in main context, you've already taken the wrong turn.
|
|
76
|
+
|
|
77
|
+
### Judgment vs labor — what NEVER delegates
|
|
78
|
+
|
|
79
|
+
Labor handles work whose answer is findable from the inputs. Main session keeps work whose answer is **judgment** — there is no "right answer" a worker could discover:
|
|
80
|
+
|
|
81
|
+
- **Brainstorming** — exploring the problem space with the engineer before a spec exists.
|
|
82
|
+
- **Spec writing** — deciding what to build, what success looks like, what's out of scope.
|
|
83
|
+
- **Plan writing** — turning a spec into ordered, testable steps with the right decomposition.
|
|
84
|
+
- **Architecture and design decisions** — choosing the shape of the solution.
|
|
85
|
+
- **Final approval / merge decisions** — what ships.
|
|
86
|
+
- **Dialogue with the engineer** — clarifying intent, negotiating tradeoffs, answering "should we?".
|
|
87
|
+
|
|
88
|
+
The test: *if a worker can produce the answer from the given inputs, delegate; if the answer requires deciding what the inputs should be, it's main-session work.* Recipes A–D all keep these judgment steps in main context (e.g., Recipe C explicitly: `mma-investigate` → **write the plan (main)** → `mma-execute-plan`).
|
|
89
|
+
|
|
90
|
+
### C1 — Delegate by default, inline by exception
|
|
91
|
+
|
|
92
|
+
If a task needs 3+ file reads or any grep, it goes to a worker. Inline `Read` is reserved for files already in context, single-file lookups, or 1-2 file reads with a known target.
|
|
93
|
+
|
|
94
|
+
### C2 — Parallel for independence, sequential for iteration
|
|
95
|
+
|
|
96
|
+
Independent fan-out (5 unrelated audits, 5 unrelated bugs) → parallel batch. Coupled rounds where round N's fix produces round N+1's input (audit → fix → re-audit, debug → fix → verify) → sequential.
|
|
97
|
+
|
|
98
|
+
### C3 — Shared content lives in a context block, not in caller tokens
|
|
99
|
+
|
|
100
|
+
Any artifact (spec, plan, prior-round findings, long error log) that crosses 2+ calls gets registered once via `mma-context-blocks` and referenced by ID.
|
|
101
|
+
|
|
102
|
+
### Recipe A — Audit-iterate-clean
|
|
103
|
+
|
|
104
|
+
`mma-audit` → read findings → fix (inline if 1-2 lines, else `mma-delegate`) → `mma-audit` again. Sequential rounds, NOT parallel re-audits. The fix produces new edges; round 2 catches what round 1 couldn't see. Register the doc as a context block before round 1; reuse the same ID across all rounds. The same shape applies to `mma-review` for source code (review → fix → re-review).
|
|
105
|
+
|
|
106
|
+
### Recipe B — Debug-fix-verify
|
|
107
|
+
|
|
108
|
+
`mma-debug` (read/reproduce/trace) → `mma-delegate` (apply the fix the hypothesis implies) → `mma-verify` (acceptance criteria checked independently). Three skills, strict order. Register the failing test output / reproduction log as a context block before the debug call; reuse it on verify.
|
|
109
|
+
|
|
110
|
+
### Recipe C — Investigate-plan-execute
|
|
111
|
+
|
|
112
|
+
`mma-investigate` (codebase Q&A) → write the plan (main-context judgment task) → `mma-execute-plan` (workers implement against named plan headings) → `mma-retry` on any failed indices. Register the plan file as a context block before execute-plan; the retry call inherits the same configuration including `contextBlockIds`.
|
|
113
|
+
|
|
114
|
+
### Recipe D — Plan-execute-retry
|
|
115
|
+
|
|
116
|
+
When `mma-execute-plan` returns mixed `done` / `done_with_concerns` / `failed`, the next step is `mma-retry` on the failed indices only — never a full-batch re-dispatch. Pass the **original `batchId`** as input, specify the failed task indices, keep the same configuration. (`mma-retry` produces a NEW `batchId` in its response — poll that one for terminal state, not the original.) Any `contextBlockIds` registered for the original batch carry forward into retry — no need to re-register.
|
|
117
|
+
|
|
118
|
+
### Anti-patterns
|
|
119
|
+
|
|
120
|
+
1. **`parallel-rounds-same-target`** — Caller fans out 3 parallel calls of the same skill on the same target — `mma-audit` on one document, `mma-review` on the same source file, or `mma-verify` of the same checklist. The reports overlap heavily; later rounds never see the fix from earlier rounds, so they re-flag the same issues. Corrective: sequential rounds with a fix between each (Recipe A for audit/review; for `mma-verify`, the analog is verify → fix → re-verify, never two verify calls on the unchanged checklist in parallel).
|
|
121
|
+
|
|
122
|
+
2. **`inline-labor-leakage`** — Caller does 3+ `Read` calls, or any `grep`, in main context "just to understand the situation." Main tokens get burned on labor; the answer the caller actually needs is one paragraph of synthesis. Corrective: `mma-investigate` for codebase Q&A; if the goal is implementation, jump straight to `mma-delegate` with file paths and let the worker read.
|
|
123
|
+
|
|
124
|
+
3. **`re-inlined-shared-content`** — Caller pastes the same spec / plan / error log into 5 task prompts in one batch (or across rounds). Token cost scales linearly with N. Corrective: `mma-context-blocks` register once, pass `contextBlockIds` to every task. C3 fires the moment the same content is referenced a second time.
|
|
125
|
+
|
|
126
|
+
4. **`full-batch-redispatch`** — Caller re-runs `mma-execute-plan` with the entire task list when only 2 of 8 tasks failed. The 6 successful tasks get re-charged. Corrective: `mma-retry` with the failed indices. (The same anti-pattern applies to multi-task `mma-delegate` batches; `mma-retry` is the corrective there too.)
|
|
127
|
+
|
|
128
|
+
5. **`clarification-as-info`** — Caller polls a batch, sees `proposedInterpretation` as a string, treats it as advisory, and waits for the batch to complete. The batch is paused — it will not complete until the caller responds via `mma-clarifications`. Corrective: a string `proposedInterpretation` is a hard gate, not an FYI. Either accept the proposal verbatim or correct it.
|
|
129
|
+
|
|
71
130
|
## Preflight: auto-start the daemon if it is not running
|
|
72
131
|
|
|
73
132
|
```bash
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@zhixuan92/multi-model-agent",
|
|
3
|
-
"version": "3.6.
|
|
3
|
+
"version": "3.6.7",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
|
|
@@ -52,7 +52,7 @@
|
|
|
52
52
|
},
|
|
53
53
|
"dependencies": {
|
|
54
54
|
"@asteasolutions/zod-to-openapi": "^8.5.0",
|
|
55
|
-
"@zhixuan92/multi-model-agent-core": "^3.6.
|
|
55
|
+
"@zhixuan92/multi-model-agent-core": "^3.6.7",
|
|
56
56
|
"gray-matter": "^4.0.3",
|
|
57
57
|
"minimist": "^1.2.8",
|
|
58
58
|
"proper-lockfile": "^4.1.2",
|