@zhixuan92/multi-model-agent 4.4.0 → 4.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -8
- package/dist/skills/mma-audit/SKILL.md +1 -1
- package/dist/skills/mma-context-blocks/SKILL.md +1 -1
- package/dist/skills/mma-debug/SKILL.md +1 -1
- package/dist/skills/mma-delegate/SKILL.md +1 -1
- package/dist/skills/mma-execute-plan/SKILL.md +1 -1
- package/dist/skills/mma-explore/SKILL.md +1 -1
- package/dist/skills/mma-investigate/SKILL.md +1 -1
- package/dist/skills/mma-research/SKILL.md +1 -1
- package/dist/skills/mma-retry/SKILL.md +1 -1
- package/dist/skills/mma-review/SKILL.md +1 -1
- package/dist/skills/multi-model-agent/SKILL.md +1 -1
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -88,7 +88,7 @@ Two ways — pick one:
|
|
|
88
88
|
|
|
89
89
|
```bash
|
|
90
90
|
mmagent serve # 127.0.0.1:7337 by default
|
|
91
|
-
curl -s http://localhost:7337/health # → {"ok":true,"version":"4.
|
|
91
|
+
curl -s http://localhost:7337/health # → {"ok":true,"version":"4.5.0",...}
|
|
92
92
|
```
|
|
93
93
|
|
|
94
94
|
For an always-on background install (survives reboots): [launchd / systemd templates](./scripts/README.md).
|
|
@@ -287,14 +287,12 @@ Full design rationale: [DIRECTION.md](https://github.com/zhixuan312/multi-model-
|
|
|
287
287
|
| TLS `handshake_failure` to a known-good telemetry endpoint | Local DNS cache is stale. `sudo dscacheutil -flushcache && sudo killall -HUP mDNSResponder` (macOS); restart the daemon so its Node process re-resolves |
|
|
288
288
|
| Local telemetry queue stops draining | Daemon's flusher is in exponential backoff after a transport failure (capped at 1 hr). Restart the daemon to force an immediate boot-flush |
|
|
289
289
|
|
|
290
|
-
## What's new in 4.
|
|
290
|
+
## What's new in 4.5.0
|
|
291
291
|
|
|
292
|
-
- **
|
|
293
|
-
- **
|
|
294
|
-
- **
|
|
295
|
-
-
|
|
296
|
-
- **LLM `verify` tool removed.** Verification is `verifyCommand` only (deterministic shell command run after the worker).
|
|
297
|
-
- **Telemetry clamp ceilings raised** for 2026-era usage: per-stage input/cached `5M → 100M`, output `500K → 2M`, per-stage cost `$100 → $500`, per-task cost `$800 → $5000`.
|
|
292
|
+
- **Commit from real git diff, not worker self-report.** The commit gate now reads `filesChanged` from `git diff --name-only <preTaskHeadSha>` + filtered untracked files (snapshot-diffed against a `preTaskUntrackedFiles` set captured at task entry), not from the worker's JSON. Three source values surface in telemetry: `git_diff` (authoritative), `self_report` (non-git cwd, count-only), `git_error` (degraded, no commit). Eliminates the "files written but `filesChanged: []` so commit skipped" failure mode.
|
|
293
|
+
- **Progress-watchdog with three signals.** Arms an interval poller around `delegateWithEscalation()` and `rework` `session.send()`. (1) Wall-clock thrash → `controller.abort()` when `wallClockMs > thrashWallClockMs` AND `git diff` empty (default 20 min). (2) Turn-count thrash → `turnsUsed > thrashTurns` AND diff empty (default 25 turns). (3) Scope-violation → any file in the real diff outside the brief's declared scope. Skip gates: read-only tools, non-git cwd, `defaults.progressWatchdogEnabled: false`. A thrashing worker can no longer burn the full budget producing nothing.
|
|
294
|
+
- **Telemetry envelope fixes.** `subtype` reaches the wire (4.4.0 landed it on the HTTP envelope only); `annotating` stage emits a deterministic `stageStats` entry so per-stage dashboards stop showing a gap; per-stage `mainEquivalentCostUSD` attached for the Lite-page per-model savings slice.
|
|
295
|
+
- **Seven new observability events** for diff resolution + watchdog signals: `real_diff_resolved`, `real_diff_self_report_fallback`, `real_diff_git_error`, `progress_watchdog_armed` / `_warn` / `_fired_thrash` / `_scope_violation` / `_disarmed`.
|
|
298
296
|
|
|
299
297
|
Full history: [CHANGELOG](https://github.com/zhixuan312/multi-model-agent/blob/master/CHANGELOG.md).
|
|
300
298
|
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
(superpowers:dispatching-parallel-agents, /security-review) points at one AND
|
|
13
13
|
mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
|
|
14
14
|
Audit a CODE-EXECUTION PLAN against the codebase — use subtype=plan.
|
|
15
|
-
version: 4.
|
|
15
|
+
version: 4.5.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-audit
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
|
|
13
13
|
mma-execute-plan / mma-audit / mma-review / mma-debug / mma-investigate.
|
|
14
14
|
Cheaper and faster than inlining the same content N times.
|
|
15
|
-
version: 4.
|
|
15
|
+
version: 4.5.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-context-blocks
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
read files, reproduce, trace — OR a methodology skill
|
|
11
11
|
(superpowers:systematic-debugging) points at the investigation step. Delegate
|
|
12
12
|
the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
|
|
13
|
-
version: 4.
|
|
13
|
+
version: 4.5.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-debug
|
|
@@ -11,7 +11,7 @@ when_to_use: >-
|
|
|
11
11
|
and keep main context free. If a plan file exists → use mma-execute-plan. If
|
|
12
12
|
the task is audit / review / verify / debug / investigate → use the matching
|
|
13
13
|
specialized skill.
|
|
14
|
-
version: 4.
|
|
14
|
+
version: 4.5.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# mma-delegate
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
superpowers:subagent-driven-development / superpowers:executing-plans —
|
|
11
11
|
workers are cheaper and don't pollute main context. Task descriptors must
|
|
12
12
|
match plan headings verbatim.
|
|
13
|
-
version: 4.
|
|
13
|
+
version: 4.5.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-execute-plan
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
out mma-investigate (internal) + mma-research (external) in parallel and
|
|
13
13
|
synthesise the results yourself. DO NOT use for convergent single-answer
|
|
14
14
|
questions — those are mma-investigate.
|
|
15
|
-
version: 4.
|
|
15
|
+
version: 4.5.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-explore
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
git-history queries. OR you are about to read 3+ files / run any grep in main
|
|
13
13
|
context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
|
|
14
14
|
skill instead.
|
|
15
|
-
version: 4.
|
|
15
|
+
version: 4.5.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-investigate
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
others do, what published methods exist) AND mmagent is running. Delegate the
|
|
11
11
|
multi-source web/adapter research to a worker so the main context stays on
|
|
12
12
|
judgment. NOT for codebase questions — those are mma-investigate.
|
|
13
|
-
version: 4.
|
|
13
|
+
version: 4.5.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-research
|
|
@@ -10,7 +10,7 @@ when_to_use: >-
|
|
|
10
10
|
AND mmagent is running. Delegate so each file reviews on its own worker; the
|
|
11
11
|
main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
|
|
12
12
|
for prose specs / configs.
|
|
13
|
-
version: 4.
|
|
13
|
+
version: 4.5.0
|
|
14
14
|
---
|
|
15
15
|
|
|
16
16
|
# mma-review
|
|
@@ -11,7 +11,7 @@ when_to_use: >-
|
|
|
11
11
|
tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
|
|
12
12
|
and delegate there. Applies equally whether the user invoked a superpowers
|
|
13
13
|
methodology skill or asked directly.
|
|
14
|
-
version: 4.
|
|
14
|
+
version: 4.5.0
|
|
15
15
|
---
|
|
16
16
|
|
|
17
17
|
# multi-model-agent (router)
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@zhixuan92/multi-model-agent",
|
|
3
|
-
"version": "4.
|
|
3
|
+
"version": "4.5.0",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
|
|
@@ -53,7 +53,7 @@
|
|
|
53
53
|
},
|
|
54
54
|
"dependencies": {
|
|
55
55
|
"@asteasolutions/zod-to-openapi": "^8.5.0",
|
|
56
|
-
"@zhixuan92/multi-model-agent-core": "^4.
|
|
56
|
+
"@zhixuan92/multi-model-agent-core": "^4.5.0",
|
|
57
57
|
"gray-matter": "^4.0.3",
|
|
58
58
|
"minimist": "^1.2.8",
|
|
59
59
|
"proper-lockfile": "^4.1.2",
|