@zhixuan92/multi-model-agent 3.8.1 → 3.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -82,7 +82,7 @@ Two ways — pick one:
82
82
 
83
83
  ```bash
84
84
  mmagent serve # 127.0.0.1:7337 by default
85
- curl -s http://localhost:7337/health # → {"ok":true,"version":"3.8.1",...}
85
+ curl -s http://localhost:7337/health # → {"ok":true,"version":"3.9.0",...}
86
86
  ```
87
87
 
88
88
  For an always-on background install (survives reboots): [launchd / systemd templates](./scripts/README.md).
@@ -187,8 +187,8 @@ Every `defaults` knob has a built-in. Override only when you need to.
187
187
 
188
188
  | Field | Default | What it does |
189
189
  |---|---|---|
190
- | `defaults.timeoutMs` | `1800000` (30 min) | Hard task-level wall-clock cap |
191
- | `defaults.stallTimeoutMs` | `600000` (10 min) | Aborts in-flight runs idle for this long |
190
+ | `defaults.timeoutMs` | `3600000` (60 min) | Hard task-level wall-clock cap (bumped from 30 min in 3.9.0) |
191
+ | `defaults.stallTimeoutMs` | `1200000` (20 min) | Aborts in-flight runs idle for this long (bumped from 10 min in 3.9.0) |
192
192
  | `defaults.maxCostUSD` | `10` | Hard per-task cost ceiling; returns `cost_exceeded` when hit |
193
193
  | `defaults.tools` | `"full"` | Tool surface: `none` / `readonly` / `no-shell` / `full` |
194
194
  | `defaults.sandboxPolicy` | `"cwd-only"` | Path-traversal + symlink confinement to the request's `cwd` |
@@ -285,7 +285,7 @@ Full design rationale: [DIRECTION.md](https://github.com/zhixuan312/multi-model-
285
285
 
286
286
  ## What's new
287
287
 
288
- Latest: **3.8.1** — read-only review becomes annotation, not gating. The 5 read-only routes (audit, review, verify, investigate, debug) now run a single reviewer pass that annotates each worker finding with `reviewerConfidence` (0-100) and an optional `reviewerSeverity` correction no rework loop, restoring 3.7.0-comparable wall-clock. `Finding` schema simplified (drop `file`/`line`/`sourceQuote`; required `evidence`; rename `suggestedFix` `suggestion`). Full history: [CHANGELOG](https://github.com/zhixuan312/multi-model-agent/blob/master/CHANGELOG.md).
288
+ Latest: **3.9.0** — watchdog hardening + per-stage idle telemetry. The reviewer entry points (`runSpecReview` / `runQualityReview` / `runDiffReview`) now thread `taskDeadlineMs` + `abortSignal` + `onProgress`, closing the leak that allowed reviewer hangs to run past the documented cap. Total wall-clock cap bumped 30 60 min, stall watchdog 10 → 20 min, both via named constants. New `StageIdleTracker` records `maxIdleMs`/`totalIdleMs`/`activityEvents` per stage and surfaces them on `task_completed` + the heartbeat (`stage_idle_ms`). Full history: [CHANGELOG](https://github.com/zhixuan312/multi-model-agent/blob/master/CHANGELOG.md).
289
289
 
290
290
  ## Full documentation
291
291
 
@@ -8,7 +8,7 @@ when_to_use: >-
8
8
  User asks for a doc/spec/config audit OR a methodology skill
9
9
  (superpowers:dispatching-parallel-agents, /security-review) points at one AND
10
10
  mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
11
- version: 3.8.1
11
+ version: 3.9.0
12
12
  ---
13
13
 
14
14
  # mma-audit
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  `proposedInterpretation` is a hard gate — the batch is paused, not
13
13
  informational. The batch will not complete until the caller responds. Treating
14
14
  it as advisory is the clarification-as-info anti-pattern (AP5).
15
- version: 3.8.1
15
+ version: 3.9.0
16
16
  ---
17
17
 
18
18
  # mma-clarifications
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
13
13
  mma-execute-plan / mma-audit / mma-review / mma-verify / mma-debug /
14
14
  mma-investigate. Cheaper and faster than inlining the same content N times.
15
- version: 3.8.1
15
+ version: 3.9.0
16
16
  ---
17
17
 
18
18
  # mma-context-blocks
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  read files, reproduce, trace — OR a methodology skill
11
11
  (superpowers:systematic-debugging) points at the investigation step. Delegate
12
12
  the read/reproduce/trace; the main agent stays on the hypothesis and the fix.
13
- version: 3.8.1
13
+ version: 3.9.0
14
14
  ---
15
15
 
16
16
  # mma-debug
@@ -11,7 +11,7 @@ when_to_use: >-
11
11
  and keep main context free. If a plan file exists → use mma-execute-plan. If
12
12
  the task is audit / review / verify / debug / investigate → use the matching
13
13
  specialized skill.
14
- version: 3.8.1
14
+ version: 3.9.0
15
15
  ---
16
16
 
17
17
  # mma-delegate
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  superpowers:subagent-driven-development / superpowers:executing-plans —
11
11
  workers are cheaper and don't pollute main context. Task descriptors must
12
12
  match plan headings verbatim.
13
- version: 3.8.1
13
+ version: 3.9.0
14
14
  ---
15
15
 
16
16
  # mma-execute-plan
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  git-history queries. OR you are about to read 3+ files / run any grep in main
13
13
  context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
14
14
  skill instead.
15
- version: 3.8.1
15
+ version: 3.9.0
16
16
  ---
17
17
 
18
18
  # mma-investigate
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  you want to re-try the failed indices only. Prefer this over re-dispatching
11
11
  the whole batch or inline-retrying — it's idempotent and preserves the
12
12
  original batch's diagnostics.
13
- version: 3.8.1
13
+ version: 3.9.0
14
14
  ---
15
15
 
16
16
  # mma-retry
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  AND mmagent is running. Delegate so each file reviews on its own worker; the
11
11
  main agent only decides what to merge. Review on SOURCE CODE — use mma-audit
12
12
  for prose specs / configs.
13
- version: 3.8.1
13
+ version: 3.9.0
14
14
  ---
15
15
 
16
16
  # mma-review
@@ -10,7 +10,7 @@ when_to_use: >-
10
10
  against implemented work BEFORE claiming success. Delegate so each checklist
11
11
  item gets independent evidence-gathering on a worker. Use this BEFORE saying
12
12
  "done" — never after.
13
- version: 3.8.1
13
+ version: 3.9.0
14
14
  ---
15
15
 
16
16
  # mma-verify
@@ -11,7 +11,7 @@ when_to_use: >-
11
11
  tasks — AND mmagent is running. Read this once, pick the matching mma-* skill,
12
12
  and delegate there. Applies equally whether the user invoked a superpowers
13
13
  methodology skill or asked directly.
14
- version: 3.8.1
14
+ version: 3.9.0
15
15
  ---
16
16
 
17
17
  # multi-model-agent (router)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zhixuan92/multi-model-agent",
3
- "version": "3.8.1",
3
+ "version": "3.9.0",
4
4
  "type": "module",
5
5
  "license": "MIT",
6
6
  "description": "Standalone HTTP server for multi-model-agent. Routes tool-invocation work to Claude, Codex, or OpenAI-compatible sub-agents with async-polling REST dispatch and installable skills for Claude Code, Gemini CLI, Codex CLI, and Cursor.",
@@ -52,7 +52,7 @@
52
52
  },
53
53
  "dependencies": {
54
54
  "@asteasolutions/zod-to-openapi": "^8.5.0",
55
- "@zhixuan92/multi-model-agent-core": "^3.8.1",
55
+ "@zhixuan92/multi-model-agent-core": "^3.9.0",
56
56
  "gray-matter": "^4.0.3",
57
57
  "minimist": "^1.2.8",
58
58
  "proper-lockfile": "^4.1.2",