@zhixuan92/multi-model-agent 5.0.1 → 5.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -9
- package/dist/cli/index.d.ts +62 -0
- package/dist/cli/index.d.ts.map +1 -0
- package/dist/cli/index.js +345 -0
- package/dist/cli/index.js.map +1 -0
- package/dist/cli/info.d.ts +22 -0
- package/dist/cli/info.d.ts.map +1 -0
- package/dist/cli/info.js +100 -0
- package/dist/cli/info.js.map +1 -0
- package/dist/cli/logs.d.ts +15 -0
- package/dist/cli/logs.d.ts.map +1 -0
- package/dist/cli/logs.js +102 -0
- package/dist/cli/logs.js.map +1 -0
- package/dist/cli/print-token.d.ts +18 -0
- package/dist/cli/print-token.d.ts.map +1 -0
- package/dist/cli/print-token.js +60 -0
- package/dist/cli/print-token.js.map +1 -0
- package/dist/cli/serve.d.ts +28 -0
- package/dist/cli/serve.d.ts.map +1 -0
- package/dist/cli/serve.js +405 -0
- package/dist/cli/serve.js.map +1 -0
- package/dist/cli/status.d.ts +49 -0
- package/dist/cli/status.d.ts.map +1 -0
- package/dist/cli/status.js +155 -0
- package/dist/cli/status.js.map +1 -0
- package/dist/cli/sync-skills.d.ts +58 -0
- package/dist/cli/sync-skills.d.ts.map +1 -0
- package/dist/cli/sync-skills.js +266 -0
- package/dist/cli/sync-skills.js.map +1 -0
- package/dist/cli/telemetry.d.ts +10 -0
- package/dist/cli/telemetry.d.ts.map +1 -0
- package/dist/cli/telemetry.js +161 -0
- package/dist/cli/telemetry.js.map +1 -0
- package/dist/cli/toggle.d.ts +26 -0
- package/dist/cli/toggle.d.ts.map +1 -0
- package/dist/cli/toggle.js +185 -0
- package/dist/cli/toggle.js.map +1 -0
- package/dist/http/async-dispatch.d.ts +44 -0
- package/dist/http/async-dispatch.d.ts.map +1 -0
- package/dist/http/async-dispatch.js +175 -0
- package/dist/http/async-dispatch.js.map +1 -0
- package/dist/http/auth.d.ts +20 -0
- package/dist/http/auth.d.ts.map +1 -0
- package/dist/http/auth.js +56 -0
- package/dist/http/auth.js.map +1 -0
- package/dist/http/canonicalize-file-paths.d.ts +8 -0
- package/dist/http/canonicalize-file-paths.d.ts.map +1 -0
- package/dist/http/canonicalize-file-paths.js +43 -0
- package/dist/http/canonicalize-file-paths.js.map +1 -0
- package/dist/http/cwd-validator.d.ts +11 -0
- package/dist/http/cwd-validator.d.ts.map +1 -0
- package/dist/http/cwd-validator.js +130 -0
- package/dist/http/cwd-validator.js.map +1 -0
- package/dist/http/errors.d.ts +4 -0
- package/dist/http/errors.d.ts.map +1 -0
- package/dist/http/errors.js +9 -0
- package/dist/http/errors.js.map +1 -0
- package/dist/http/execution-context.d.ts +18 -0
- package/dist/http/execution-context.d.ts.map +1 -0
- package/dist/http/execution-context.js +61 -0
- package/dist/http/execution-context.js.map +1 -0
- package/dist/http/handler-deps.d.ts +19 -0
- package/dist/http/handler-deps.d.ts.map +1 -0
- package/dist/http/handler-deps.js +2 -0
- package/dist/http/handler-deps.js.map +1 -0
- package/dist/http/handlers/control/batch-slice.d.ts +4 -0
- package/dist/http/handlers/control/batch-slice.d.ts.map +1 -0
- package/dist/http/handlers/control/batch-slice.js +40 -0
- package/dist/http/handlers/control/batch-slice.js.map +1 -0
- package/dist/http/handlers/control/batch.d.ts +23 -0
- package/dist/http/handlers/control/batch.d.ts.map +1 -0
- package/dist/http/handlers/control/batch.js +332 -0
- package/dist/http/handlers/control/batch.js.map +1 -0
- package/dist/http/handlers/control/context-blocks.d.ts +22 -0
- package/dist/http/handlers/control/context-blocks.d.ts.map +1 -0
- package/dist/http/handlers/control/context-blocks.js +111 -0
- package/dist/http/handlers/control/context-blocks.js.map +1 -0
- package/dist/http/handlers/introspection/health.d.ts +20 -0
- package/dist/http/handlers/introspection/health.d.ts.map +1 -0
- package/dist/http/handlers/introspection/health.js +18 -0
- package/dist/http/handlers/introspection/health.js.map +1 -0
- package/dist/http/handlers/introspection/status.d.ts +26 -0
- package/dist/http/handlers/introspection/status.d.ts.map +1 -0
- package/dist/http/handlers/introspection/status.js +136 -0
- package/dist/http/handlers/introspection/status.js.map +1 -0
- package/dist/http/handlers/tools/audit.d.ts +4 -0
- package/dist/http/handlers/tools/audit.d.ts.map +1 -0
- package/dist/http/handlers/tools/audit.js +43 -0
- package/dist/http/handlers/tools/audit.js.map +1 -0
- package/dist/http/handlers/tools/debug.d.ts +4 -0
- package/dist/http/handlers/tools/debug.d.ts.map +1 -0
- package/dist/http/handlers/tools/debug.js +43 -0
- package/dist/http/handlers/tools/debug.js.map +1 -0
- package/dist/http/handlers/tools/delegate.d.ts +4 -0
- package/dist/http/handlers/tools/delegate.d.ts.map +1 -0
- package/dist/http/handlers/tools/delegate.js +43 -0
- package/dist/http/handlers/tools/delegate.js.map +1 -0
- package/dist/http/handlers/tools/execute-plan.d.ts +4 -0
- package/dist/http/handlers/tools/execute-plan.d.ts.map +1 -0
- package/dist/http/handlers/tools/execute-plan.js +45 -0
- package/dist/http/handlers/tools/execute-plan.js.map +1 -0
- package/dist/http/handlers/tools/investigate.d.ts +4 -0
- package/dist/http/handlers/tools/investigate.d.ts.map +1 -0
- package/dist/http/handlers/tools/investigate.js +64 -0
- package/dist/http/handlers/tools/investigate.js.map +1 -0
- package/dist/http/handlers/tools/journal-recall.d.ts +4 -0
- package/dist/http/handlers/tools/journal-recall.d.ts.map +1 -0
- package/dist/http/handlers/tools/journal-recall.js +40 -0
- package/dist/http/handlers/tools/journal-recall.js.map +1 -0
- package/dist/http/handlers/tools/journal-record.d.ts +8 -0
- package/dist/http/handlers/tools/journal-record.d.ts.map +1 -0
- package/dist/http/handlers/tools/journal-record.js +40 -0
- package/dist/http/handlers/tools/journal-record.js.map +1 -0
- package/dist/http/handlers/tools/research.d.ts +4 -0
- package/dist/http/handlers/tools/research.d.ts.map +1 -0
- package/dist/http/handlers/tools/research.js +64 -0
- package/dist/http/handlers/tools/research.js.map +1 -0
- package/dist/http/handlers/tools/retry.d.ts +4 -0
- package/dist/http/handlers/tools/retry.d.ts.map +1 -0
- package/dist/http/handlers/tools/retry.js +73 -0
- package/dist/http/handlers/tools/retry.js.map +1 -0
- package/dist/http/handlers/tools/review.d.ts +4 -0
- package/dist/http/handlers/tools/review.d.ts.map +1 -0
- package/dist/http/handlers/tools/review.js +43 -0
- package/dist/http/handlers/tools/review.js.map +1 -0
- package/dist/http/journal-lock.d.ts +4 -0
- package/dist/http/journal-lock.d.ts.map +1 -0
- package/dist/http/journal-lock.js +34 -0
- package/dist/http/journal-lock.js.map +1 -0
- package/dist/http/middleware/body-reader.d.ts +16 -0
- package/dist/http/middleware/body-reader.d.ts.map +1 -0
- package/dist/http/middleware/body-reader.js +44 -0
- package/dist/http/middleware/body-reader.js.map +1 -0
- package/dist/http/middleware/caller-identity.d.ts +16 -0
- package/dist/http/middleware/caller-identity.d.ts.map +1 -0
- package/dist/http/middleware/caller-identity.js +16 -0
- package/dist/http/middleware/caller-identity.js.map +1 -0
- package/dist/http/middleware/decompress.d.ts +14 -0
- package/dist/http/middleware/decompress.d.ts.map +1 -0
- package/dist/http/middleware/decompress.js +51 -0
- package/dist/http/middleware/decompress.js.map +1 -0
- package/dist/http/project-registry.d.ts +54 -0
- package/dist/http/project-registry.d.ts.map +1 -0
- package/dist/http/project-registry.js +130 -0
- package/dist/http/project-registry.js.map +1 -0
- package/dist/http/request-observability.d.ts +8 -0
- package/dist/http/request-observability.d.ts.map +1 -0
- package/dist/http/request-observability.js +20 -0
- package/dist/http/request-observability.js.map +1 -0
- package/dist/http/request-pipeline.d.ts +16 -0
- package/dist/http/request-pipeline.d.ts.map +1 -0
- package/dist/http/request-pipeline.js +144 -0
- package/dist/http/request-pipeline.js.map +1 -0
- package/dist/http/server.d.ts +17 -0
- package/dist/http/server.d.ts.map +1 -0
- package/dist/http/server.js +300 -0
- package/dist/http/server.js.map +1 -0
- package/dist/http/types.d.ts +20 -0
- package/dist/http/types.d.ts.map +1 -0
- package/dist/http/types.js +2 -0
- package/dist/http/types.js.map +1 -0
- package/dist/skill-install/disabled-state.d.ts +35 -0
- package/dist/skill-install/disabled-state.d.ts.map +1 -0
- package/dist/skill-install/disabled-state.js +96 -0
- package/dist/skill-install/disabled-state.js.map +1 -0
- package/dist/skill-install/discover.d.ts +29 -0
- package/dist/skill-install/discover.d.ts.map +1 -0
- package/dist/skill-install/discover.js +104 -0
- package/dist/skill-install/discover.js.map +1 -0
- package/dist/skill-install/include-utils.d.ts +27 -0
- package/dist/skill-install/include-utils.d.ts.map +1 -0
- package/dist/skill-install/include-utils.js +90 -0
- package/dist/skill-install/include-utils.js.map +1 -0
- package/dist/skill-install/manifest.d.ts +82 -0
- package/dist/skill-install/manifest.d.ts.map +1 -0
- package/dist/skill-install/manifest.js +215 -0
- package/dist/skill-install/manifest.js.map +1 -0
- package/dist/skill-install/skill-installer-common.d.ts +26 -0
- package/dist/skill-install/skill-installer-common.d.ts.map +1 -0
- package/dist/skill-install/skill-installer-common.js +139 -0
- package/dist/skill-install/skill-installer-common.js.map +1 -0
- package/dist/skill-install/skill-installers/claude-code.d.ts +43 -0
- package/dist/skill-install/skill-installers/claude-code.d.ts.map +1 -0
- package/dist/skill-install/skill-installers/claude-code.js +65 -0
- package/dist/skill-install/skill-installers/claude-code.js.map +1 -0
- package/dist/skill-install/skill-installers/codex-cli.d.ts +27 -0
- package/dist/skill-install/skill-installers/codex-cli.d.ts.map +1 -0
- package/dist/skill-install/skill-installers/codex-cli.js +84 -0
- package/dist/skill-install/skill-installers/codex-cli.js.map +1 -0
- package/dist/skill-install/skill-installers/cursor.d.ts +72 -0
- package/dist/skill-install/skill-installers/cursor.d.ts.map +1 -0
- package/dist/skill-install/skill-installers/cursor.js +81 -0
- package/dist/skill-install/skill-installers/cursor.js.map +1 -0
- package/dist/skill-install/skill-installers/gemini-cli.d.ts +50 -0
- package/dist/skill-install/skill-installers/gemini-cli.d.ts.map +1 -0
- package/dist/skill-install/skill-installers/gemini-cli.js +72 -0
- package/dist/skill-install/skill-installers/gemini-cli.js.map +1 -0
- package/dist/skill-install/skill-manifest-sync.d.ts +11 -0
- package/dist/skill-install/skill-manifest-sync.d.ts.map +1 -0
- package/dist/skill-install/skill-manifest-sync.js +65 -0
- package/dist/skill-install/skill-manifest-sync.js.map +1 -0
- package/dist/skills/_shared/auth.md +41 -0
- package/dist/skills/_shared/error-handling.md +31 -0
- package/dist/skills/_shared/polling.md +88 -0
- package/dist/skills/_shared/response-shape.md +55 -0
- package/dist/skills/_shared/review-policy.md +15 -0
- package/dist/skills/mma-audit/SKILL.md +270 -0
- package/dist/skills/mma-context-blocks/SKILL.md +148 -0
- package/dist/skills/mma-debug/SKILL.md +208 -0
- package/dist/skills/mma-delegate/SKILL.md +216 -0
- package/dist/skills/mma-execute-plan/SKILL.md +214 -0
- package/dist/skills/mma-explore/SKILL.md +190 -0
- package/dist/skills/mma-investigate/SKILL.md +258 -0
- package/dist/skills/mma-journal-recall/SKILL.md +242 -0
- package/dist/skills/mma-journal-record/SKILL.md +202 -0
- package/dist/skills/mma-research/SKILL.md +223 -0
- package/dist/skills/mma-retry/SKILL.md +221 -0
- package/dist/skills/mma-review/SKILL.md +209 -0
- package/dist/skills/multi-model-agent/SKILL.md +206 -0
- package/dist/telemetry/consent.d.ts +4 -0
- package/dist/telemetry/consent.d.ts.map +1 -0
- package/dist/telemetry/consent.js +40 -0
- package/dist/telemetry/consent.js.map +1 -0
- package/dist/telemetry/flusher.d.ts +19 -0
- package/dist/telemetry/flusher.d.ts.map +1 -0
- package/dist/telemetry/flusher.js +277 -0
- package/dist/telemetry/flusher.js.map +1 -0
- package/dist/telemetry/generation.d.ts +9 -0
- package/dist/telemetry/generation.d.ts.map +1 -0
- package/dist/telemetry/generation.js +33 -0
- package/dist/telemetry/generation.js.map +1 -0
- package/dist/telemetry/identity.d.ts +9 -0
- package/dist/telemetry/identity.d.ts.map +1 -0
- package/dist/telemetry/identity.js +35 -0
- package/dist/telemetry/identity.js.map +1 -0
- package/dist/telemetry/install-id.d.ts +13 -0
- package/dist/telemetry/install-id.d.ts.map +1 -0
- package/dist/telemetry/install-id.js +49 -0
- package/dist/telemetry/install-id.js.map +1 -0
- package/dist/telemetry/install-meta.d.ts +10 -0
- package/dist/telemetry/install-meta.d.ts.map +1 -0
- package/dist/telemetry/install-meta.js +15 -0
- package/dist/telemetry/install-meta.js.map +1 -0
- package/dist/telemetry/queue.d.ts +35 -0
- package/dist/telemetry/queue.d.ts.map +1 -0
- package/dist/telemetry/queue.js +287 -0
- package/dist/telemetry/queue.js.map +1 -0
- package/dist/telemetry/recorder.d.ts +39 -0
- package/dist/telemetry/recorder.d.ts.map +1 -0
- package/dist/telemetry/recorder.js +173 -0
- package/dist/telemetry/recorder.js.map +1 -0
- package/package.json +43 -24
- package/scripts/postinstall.js +36 -0
- package/bin/mmagent.mjs +0 -47
- package/postinstall.mjs +0 -8
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
## Authentication & identity headers
|
|
2
|
+
|
|
3
|
+
Every request to the multi-model-agent server requires:
|
|
4
|
+
|
|
5
|
+
| Header | Required for | Purpose |
|
|
6
|
+
|---|---|---|
|
|
7
|
+
| `Authorization: Bearer <token>` | All routes (except `/health`) | Auth — token from `mmagent print-token` |
|
|
8
|
+
| `X-MMA-Client: <client>` | All tool routes | Identifies your client. One of `claude-code`, `cursor`, `codex-cli`, `gemini-cli`. **Server returns `400 client_required` if missing.** |
|
|
9
|
+
| `X-MMA-Main-Model: <model-id>` | All tool routes | Calling agent's model id (e.g. `claude-opus-4-7`, `gpt-5.4`). Used as `mainModel` in wire telemetry so cost-delta-vs-main and family attribution can be computed. **Server returns `400 main_model_required` if missing.** Auto-detection is intentionally not attempted — the calling client is the only reliable source. |
|
|
10
|
+
|
|
11
|
+
### Obtain the token
|
|
12
|
+
|
|
13
|
+
**From environment variable** (preferred):
|
|
14
|
+
```
|
|
15
|
+
MMAGENT_AUTH_TOKEN=<token>
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
**From CLI**:
|
|
19
|
+
```bash
|
|
20
|
+
mmagent print-token
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
### Shell helper
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
TOKEN="${MMAGENT_AUTH_TOKEN:-$(mmagent print-token)}"
|
|
27
|
+
MMA_CLIENT="${MMAGENT_CLIENT:-claude-code}"
|
|
28
|
+
MMA_MAIN_MODEL="${MMAGENT_MAIN_MODEL:-claude-opus-4-7}"
|
|
29
|
+
|
|
30
|
+
curl \
|
|
31
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
32
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
33
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
34
|
+
...
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### Errors
|
|
38
|
+
|
|
39
|
+
- `401 unauthorized` — re-run `mmagent print-token`; the token may have changed after a server restart.
|
|
40
|
+
- `400 client_required` — `X-MMA-Client` header is missing on a tool route. Set it to one of: `claude-code`, `cursor`, `codex-cli`, `gemini-cli`.
|
|
41
|
+
- `400 main_model_required` — `X-MMA-Main-Model` header is missing on a tool route. Set it to the calling agent's model id (e.g. `claude-opus-4-7`, `gpt-5.4`).
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
## Error handling
|
|
2
|
+
|
|
3
|
+
### HTTP status decision table
|
|
4
|
+
|
|
5
|
+
| Status | Code | Action |
|
|
6
|
+
|---|---|---|
|
|
7
|
+
| `400` | `invalid_request` | Fix the request body or query params |
|
|
8
|
+
| `401` | `unauthorized` | Re-fetch token; check `MMAGENT_AUTH_TOKEN` |
|
|
9
|
+
| `403` | `forbidden` | `cwd` query param missing or out of scope |
|
|
10
|
+
| `404` | `not_found` | Wrong `batchId` or resource does not exist |
|
|
11
|
+
| `409` | `invalid_batch_state` / `pinned` | Batch in wrong state; check current state first |
|
|
12
|
+
| `413` | `payload_too_large` | Reduce content size (context block or body) |
|
|
13
|
+
| `429` | `rate_limited` | Wait `Retry-After` seconds, then retry |
|
|
14
|
+
| `503` | `project_cap_exceeded` | Too many concurrent projects; wait and retry |
|
|
15
|
+
| `5xx` | server error | Retry once after 2 s; escalate if it persists |
|
|
16
|
+
|
|
17
|
+
### Network failures
|
|
18
|
+
|
|
19
|
+
Retry up to 3 times with exponential backoff (1 s → 2 s → 4 s).
|
|
20
|
+
If the server is unreachable, check that `mmagent serve` is running:
|
|
21
|
+
```bash
|
|
22
|
+
curl -s http://localhost:$PORT/health # expects { "status": "ok" } (v4.0 — see spec C13)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
### Auth errors (401)
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
export MMAGENT_AUTH_TOKEN=$(mmagent print-token)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
The token changes on every server restart. Re-export before retrying.
|
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
## Polling for batch completion
|
|
2
|
+
|
|
3
|
+
After a tool call returns a `batchId`, poll `GET /batch/:id` until the batch
|
|
4
|
+
reaches a terminal state.
|
|
5
|
+
|
|
6
|
+
### HTTP response shapes (3.1.0)
|
|
7
|
+
|
|
8
|
+
| Status | Content-Type | Meaning |
|
|
9
|
+
|---|---|---|
|
|
10
|
+
| `202` | `text/plain` | Still working — body is the running headline (e.g. `1/1 running, 47s elapsed`) |
|
|
11
|
+
| `200` | `application/json` | Terminal — body is the uniform 6-field envelope (see `response-shape.md`) |
|
|
12
|
+
| `404` / `401` / other | — | Error — stop polling |
|
|
13
|
+
|
|
14
|
+
### Terminal envelope states
|
|
15
|
+
|
|
16
|
+
Every terminal envelope has the same six fields; inspect `error` to tell
|
|
17
|
+
which terminal state you're in:
|
|
18
|
+
|
|
19
|
+
| Shape | Meaning |
|
|
20
|
+
|---|---|
|
|
21
|
+
| `error` is a real object | Batch failed — read `error.code` + `error.message` |
|
|
22
|
+
| `error` is `{kind: "not_applicable", ...}` | Batch succeeded — read `results` |
|
|
23
|
+
|
|
24
|
+
### Poll loop (POSIX sh)
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
DELAY=1
|
|
28
|
+
START=$(date +%s)
|
|
29
|
+
TIMEOUT_S=${MMAGENT_POLL_TIMEOUT_S:-1800}
|
|
30
|
+
BODY_FILE=$(mktemp -t mmagent-poll.XXXXXX)
|
|
31
|
+
trap 'rm -f "$BODY_FILE"' EXIT
|
|
32
|
+
|
|
33
|
+
while true; do
|
|
34
|
+
NOW=$(date +%s)
|
|
35
|
+
if [ $((NOW - START)) -ge "$TIMEOUT_S" ]; then
|
|
36
|
+
echo "mmagent: poll timed out after ${TIMEOUT_S}s" >&2
|
|
37
|
+
exit 124
|
|
38
|
+
fi
|
|
39
|
+
|
|
40
|
+
STATUS=$(curl -f --show-error -o "$BODY_FILE" -w "%{http_code}" -s \
|
|
41
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
42
|
+
"http://127.0.0.1:$PORT/batch/$BATCH_ID" || true)
|
|
43
|
+
|
|
44
|
+
case "$STATUS" in
|
|
45
|
+
202)
|
|
46
|
+
cat "$BODY_FILE"; echo
|
|
47
|
+
sleep "$DELAY"
|
|
48
|
+
DELAY=$(( DELAY < 30 ? DELAY * 2 : 30 ))
|
|
49
|
+
;;
|
|
50
|
+
200)
|
|
51
|
+
cat "$BODY_FILE"
|
|
52
|
+
exit 0
|
|
53
|
+
;;
|
|
54
|
+
"")
|
|
55
|
+
echo "mmagent: unreachable (curl failed)" >&2; exit 1 ;;
|
|
56
|
+
*)
|
|
57
|
+
echo "mmagent: HTTP $STATUS"; cat "$BODY_FILE" >&2; exit 1 ;;
|
|
58
|
+
esac
|
|
59
|
+
done
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Start at 1 s, double each iteration, cap at 30 s. The 1800-second client-side
|
|
63
|
+
timeout is a safety cap; most batches complete in under 60 s. Discover `$PORT`
|
|
64
|
+
at runtime with `mmagent info --json | jq -r .port` (default: 7337).
|
|
65
|
+
|
|
66
|
+
### Caller-side tool-timeout note
|
|
67
|
+
|
|
68
|
+
The poll helper's internal `TIMEOUT_S` default is 1800s (30 minutes). If your
|
|
69
|
+
agent's shell tool (e.g. Claude Code's Bash) caps command wall-clock at
|
|
70
|
+
10 minutes by default, the helper will be killed at 10m regardless of
|
|
71
|
+
`TIMEOUT_S` — long-running delegations then appear to "fail" before terminal.
|
|
72
|
+
|
|
73
|
+
When invoking this poll loop, pick one:
|
|
74
|
+
|
|
75
|
+
- **Preferred — pass a 30-minute tool timeout explicitly** (e.g. Claude Code
|
|
76
|
+
Bash accepts `timeout: 1800000`, up to 600000ms/10 min by default; pass the
|
|
77
|
+
max the tool allows, or bump the tool's allowed ceiling via harness
|
|
78
|
+
settings).
|
|
79
|
+
- **Alternative — cap the helper to match the tool's limit** by exporting
|
|
80
|
+
`MMAGENT_POLL_TIMEOUT_S=600` before running the loop. The helper will then
|
|
81
|
+
exit 124 cleanly at 10 minutes and the caller can decide whether to
|
|
82
|
+
re-poll or surface the timeout.
|
|
83
|
+
|
|
84
|
+
Never let the helper run longer than the caller's tool cap — the process
|
|
85
|
+
gets killed mid-poll, the caller sees a generic failure, and diagnostics
|
|
86
|
+
from the `TIMEOUT_S` exit path are lost.
|
|
87
|
+
|
|
88
|
+
Windows/PowerShell equivalent is planned for a later release.
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
## Response shapes
|
|
2
|
+
|
|
3
|
+
### POST /<tool>?cwd=<abs> — dispatch response (202)
|
|
4
|
+
|
|
5
|
+
```json
|
|
6
|
+
{ "batchId": "<uuid>", "statusUrl": "/batch/<uuid>" }
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
Use `batchId` to poll. `statusUrl` is a convenience pointer.
|
|
10
|
+
|
|
11
|
+
### GET /batch/:id — polling response
|
|
12
|
+
|
|
13
|
+
The HTTP status is the state discriminator:
|
|
14
|
+
|
|
15
|
+
| Status | Meaning |
|
|
16
|
+
|---|---|
|
|
17
|
+
| `202 text/plain` | Still pending — body is the running headline string (e.g. `"1/2 running, 47s elapsed"`) |
|
|
18
|
+
| `200 application/json` | Terminal — body is the uniform 7-field envelope below |
|
|
19
|
+
| `404` / `401` / `5xx` | Error — see Error response below; stop polling |
|
|
20
|
+
|
|
21
|
+
The terminal JSON envelope always has these 6 fields. Each may be a real value or a `not_applicable` sentinel:
|
|
22
|
+
|
|
23
|
+
```json
|
|
24
|
+
{
|
|
25
|
+
"headline": "<string>",
|
|
26
|
+
"results": [ /* per-task result objects */ ],
|
|
27
|
+
"batchTimings": { /* timings */ },
|
|
28
|
+
"costSummary": { /* cost roll-up */ },
|
|
29
|
+
"structuredReport": { /* parsed sections */ },
|
|
30
|
+
"error": { "kind": "not_applicable", "reason": "batch succeeded" }
|
|
31
|
+
}
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Read the envelope by the shape of `error`:
|
|
35
|
+
|
|
36
|
+
| Shape | Meaning |
|
|
37
|
+
|---|---|
|
|
38
|
+
| `error` is a real object (with `code` / `message`) | Batch failed — read `error.code` + `error.message` |
|
|
39
|
+
| `error` is `{kind: "not_applicable", ...}` | Batch succeeded — read `results` |
|
|
40
|
+
|
|
41
|
+
### GET /batch/:id?taskIndex=N — single task slice
|
|
42
|
+
|
|
43
|
+
Same 6-field envelope. `results` contains exactly the task at index `N`. Returns `404 unknown_task_index` if `N` is out of range.
|
|
44
|
+
|
|
45
|
+
### Error response (4xx / 5xx)
|
|
46
|
+
|
|
47
|
+
```json
|
|
48
|
+
{
|
|
49
|
+
"error": "<code>",
|
|
50
|
+
"message": "<human-readable>",
|
|
51
|
+
"details": { /* optional structured context, e.g. fieldErrors for 400 */ }
|
|
52
|
+
}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
`details` is optional and present only when the server has structured additional context.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
### `reviewPolicy` — review lifecycle per task
|
|
2
|
+
|
|
3
|
+
**Applies to write routes only** (`delegate`, `execute-plan`, `retry`).
|
|
4
|
+
Read-only routes (audit, review, debug, investigate, research) do not expose
|
|
5
|
+
this field — they are hardcoded to `"none"` because the review stage is
|
|
6
|
+
write-routes-only. They still run the always-on **annotate** judge (a
|
|
7
|
+
standard-tier LLM pass that summarizes the worker's report); their findings
|
|
8
|
+
come from the worker itself, not from a second-pass code review.
|
|
9
|
+
|
|
10
|
+
| Value | Behavior | Use when |
|
|
11
|
+
|---|---|---|
|
|
12
|
+
| `"full"` | Spec review + quality review (default) | Default for new code or risky edits |
|
|
13
|
+
| `"quality_only"` | Quality review only | Write task where spec-conformance is already certain but you still want a quality pass |
|
|
14
|
+
| `"diff_only"` | Single-pass review of the produced diff | Cheap mechanical refactors (file moves, renames, import-path updates) |
|
|
15
|
+
| `"none"` | Skip the review stage | Trivially mechanical edits or throwaway scripts where a second-pass reviewer adds nothing |
|
|
@@ -0,0 +1,270 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mma-audit
|
|
3
|
+
description: >-
|
|
4
|
+
Use when the user asks to audit a spec / plan / design doc / skill file. The
|
|
5
|
+
`subtype` field picks the criteria set. `default` (prose-coherence) is the
|
|
6
|
+
general doc auditor. `plan` verifies a code-execution plan against the actual
|
|
7
|
+
codebase — run this before any `mma-execute-plan` dispatch. `spec` audits
|
|
8
|
+
requirement prose for testability and decision-trace. `skill` audits a
|
|
9
|
+
SKILL.md against reader-effectiveness criteria.
|
|
10
|
+
when_to_use: >-
|
|
11
|
+
User asks for a doc / spec / plan / skill audit OR a methodology skill
|
|
12
|
+
(superpowers:dispatching-parallel-agents, /security-review) points at one AND
|
|
13
|
+
mmagent is running. Audit on PROSE/SPEC docs — use mma-review for source code.
|
|
14
|
+
Audit a CODE-EXECUTION PLAN against the codebase — use subtype=plan.
|
|
15
|
+
version: 5.0.3
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# mma-audit
|
|
19
|
+
|
|
20
|
+
## Overview
|
|
21
|
+
|
|
22
|
+
`mma-audit` sends a prose artifact to workers for structured auditing. The `subtype` field picks WHICH criteria set the workers apply — every subtype runs through the same sequential-criteria read-only lifecycle, but each one carries its own criteria list, semantics, and prompt scaffolding.
|
|
23
|
+
|
|
24
|
+
**Four subtypes — picked by the kind of artifact, not by the lens you want:**
|
|
25
|
+
|
|
26
|
+
| You're auditing… | Use… | What it checks |
|
|
27
|
+
|---|---|---|
|
|
28
|
+
| A general prose artifact (design doc, recommendation, post-mortem, README) | `subtype: 'default'` | Comprehensive prose-coherence — would a literal-following worker produce the right outcome from this prose alone? Catches ambiguity, contradictions, missing branches, drift, scope-creep. **Does NOT verify against any codebase.** |
|
|
29
|
+
| A **code-execution PLAN** (`docs/superpowers/plans/*.md` or similar) before running it via `mma-execute-plan` | `subtype: 'plan'` | Plan-vs-codebase coherence — for every method / type / file path / signature / import / verify command the plan names, the codebase actually contains it as described. Catches the bug class the prose-coherence audit cannot see (e.g. plan says `registerBlock` but actual interface is `register`). |
|
|
30
|
+
| A **requirement spec** (what we want, why; success criteria) | `subtype: 'spec'` | Requirement-prose executability across 9 criteria — testability, scope explicitness AND decomposability, acceptance-criteria coverage, non-functional capture, requirement conflicts, decision-trace, assumption exposure, placeholder scan, and design-decomposition presence (architecture / components / data flow / error handling / testing). |
|
|
31
|
+
| A **SKILL.md** for an `mma-*` skill or comparable agent-facing playbook | `subtype: 'skill'` | Skill-file reader-effectiveness — when-to-use specificity, endpoint contract integrity, example correctness, anti-pattern coverage, link integrity. |
|
|
32
|
+
|
|
33
|
+
If you want to bias workers toward a narrow lens (security only, performance only, accessibility only), put that in the free-text `background` portion of the prompt — `subtype` is criteria machinery, not a lens selector.
|
|
34
|
+
|
|
35
|
+
## When to Use
|
|
36
|
+
|
|
37
|
+
- `subtype: 'default'` — a general prose artifact needs a critical read for internal executability (the artifact will be acted on by a worker reading the prose alone).
|
|
38
|
+
- `subtype: 'plan'` — you have a written code-execution plan on disk and you're about to dispatch tasks from it via `mma-execute-plan`. This is the ONLY subtype that grounds findings against real source files.
|
|
39
|
+
- `subtype: 'spec'` — you have a requirement / brainstorming-output spec and want to verify every requirement is testable, traceable, and unambiguous BEFORE writing the plan. Typical predecessor to `writing-plans`.
|
|
40
|
+
- `subtype: 'skill'` — you're authoring or revising an `mma-*` skill or comparable SKILL.md and want to know whether agents will actually read it the right way.
|
|
41
|
+
|
|
42
|
+
**Don't use mma-audit when:** the thing being audited is source code (→ `mma-review`); a 30-second `Read` would answer it; or you want to verify a plan that hasn't been written yet (write the plan first).
|
|
43
|
+
|
|
44
|
+
## Endpoint
|
|
45
|
+
|
|
46
|
+
`POST /audit?cwd=<abs-path>`
|
|
47
|
+
|
|
48
|
+
@include _shared/auth.md
|
|
49
|
+
|
|
50
|
+
## Request body
|
|
51
|
+
|
|
52
|
+
```json
|
|
53
|
+
{
|
|
54
|
+
"document": "inline content to audit (optional if filePaths given)",
|
|
55
|
+
"subtype": "default",
|
|
56
|
+
"filePaths": ["/project/docs/spec.md"],
|
|
57
|
+
"contextBlockIds": []
|
|
58
|
+
}
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
| Field | Type | Required | Notes |
|
|
62
|
+
|---|---|---|---|
|
|
63
|
+
| `document` | string | no | Inline document content |
|
|
64
|
+
| `subtype` | `'default' \| 'plan' \| 'spec' \| 'skill'` | no (defaults to `'default'`) | See "Picking subtype" below. |
|
|
65
|
+
| `filePaths` | string[] | no | Files to audit (one worker per file, parallel) |
|
|
66
|
+
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` |
|
|
67
|
+
|
|
68
|
+
Either `document` or `filePaths` (or both) must be provided.
|
|
69
|
+
|
|
70
|
+
> Worker tier for `mma-audit` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
|
|
71
|
+
|
|
72
|
+
### Picking subtype
|
|
73
|
+
|
|
74
|
+
| Value | When to use |
|
|
75
|
+
|---|---|
|
|
76
|
+
| `default` (or omit the field) | **General prose — design doc, recommendation, post-mortem, README, brief.** Comprehensive prose-coherence audit. Does NOT verify against any codebase. |
|
|
77
|
+
| `plan` | **Code-execution plans being audited against a real codebase.** Single-file input (the plan markdown). Workers grep / read source files under `cwd` to verify every named symbol / path / signature / import / verify command. Use this BEFORE every `mma-execute-plan` dispatch. |
|
|
78
|
+
| `spec` | **Requirement spec / brainstorming-output / what-we-want prose.** 9 criteria target testability, scope explicitness + decomposability, acceptance-criteria coverage, non-functional capture, requirement conflicts, decision-trace, assumption exposure, placeholder scan, and design-decomposition presence. |
|
|
79
|
+
| `skill` | **`SKILL.md` or comparable agent-facing playbook.** Criteria target when-to-use specificity, endpoint contract integrity, example correctness, anti-pattern coverage, link integrity. |
|
|
80
|
+
|
|
81
|
+
You can run BOTH on a plan: first `spec` or `default` (prose quality), then `plan` (does the plan match the codebase?). They cover orthogonal failure modes.
|
|
82
|
+
|
|
83
|
+
The legacy `auditType` field and its `correctness` / `style` / `general` / `security` / `performance` values no longer exist. Sending `auditType` returns `400 invalid_request`. Sending unknown `subtype` values returns `400 invalid_request` with the allowed enum.
|
|
84
|
+
|
|
85
|
+
### Plan-audit specifics
|
|
86
|
+
|
|
87
|
+
When `subtype: 'plan'`:
|
|
88
|
+
|
|
89
|
+
- `filePaths` MUST contain exactly **one entry** — the plan markdown. Sending zero or 2+ entries → `400 invalid_request` with the message: *"Plan audit takes exactly one filePath (the plan markdown). The worker discovers and verifies source files itself via its tool surface — do not pre-list source files."*
|
|
90
|
+
- `document` (inline content) is not used in plan mode — the plan must be on disk so workers can reference it by `?cwd=`-relative path.
|
|
91
|
+
- The worker runs the sequential-criteria loop with the plan-audit criteria set across 12 perspectives in three groups: **EXTERNAL CODEBASE COHERENCE** (1 PATH EXISTENCE, 2 SYMBOL EXISTENCE, 3 SIGNATURE MATCH, 4 IMPORT GRAPH, 5 TEST HARNESS AVAILABILITY, 6 STEP SEQUENCE WITHIN TASK, 7 CROSS-TASK DEPENDENCIES, 8 VERIFICATION COMMAND VALIDITY), **INTRA-PLAN STRUCTURE** (9 TASK GRANULARITY, 11 PLACEHOLDER LANGUAGE, 12 PLAN SKELETON), and **SPEC ALIGNMENT** (10 SPEC COVERAGE).
|
|
92
|
+
- To enable perspective 10 (SPEC COVERAGE), register the upstream spec as a context block via `mma-context-blocks` and pass its `blockId` in `contextBlockIds`. Without a spec in context, perspective 10 emits "No findings for this criterion." and the other 11 still run.
|
|
93
|
+
- Read the findings list. Fix the plan and re-audit if any `critical` or `high` plan-audit findings remain.
|
|
94
|
+
|
|
95
|
+
## Full example
|
|
96
|
+
|
|
97
|
+
### Default audit (general prose)
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
101
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
102
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
103
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
104
|
+
-H "Content-Type: application/json" \
|
|
105
|
+
-d '{"subtype":"default","filePaths":["/project/docs/api-spec.md"]}' \
|
|
106
|
+
"http://localhost:$PORT/audit?cwd=/project")
|
|
107
|
+
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Spec audit (requirement prose)
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
114
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
115
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
116
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
117
|
+
-H "Content-Type: application/json" \
|
|
118
|
+
-d '{"subtype":"spec","filePaths":["/project/docs/superpowers/specs/2026-05-12-feature-design.md"]}' \
|
|
119
|
+
"http://localhost:$PORT/audit?cwd=/project")
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### Skill audit (SKILL.md)
|
|
123
|
+
|
|
124
|
+
```bash
|
|
125
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
126
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
127
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
128
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
129
|
+
-H "Content-Type: application/json" \
|
|
130
|
+
-d '{"subtype":"skill","filePaths":["/project/packages/server/src/skills/mma-audit/SKILL.md"]}' \
|
|
131
|
+
"http://localhost:$PORT/audit?cwd=/project")
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Plan audit (verify a code-execution plan against the codebase)
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
138
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
139
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
140
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
141
|
+
-H "Content-Type: application/json" \
|
|
142
|
+
-d '{"subtype":"plan","filePaths":["/project/docs/superpowers/plans/2026-05-10-feature.md"]}' \
|
|
143
|
+
"http://localhost:$PORT/audit?cwd=/project")
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
@include _shared/polling.md
|
|
147
|
+
|
|
148
|
+
@include _shared/response-shape.md
|
|
149
|
+
|
|
150
|
+
## Reading the findings
|
|
151
|
+
|
|
152
|
+
The main agent reads `completed` + `message` + `findings` — the findings are the answer. For
|
|
153
|
+
read-only routes, `filesChanged` is always `[]` and `commitSha` is always `null`.
|
|
154
|
+
|
|
155
|
+
```json
|
|
156
|
+
{
|
|
157
|
+
"completed": true,
|
|
158
|
+
"message": "Plan audit complete; 2 findings.",
|
|
159
|
+
"findings": [
|
|
160
|
+
{ "id": "F1", "severity": "high", "category": "path-existence",
|
|
161
|
+
"claim": "Step 3 names `src/utils/foo.ts` which does not exist.",
|
|
162
|
+
"evidence": "Worker grepped for the file under cwd — no match found.",
|
|
163
|
+
"suggestion": "Use `src/utils/bar.ts` instead.",
|
|
164
|
+
"source": "implementer" }
|
|
165
|
+
],
|
|
166
|
+
"filesChanged": [],
|
|
167
|
+
"commitSha": null,
|
|
168
|
+
"summary": "...",
|
|
169
|
+
"telemetry": { ... }
|
|
170
|
+
}
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Finding shape
|
|
174
|
+
|
|
175
|
+
Every finding has this shape:
|
|
176
|
+
|
|
177
|
+
| Field | Type | Notes |
|
|
178
|
+
|---|---|---|
|
|
179
|
+
| `id` | string | Worker-assigned, e.g. `F1`, `F2`. Stable across chain. |
|
|
180
|
+
| `severity` | `'critical' \| 'high' \| 'medium' \| 'low'` | 4-tier. |
|
|
181
|
+
| `category` | string | Topical bucket, e.g. `path-existence`, `prose-coherence`. |
|
|
182
|
+
| `claim` | string | One-sentence summary. |
|
|
183
|
+
| `evidence` | string ≥20 chars | Verbatim from source when grounded. |
|
|
184
|
+
| `suggestion?` | string | Optional fix recommendation. |
|
|
185
|
+
| `source` | `'implementer' \| 'reviewer'` | Who produced the finding. |
|
|
186
|
+
|
|
187
|
+
`annotatorConfidence` and `evidenceGrounded` are retired — they were v4 fields with no producers.
|
|
188
|
+
|
|
189
|
+
### Recommended rendering by the main agent
|
|
190
|
+
|
|
191
|
+
1. Show ALL findings — never silently drop. Severity and grounding are soft
|
|
192
|
+
signals, not gates.
|
|
193
|
+
2. Default sort: severity (critical → low), then `id` ascending.
|
|
194
|
+
3. `severity` is the authoritative value — use it directly.
|
|
195
|
+
4. Mark findings with `evidence` shorter than 30 chars as "low-evidence"
|
|
196
|
+
(lighter color or `(low evidence)` annotation). User decides what to do.
|
|
197
|
+
5. Severity-tier counts feed the dashboard.
|
|
198
|
+
|
|
199
|
+
## Best practices
|
|
200
|
+
|
|
201
|
+
This skill is one step in the larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-audit`:
|
|
202
|
+
|
|
203
|
+
- **Recipe A — Audit-iterate-clean.** `mma-audit` → fix → `mma-audit` again. Sequential rounds. Register the doc via `mma-context-blocks` before round 1 and reuse the same ID across all rounds — avoids re-inlining the same content into every audit call.
|
|
204
|
+
|
|
205
|
+
- **Recipe E — Plan-validate-execute.** Before any `mma-execute-plan` batch, run `mma-audit` with `subtype: 'plan'` on the plan file. Read the findings. If any `critical` / `high` finding survives, fix the plan and re-audit. This catches the bug class where the plan's named methods/files don't actually exist in the codebase — symbols a prose-coherence audit cannot see.
|
|
206
|
+
|
|
207
|
+
- **Recipe F — Spec-then-plan-then-execute (the canonical flow).** When working from a brainstorming spec: `mma-audit` (`subtype: 'spec'`) → fix → `writing-plans` → register the spec as a context block via `mma-context-blocks` → `mma-audit` (`subtype: 'plan'`, `contextBlockIds: [specBlockId]`) → fix → `mma-execute-plan`. Spec audit covers requirement-prose executability; plan audit covers BOTH plan-vs-codebase coherence AND plan-vs-spec coverage (perspective 10 fires only when the spec is in context, which is why the context-block step is load-bearing in this recipe).
|
|
208
|
+
|
|
209
|
+
Anti-pattern alert: **`parallel-rounds-same-target`** (AP1). Three parallel audits on the same document re-flag the same issues without seeing each other's fixes. Run rounds sequentially with a fix between each.
|
|
210
|
+
|
|
211
|
+
## Common pitfalls
|
|
212
|
+
|
|
213
|
+
❌ **Auditing source code with `mma-audit`**
|
|
214
|
+
The auditor lacks codebase context (no type info, no call-site lookup, no test awareness). Findings are speculative. **Fix:** use `mma-review` — it pulls in surrounding source context and validates against the actual types.
|
|
215
|
+
|
|
216
|
+
❌ **Single huge `document` string instead of `filePaths`**
|
|
217
|
+
Inline docs lose the file boundary, so the per-file parallel split degenerates to one worker. **Fix:** save to disk first, pass `filePaths`.
|
|
218
|
+
|
|
219
|
+
❌ **Sending the legacy `auditType` field**
|
|
220
|
+
The field was renamed to `subtype` and the value set was narrowed. **Fix:** use `subtype` with one of `default` / `plan` / `spec` / `skill`. For "security only" / "performance only" lenses, put the bias in the free-text prompt — there is no narrow-lens subtype.
|
|
221
|
+
|
|
222
|
+
❌ **Re-auditing the same files round after round without delta context**
|
|
223
|
+
Round 2 worker has no idea what round 1 found. **Fix:** register the round 1 findings as a context block (`mma-context-blocks`) and pass `contextBlockIds` to round 2.
|
|
224
|
+
|
|
225
|
+
## Terminal context block
|
|
226
|
+
|
|
227
|
+
Every completed **read-route** task (audit / review / debug / investigate / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
|
|
228
|
+
|
|
229
|
+
Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
|
|
230
|
+
|
|
231
|
+
contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
|
|
232
|
+
|
|
233
|
+
**Use cases:**
|
|
234
|
+
- Pass round-N audit findings to round N+1 via `contextBlockIds`
|
|
235
|
+
- Feed audit results into a downstream `mma-delegate` fix step
|
|
236
|
+
- Accumulate findings across iterative audit rounds
|
|
237
|
+
|
|
238
|
+
The block is registered server-side at task completion; no caller action is needed to create it. Delete it explicitly via `DELETE /context-blocks/:id` when no longer needed, or let it expire on session teardown.
|
|
239
|
+
|
|
240
|
+
## Outcome semantics
|
|
241
|
+
|
|
242
|
+
Every task result carries outcome fields that describe the audit's conclusion status:
|
|
243
|
+
|
|
244
|
+
| Field | Type | Meaning |
|
|
245
|
+
|---|---|---|
|
|
246
|
+
| `findingsOutcome` | `'found' \| 'clean' \| 'not_applicable'` | Answers the question: did the audit uncover issues? |
|
|
247
|
+
| `findingsOutcomeReason` | `string \| null` | When `findingsOutcome` is set, this explains why (e.g. "3 critical findings: broken paths, missing symbols, mismatched signatures" or "Document is clean across all audit criteria"). |
|
|
248
|
+
| `outcomeInferred` | `boolean` | `true` if the system inferred the outcome from findings count; `false` if the auditor explicitly stated it. |
|
|
249
|
+
| `outcomeMalformed` | `boolean` | `true` if the outcome line was malformed and had to be repaired; `false` otherwise. |
|
|
250
|
+
|
|
251
|
+
### Enum values
|
|
252
|
+
|
|
253
|
+
- **`found`** — the audit surfaced one or more issues (findings) in the artifact across one or more criteria. This indicates the artifact needs rework before downstream use.
|
|
254
|
+
- **`clean`** — the audit completed and found zero issues. The artifact is clear across all audit criteria and ready for downstream use.
|
|
255
|
+
- **`not_applicable`** — the audit could not proceed (e.g., wrong input type, missing preconditions, or system error). This is rare; most audits resolve to `found` or `clean`.
|
|
256
|
+
|
|
257
|
+
### Empty findings ≠ failure
|
|
258
|
+
|
|
259
|
+
A crucial semantic: **empty findings does NOT mean `completed: false` or a failed task.** Finding nothing wrong is a successful audit outcome — it means the document passed the bar. An audit with zero findings is `completed: true` with `findingsOutcome: 'clean'`.
|
|
260
|
+
|
|
261
|
+
### Per-route legal outcomes
|
|
262
|
+
|
|
263
|
+
The legal outcomes for this route are: `['found', 'clean']`
|
|
264
|
+
|
|
265
|
+
- **`found`** — one or more issues were detected across the audit criteria.
|
|
266
|
+
- **`clean`** — zero issues were detected; the artifact is ready for downstream use.
|
|
267
|
+
|
|
268
|
+
The outcome `not_applicable` is not legal for `mma-audit` (except on actual precondition failures) because an audit always produces a verdict: either issues found or clean.
|
|
269
|
+
|
|
270
|
+
@include _shared/error-handling.md
|
|
@@ -0,0 +1,148 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mma-context-blocks
|
|
3
|
+
description: >-
|
|
4
|
+
Use when a document larger than ~2 KB will be referenced by 2+ subsequent
|
|
5
|
+
mma-* calls — register once, pass the returned ID to each call instead of
|
|
6
|
+
re-uploading the same content. OR a spec / plan / error log was already
|
|
7
|
+
inlined into one task and is about to be inlined into a second — register on
|
|
8
|
+
the second reference, never the third.
|
|
9
|
+
when_to_use: >-
|
|
10
|
+
A document (spec, plan, codebase summary, prior round's findings, error log)
|
|
11
|
+
larger than ~2 KB will be referenced by two or more mma-* calls in a row.
|
|
12
|
+
Register once here, then pass the ID via `contextBlockIds` on mma-delegate /
|
|
13
|
+
mma-execute-plan / mma-audit / mma-review / mma-debug / mma-investigate.
|
|
14
|
+
Cheaper and faster than inlining the same content N times.
|
|
15
|
+
version: 5.0.3
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# mma-context-blocks
|
|
19
|
+
|
|
20
|
+
## Overview
|
|
21
|
+
|
|
22
|
+
Store large documents once; reference them by ID in subsequent `mma-*` calls via `contextBlockIds`. The service prepends the block content to each task prompt that references the ID — content is transmitted ONCE to the daemon, then reused server-side.
|
|
23
|
+
|
|
24
|
+
**Core principle:** Without context blocks, the same document is sent N times for N tasks. Blocks transmit once. The savings compound on shared specs, prior-round findings, and codebase summaries.
|
|
25
|
+
|
|
26
|
+
## When to Use
|
|
27
|
+
|
|
28
|
+
**Use when:**
|
|
29
|
+
- A doc >2 KB will be referenced by ≥2 mma-* calls
|
|
30
|
+
- You're running iterative audit/review rounds (round 2 references round 1's findings)
|
|
31
|
+
- A spec or design doc is the shared input across N parallel tasks
|
|
32
|
+
- A long error log is the context for debug + delegate calls
|
|
33
|
+
|
|
34
|
+
**Don't use when:**
|
|
35
|
+
- The doc is <2 KB and used once → just inline it (registration overhead exceeds savings)
|
|
36
|
+
- The doc changes between calls → context blocks are immutable; register a new one
|
|
37
|
+
- Single task that doesn't reference any large shared content → no benefit
|
|
38
|
+
|
|
39
|
+
## Endpoints
|
|
40
|
+
|
|
41
|
+
### Register a context block
|
|
42
|
+
|
|
43
|
+
`POST /context-blocks?cwd=<abs-path>`
|
|
44
|
+
|
|
45
|
+
@include _shared/auth.md
|
|
46
|
+
|
|
47
|
+
#### Request body
|
|
48
|
+
|
|
49
|
+
```json
|
|
50
|
+
{
|
|
51
|
+
"content": "# Project spec\n...",
|
|
52
|
+
"ttlMs": 3600000
|
|
53
|
+
}
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
| Field | Type | Required | Notes |
|
|
57
|
+
|---|---|---|---|
|
|
58
|
+
| `content` | string | yes | Document content (min 1 char, max 50 MiB) |
|
|
59
|
+
| `ttlMs` | number | no | Time-to-live in ms; omit for idle-expiry (default 24 h idle). A block that is not referenced by any active batch for 24 h is eligible for eviction. |
|
|
60
|
+
|
|
61
|
+
#### Response (201)
|
|
62
|
+
|
|
63
|
+
```json
|
|
64
|
+
{ "id": "cb_abc123" }
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Use this `id` as a `contextBlockIds` entry in any `mma-*` skill that supports it.
|
|
68
|
+
|
|
69
|
+
### Delete a context block
|
|
70
|
+
|
|
71
|
+
`DELETE /context-blocks/:id?cwd=<abs-path>`
|
|
72
|
+
|
|
73
|
+
Returns `200 { ok: true }` on success. Returns `409 pinned` if the block is held by one or more active batches — wait for those batches to complete before deleting.
|
|
74
|
+
|
|
75
|
+
## Full example
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
# Register spec document once
|
|
79
|
+
ID=$(curl -f --show-error -s -X POST \
|
|
80
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
81
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
82
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
83
|
+
-H "Content-Type: application/json" \
|
|
84
|
+
-d "{\"content\":$(jq -Rs . < /project/docs/spec.md)}" \
|
|
85
|
+
"http://localhost:$PORT/context-blocks?cwd=/project" | jq -r '.id')
|
|
86
|
+
|
|
87
|
+
# Reference from N delegate tasks
|
|
88
|
+
curl -f --show-error -s -X POST \
|
|
89
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
90
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
91
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
92
|
+
-H "Content-Type: application/json" \
|
|
93
|
+
-d "{\"tasks\":[
|
|
94
|
+
{\"prompt\":\"Implement section 3 per spec\",\"contextBlockIds\":[\"$ID\"]},
|
|
95
|
+
{\"prompt\":\"Implement section 4 per spec\",\"contextBlockIds\":[\"$ID\"]}
|
|
96
|
+
]}" \
|
|
97
|
+
"http://localhost:$PORT/delegate?cwd=/project"
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## v5 wire shape (register-context-block route)
|
|
101
|
+
|
|
102
|
+
Every task result is a `ComposePayload`. For the `register-context-block` route, the envelope has one additional field beyond the standard seven:
|
|
103
|
+
|
|
104
|
+
```json
|
|
105
|
+
{
|
|
106
|
+
"completed": true,
|
|
107
|
+
"message": "Context block cb_abc123 registered (12345 bytes)",
|
|
108
|
+
"findings": [],
|
|
109
|
+
"summary": "",
|
|
110
|
+
"filesChanged": [],
|
|
111
|
+
"commitSha": null,
|
|
112
|
+
"blockId": "cb_abc123",
|
|
113
|
+
"telemetry": { ... }
|
|
114
|
+
}
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
`blockId` is **non-null only for the `register-context-block` route**. For every other route (`delegate`, `execute-plan`, `investigate`, etc.), `blockId` is `null`. This is the only signal that distinguishes a register-context-block result from any other route — no route-keyed discriminated union, just one extra nullable field on the shared shape.
|
|
118
|
+
|
|
119
|
+
The terminal context block (per-task, auto-registered) uses a different ID format and is separate from the `blockId` in the wire envelope.
|
|
120
|
+
|
|
121
|
+
## Best practices
|
|
122
|
+
|
|
123
|
+
This skill is the cross-cutting state mechanism described in `multi-model-agent` → "Best practices". Recipes that use context blocks:
|
|
124
|
+
|
|
125
|
+
- **Recipe A — Audit-iterate-clean.** Register the doc once before round 1; pass round-N's findings block ID into round N+1.
|
|
126
|
+
- **Recipe B — Debug-fix-verify.** Register the failing test output / reproduction log before the debug call; reuse on verify.
|
|
127
|
+
- **Recipe C — Investigate-plan-execute.** Register the plan file before `mma-execute-plan`.
|
|
128
|
+
- **Recipe D — Plan-execute-retry.** No new registration needed — `mma-retry` inherits the original batch's `contextBlockIds`.
|
|
129
|
+
|
|
130
|
+
Anti-pattern alert: **`re-inlined-shared-content`** (AP3). Pasting the same spec into 5 task prompts costs N× tokens. Register once; pass `contextBlockIds`.
|
|
131
|
+
|
|
132
|
+
## Common pitfalls
|
|
133
|
+
|
|
134
|
+
❌ **Inlining the same 50KB spec into every task prompt**
|
|
135
|
+
> tasks: [{prompt: "Implement section 3:\n[50KB spec]"}, {prompt: "Implement section 4:\n[50KB spec]"}]
|
|
136
|
+
|
|
137
|
+
N×50KB transmissions; main context burns through tokens. **Fix:** register the spec once, pass `contextBlockIds: ["cb_xxx"]` to each task.
|
|
138
|
+
|
|
139
|
+
❌ **Forgetting to delete unused blocks**
|
|
140
|
+
Blocks count against the project's context-block quota (`maxEntries` 500). **Fix:** explicitly `DELETE` after the dependent batches finish — or let idle expiry (24 h) evict them.
|
|
141
|
+
|
|
142
|
+
❌ **Trying to update a block's content**
|
|
143
|
+
Blocks are immutable. **Fix:** register a new block with the new content; switch the `contextBlockIds` to the new ID.
|
|
144
|
+
|
|
145
|
+
❌ **Deleting a block while a batch still references it**
|
|
146
|
+
Returns `409 pinned`. **Fix:** poll the dependent batches to terminal first, then delete.
|
|
147
|
+
|
|
148
|
+
@include _shared/error-handling.md
|