@zhixuan92/multi-model-agent 4.7.19 → 4.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -7
- package/dist/http/handlers/control/batch.d.ts +2 -0
- package/dist/http/handlers/control/batch.d.ts.map +1 -1
- package/dist/http/handlers/control/batch.js +17 -1
- package/dist/http/handlers/control/batch.js.map +1 -1
- package/dist/http/handlers/tools/journal-recall.d.ts +4 -0
- package/dist/http/handlers/tools/journal-recall.d.ts.map +1 -0
- package/dist/http/handlers/tools/journal-recall.js +40 -0
- package/dist/http/handlers/tools/journal-recall.js.map +1 -0
- package/dist/http/handlers/tools/journal-record.d.ts +4 -0
- package/dist/http/handlers/tools/journal-record.d.ts.map +1 -0
- package/dist/http/handlers/tools/journal-record.js +35 -0
- package/dist/http/handlers/tools/journal-record.js.map +1 -0
- package/dist/http/handlers/tools/research.d.ts.map +1 -1
- package/dist/http/handlers/tools/research.js +0 -1
- package/dist/http/handlers/tools/research.js.map +1 -1
- package/dist/http/server.d.ts.map +1 -1
- package/dist/http/server.js +6 -2
- package/dist/http/server.js.map +1 -1
- package/dist/skill-install/discover.d.ts +1 -1
- package/dist/skill-install/discover.d.ts.map +1 -1
- package/dist/skill-install/discover.js +2 -0
- package/dist/skill-install/discover.js.map +1 -1
- package/dist/skills/mma-audit/SKILL.md +6 -2
- package/dist/skills/mma-context-blocks/SKILL.md +1 -1
- package/dist/skills/mma-debug/SKILL.md +6 -2
- package/dist/skills/mma-delegate/SKILL.md +3 -9
- package/dist/skills/mma-execute-plan/SKILL.md +3 -9
- package/dist/skills/mma-explore/SKILL.md +54 -27
- package/dist/skills/mma-investigate/SKILL.md +6 -2
- package/dist/skills/mma-journal-recall/SKILL.md +242 -0
- package/dist/skills/mma-journal-record/SKILL.md +189 -0
- package/dist/skills/mma-research/SKILL.md +14 -5
- package/dist/skills/mma-retry/SKILL.md +4 -4
- package/dist/skills/mma-review/SKILL.md +6 -2
- package/dist/skills/multi-model-agent/SKILL.md +7 -3
- package/package.json +2 -2
|
@@ -3,31 +3,36 @@ name: mma-explore
|
|
|
3
3
|
description: >-
|
|
4
4
|
Use when about to brainstorm or plan and need a divergent landscape scan —
|
|
5
5
|
orchestrates parallel internal-codebase investigation + external multi-source
|
|
6
|
-
research
|
|
7
|
-
single-answer questions (use
|
|
6
|
+
research + prior-learnings recall from the project journal, then synthesises
|
|
7
|
+
3–5 distinct directions. Not for "where is X" single-answer questions (use
|
|
8
|
+
mma-investigate).
|
|
8
9
|
when_to_use: >-
|
|
9
10
|
You are about to brainstorm or plan and need a broad landscape scan before
|
|
10
11
|
narrowing. The question is exploratory ("what are our options", "what
|
|
11
12
|
approaches exist", "survey how others handle"). The skill instructs you to fan
|
|
12
|
-
out mma-investigate (internal)
|
|
13
|
-
|
|
14
|
-
questions — those
|
|
15
|
-
|
|
13
|
+
out mma-investigate (internal), mma-research (external), and
|
|
14
|
+
mma-journal-recall (prior learnings/decisions) in parallel and synthesise the
|
|
15
|
+
results yourself. DO NOT use for convergent single-answer questions — those
|
|
16
|
+
are mma-investigate.
|
|
17
|
+
version: 4.8.0
|
|
16
18
|
---
|
|
17
19
|
|
|
18
20
|
# mma-explore
|
|
19
21
|
|
|
20
22
|
## Overview
|
|
21
23
|
|
|
22
|
-
Codebase + external sources, synthesised into 3–5 distinct
|
|
23
|
-
delegated calls
|
|
24
|
-
|
|
25
|
-
the
|
|
24
|
+
Codebase + external sources + prior learnings, synthesised into 3–5 distinct
|
|
25
|
+
directions. Three delegated calls run in parallel — `mma-investigate` (internal
|
|
26
|
+
codebase), `mma-research` (external sources), and `mma-journal-recall` (what
|
|
27
|
+
this project already learned/decided, from the `.mmagent/journal/` graph) —
|
|
28
|
+
and **you** synthesise their results into the final output.
|
|
26
29
|
|
|
27
30
|
**Core principle:** Exploration is divergent (survey, enumerate, compare).
|
|
28
|
-
Synthesis turns raw threads into ranked, citable directions. The
|
|
29
|
-
|
|
30
|
-
|
|
31
|
+
Synthesis turns raw threads into ranked, citable directions. The three legs
|
|
32
|
+
are delegated; the synthesis is your judgment work and stays in main context.
|
|
33
|
+
The journal leg is what keeps you from re-proposing a direction the project
|
|
34
|
+
already tried and dropped — it grounds the scan in your own history, not just
|
|
35
|
+
the code and the outside world.
|
|
31
36
|
|
|
32
37
|
## When to Use
|
|
33
38
|
|
|
@@ -56,41 +61,50 @@ digraph when_to_use {
|
|
|
56
61
|
|
|
57
62
|
## How to run
|
|
58
63
|
|
|
59
|
-
Dispatch
|
|
64
|
+
Dispatch ALL THREE in ONE message (parallel tool use):
|
|
60
65
|
|
|
61
66
|
1. `mma-investigate` — internal codebase research
|
|
62
67
|
- You MAY skip this only if the question is unambiguously greenfield (no
|
|
63
68
|
codebase touch-points exist). When in doubt, run it.
|
|
64
69
|
2. `mma-research` — external multi-source research
|
|
70
|
+
3. `mma-journal-recall` — prior learnings/decisions from the project journal
|
|
71
|
+
- Always run it. If the project has no journal yet (or nothing relevant),
|
|
72
|
+
it returns zero findings — a valid result you handle with the
|
|
73
|
+
`(no prior learning)` sentinel. Never skip it to "save a call": a
|
|
74
|
+
superseded prior decision is exactly the signal you most want before
|
|
75
|
+
brainstorming.
|
|
65
76
|
|
|
66
|
-
Wait for
|
|
67
|
-
|
|
77
|
+
Wait for all legs to return. Do NOT proceed to synthesis until you have every
|
|
78
|
+
result (or have decided to skip investigate as greenfield).
|
|
68
79
|
|
|
69
80
|
## Endpoint
|
|
70
81
|
|
|
71
82
|
This is a main-agent skill — there is no dedicated `/explore` HTTP endpoint.
|
|
72
|
-
Behind the scenes, you dispatch the
|
|
73
|
-
(`POST /investigate`)
|
|
83
|
+
Behind the scenes, you dispatch the three delegated tools `mma-investigate`
|
|
84
|
+
(`POST /investigate`), `mma-research` (`POST /research`), and
|
|
85
|
+
`mma-journal-recall` (`POST /journal-recall`) yourself.
|
|
74
86
|
|
|
75
87
|
## Request body
|
|
76
88
|
|
|
77
|
-
(Not applicable — this skill orchestrates
|
|
78
|
-
[`mma-investigate`](../mma-investigate/SKILL.md)
|
|
79
|
-
[`mma-research`](../mma-research/SKILL.md)
|
|
89
|
+
(Not applicable — this skill orchestrates three other skills.) See
|
|
90
|
+
[`mma-investigate`](../mma-investigate/SKILL.md),
|
|
91
|
+
[`mma-research`](../mma-research/SKILL.md), and
|
|
92
|
+
[`mma-journal-recall`](../mma-journal-recall/SKILL.md) for their request bodies.
|
|
80
93
|
|
|
81
94
|
## Full example
|
|
82
95
|
|
|
83
|
-
The main agent (you) issues a single message with
|
|
96
|
+
The main agent (you) issues a single message with three parallel tool calls:
|
|
84
97
|
|
|
85
98
|
```
|
|
86
99
|
[parallel tool use]
|
|
87
|
-
mma-investigate
|
|
88
|
-
mma-research
|
|
100
|
+
mma-investigate { question: "How does our streaming JSON parser handle backpressure?", filePaths: ["src/parsers/"] }
|
|
101
|
+
mma-research { researchQuestion: "State-of-the-art streaming JSON parsers with backpressure?", background: "We use a single-pass push parser." }
|
|
102
|
+
mma-journal-recall { query: "what have we learned about streaming-parser backpressure or buffering tradeoffs?" }
|
|
89
103
|
```
|
|
90
104
|
|
|
91
105
|
## Reading the leg results
|
|
92
106
|
|
|
93
|
-
|
|
107
|
+
All three legs (`mma-investigate`, `mma-research`, `mma-journal-recall`) return the v5 wire envelope (see `mma-investigate/SKILL.md` → "v5 wire shape"). Each sub-task result is a `ComposePayload` with the standard seven fields. The authoritative citation source is **`results[0].findings`** — an array of `{ id, severity, category, claim, evidence, suggestion, source }`.
|
|
94
108
|
|
|
95
109
|
Explore top-level orchestration aggregates sub-task results into a valid `ImplementPayload` (read-route shape) before the final `annotate` stage runs. Each sub-task follows the same v5 wire shape; the top-level result is a composition of those sub-tasks.
|
|
96
110
|
|
|
@@ -99,6 +113,7 @@ Explore top-level orchestration aggregates sub-task results into a valid `Implem
|
|
|
99
113
|
| Did the leg succeed? | `results[0].completed === true` — findings may be zero on a read route; finding nothing wrong is a valid completion |
|
|
100
114
|
| Internal citation source | `results[0].findings[i].claim` plus a `file:LINE` token from `results[0].findings[i].evidence` (workers style them as `` `path:LINE` `` markdown-linked refs) |
|
|
101
115
|
| External citation source | `results[0].findings[i].claim` plus a source name / URL from `results[0].findings[i].evidence` |
|
|
116
|
+
| Prior-learning source | `results[0].findings[i].claim` plus a journal node id from `results[0].findings[i].evidence` (recall cites `` `.mmagent/journal/nodes/NNNN-…` `` or `node NNNN`). Watch the node's status: a **superseded** learning is a "we tried this and moved on" signal — surface it, don't bury it |
|
|
102
117
|
| Divergence axis | `results[0].findings[i].category` groups findings by criterion — pick across categories so threads don't collapse onto one axis |
|
|
103
118
|
|
|
104
119
|
Apply a sentinel only when `findings` is empty AND `results[0].message` contains no finding-level content — i.e., the worker genuinely returned nothing. Do NOT apply a sentinel just because `results[0].message` reads tersely or `results[0].telemetry.workerSelfAssessment === 'failed'` — a worker can say `'failed'` with usable partial findings.
|
|
@@ -116,11 +131,21 @@ Produce **3–5 threads**. Each thread MUST have:
|
|
|
116
131
|
- One **external citation** (from research) — `<source> — claim`.
|
|
117
132
|
- Pick from `results[0].findings`: take `claim` as the citation claim and pull a source name / URL out of `evidence`.
|
|
118
133
|
- Use the sentinel `(no external source found)` only when `results[0].findings` is empty for the research leg.
|
|
134
|
+
- One **prior-learning citation** (from journal-recall) WHEN a relevant node exists — `(journal) node NNNN — claim`.
|
|
135
|
+
- Pick from the recall leg's `results[0].findings`: take `claim` as the citation and pull the node id out of `evidence`.
|
|
136
|
+
- If the cited node is **superseded**, say so inline (e.g. `(journal) node 0012 [superseded by 0013] — …`) so the thread carries the "we already moved past this" signal.
|
|
137
|
+
- Use the sentinel `(no prior learning)` when the recall leg returned no relevant node — most threads on a young project will use this, and that's fine.
|
|
119
138
|
- A **one-line divergence reason** — what makes this thread different from
|
|
120
139
|
the others. No two threads may share the same divergence axis.
|
|
121
140
|
|
|
141
|
+
If the recall leg surfaced a learning that **invalidates** a direction (a
|
|
142
|
+
superseded or dropped decision that maps onto a thread you'd otherwise
|
|
143
|
+
propose), do not silently omit it — keep the thread but mark it
|
|
144
|
+
`⚠ already explored — see (journal) node NNNN` and weight it down in the
|
|
145
|
+
recommendation. Prior learnings prune the search; they don't just decorate it.
|
|
146
|
+
|
|
122
147
|
End with `## Recommended next step` — one paragraph naming which thread to
|
|
123
|
-
pursue first and why.
|
|
148
|
+
pursue first and why. If a prior learning rules a thread in or out, cite it here.
|
|
124
149
|
|
|
125
150
|
## Best practices
|
|
126
151
|
|
|
@@ -154,7 +179,9 @@ directions in the data.
|
|
|
154
179
|
|---|---|
|
|
155
180
|
| `mma-research` failed | Use `(no external source found)` sentinel on every external line. If `mma-investigate` also failed, do NOT synthesise — surface both errors to the user. |
|
|
156
181
|
| `mma-investigate` failed | Treat as greenfield — use `(no internal anchor — fully greenfield)` sentinel. |
|
|
157
|
-
|
|
|
182
|
+
| `mma-journal-recall` failed OR returned 0 findings | Use the `(no prior learning)` sentinel on every prior-learning line and continue — the journal leg is additive, never blocking. A young project with an empty journal hits this every time; it is not an error. |
|
|
183
|
+
| All three failed | Report all errors to the user. Do NOT fabricate threads. |
|
|
184
|
+
| Both investigate and research failed | Report both errors to the user. Do NOT fabricate threads. |
|
|
158
185
|
| Investigate returned `needsCallerClarification: true` | Pause — surface the clarification need to the user. Do NOT synthesise over an unfinished investigation. |
|
|
159
186
|
| Research returned 0 usable sources | Sentinel on external lines. Add a one-line note in synthesis preamble: *"External research returned no usable sources — threads anchor on internal findings only."* |
|
|
160
187
|
| Investigate headline reads "0 citations" / "confidence unparseable" but `results[0].findings.length > 0` | Known stage-sync noise — IGNORE the headline. The leg succeeded; read `results[0].findings` directly. |
|
|
@@ -12,7 +12,7 @@ when_to_use: >-
|
|
|
12
12
|
git-history queries. OR you are about to read 3+ files / run any grep in main
|
|
13
13
|
context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
|
|
14
14
|
skill instead.
|
|
15
|
-
version: 4.
|
|
15
|
+
version: 4.8.0
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
# mma-investigate
|
|
@@ -212,7 +212,11 @@ About to `Read` 3+ files just to answer one question? That's the wrong tradeoff
|
|
|
212
212
|
|
|
213
213
|
## Terminal context block
|
|
214
214
|
|
|
215
|
-
Every completed task
|
|
215
|
+
Every completed **read-route** task (audit / review / debug / investigate / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
|
|
216
|
+
|
|
217
|
+
Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
|
|
218
|
+
|
|
219
|
+
contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
|
|
216
220
|
|
|
217
221
|
**Use cases:**
|
|
218
222
|
- Pass investigation results to a downstream planning step
|
|
@@ -0,0 +1,242 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mma-journal-recall
|
|
3
|
+
description: >-
|
|
4
|
+
Use when you're about to design or attempt something and want to know what
|
|
5
|
+
THIS project already learned — ask a vague conceptual question (no tags or
|
|
6
|
+
keywords needed); a read-only worker searches the learnings graph and returns
|
|
7
|
+
the relevant prior lessons + how they relate. Fire before re-treading ground
|
|
8
|
+
that may already have been explored. NOT for recording a new learning
|
|
9
|
+
(mma-journal-record), codebase questions (mma-investigate), or external
|
|
10
|
+
research (mma-research).
|
|
11
|
+
when_to_use: >-
|
|
12
|
+
A question about THIS project's learnings, before attempting or designing
|
|
13
|
+
something — ask a vague conceptual question; skip if recording a new learning,
|
|
14
|
+
asking the codebase, or researching external docs.
|
|
15
|
+
version: 4.8.0
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# mma-journal-recall
|
|
19
|
+
|
|
20
|
+
## Overview
|
|
21
|
+
|
|
22
|
+
Recall relevant project learnings from the journal via a read-only mmagent worker. The worker reads the learnings graph at `.mmagent/journal/` and synthesizes answers to vague conceptual queries.
|
|
23
|
+
|
|
24
|
+
**Core principle:** Recall is retrieval (read, traverse graph, synthesize). Delegate it. The main agent stays on using the results — deciding what to do with the prior lessons.
|
|
25
|
+
|
|
26
|
+
## When to Use
|
|
27
|
+
|
|
28
|
+
**Use when:**
|
|
29
|
+
- Before attempting something, ask "what have we learned about this?".
|
|
30
|
+
- The query is a conceptual question ("dispatch cancellation reliability?", "rate-limiting patterns?"), not exact tags or keywords.
|
|
31
|
+
- You want prior learnings + their relationships, not isolated chunks.
|
|
32
|
+
- The project has an active journal (started with `mma-journal-record`).
|
|
33
|
+
|
|
34
|
+
**Don't use when:**
|
|
35
|
+
- You're recording a new learning → `mma-journal-record` (write route).
|
|
36
|
+
- You're asking about the codebase structure → `mma-investigate` (read codebase).
|
|
37
|
+
- You're researching external docs/web → `mma-research` / `WebSearch`.
|
|
38
|
+
- The journal is empty or not yet initialized.
|
|
39
|
+
|
|
40
|
+
## Endpoint
|
|
41
|
+
|
|
42
|
+
`POST /journal-recall?cwd=<abs-path>`
|
|
43
|
+
|
|
44
|
+
@include _shared/auth.md
|
|
45
|
+
|
|
46
|
+
## Request body
|
|
47
|
+
|
|
48
|
+
```json
|
|
49
|
+
{
|
|
50
|
+
"query": "what have we learned about dispatch cancellation reliability?",
|
|
51
|
+
"contextBlockIds": []
|
|
52
|
+
}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
| Field | Type | Required | Notes |
|
|
56
|
+
|---|---|---|---|
|
|
57
|
+
| `query` | string | yes | A vague conceptual question about prior learnings. No tags or keywords needed. |
|
|
58
|
+
| `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` — enables follow-up / delta recall |
|
|
59
|
+
| `tools` | `'none' \| 'readonly'` | no | Default `'readonly'`. `'full'` and `'no-shell'` are rejected — recall is read-only |
|
|
60
|
+
|
|
61
|
+
> Worker tier for `mma-journal-recall` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
|
|
62
|
+
|
|
63
|
+
**Why `query` is vague, not keyword-filtered:**
|
|
64
|
+
|
|
65
|
+
❌ `{ "query": "dispatch" }` — too narrow, might miss "cancellation reliability" nodes that don't mention the word "dispatch" in title.
|
|
66
|
+
✅ `{ "query": "what have we learned about dispatch cancellation reliability?" }` — the worker understands the concept and finds related nodes.
|
|
67
|
+
|
|
68
|
+
**Why:** the worker traverses the journal's typed graph (supersedes, refines, contradicts, depends-on) and synthesizes across related nodes. Semantic matching is the LLM's job, just like `mma-investigate`.
|
|
69
|
+
|
|
70
|
+
## Full example
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
74
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
75
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
76
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
77
|
+
-H "Content-Type: application/json" \
|
|
78
|
+
-d '{"query":"what have we learned about dispatch cancellation reliability?"}' \
|
|
79
|
+
"http://localhost:$PORT/journal-recall?cwd=/project")
|
|
80
|
+
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
@include _shared/polling.md
|
|
84
|
+
|
|
85
|
+
@include _shared/response-shape.md
|
|
86
|
+
|
|
87
|
+
## Per-task report shape
|
|
88
|
+
|
|
89
|
+
Each task carries a `investigation` field on its per-task report (same shape as `mma-investigate`):
|
|
90
|
+
|
|
91
|
+
```json
|
|
92
|
+
{
|
|
93
|
+
"investigation": {
|
|
94
|
+
"citations": [
|
|
95
|
+
{ "file": "nodes/0012-dispatch-cancellation-lifecycle.md", "lines": "1-50", "claim": "Cancellation handlers must check context before writing." }
|
|
96
|
+
],
|
|
97
|
+
"confidence": { "level": "high", "rationale": "Direct citations from journal nodes." },
|
|
98
|
+
"diagnostics": {
|
|
99
|
+
"malformedCitationLines": 0,
|
|
100
|
+
"missingRequiredSections": [],
|
|
101
|
+
"invalidRequiredSections": []
|
|
102
|
+
}
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
The authoritative success signals are `completed`, `message`, and `findings`. See "v5 wire shape" below for the full envelope.
|
|
108
|
+
|
|
109
|
+
## v5 wire shape (read route)
|
|
110
|
+
|
|
111
|
+
Every task result is a `ComposePayload` — seven main-agent fields plus a telemetry block.
|
|
112
|
+
The main-agent fields are authoritative; the telemetry block is diagnostics.
|
|
113
|
+
|
|
114
|
+
```json
|
|
115
|
+
{
|
|
116
|
+
"completed": true,
|
|
117
|
+
"message": "Recall complete; 4 relevant learnings found.",
|
|
118
|
+
"findings": [
|
|
119
|
+
{
|
|
120
|
+
"id": "F1",
|
|
121
|
+
"severity": "critical",
|
|
122
|
+
"category": "correctness",
|
|
123
|
+
"claim": "Cancellation handlers must check context before writing to avoid corruption.",
|
|
124
|
+
"evidence": "nodes/0012-dispatch-cancellation-lifecycle.md:20-35 — verbatim substring from journal node.",
|
|
125
|
+
"suggestion": null,
|
|
126
|
+
"source": "implementer"
|
|
127
|
+
}
|
|
128
|
+
],
|
|
129
|
+
"summary": "The project learned that dispatch cancellation must synchronize context reads (node 0012) and never write without checking. Related node 0008 (refines) adds that timeout-based cancellation has race conditions under high load.",
|
|
130
|
+
"filesChanged": [],
|
|
131
|
+
"commitSha": null,
|
|
132
|
+
"blockId": null,
|
|
133
|
+
"telemetry": {
|
|
134
|
+
"totalDurationMs": 1234,
|
|
135
|
+
"totalCostUSD": 0.08,
|
|
136
|
+
"workerSelfAssessment": "done",
|
|
137
|
+
"reviewVerdict": null,
|
|
138
|
+
"commitOutcome": "not_applicable",
|
|
139
|
+
"stopReason": "normal",
|
|
140
|
+
"haltedStage": null,
|
|
141
|
+
"stages": [...]
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Key fields
|
|
147
|
+
|
|
148
|
+
| Field | When populated | Notes |
|
|
149
|
+
|---|---|---|
|
|
150
|
+
| `completed` | always | `true` when at least one criterion succeeded; `false` on annotator transport failure OR unmet annotate preconditions |
|
|
151
|
+
| `message` | always | human-readable summary; names blocking gates or finding IDs on failure |
|
|
152
|
+
| `findings` | always | `source: 'implementer'` for recall; findings are the deliverable on read routes |
|
|
153
|
+
| `workerSelfAssessment` | always | `'done'` or `'failed'` — never `done_with_concerns` |
|
|
154
|
+
| `blockId` | always `null` (for write routes); string (for read routes) | recall is a read route, so `blockId` is a string — a reusable context block for delta follow-up |
|
|
155
|
+
|
|
156
|
+
### No second review
|
|
157
|
+
|
|
158
|
+
The LLM-judge stage (`annotate`) runs once, after the worker's output. Its preconditions for read-route `completed: true`:
|
|
159
|
+
|
|
160
|
+
```
|
|
161
|
+
gates.implement.outcome === 'advance'
|
|
162
|
+
&& gates.implement.payload.workerSelfAssessment === 'done'
|
|
163
|
+
&& (criteriaSucceeded.length > 0 || criteriaErrors.length === 0)
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
Findings are the deliverable — a recall that surfaces 5 relevant lessons is `completed: true`. Finding nothing relevant is also a valid completion (returns `findings: []`).
|
|
167
|
+
|
|
168
|
+
### `completed: false` — what it means
|
|
169
|
+
|
|
170
|
+
Only on annotator transport failure, or if the journal is inaccessible/corrupted. The `message` names the blocking gate. Re-dispatch with a broader `query` if the worker's findings were too narrow.
|
|
171
|
+
|
|
172
|
+
## Best practices
|
|
173
|
+
|
|
174
|
+
This skill is one step in a larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-journal-recall`:
|
|
175
|
+
|
|
176
|
+
- **Recipe A — Recall before attempting.** Call `mma-journal-recall` with your question before running `mma-delegate` / `mma-execute-plan` to avoid re-treading prior dead ends.
|
|
177
|
+
- **Recipe B — Recall → plan → execute.** `mma-journal-recall` → write a plan based on the learnings → `mma-execute-plan`.
|
|
178
|
+
- **Recipe C — Delta follow-up recall.** Feed a prior recall's `contextBlockId` into a follow-up call to dig deeper: `contextBlockIds: [priorResult.contextBlockId]`.
|
|
179
|
+
|
|
180
|
+
Anti-pattern alert: **Misusing recall as codebase search.** Recall is for the *project's learnings graph*, not the codebase. If you want to search code → `mma-investigate`. If you want to ask the journal → `mma-journal-recall`.
|
|
181
|
+
|
|
182
|
+
## Common pitfalls
|
|
183
|
+
|
|
184
|
+
❌ **Using exact tags instead of a conceptual question**
|
|
185
|
+
> query: "dispatch cancellation"
|
|
186
|
+
|
|
187
|
+
The worker expects a sentence with context, not keywords. **Fix:** phrase it as a question:
|
|
188
|
+
> query: "what have we learned about dispatch cancellation and how it interacts with timeouts?"
|
|
189
|
+
|
|
190
|
+
❌ **Asking about the codebase instead of the journal**
|
|
191
|
+
> query: "where is DispatchCanceller called?"
|
|
192
|
+
|
|
193
|
+
That's a codebase question. Use `mma-investigate` instead. Journal recall is for *learnings* stored in `.mmagent/journal/`, not code.
|
|
194
|
+
|
|
195
|
+
❌ **Assuming the journal exists**
|
|
196
|
+
> query: "what do we know about X?"
|
|
197
|
+
|
|
198
|
+
If the project hasn't used `mma-journal-record`, the journal is empty. The worker will return `not_applicable`. **Fix:** check whether the journal is active in the project first, or start recording learnings with `mma-journal-record`.
|
|
199
|
+
|
|
200
|
+
## Terminal context block
|
|
201
|
+
|
|
202
|
+
Every completed **read-route** task (audit / review / debug / investigate / recall / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry / journal-record) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
|
|
203
|
+
|
|
204
|
+
Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
|
|
205
|
+
|
|
206
|
+
contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
|
|
207
|
+
|
|
208
|
+
**Use cases:**
|
|
209
|
+
- Recall round 2: pass round 1's block into round 2's `contextBlockIds` to dig deeper on a specific thread.
|
|
210
|
+
- Recall → plan → execute chain: feed recall findings as a context block into `mma-execute-plan` as shared prior context.
|
|
211
|
+
- Multi-agent follow-up: capture a recall's block and hand it to another tool chain.
|
|
212
|
+
|
|
213
|
+
The block is registered server-side at task completion; no caller action is needed to create it. Delete it explicitly via `DELETE /context-blocks/:id` when no longer needed, or let it expire on session teardown.
|
|
214
|
+
|
|
215
|
+
## Outcome semantics
|
|
216
|
+
|
|
217
|
+
Every task result carries outcome fields that describe the recall's conclusion status:
|
|
218
|
+
|
|
219
|
+
| Field | Type | Meaning |
|
|
220
|
+
|---|---|---|
|
|
221
|
+
| `findingsOutcome` | `'found' \| 'not_applicable'` | Answers the question: did the recall produce substantive learnings? |
|
|
222
|
+
| `findingsOutcomeReason` | `string \| null` | When `findingsOutcome` is set, this explains why (e.g. "No relevant journal nodes found for the query" or "Journal is empty"). |
|
|
223
|
+
| `outcomeInferred` | `boolean` | `true` if the system inferred the outcome from findings count; `false` if the worker explicitly stated it. |
|
|
224
|
+
| `outcomeMalformed` | `boolean` | `true` if the outcome line was malformed and had to be repaired; `false` otherwise. |
|
|
225
|
+
|
|
226
|
+
### Enum values
|
|
227
|
+
|
|
228
|
+
- **`found`** — the recall produced one or more relevant prior learnings (findings) across one or more journal nodes.
|
|
229
|
+
- **`not_applicable`** — the recall could not proceed (the journal is empty, inaccessible, or nothing in it answers the query).
|
|
230
|
+
|
|
231
|
+
### Empty journal ≠ failure
|
|
232
|
+
|
|
233
|
+
A recall that searches the journal and finds nothing relevant is a valid `completed: true` outcome; it simply answers "no prior learnings match that question" — which is useful information before attempting something new.
|
|
234
|
+
|
|
235
|
+
### Per-route legal outcomes
|
|
236
|
+
|
|
237
|
+
The legal outcomes for this route are: `['found', 'not_applicable']`
|
|
238
|
+
|
|
239
|
+
- **`found`** — one or more prior learnings surfaced from the journal.
|
|
240
|
+
- **`not_applicable`** — the journal is empty, inaccessible, or no learnings match the query.
|
|
241
|
+
|
|
242
|
+
@include _shared/error-handling.md
|
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mma-journal-record
|
|
3
|
+
description: >-
|
|
4
|
+
Use when you've abandoned an approach, hit a constraint, or concluded
|
|
5
|
+
something worth remembering — record it to the persistent journal as a
|
|
6
|
+
fire-and-forget decision audit trail for future sessions.
|
|
7
|
+
when_to_use: >-
|
|
8
|
+
You've completed analysis and want to log the outcome — abandoned an approach,
|
|
9
|
+
hit a blocking constraint, or reached a conclusion worth remembering. NOT for
|
|
10
|
+
recall/investigate/delegate; those are read routes. Journal stores conclusions
|
|
11
|
+
for cross-session reference.
|
|
12
|
+
version: 4.8.0
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# mma-journal-record
|
|
16
|
+
|
|
17
|
+
## Overview
|
|
18
|
+
|
|
19
|
+
Record a learning, constraint, or decision outcome to the persistent journal via a fire-and-forget mmagent worker. The worker stores the entry and returns immediately; you continue on your main context.
|
|
20
|
+
|
|
21
|
+
**Core principle:** Journal is an audit trail of what you've decided, discovered, or abandoned. Record it once per session; don't re-investigate.
|
|
22
|
+
|
|
23
|
+
## When to Use
|
|
24
|
+
|
|
25
|
+
**Use when:**
|
|
26
|
+
- You've abandoned an approach and want to log why
|
|
27
|
+
- You've hit a blocking constraint worth remembering
|
|
28
|
+
- You've reached a conclusion (e.g., "Pattern X doesn't work in this codebase")
|
|
29
|
+
- You've decided not to pursue a direction and want to avoid repeating that decision next session
|
|
30
|
+
|
|
31
|
+
**Don't use when:**
|
|
32
|
+
- You're asking a question → `mma-investigate`
|
|
33
|
+
- You're dispatching work → `mma-delegate`
|
|
34
|
+
- You want to retrieve past entries → journal is append-only, not searchable; use `git log` or `.mmagent/journal/` files directly
|
|
35
|
+
- You're mid-task and want to pause → that's what `blockedBy` is for; journal is for conclusions, not temporary blockers
|
|
36
|
+
|
|
37
|
+
## Endpoint
|
|
38
|
+
|
|
39
|
+
`POST /journal-record?cwd=<abs-path>`
|
|
40
|
+
|
|
41
|
+
@include _shared/auth.md
|
|
42
|
+
|
|
43
|
+
## Request body
|
|
44
|
+
|
|
45
|
+
```json
|
|
46
|
+
{
|
|
47
|
+
"learning": "Tried worker self-report for grouped-dispatch cancellation; dropped it — git diff is the source of truth. Lesson: use getRealFilesChanged.",
|
|
48
|
+
"tagHints": ["dispatch", "cancellation"]
|
|
49
|
+
}
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
| Field | Type | Required | Notes |
|
|
53
|
+
|---|---|---|---|
|
|
54
|
+
| `learning` | string | yes | Natural-language entry: what you decided, why, or what you learned. Keep it concrete. |
|
|
55
|
+
| `tagHints` | string[] | no | Optional tags for later cross-reference (e.g. `["perf", "refactor"]`). Tags are advisory; the journal system may group or index them. |
|
|
56
|
+
|
|
57
|
+
**What gets stored & where:**
|
|
58
|
+
|
|
59
|
+
Entries are integrated into a graph-structured journal store at `.mmagent/journal/`:
|
|
60
|
+
- `nodes/` — individual learning entries (keyed by unique node ID)
|
|
61
|
+
- `index.md` — searchable index of all entries, tags, and cross-references
|
|
62
|
+
- `log.md` — append-only event log of create/refine/supersede/merge operations
|
|
63
|
+
|
|
64
|
+
The worker creates, refines, or supersedes nodes in the graph (never appends blindly). You can query the index or log directly to track learning history. Writes are confined to the project's `.mmagent/` directory (no traversal).
|
|
65
|
+
|
|
66
|
+
## Full example
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
BATCH=$(curl -f --show-error -s -X POST \
|
|
70
|
+
-H "X-MMA-Client: $MMA_CLIENT" \
|
|
71
|
+
-H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
|
|
72
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
73
|
+
-H "Content-Type: application/json" \
|
|
74
|
+
-d '{
|
|
75
|
+
"learning": "Tried worker self-report for grouped-dispatch cancellation; dropped it — git diff is the source of truth. Lesson: use getRealFilesChanged.",
|
|
76
|
+
"tagHints": ["dispatch", "cancellation"]
|
|
77
|
+
}' \
|
|
78
|
+
"http://localhost:$PORT/journal-record?cwd=/project")
|
|
79
|
+
BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
@include _shared/polling.md
|
|
83
|
+
|
|
84
|
+
@include _shared/response-shape.md
|
|
85
|
+
|
|
86
|
+
## Per-task report shape
|
|
87
|
+
|
|
88
|
+
Each task carries a structured report containing the graph operation metadata:
|
|
89
|
+
|
|
90
|
+
```json
|
|
91
|
+
{
|
|
92
|
+
"summary": "created 0012; superseded 0009",
|
|
93
|
+
"filesChanged": [".mmagent/journal/nodes/0012.md", ".mmagent/journal/index.md", ".mmagent/journal/log.md"],
|
|
94
|
+
"op": "create"
|
|
95
|
+
}
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
The authoritative success signal is `completed` + the presence of `filesChanged`. See "v5 wire shape" below for the full envelope.
|
|
99
|
+
|
|
100
|
+
## v5 wire shape (reviewed write route)
|
|
101
|
+
|
|
102
|
+
Every task result is a `ComposePayload` — seven main-agent fields plus a telemetry block.
|
|
103
|
+
The main-agent fields are authoritative; the telemetry block is diagnostics.
|
|
104
|
+
|
|
105
|
+
```json
|
|
106
|
+
{
|
|
107
|
+
"completed": true,
|
|
108
|
+
"message": "Journal entry created (node 0012); superseded prior learning (node 0009)",
|
|
109
|
+
"findings": [],
|
|
110
|
+
"summary": "created 0012; superseded 0009",
|
|
111
|
+
"filesChanged": [".mmagent/journal/nodes/0012.md", ".mmagent/journal/index.md", ".mmagent/journal/log.md"],
|
|
112
|
+
"commitSha": null,
|
|
113
|
+
"blockId": null,
|
|
114
|
+
"telemetry": {
|
|
115
|
+
"totalDurationMs": 5400,
|
|
116
|
+
"totalCostUSD": 0.04,
|
|
117
|
+
"workerSelfAssessment": "done",
|
|
118
|
+
"reviewVerdict": "approved",
|
|
119
|
+
"commitOutcome": "not_applicable",
|
|
120
|
+
"stopReason": "normal",
|
|
121
|
+
"haltedStage": null,
|
|
122
|
+
"stages": [
|
|
123
|
+
{ "name": "prepare", "outcome": "advance", "durationMs": 2, "costUSD": 0 },
|
|
124
|
+
{ "name": "register-block", "outcome": "skip", "comment": "register-block does not apply to route=journal", "durationMs": 0, "costUSD": 0 },
|
|
125
|
+
{ "name": "implement", "outcome": "advance", "durationMs": 3200, "costUSD": 0.02 },
|
|
126
|
+
{ "name": "review", "outcome": "advance", "durationMs": 1800, "costUSD": 0.01 },
|
|
127
|
+
{ "name": "rework", "outcome": "skip", "comment": "rework skipped because review approved", "durationMs": 0, "costUSD": 0 },
|
|
128
|
+
{ "name": "commit", "outcome": "skip", "comment": "commit does not apply to non-git routes", "durationMs": 0, "costUSD": 0 },
|
|
129
|
+
{ "name": "annotate", "outcome": "advance", "durationMs": 340, "costUSD": 0.01 },
|
|
130
|
+
{ "name": "compose", "outcome": "advance", "durationMs": 56, "costUSD": 0 },
|
|
131
|
+
{ "name": "terminal", "outcome": "advance", "durationMs": 2, "costUSD": 0 }
|
|
132
|
+
]
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Key fields
|
|
138
|
+
|
|
139
|
+
| Field | When populated | Notes |
|
|
140
|
+
|---|---|---|
|
|
141
|
+
| `completed` | always | `true` when entry is created/refined/superseded and approved; `false` on review rejection, path traversal, or write failure |
|
|
142
|
+
| `message` | always | human-readable summary (e.g., "created 0012; superseded 0009"); read on failure for diagnostic |
|
|
143
|
+
| `findings` | always | issues surfaced by the reviewer (e.g., unclear learning, duplicate with 0009). Empty if approved as-is. |
|
|
144
|
+
| `filesChanged` | always | graph journal paths modified: `nodes/`, `index.md`, `log.md` (relative to `cwd`) |
|
|
145
|
+
| `workerSelfAssessment` | always | `'done'` or `'failed'` — worker's assessment of completeness |
|
|
146
|
+
| `blockId` | always `null` | journal is a task route, not register-context-block |
|
|
147
|
+
| `commitSha` | always `null` | journal entries are graph mutations, not git commits |
|
|
148
|
+
| `reviewVerdict` | via telemetry | `'approved'` \| `'rejected_with_rework'` \| `'rejected'` — reviewer's verdict on the learned entry |
|
|
149
|
+
|
|
150
|
+
### Reviewed write lifecycle
|
|
151
|
+
|
|
152
|
+
Unlike read routes (audit/investigate/debug), journal runs a full review cycle: **implement** → **review** → [optional **rework**] → **commit** (skipped for non-git routes) → **annotate**. If the reviewer finds issues (e.g., the learning is ambiguous, the node supersedes multiple prior entries), a rework round applies targeted edits before finalization.
|
|
153
|
+
|
|
154
|
+
### `completed: false` — what it means
|
|
155
|
+
|
|
156
|
+
Path traversal detected, write permission denied, or directory creation failed. The `message` names the blocking issue.
|
|
157
|
+
|
|
158
|
+
## Best practices
|
|
159
|
+
|
|
160
|
+
**One entry per decision, not per turn.**
|
|
161
|
+
Log once when you decide not to pursue a direction; don't log "just checked X" on every iteration.
|
|
162
|
+
|
|
163
|
+
**Keep entries concrete.**
|
|
164
|
+
❌ "Didn't work"
|
|
165
|
+
✅ "Tried multicast-style dispatch with worker dedup; git diff is the source of truth, workers can't track cancellations atomically. Use getRealFilesChanged instead."
|
|
166
|
+
|
|
167
|
+
**Use tags to build searchable structure.**
|
|
168
|
+
```bash
|
|
169
|
+
# Later, grep your journal for all perf decisions:
|
|
170
|
+
grep -r "^" .mmagent/journal/ | grep -i "perf:"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
## Common pitfalls
|
|
174
|
+
|
|
175
|
+
❌ **Using journal as a scratchpad**
|
|
176
|
+
> "Thinking about X. Maybe Y? Need to check Z."
|
|
177
|
+
|
|
178
|
+
Journal is for **conclusions**, not work-in-progress. Keep notes in a separate working file if you need to brainstorm.
|
|
179
|
+
|
|
180
|
+
❌ **Logging without context**
|
|
181
|
+
> "Doesn't work."
|
|
182
|
+
|
|
183
|
+
Future-you (or a teammate) won't remember what "doesn't work" means. Always include the decision frame: what did you try, why did you try it, what was the outcome, and what will you do instead?
|
|
184
|
+
|
|
185
|
+
## Context blocks
|
|
186
|
+
|
|
187
|
+
Write-route tasks (delegate / execute-plan / journal / retry) do **not** register terminal context blocks. Their artifact is the filesystem mutation (git commit for delegate; graph mutations for journal). Read-route tasks (audit / review / debug / investigate / research) auto-register blocks containing their findings.
|
|
188
|
+
|
|
189
|
+
@include _shared/error-handling.md
|