@zhixuan92/multi-model-agent 4.7.19 → 4.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/README.md +10 -7
  2. package/dist/http/handlers/control/batch.d.ts +2 -0
  3. package/dist/http/handlers/control/batch.d.ts.map +1 -1
  4. package/dist/http/handlers/control/batch.js +17 -1
  5. package/dist/http/handlers/control/batch.js.map +1 -1
  6. package/dist/http/handlers/tools/journal-recall.d.ts +4 -0
  7. package/dist/http/handlers/tools/journal-recall.d.ts.map +1 -0
  8. package/dist/http/handlers/tools/journal-recall.js +40 -0
  9. package/dist/http/handlers/tools/journal-recall.js.map +1 -0
  10. package/dist/http/handlers/tools/journal-record.d.ts +4 -0
  11. package/dist/http/handlers/tools/journal-record.d.ts.map +1 -0
  12. package/dist/http/handlers/tools/journal-record.js +35 -0
  13. package/dist/http/handlers/tools/journal-record.js.map +1 -0
  14. package/dist/http/handlers/tools/research.d.ts.map +1 -1
  15. package/dist/http/handlers/tools/research.js +0 -1
  16. package/dist/http/handlers/tools/research.js.map +1 -1
  17. package/dist/http/server.d.ts.map +1 -1
  18. package/dist/http/server.js +6 -2
  19. package/dist/http/server.js.map +1 -1
  20. package/dist/skill-install/discover.d.ts +1 -1
  21. package/dist/skill-install/discover.d.ts.map +1 -1
  22. package/dist/skill-install/discover.js +2 -0
  23. package/dist/skill-install/discover.js.map +1 -1
  24. package/dist/skills/mma-audit/SKILL.md +6 -2
  25. package/dist/skills/mma-context-blocks/SKILL.md +1 -1
  26. package/dist/skills/mma-debug/SKILL.md +6 -2
  27. package/dist/skills/mma-delegate/SKILL.md +3 -9
  28. package/dist/skills/mma-execute-plan/SKILL.md +3 -9
  29. package/dist/skills/mma-explore/SKILL.md +54 -27
  30. package/dist/skills/mma-investigate/SKILL.md +6 -2
  31. package/dist/skills/mma-journal-recall/SKILL.md +242 -0
  32. package/dist/skills/mma-journal-record/SKILL.md +189 -0
  33. package/dist/skills/mma-research/SKILL.md +14 -5
  34. package/dist/skills/mma-retry/SKILL.md +4 -4
  35. package/dist/skills/mma-review/SKILL.md +6 -2
  36. package/dist/skills/multi-model-agent/SKILL.md +7 -3
  37. package/package.json +2 -2
@@ -3,31 +3,36 @@ name: mma-explore
3
3
  description: >-
4
4
  Use when about to brainstorm or plan and need a divergent landscape scan —
5
5
  orchestrates parallel internal-codebase investigation + external multi-source
6
- research, then synthesises 3–5 distinct directions. Not for "where is X"
7
- single-answer questions (use mma-investigate).
6
+ research + prior-learnings recall from the project journal, then synthesises
7
+ 3–5 distinct directions. Not for "where is X" single-answer questions (use
8
+ mma-investigate).
8
9
  when_to_use: >-
9
10
  You are about to brainstorm or plan and need a broad landscape scan before
10
11
  narrowing. The question is exploratory ("what are our options", "what
11
12
  approaches exist", "survey how others handle"). The skill instructs you to fan
12
- out mma-investigate (internal) + mma-research (external) in parallel and
13
- synthesise the results yourself. DO NOT use for convergent single-answer
14
- questions — those are mma-investigate.
15
- version: 4.7.19
13
+ out mma-investigate (internal), mma-research (external), and
14
+ mma-journal-recall (prior learnings/decisions) in parallel and synthesise the
15
+ results yourself. DO NOT use for convergent single-answer questions — those
16
+ are mma-investigate.
17
+ version: 4.8.0
16
18
  ---
17
19
 
18
20
  # mma-explore
19
21
 
20
22
  ## Overview
21
23
 
22
- Codebase + external sources, synthesised into 3–5 distinct directions. Two
23
- delegated calls (`mma-investigate` for the internal codebase, `mma-research`
24
- for external sources) run in parallel; **you** synthesise their results into
25
- the final output.
24
+ Codebase + external sources + prior learnings, synthesised into 3–5 distinct
25
+ directions. Three delegated calls run in parallel `mma-investigate` (internal
26
+ codebase), `mma-research` (external sources), and `mma-journal-recall` (what
27
+ this project already learned/decided, from the `.mmagent/journal/` graph) —
28
+ and **you** synthesise their results into the final output.
26
29
 
27
30
  **Core principle:** Exploration is divergent (survey, enumerate, compare).
28
- Synthesis turns raw threads into ranked, citable directions. The internal and
29
- external research is delegated; the synthesis is your judgment work and stays
30
- in main context.
31
+ Synthesis turns raw threads into ranked, citable directions. The three legs
32
+ are delegated; the synthesis is your judgment work and stays in main context.
33
+ The journal leg is what keeps you from re-proposing a direction the project
34
+ already tried and dropped — it grounds the scan in your own history, not just
35
+ the code and the outside world.
31
36
 
32
37
  ## When to Use
33
38
 
@@ -56,41 +61,50 @@ digraph when_to_use {
56
61
 
57
62
  ## How to run
58
63
 
59
- Dispatch BOTH in ONE message (parallel tool use):
64
+ Dispatch ALL THREE in ONE message (parallel tool use):
60
65
 
61
66
  1. `mma-investigate` — internal codebase research
62
67
  - You MAY skip this only if the question is unambiguously greenfield (no
63
68
  codebase touch-points exist). When in doubt, run it.
64
69
  2. `mma-research` — external multi-source research
70
+ 3. `mma-journal-recall` — prior learnings/decisions from the project journal
71
+ - Always run it. If the project has no journal yet (or nothing relevant),
72
+ it returns zero findings — a valid result you handle with the
73
+ `(no prior learning)` sentinel. Never skip it to "save a call": a
74
+ superseded prior decision is exactly the signal you most want before
75
+ brainstorming.
65
76
 
66
- Wait for both to return. Do NOT proceed to synthesis until you have both
67
- results (or have decided to skip investigate).
77
+ Wait for all legs to return. Do NOT proceed to synthesis until you have every
78
+ result (or have decided to skip investigate as greenfield).
68
79
 
69
80
  ## Endpoint
70
81
 
71
82
  This is a main-agent skill — there is no dedicated `/explore` HTTP endpoint.
72
- Behind the scenes, you dispatch the two delegated tools `mma-investigate`
73
- (`POST /investigate`) and `mma-research` (`POST /research`) yourself.
83
+ Behind the scenes, you dispatch the three delegated tools `mma-investigate`
84
+ (`POST /investigate`), `mma-research` (`POST /research`), and
85
+ `mma-journal-recall` (`POST /journal-recall`) yourself.
74
86
 
75
87
  ## Request body
76
88
 
77
- (Not applicable — this skill orchestrates two other skills.) See
78
- [`mma-investigate`](../mma-investigate/SKILL.md) and
79
- [`mma-research`](../mma-research/SKILL.md) for their request bodies.
89
+ (Not applicable — this skill orchestrates three other skills.) See
90
+ [`mma-investigate`](../mma-investigate/SKILL.md),
91
+ [`mma-research`](../mma-research/SKILL.md), and
92
+ [`mma-journal-recall`](../mma-journal-recall/SKILL.md) for their request bodies.
80
93
 
81
94
  ## Full example
82
95
 
83
- The main agent (you) issues a single message with two parallel tool calls:
96
+ The main agent (you) issues a single message with three parallel tool calls:
84
97
 
85
98
  ```
86
99
  [parallel tool use]
87
- mma-investigate { question: "How does our streaming JSON parser handle backpressure?", filePaths: ["src/parsers/"] }
88
- mma-research { researchQuestion: "State-of-the-art streaming JSON parsers with backpressure?", background: "We use a single-pass push parser." }
100
+ mma-investigate { question: "How does our streaming JSON parser handle backpressure?", filePaths: ["src/parsers/"] }
101
+ mma-research { researchQuestion: "State-of-the-art streaming JSON parsers with backpressure?", background: "We use a single-pass push parser." }
102
+ mma-journal-recall { query: "what have we learned about streaming-parser backpressure or buffering tradeoffs?" }
89
103
  ```
90
104
 
91
105
  ## Reading the leg results
92
106
 
93
- Both `mma-investigate` and `mma-research` return the v5 wire envelope (see `mma-investigate/SKILL.md` → "v5 wire shape"). Each sub-task result is a `ComposePayload` with the standard seven fields. The authoritative citation source is **`results[0].findings`** — an array of `{ id, severity, category, claim, evidence, suggestion, source }`.
107
+ All three legs (`mma-investigate`, `mma-research`, `mma-journal-recall`) return the v5 wire envelope (see `mma-investigate/SKILL.md` → "v5 wire shape"). Each sub-task result is a `ComposePayload` with the standard seven fields. The authoritative citation source is **`results[0].findings`** — an array of `{ id, severity, category, claim, evidence, suggestion, source }`.
94
108
 
95
109
  Explore top-level orchestration aggregates sub-task results into a valid `ImplementPayload` (read-route shape) before the final `annotate` stage runs. Each sub-task follows the same v5 wire shape; the top-level result is a composition of those sub-tasks.
96
110
 
@@ -99,6 +113,7 @@ Explore top-level orchestration aggregates sub-task results into a valid `Implem
99
113
  | Did the leg succeed? | `results[0].completed === true` — findings may be zero on a read route; finding nothing wrong is a valid completion |
100
114
  | Internal citation source | `results[0].findings[i].claim` plus a `file:LINE` token from `results[0].findings[i].evidence` (workers style them as `` `path:LINE` `` markdown-linked refs) |
101
115
  | External citation source | `results[0].findings[i].claim` plus a source name / URL from `results[0].findings[i].evidence` |
116
+ | Prior-learning source | `results[0].findings[i].claim` plus a journal node id from `results[0].findings[i].evidence` (recall cites `` `.mmagent/journal/nodes/NNNN-…` `` or `node NNNN`). Watch the node's status: a **superseded** learning is a "we tried this and moved on" signal — surface it, don't bury it |
102
117
  | Divergence axis | `results[0].findings[i].category` groups findings by criterion — pick across categories so threads don't collapse onto one axis |
103
118
 
104
119
  Apply a sentinel only when `findings` is empty AND `results[0].message` contains no finding-level content — i.e., the worker genuinely returned nothing. Do NOT apply a sentinel just because `results[0].message` reads tersely or `results[0].telemetry.workerSelfAssessment === 'failed'` — a worker can say `'failed'` with usable partial findings.
@@ -116,11 +131,21 @@ Produce **3–5 threads**. Each thread MUST have:
116
131
  - One **external citation** (from research) — `<source> — claim`.
117
132
  - Pick from `results[0].findings`: take `claim` as the citation claim and pull a source name / URL out of `evidence`.
118
133
  - Use the sentinel `(no external source found)` only when `results[0].findings` is empty for the research leg.
134
+ - One **prior-learning citation** (from journal-recall) WHEN a relevant node exists — `(journal) node NNNN — claim`.
135
+ - Pick from the recall leg's `results[0].findings`: take `claim` as the citation and pull the node id out of `evidence`.
136
+ - If the cited node is **superseded**, say so inline (e.g. `(journal) node 0012 [superseded by 0013] — …`) so the thread carries the "we already moved past this" signal.
137
+ - Use the sentinel `(no prior learning)` when the recall leg returned no relevant node — most threads on a young project will use this, and that's fine.
119
138
  - A **one-line divergence reason** — what makes this thread different from
120
139
  the others. No two threads may share the same divergence axis.
121
140
 
141
+ If the recall leg surfaced a learning that **invalidates** a direction (a
142
+ superseded or dropped decision that maps onto a thread you'd otherwise
143
+ propose), do not silently omit it — keep the thread but mark it
144
+ `⚠ already explored — see (journal) node NNNN` and weight it down in the
145
+ recommendation. Prior learnings prune the search; they don't just decorate it.
146
+
122
147
  End with `## Recommended next step` — one paragraph naming which thread to
123
- pursue first and why.
148
+ pursue first and why. If a prior learning rules a thread in or out, cite it here.
124
149
 
125
150
  ## Best practices
126
151
 
@@ -154,7 +179,9 @@ directions in the data.
154
179
  |---|---|
155
180
  | `mma-research` failed | Use `(no external source found)` sentinel on every external line. If `mma-investigate` also failed, do NOT synthesise — surface both errors to the user. |
156
181
  | `mma-investigate` failed | Treat as greenfield — use `(no internal anchor — fully greenfield)` sentinel. |
157
- | Both failed | Report both errors to the user. Do NOT fabricate threads. |
182
+ | `mma-journal-recall` failed OR returned 0 findings | Use the `(no prior learning)` sentinel on every prior-learning line and continue — the journal leg is additive, never blocking. A young project with an empty journal hits this every time; it is not an error. |
183
+ | All three failed | Report all errors to the user. Do NOT fabricate threads. |
184
+ | Both investigate and research failed | Report both errors to the user. Do NOT fabricate threads. |
158
185
  | Investigate returned `needsCallerClarification: true` | Pause — surface the clarification need to the user. Do NOT synthesise over an unfinished investigation. |
159
186
  | Research returned 0 usable sources | Sentinel on external lines. Add a one-line note in synthesis preamble: *"External research returned no usable sources — threads anchor on internal findings only."* |
160
187
  | Investigate headline reads "0 citations" / "confidence unparseable" but `results[0].findings.length > 0` | Known stage-sync noise — IGNORE the headline. The leg succeeded; read `results[0].findings` directly. |
@@ -12,7 +12,7 @@ when_to_use: >-
12
12
  git-history queries. OR you are about to read 3+ files / run any grep in main
13
13
  context — that's the inline-labor-leakage anti-pattern (AP2); delegate to this
14
14
  skill instead.
15
- version: 4.7.19
15
+ version: 4.8.0
16
16
  ---
17
17
 
18
18
  # mma-investigate
@@ -212,7 +212,11 @@ About to `Read` 3+ files just to answer one question? That's the wrong tradeoff
212
212
 
213
213
  ## Terminal context block
214
214
 
215
- Every completed task automatically registers a terminal markdown context block containing the full task report (headline, investigation synthesis, citations, and annotated findings). The `blockId` is returned in each task result under the shared `blockId` field (not a separate `terminalBlockId` field). This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
215
+ Every completed **read-route** task (audit / review / debug / investigate / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry) return `contextBlockId: null` their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
216
+
217
+ Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
218
+
219
+ contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
216
220
 
217
221
  **Use cases:**
218
222
  - Pass investigation results to a downstream planning step
@@ -0,0 +1,242 @@
1
+ ---
2
+ name: mma-journal-recall
3
+ description: >-
4
+ Use when you're about to design or attempt something and want to know what
5
+ THIS project already learned — ask a vague conceptual question (no tags or
6
+ keywords needed); a read-only worker searches the learnings graph and returns
7
+ the relevant prior lessons + how they relate. Fire before re-treading ground
8
+ that may already have been explored. NOT for recording a new learning
9
+ (mma-journal-record), codebase questions (mma-investigate), or external
10
+ research (mma-research).
11
+ when_to_use: >-
12
+ A question about THIS project's learnings, before attempting or designing
13
+ something — ask a vague conceptual question; skip if recording a new learning,
14
+ asking the codebase, or researching external docs.
15
+ version: 4.8.0
16
+ ---
17
+
18
+ # mma-journal-recall
19
+
20
+ ## Overview
21
+
22
+ Recall relevant project learnings from the journal via a read-only mmagent worker. The worker reads the learnings graph at `.mmagent/journal/` and synthesizes answers to vague conceptual queries.
23
+
24
+ **Core principle:** Recall is retrieval (read, traverse graph, synthesize). Delegate it. The main agent stays on using the results — deciding what to do with the prior lessons.
25
+
26
+ ## When to Use
27
+
28
+ **Use when:**
29
+ - Before attempting something, ask "what have we learned about this?".
30
+ - The query is a conceptual question ("dispatch cancellation reliability?", "rate-limiting patterns?"), not exact tags or keywords.
31
+ - You want prior learnings + their relationships, not isolated chunks.
32
+ - The project has an active journal (started with `mma-journal-record`).
33
+
34
+ **Don't use when:**
35
+ - You're recording a new learning → `mma-journal-record` (write route).
36
+ - You're asking about the codebase structure → `mma-investigate` (read codebase).
37
+ - You're researching external docs/web → `mma-research` / `WebSearch`.
38
+ - The journal is empty or not yet initialized.
39
+
40
+ ## Endpoint
41
+
42
+ `POST /journal-recall?cwd=<abs-path>`
43
+
44
+ @include _shared/auth.md
45
+
46
+ ## Request body
47
+
48
+ ```json
49
+ {
50
+ "query": "what have we learned about dispatch cancellation reliability?",
51
+ "contextBlockIds": []
52
+ }
53
+ ```
54
+
55
+ | Field | Type | Required | Notes |
56
+ |---|---|---|---|
57
+ | `query` | string | yes | A vague conceptual question about prior learnings. No tags or keywords needed. |
58
+ | `contextBlockIds` | string[] | no | IDs from `mma-context-blocks` — enables follow-up / delta recall |
59
+ | `tools` | `'none' \| 'readonly'` | no | Default `'readonly'`. `'full'` and `'no-shell'` are rejected — recall is read-only |
60
+
61
+ > Worker tier for `mma-journal-recall` is hardcoded to `complex` and is not caller-configurable. Sending `agentType` is rejected with HTTP 400.
62
+
63
+ **Why `query` is vague, not keyword-filtered:**
64
+
65
+ ❌ `{ "query": "dispatch" }` — too narrow, might miss "cancellation reliability" nodes that don't mention the word "dispatch" in title.
66
+ ✅ `{ "query": "what have we learned about dispatch cancellation reliability?" }` — the worker understands the concept and finds related nodes.
67
+
68
+ **Why:** the worker traverses the journal's typed graph (supersedes, refines, contradicts, depends-on) and synthesizes across related nodes. Semantic matching is the LLM's job, just like `mma-investigate`.
69
+
70
+ ## Full example
71
+
72
+ ```bash
73
+ BATCH=$(curl -f --show-error -s -X POST \
74
+ -H "X-MMA-Client: $MMA_CLIENT" \
75
+ -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
76
+ -H "Authorization: Bearer $TOKEN" \
77
+ -H "Content-Type: application/json" \
78
+ -d '{"query":"what have we learned about dispatch cancellation reliability?"}' \
79
+ "http://localhost:$PORT/journal-recall?cwd=/project")
80
+ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
81
+ ```
82
+
83
+ @include _shared/polling.md
84
+
85
+ @include _shared/response-shape.md
86
+
87
+ ## Per-task report shape
88
+
89
+ Each task carries a `investigation` field on its per-task report (same shape as `mma-investigate`):
90
+
91
+ ```json
92
+ {
93
+ "investigation": {
94
+ "citations": [
95
+ { "file": "nodes/0012-dispatch-cancellation-lifecycle.md", "lines": "1-50", "claim": "Cancellation handlers must check context before writing." }
96
+ ],
97
+ "confidence": { "level": "high", "rationale": "Direct citations from journal nodes." },
98
+ "diagnostics": {
99
+ "malformedCitationLines": 0,
100
+ "missingRequiredSections": [],
101
+ "invalidRequiredSections": []
102
+ }
103
+ }
104
+ }
105
+ ```
106
+
107
+ The authoritative success signals are `completed`, `message`, and `findings`. See "v5 wire shape" below for the full envelope.
108
+
109
+ ## v5 wire shape (read route)
110
+
111
+ Every task result is a `ComposePayload` — seven main-agent fields plus a telemetry block.
112
+ The main-agent fields are authoritative; the telemetry block is diagnostics.
113
+
114
+ ```json
115
+ {
116
+ "completed": true,
117
+ "message": "Recall complete; 4 relevant learnings found.",
118
+ "findings": [
119
+ {
120
+ "id": "F1",
121
+ "severity": "critical",
122
+ "category": "correctness",
123
+ "claim": "Cancellation handlers must check context before writing to avoid corruption.",
124
+ "evidence": "nodes/0012-dispatch-cancellation-lifecycle.md:20-35 — verbatim substring from journal node.",
125
+ "suggestion": null,
126
+ "source": "implementer"
127
+ }
128
+ ],
129
+ "summary": "The project learned that dispatch cancellation must synchronize context reads (node 0012) and never write without checking. Related node 0008 (refines) adds that timeout-based cancellation has race conditions under high load.",
130
+ "filesChanged": [],
131
+ "commitSha": null,
132
+ "blockId": null,
133
+ "telemetry": {
134
+ "totalDurationMs": 1234,
135
+ "totalCostUSD": 0.08,
136
+ "workerSelfAssessment": "done",
137
+ "reviewVerdict": null,
138
+ "commitOutcome": "not_applicable",
139
+ "stopReason": "normal",
140
+ "haltedStage": null,
141
+ "stages": [...]
142
+ }
143
+ }
144
+ ```
145
+
146
+ ### Key fields
147
+
148
+ | Field | When populated | Notes |
149
+ |---|---|---|
150
+ | `completed` | always | `true` when at least one criterion succeeded; `false` on annotator transport failure OR unmet annotate preconditions |
151
+ | `message` | always | human-readable summary; names blocking gates or finding IDs on failure |
152
+ | `findings` | always | `source: 'implementer'` for recall; findings are the deliverable on read routes |
153
+ | `workerSelfAssessment` | always | `'done'` or `'failed'` — never `done_with_concerns` |
154
+ | `blockId` | always `null` (for write routes); string (for read routes) | recall is a read route, so `blockId` is a string — a reusable context block for delta follow-up |
155
+
156
+ ### No second review
157
+
158
+ The LLM-judge stage (`annotate`) runs once, after the worker's output. Its preconditions for read-route `completed: true`:
159
+
160
+ ```
161
+ gates.implement.outcome === 'advance'
162
+ && gates.implement.payload.workerSelfAssessment === 'done'
163
+ && (criteriaSucceeded.length > 0 || criteriaErrors.length === 0)
164
+ ```
165
+
166
+ Findings are the deliverable — a recall that surfaces 5 relevant lessons is `completed: true`. Finding nothing relevant is also a valid completion (returns `findings: []`).
167
+
168
+ ### `completed: false` — what it means
169
+
170
+ Only on annotator transport failure, or if the journal is inaccessible/corrupted. The `message` names the blocking gate. Re-dispatch with a broader `query` if the worker's findings were too narrow.
171
+
172
+ ## Best practices
173
+
174
+ This skill is one step in a larger flow described in `multi-model-agent` → "Best practices". Recipes that involve `mma-journal-recall`:
175
+
176
+ - **Recipe A — Recall before attempting.** Call `mma-journal-recall` with your question before running `mma-delegate` / `mma-execute-plan` to avoid re-treading prior dead ends.
177
+ - **Recipe B — Recall → plan → execute.** `mma-journal-recall` → write a plan based on the learnings → `mma-execute-plan`.
178
+ - **Recipe C — Delta follow-up recall.** Feed a prior recall's `contextBlockId` into a follow-up call to dig deeper: `contextBlockIds: [priorResult.contextBlockId]`.
179
+
180
+ Anti-pattern alert: **Misusing recall as codebase search.** Recall is for the *project's learnings graph*, not the codebase. If you want to search code → `mma-investigate`. If you want to ask the journal → `mma-journal-recall`.
181
+
182
+ ## Common pitfalls
183
+
184
+ ❌ **Using exact tags instead of a conceptual question**
185
+ > query: "dispatch cancellation"
186
+
187
+ The worker expects a sentence with context, not keywords. **Fix:** phrase it as a question:
188
+ > query: "what have we learned about dispatch cancellation and how it interacts with timeouts?"
189
+
190
+ ❌ **Asking about the codebase instead of the journal**
191
+ > query: "where is DispatchCanceller called?"
192
+
193
+ That's a codebase question. Use `mma-investigate` instead. Journal recall is for *learnings* stored in `.mmagent/journal/`, not code.
194
+
195
+ ❌ **Assuming the journal exists**
196
+ > query: "what do we know about X?"
197
+
198
+ If the project hasn't used `mma-journal-record`, the journal is empty. The worker will return `not_applicable`. **Fix:** check whether the journal is active in the project first, or start recording learnings with `mma-journal-record`.
199
+
200
+ ## Terminal context block
201
+
202
+ Every completed **read-route** task (audit / review / debug / investigate / recall / research) auto-registers a reusable terminal context block containing its report (headline + findings). The block id is returned on each per-task result as **`contextBlockId`**. Write routes (delegate / execute-plan / retry / journal-record) return `contextBlockId: null` — their record is the commit, not a block. This block is immutable, lives for the session duration, and counts against the project's `maxEntries` quota (default 500).
203
+
204
+ Use it for delta follow-ups — feed prior results' block ids into a later call's `contextBlockIds`, filtering out nulls:
205
+
206
+ contextBlockIds: priorResults.map(r => r.contextBlockId).filter((id) => id !== null)
207
+
208
+ **Use cases:**
209
+ - Recall round 2: pass round 1's block into round 2's `contextBlockIds` to dig deeper on a specific thread.
210
+ - Recall → plan → execute chain: feed recall findings as a context block into `mma-execute-plan` as shared prior context.
211
+ - Multi-agent follow-up: capture a recall's block and hand it to another tool chain.
212
+
213
+ The block is registered server-side at task completion; no caller action is needed to create it. Delete it explicitly via `DELETE /context-blocks/:id` when no longer needed, or let it expire on session teardown.
214
+
215
+ ## Outcome semantics
216
+
217
+ Every task result carries outcome fields that describe the recall's conclusion status:
218
+
219
+ | Field | Type | Meaning |
220
+ |---|---|---|
221
+ | `findingsOutcome` | `'found' \| 'not_applicable'` | Answers the question: did the recall produce substantive learnings? |
222
+ | `findingsOutcomeReason` | `string \| null` | When `findingsOutcome` is set, this explains why (e.g. "No relevant journal nodes found for the query" or "Journal is empty"). |
223
+ | `outcomeInferred` | `boolean` | `true` if the system inferred the outcome from findings count; `false` if the worker explicitly stated it. |
224
+ | `outcomeMalformed` | `boolean` | `true` if the outcome line was malformed and had to be repaired; `false` otherwise. |
225
+
226
+ ### Enum values
227
+
228
+ - **`found`** — the recall produced one or more relevant prior learnings (findings) across one or more journal nodes.
229
+ - **`not_applicable`** — the recall could not proceed (the journal is empty, inaccessible, or nothing in it answers the query).
230
+
231
+ ### Empty journal ≠ failure
232
+
233
+ A recall that searches the journal and finds nothing relevant is a valid `completed: true` outcome; it simply answers "no prior learnings match that question" — which is useful information before attempting something new.
234
+
235
+ ### Per-route legal outcomes
236
+
237
+ The legal outcomes for this route are: `['found', 'not_applicable']`
238
+
239
+ - **`found`** — one or more prior learnings surfaced from the journal.
240
+ - **`not_applicable`** — the journal is empty, inaccessible, or no learnings match the query.
241
+
242
+ @include _shared/error-handling.md
@@ -0,0 +1,189 @@
1
+ ---
2
+ name: mma-journal-record
3
+ description: >-
4
+ Use when you've abandoned an approach, hit a constraint, or concluded
5
+ something worth remembering — record it to the persistent journal as a
6
+ fire-and-forget decision audit trail for future sessions.
7
+ when_to_use: >-
8
+ You've completed analysis and want to log the outcome — abandoned an approach,
9
+ hit a blocking constraint, or reached a conclusion worth remembering. NOT for
10
+ recall/investigate/delegate; those are read routes. Journal stores conclusions
11
+ for cross-session reference.
12
+ version: 4.8.0
13
+ ---
14
+
15
+ # mma-journal-record
16
+
17
+ ## Overview
18
+
19
+ Record a learning, constraint, or decision outcome to the persistent journal via a fire-and-forget mmagent worker. The worker stores the entry and returns immediately; you continue on your main context.
20
+
21
+ **Core principle:** Journal is an audit trail of what you've decided, discovered, or abandoned. Record it once per session; don't re-investigate.
22
+
23
+ ## When to Use
24
+
25
+ **Use when:**
26
+ - You've abandoned an approach and want to log why
27
+ - You've hit a blocking constraint worth remembering
28
+ - You've reached a conclusion (e.g., "Pattern X doesn't work in this codebase")
29
+ - You've decided not to pursue a direction and want to avoid repeating that decision next session
30
+
31
+ **Don't use when:**
32
+ - You're asking a question → `mma-investigate`
33
+ - You're dispatching work → `mma-delegate`
34
+ - You want to retrieve past entries → journal is append-only, not searchable; use `git log` or `.mmagent/journal/` files directly
35
+ - You're mid-task and want to pause → that's what `blockedBy` is for; journal is for conclusions, not temporary blockers
36
+
37
+ ## Endpoint
38
+
39
+ `POST /journal-record?cwd=<abs-path>`
40
+
41
+ @include _shared/auth.md
42
+
43
+ ## Request body
44
+
45
+ ```json
46
+ {
47
+ "learning": "Tried worker self-report for grouped-dispatch cancellation; dropped it — git diff is the source of truth. Lesson: use getRealFilesChanged.",
48
+ "tagHints": ["dispatch", "cancellation"]
49
+ }
50
+ ```
51
+
52
+ | Field | Type | Required | Notes |
53
+ |---|---|---|---|
54
+ | `learning` | string | yes | Natural-language entry: what you decided, why, or what you learned. Keep it concrete. |
55
+ | `tagHints` | string[] | no | Optional tags for later cross-reference (e.g. `["perf", "refactor"]`). Tags are advisory; the journal system may group or index them. |
56
+
57
+ **What gets stored & where:**
58
+
59
+ Entries are integrated into a graph-structured journal store at `.mmagent/journal/`:
60
+ - `nodes/` — individual learning entries (keyed by unique node ID)
61
+ - `index.md` — searchable index of all entries, tags, and cross-references
62
+ - `log.md` — append-only event log of create/refine/supersede/merge operations
63
+
64
+ The worker creates, refines, or supersedes nodes in the graph (never appends blindly). You can query the index or log directly to track learning history. Writes are confined to the project's `.mmagent/` directory (no traversal).
65
+
66
+ ## Full example
67
+
68
+ ```bash
69
+ BATCH=$(curl -f --show-error -s -X POST \
70
+ -H "X-MMA-Client: $MMA_CLIENT" \
71
+ -H "X-MMA-Main-Model: $MMA_MAIN_MODEL" \
72
+ -H "Authorization: Bearer $TOKEN" \
73
+ -H "Content-Type: application/json" \
74
+ -d '{
75
+ "learning": "Tried worker self-report for grouped-dispatch cancellation; dropped it — git diff is the source of truth. Lesson: use getRealFilesChanged.",
76
+ "tagHints": ["dispatch", "cancellation"]
77
+ }' \
78
+ "http://localhost:$PORT/journal-record?cwd=/project")
79
+ BATCH_ID=$(echo "$BATCH" | jq -r '.batchId')
80
+ ```
81
+
82
+ @include _shared/polling.md
83
+
84
+ @include _shared/response-shape.md
85
+
86
+ ## Per-task report shape
87
+
88
+ Each task carries a structured report containing the graph operation metadata:
89
+
90
+ ```json
91
+ {
92
+ "summary": "created 0012; superseded 0009",
93
+ "filesChanged": [".mmagent/journal/nodes/0012.md", ".mmagent/journal/index.md", ".mmagent/journal/log.md"],
94
+ "op": "create"
95
+ }
96
+ ```
97
+
98
+ The authoritative success signal is `completed` + the presence of `filesChanged`. See "v5 wire shape" below for the full envelope.
99
+
100
+ ## v5 wire shape (reviewed write route)
101
+
102
+ Every task result is a `ComposePayload` — seven main-agent fields plus a telemetry block.
103
+ The main-agent fields are authoritative; the telemetry block is diagnostics.
104
+
105
+ ```json
106
+ {
107
+ "completed": true,
108
+ "message": "Journal entry created (node 0012); superseded prior learning (node 0009)",
109
+ "findings": [],
110
+ "summary": "created 0012; superseded 0009",
111
+ "filesChanged": [".mmagent/journal/nodes/0012.md", ".mmagent/journal/index.md", ".mmagent/journal/log.md"],
112
+ "commitSha": null,
113
+ "blockId": null,
114
+ "telemetry": {
115
+ "totalDurationMs": 5400,
116
+ "totalCostUSD": 0.04,
117
+ "workerSelfAssessment": "done",
118
+ "reviewVerdict": "approved",
119
+ "commitOutcome": "not_applicable",
120
+ "stopReason": "normal",
121
+ "haltedStage": null,
122
+ "stages": [
123
+ { "name": "prepare", "outcome": "advance", "durationMs": 2, "costUSD": 0 },
124
+ { "name": "register-block", "outcome": "skip", "comment": "register-block does not apply to route=journal", "durationMs": 0, "costUSD": 0 },
125
+ { "name": "implement", "outcome": "advance", "durationMs": 3200, "costUSD": 0.02 },
126
+ { "name": "review", "outcome": "advance", "durationMs": 1800, "costUSD": 0.01 },
127
+ { "name": "rework", "outcome": "skip", "comment": "rework skipped because review approved", "durationMs": 0, "costUSD": 0 },
128
+ { "name": "commit", "outcome": "skip", "comment": "commit does not apply to non-git routes", "durationMs": 0, "costUSD": 0 },
129
+ { "name": "annotate", "outcome": "advance", "durationMs": 340, "costUSD": 0.01 },
130
+ { "name": "compose", "outcome": "advance", "durationMs": 56, "costUSD": 0 },
131
+ { "name": "terminal", "outcome": "advance", "durationMs": 2, "costUSD": 0 }
132
+ ]
133
+ }
134
+ }
135
+ ```
136
+
137
+ ### Key fields
138
+
139
+ | Field | When populated | Notes |
140
+ |---|---|---|
141
+ | `completed` | always | `true` when entry is created/refined/superseded and approved; `false` on review rejection, path traversal, or write failure |
142
+ | `message` | always | human-readable summary (e.g., "created 0012; superseded 0009"); read on failure for diagnostic |
143
+ | `findings` | always | issues surfaced by the reviewer (e.g., unclear learning, duplicate with 0009). Empty if approved as-is. |
144
+ | `filesChanged` | always | graph journal paths modified: `nodes/`, `index.md`, `log.md` (relative to `cwd`) |
145
+ | `workerSelfAssessment` | always | `'done'` or `'failed'` — worker's assessment of completeness |
146
+ | `blockId` | always `null` | journal is a task route, not register-context-block |
147
+ | `commitSha` | always `null` | journal entries are graph mutations, not git commits |
148
+ | `reviewVerdict` | via telemetry | `'approved'` \| `'rejected_with_rework'` \| `'rejected'` — reviewer's verdict on the learned entry |
149
+
150
+ ### Reviewed write lifecycle
151
+
152
+ Unlike read routes (audit/investigate/debug), journal runs a full review cycle: **implement** → **review** → [optional **rework**] → **commit** (skipped for non-git routes) → **annotate**. If the reviewer finds issues (e.g., the learning is ambiguous, the node supersedes multiple prior entries), a rework round applies targeted edits before finalization.
153
+
154
+ ### `completed: false` — what it means
155
+
156
+ Path traversal detected, write permission denied, or directory creation failed. The `message` names the blocking issue.
157
+
158
+ ## Best practices
159
+
160
+ **One entry per decision, not per turn.**
161
+ Log once when you decide not to pursue a direction; don't log "just checked X" on every iteration.
162
+
163
+ **Keep entries concrete.**
164
+ ❌ "Didn't work"
165
+ ✅ "Tried multicast-style dispatch with worker dedup; git diff is the source of truth, workers can't track cancellations atomically. Use getRealFilesChanged instead."
166
+
167
+ **Use tags to build searchable structure.**
168
+ ```bash
169
+ # Later, grep your journal for all perf decisions:
170
+ grep -r "^" .mmagent/journal/ | grep -i "perf:"
171
+ ```
172
+
173
+ ## Common pitfalls
174
+
175
+ ❌ **Using journal as a scratchpad**
176
+ > "Thinking about X. Maybe Y? Need to check Z."
177
+
178
+ Journal is for **conclusions**, not work-in-progress. Keep notes in a separate working file if you need to brainstorm.
179
+
180
+ ❌ **Logging without context**
181
+ > "Doesn't work."
182
+
183
+ Future-you (or a teammate) won't remember what "doesn't work" means. Always include the decision frame: what did you try, why did you try it, what was the outcome, and what will you do instead?
184
+
185
+ ## Context blocks
186
+
187
+ Write-route tasks (delegate / execute-plan / journal / retry) do **not** register terminal context blocks. Their artifact is the filesystem mutation (git commit for delegate; graph mutations for journal). Read-route tasks (audit / review / debug / investigate / research) auto-register blocks containing their findings.
188
+
189
+ @include _shared/error-handling.md