agora-mnemo 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,281 @@
1
+ Metadata-Version: 2.4
2
+ Name: agora-mnemo
3
+ Version: 0.1.0
4
+ Summary: mnemo - a zero-dependency memory layer for AI agents: value-ranked recall, per-type decay, consolidation, and semantic+lexical auto-mode. Extracted from an autonomous research system running over ~9,000 notes.
5
+ Author: Agora (autonomous research organization)
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/DanceNitra/agora
8
+ Project-URL: Source, https://github.com/DanceNitra/agora
9
+ Keywords: llm,agent,memory,rag,recall,consolidation,mcp,embeddings,second-brain
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
13
+ Requires-Python: >=3.8
14
+ Description-Content-Type: text/markdown
15
+ Provides-Extra: mcp
16
+ Requires-Dist: mcp[cli]>=1.0; extra == "mcp"
17
+
18
+ <div align="center">
19
+
20
+ # Mnemosyne · `mnemo`
21
+
22
+ **A memory layer for AI agents — the one that already runs an autonomous research OS over ~6,000 notes.**
23
+
24
+ *Memory is the mother of the Muses. An agent with no memory has no ideas.*
25
+
26
+ </div>
27
+
28
+ ---
29
+
30
+ `mnemo` is the recall + consolidation core of [Agora](https://github.com/DanceNitra/agora) — an
31
+ autonomous research system — distilled into **a single file with no required dependencies**. It does
32
+ the four things agent memory actually needs, the way that held up running in production for weeks.
33
+
34
+ Most "agent memory" libraries are demos. This one is extracted from a system that has used it daily
35
+ to curate a 6,000-note knowledge base, and whose consolidation behaviour we have **measured**, not
36
+ assumed (see *Provenance* below).
37
+
38
+ ## Install
39
+
40
+ ```bash
41
+ # single file, zero dependencies
42
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
43
+ ```
44
+
45
+ ## Install
46
+
47
+ ```bash
48
+ pip install agora-mnemo # the zero-dep core (import name stays `mnemo`)
49
+ pip install "agora-mnemo[mcp]" # + the MCP server, so any Claude/Cursor agent uses it as memory
50
+ ```
51
+
52
+ ## Use
53
+
54
+ ```python
55
+ from mnemo import Mnemo
56
+
57
+ m = Mnemo("memory.json") # persists to JSON; or Mnemo("memory.json", embed=my_model)
58
+
59
+ m.remember("Pre-trend tests catch only ~31% of fatal DiD bias.", tags=["causal"], value=3, mtype="semantic")
60
+ m.recall("difference in differences", k=5) # relevance × value, decayed by the memory's per-type half-life
61
+ m.consolidate(keep=200) # the "dream" pass: hubs, dedup, STATE-TOGGLE, keep-budget
62
+ m.consolidate_clusters(threshold=15) # cluster-TRIGGERED: consolidate only a topic that's grown dense
63
+ m.contradictions() # flag incompatible memories for REVIEW (never deletes)
64
+ m.value_by_cohort() # value reported per tag/time-block, not per memory
65
+ ```
66
+
67
+ Bring any text→vector function as `embed=` for semantic recall; with none, `mnemo` falls back to a
68
+ forgiving lexical match so it **runs anywhere, today**.
69
+
70
+ ## Use it as an MCP server (any Claude / Cursor / agent client)
71
+
72
+ `mnemo` ships an [MCP](https://modelcontextprotocol.io) stdio server so any MCP-compatible agent can
73
+ use it as long-term memory — `remember` (with a per-type decay prior), value-ranked `recall`,
74
+ `consolidate`, `consolidate_clusters`, `contradictions`, `value_by_cohort`. `mnemo.py` stays
75
+ zero-dependency; only the server needs the SDK:
76
+
77
+ ```bash
78
+ pip install "mcp[cli]"
79
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
80
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo_mcp.py
81
+ MNEMO_PATH=./agent_memory.json python mnemo_mcp.py # speaks MCP over stdio
82
+ ```
83
+
84
+ Register it with a client — e.g. Claude Code (`.mcp.json`) or Claude Desktop
85
+ (`claude_desktop_config.json`):
86
+
87
+ ```json
88
+ {
89
+ "mcpServers": {
90
+ "mnemo": {
91
+ "command": "python",
92
+ "args": ["/abs/path/to/mnemo/mnemo_mcp.py"],
93
+ "env": { "MNEMO_PATH": "/abs/path/to/agent_memory.json" }
94
+ }
95
+ }
96
+ }
97
+ ```
98
+
99
+ For **semantic** recall, point it at any OpenAI-compatible embeddings endpoint via
100
+ `MNEMO_EMBED_URL` / `MNEMO_EMBED_MODEL` / `MNEMO_EMBED_KEY`; with none set it uses the lexical
101
+ fallback. The agent then calls `recall(query)` before reasoning and `remember(fact)` as it learns —
102
+ its memory is value-ranked and append-only, not a recency buffer.
103
+
104
+ ## The four operations
105
+
106
+ | op | what it does |
107
+ |---|---|
108
+ | `remember(text, tags, value, mtype)` | **append-only** raw capture, absolute UTC time, never edited; `mtype` ∈ {episodic, semantic, procedural} sets the **decay prior** (events fade fast, durable facts slow, rules barely) |
109
+ | `recall(query, k)` | **value-ranked** retrieval: relevance × value, **decayed by the memory's per-type half-life** (access resets the clock), so important durable memories beat both merely-similar and stale ones. Reinforcement is **relevance-weighted** (a bullseye hit reinforces value more than one that squeaked into top-k, so a weak-but-frequent false positive can't go immortal); a repeatedly-recalled episodic memory **graduates** to semantic; and a memory whose source was later contradicted is **provenance-demoted** + flagged `stale_derived` |
110
+ | `consolidate(keep)` | the **dream pass**: flag universal-matcher *hubs*, link near-duplicates, apply the **state-toggle guard** (a polarity clash supersedes, doesn't merge), supersede the low-value surplus — only *adds* a derived layer |
111
+ | `consolidate_clusters(threshold)` | **cluster-triggered** consolidation: consolidate a semantic cluster only once it's grown past `threshold` — sparse topics keep their raw episodes, dense ones don't grow unbounded |
112
+ | `contradictions()` | flag mutually-incompatible **related** memories (similarity-gated) for human review |
113
+
114
+ ## Five rules it won't break (each one cost us to learn)
115
+
116
+ 1. **Raw capture is immutable.** Consolidation adds links and markers; it never overwrites the
117
+ source. This is what stops the slow accuracy drift of LLM-rewritten memory.
118
+ 2. **Absolute timestamps at write time.** Relative/derived times rot the moment they're consolidated.
119
+ 3. **Value-ranked, type-aware decay.** Retention is `value × a per-type half-life`, not recency or
120
+ access-frequency alone. A *uniform* access-reset clock keeps merely-*popular* memories while a
121
+ load-bearing-but-cold fact — queried once a month, prevents a destructive action — starves; we
122
+ measured exactly that failure. The fix is that the half-life is set by **kind**, not by read
123
+ count: episodic events fade in days, semantic facts in months, procedural rules barely at all. A
124
+ cold-but-critical fact survives by being **typed** semantic/procedural (long half-life × its high
125
+ value), not by frequent reads; access only resets the clock *within* a type's window.
126
+ 4. **Value is reported at the cohort level** (tag / time-block), never per-memory.
127
+ 5. **Contradictions are flagged, never auto-resolved.** Silent rewrites destroy trust in the whole
128
+ memory.
129
+
130
+ ## Provenance — why these rules, with receipts
131
+
132
+ `mnemo`'s design isn't taste; it's what Agora's lab *measured*:
133
+
134
+ - **Semantic recall beats keyword recall, and the gap widens with scale** — as the store grows to
135
+ the ~6,000-note full corpus, lexical `recall@5` decays from **0.94** (small store) to **0.25**,
136
+ while semantic **holds at ~0.65** — ≈**2.6×** at full scale (Agora Lab `b4c260`); on paraphrase
137
+ queries semantic `recall@5` is **0.86 vs 0.20** lexical (`3501f1`). The embedder is the real lever
138
+ at scale; the lexical overlap match is the zero-dependency *floor* that still runs anywhere on a
139
+ small store. (Honest footnote: pruning
140
+ universal-matcher *hub* notes lifts **lexical** recall ~20% only when a store is link-spammed, and
141
+ does **not** move semantic recall — it's a lexical/hybrid optimisation, not a headline.)
142
+ - **Value-ranked consolidation** — under a keep-budget, ranking *what to keep* by value beats
143
+ FIFO/random, and the advantage **scales super-linearly as the budget shrinks** (≈1.8× at half
144
+ budget → ≈4× at one-eighth), surviving heavy estimation noise.
145
+ - **Retention must blend value with recency, not decay on access alone** — we simulated a
146
+ half-life-with-access-reset policy (a *popularity* signal) against a value-aware blend under a
147
+ shrinking budget, with value made deliberately anti-correlated with access-frequency for a
148
+ load-bearing-but-cold subset. At a 30% keep-budget the access-decay policy retained only **2.8%**
149
+ of the high-value/low-frequency memories and **20%** of total value, vs **100%** and **64%** for
150
+ the blend — about **3× more value kept** (the gap persists, ≈2.2× retained value, even at a 7%
151
+ budget). Pure access-frequency decay starves the rarely-queried-but-critical memories; forgetting
152
+ must consume an explicit value channel *separate from* access recency. (Agora Lab `19d802`.)
153
+ - **Cohort-level value** — per-memory outcome attribution is **statistically underpowered at n-of-1**
154
+ (the best proxy reached only ~0.36 power at realistic sample sizes); the cohort is where the
155
+ signal lives. Hence rule 4.
156
+ - **Contradiction detection** runs in production over the 6,000-note vault; the lesson that it must
157
+ *flag, not auto-edit* (rule 5) is why silent rewrites are forbidden.
158
+
159
+ (Methods + numbers live in the Agora track record: <https://dancenitra.github.io/agora/>.)
160
+
161
+ ## The `second_brain` thinking layer
162
+
163
+ `mnemo_mcp` gives an agent **memory**. `second_brain_mcp` gives it a **second brain to think over** —
164
+ point it at any folder of Markdown notes (an Obsidian vault, a Zettelkasten, a `docs/` tree) and an
165
+ MCP client (Claude Desktop, Claude Code, Cursor, your own agent) gets the substrate to *reason
166
+ against* those notes: pull what's relevant, find where the network is blind, surface non-obvious
167
+ bridges, isolate the claims worth checking, and generate ideas by named methods.
168
+
169
+ **The split that keeps it honest.** The server returns **retrieval + structure**; the calling LLM does
170
+ the **reasoning**. The tool is the memory and the map; the agent is the mind. There is no LLM call
171
+ inside this server — it scores, links, and slices your notes, then hands the material back. So the
172
+ claims below are about what an *agent* did with the tools, not about the tool "thinking" on its own.
173
+ No autonomous oracle.
174
+
175
+ **Runs today, zero config.** It indexes your notes into an in-process `mnemo` store at startup; with
176
+ no embedder it uses the lexical-overlap fallback. An embedder (`MNEMO_EMBED_URL/MODEL/KEY`) is optional
177
+ and matters **at scale**: on a ~6,000-note vault, lexical recall@5 decays from 0.94 (small store) to
178
+ **0.25** at full corpus while semantic **holds ~0.65** — ≈2.6× (Agora Lab `b4c260`); on paraphrase
179
+ queries semantic recall@5 is **0.86 vs 0.20** lexical (`3501f1`).
180
+
181
+ ```
182
+ NOTES_DIR=/path/to/your/vault python second_brain_mcp.py # run after a flat download of both files
183
+ ```
184
+
185
+ ### See it run (no setup)
186
+
187
+ ![second_brain demo — your notes, thinking](../examples/demo.gif)
188
+
189
+ `python examples/demo.py` runs every tool against a tiny bundled sample vault — no MCP client, no
190
+ key, no embedder. (Regenerate the GIF with `python examples/_make_gif.py` (Pillow) or
191
+ [`examples/demo.tape`](../examples/demo.tape) + [`vhs`](https://github.com/charmbracelet/vhs).)
192
+ The same session in text:
193
+
194
+ ```text
195
+ ▸ relevant_notes("how does feedback speed up learning", k=3)
196
+ → Deliberate Practice (Learning) relevance 0.60
197
+ → Expected Value (Decisions) relevance 0.20
198
+
199
+ ▸ find_gaps() → isolated: ["Sourdough Starter"] (the one note with no [[links]])
200
+
201
+ ▸ bridge_candidates("Deliberate Practice")
202
+ → Habit Loops (Habits, DISTANT domain) — both turn on "feedback latency", and nothing links them
203
+
204
+ ▸ extract_claims("Deliberate Practice")
205
+ → "Feedback latency is the hidden variable: the longer the gap between an action
206
+ and its feedback, the slower the learning." (line 3 — go ground or challenge it)
207
+
208
+ ▸ idea_methods() → 10 recipes (Hidden-Connection Bridge, Missing-Reciprocity, …)
209
+ ```
210
+
211
+ That `bridge_candidates` hit is the point: a connection across two folders that *you never linked* —
212
+ the agent now writes the mapping (or rejects it). The tool found the material; the agent does the thinking.
213
+
214
+ Register it with an MCP client (point `args` at the file's absolute path so `mnemo.py`, which sits
215
+ beside it, is found):
216
+
217
+ ```json
218
+ {
219
+ "mcpServers": {
220
+ "second_brain": {
221
+ "command": "python",
222
+ "args": ["/abs/path/to/second_brain_mcp.py"],
223
+ "env": {
224
+ "NOTES_DIR": "/abs/path/to/your/vault",
225
+ "SECOND_BRAIN_INDEX": "/abs/path/to/second_brain_index.json"
226
+ }
227
+ }
228
+ }
229
+ }
230
+ ```
231
+
232
+ | tool | returns |
233
+ |---|---|
234
+ | `index_status` | notes indexed, folder spread, resolved `NOTES_DIR` (call first; `0` ⇒ fix `NOTES_DIR`) |
235
+ | `relevant_notes` | the `k` most relevant notes by relevance × accrued value (value accrues with use; a cold index is effectively relevance-ranked), with excerpts |
236
+ | `coverage_gap` | the **negative space** of a question: top notes + a measured completeness score + the explicit sub-terms with **no** supporting note — a WYSIATI guard so the agent sees what's *missing* and doesn't answer a tidy-but-incomplete context with false confidence |
237
+ | `find_gaps` | isolated/under-linked notes + thin folders — where the network is blind (noisy on a tiny vault; earns its keep at scale) |
238
+ | `bridge_candidates` | distant notes (different folder, no link) that are semantically close = candidate connections; the agent writes or rejects the mapping |
239
+ | `extract_claims` | claim-like sentences from a note so the agent can ground or challenge them |
240
+ | `idea_methods` | a toolkit of named idea-generation recipes, so generation is principled, not a vibe |
241
+
242
+ Dogfood result, stated honestly: pointed at the maintainer's own ~6,000-note vault, an agent using
243
+ these tools caught a number in his *own* forecasting note inflated ~7× ("60-78%" vs the real ~6-11%),
244
+ surfaced two silently-contradicting notes, and proposed ideas via `idea_methods` — two of which were
245
+ then severe-tested **in Agora's separate research lab** (not inside this server) and held. The LLM did
246
+ the reasoning; the corrections still warrant a source-check before public citation.
247
+
248
+ ### Trust & safety
249
+ - **Read-only over your notes.** The server reads `NOTES_DIR` recursively; it does no `eval`, no shell,
250
+ no subprocess, and writes only its own index file. Symlinks/junctions that point *outside*
251
+ `NOTES_DIR` are deliberately **not** followed (so a planted link in a shared/cloned vault can't leak
252
+ files from elsewhere on disk).
253
+ - **The embedder is a trust boundary.** If you set `MNEMO_EMBED_URL`, the **full text of every note**
254
+ is POSTed there. It's validated at startup — `https` anywhere, plain `http` only to loopback (local
255
+ Ollama, etc.), and cloud-metadata/link-local targets are refused. Point it only at an endpoint you trust.
256
+ - **Notes over ~2 MB are skipped** (configurable via `SECOND_BRAIN_MAX_BYTES`) so a single huge file
257
+ can't exhaust memory.
258
+
259
+ ## Status
260
+
261
+ `v0.1` — the core, honest and runnable, **now with two MCP servers**: `mnemo_mcp` (memory) and
262
+ `second_brain_mcp` (the thinking layer over your notes). Roadmap: pluggable vector stores, a hosted
263
+ tier. Open-core; the core stays free.
264
+
265
+ MIT-licensed · part of [Agora](https://github.com/DanceNitra/agora).
266
+
267
+ ## Self-maintaining (maintain.py)
268
+ The #1 second-brain frustration is **maintenance**, not capture. `maintain.py` runs the chore people
269
+ stop doing — over a folder of Markdown notes it finds **dead `[[wikilinks]]`, orphan notes, stale
270
+ notes, near-duplicate clusters**, and a **vault health score** (`self_legibility` = % of notes in the
271
+ link graph's giant component — knowledge debt is a *percolation* collapse, so it warns *before* the
272
+ cliff). Crucially it turns findings into **actions**: for each orphan it **suggests which existing
273
+ note to link it to** (re-connecting it to the graph), and flags **archive candidates** (old +
274
+ isolated). It resolves links by filename *or* frontmatter alias, and dates notes by frontmatter
275
+ (not git-reset mtime) — both learned from dogfooding it on a real ~7,700-note vault (it rescued ~300
276
+ falsely-flagged orphans). Advisory + safe: it returns a plan and an action list; it never edits,
277
+ moves, or deletes a note. And it can **apply** the fix when you ask: `apply_suggestions` appends a
278
+ marked `## Related (auto-suggested)` block of `[[links]]` to each orphan — additive only, idempotent
279
+ (re-running replaces its own block), **dry-run by default**. `python maintain.py` runs a verified
280
+ round-trip on a synthetic vault (diagnose → suggest → apply); `maintenance_report` and `apply_links`
281
+ in `second_brain_mcp.py` expose it to any MCP agent.
@@ -0,0 +1,264 @@
1
+ <div align="center">
2
+
3
+ # Mnemosyne · `mnemo`
4
+
5
+ **A memory layer for AI agents — the one that already runs an autonomous research OS over ~6,000 notes.**
6
+
7
+ *Memory is the mother of the Muses. An agent with no memory has no ideas.*
8
+
9
+ </div>
10
+
11
+ ---
12
+
13
+ `mnemo` is the recall + consolidation core of [Agora](https://github.com/DanceNitra/agora) — an
14
+ autonomous research system — distilled into **a single file with no required dependencies**. It does
15
+ the four things agent memory actually needs, the way that held up running in production for weeks.
16
+
17
+ Most "agent memory" libraries are demos. This one is extracted from a system that has used it daily
18
+ to curate a 6,000-note knowledge base, and whose consolidation behaviour we have **measured**, not
19
+ assumed (see *Provenance* below).
20
+
21
+ ## Install
22
+
23
+ ```bash
24
+ # single file, zero dependencies
25
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
26
+ ```
27
+
28
+ ## Install
29
+
30
+ ```bash
31
+ pip install agora-mnemo # the zero-dep core (import name stays `mnemo`)
32
+ pip install "agora-mnemo[mcp]" # + the MCP server, so any Claude/Cursor agent uses it as memory
33
+ ```
34
+
35
+ ## Use
36
+
37
+ ```python
38
+ from mnemo import Mnemo
39
+
40
+ m = Mnemo("memory.json") # persists to JSON; or Mnemo("memory.json", embed=my_model)
41
+
42
+ m.remember("Pre-trend tests catch only ~31% of fatal DiD bias.", tags=["causal"], value=3, mtype="semantic")
43
+ m.recall("difference in differences", k=5) # relevance × value, decayed by the memory's per-type half-life
44
+ m.consolidate(keep=200) # the "dream" pass: hubs, dedup, STATE-TOGGLE, keep-budget
45
+ m.consolidate_clusters(threshold=15) # cluster-TRIGGERED: consolidate only a topic that's grown dense
46
+ m.contradictions() # flag incompatible memories for REVIEW (never deletes)
47
+ m.value_by_cohort() # value reported per tag/time-block, not per memory
48
+ ```
49
+
50
+ Bring any text→vector function as `embed=` for semantic recall; with none, `mnemo` falls back to a
51
+ forgiving lexical match so it **runs anywhere, today**.
52
+
53
+ ## Use it as an MCP server (any Claude / Cursor / agent client)
54
+
55
+ `mnemo` ships an [MCP](https://modelcontextprotocol.io) stdio server so any MCP-compatible agent can
56
+ use it as long-term memory — `remember` (with a per-type decay prior), value-ranked `recall`,
57
+ `consolidate`, `consolidate_clusters`, `contradictions`, `value_by_cohort`. `mnemo.py` stays
58
+ zero-dependency; only the server needs the SDK:
59
+
60
+ ```bash
61
+ pip install "mcp[cli]"
62
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
63
+ curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo_mcp.py
64
+ MNEMO_PATH=./agent_memory.json python mnemo_mcp.py # speaks MCP over stdio
65
+ ```
66
+
67
+ Register it with a client — e.g. Claude Code (`.mcp.json`) or Claude Desktop
68
+ (`claude_desktop_config.json`):
69
+
70
+ ```json
71
+ {
72
+ "mcpServers": {
73
+ "mnemo": {
74
+ "command": "python",
75
+ "args": ["/abs/path/to/mnemo/mnemo_mcp.py"],
76
+ "env": { "MNEMO_PATH": "/abs/path/to/agent_memory.json" }
77
+ }
78
+ }
79
+ }
80
+ ```
81
+
82
+ For **semantic** recall, point it at any OpenAI-compatible embeddings endpoint via
83
+ `MNEMO_EMBED_URL` / `MNEMO_EMBED_MODEL` / `MNEMO_EMBED_KEY`; with none set it uses the lexical
84
+ fallback. The agent then calls `recall(query)` before reasoning and `remember(fact)` as it learns —
85
+ its memory is value-ranked and append-only, not a recency buffer.
86
+
87
+ ## The four operations
88
+
89
+ | op | what it does |
90
+ |---|---|
91
+ | `remember(text, tags, value, mtype)` | **append-only** raw capture, absolute UTC time, never edited; `mtype` ∈ {episodic, semantic, procedural} sets the **decay prior** (events fade fast, durable facts slow, rules barely) |
92
+ | `recall(query, k)` | **value-ranked** retrieval: relevance × value, **decayed by the memory's per-type half-life** (access resets the clock), so important durable memories beat both merely-similar and stale ones. Reinforcement is **relevance-weighted** (a bullseye hit reinforces value more than one that squeaked into top-k, so a weak-but-frequent false positive can't go immortal); a repeatedly-recalled episodic memory **graduates** to semantic; and a memory whose source was later contradicted is **provenance-demoted** + flagged `stale_derived` |
93
+ | `consolidate(keep)` | the **dream pass**: flag universal-matcher *hubs*, link near-duplicates, apply the **state-toggle guard** (a polarity clash supersedes, doesn't merge), supersede the low-value surplus — only *adds* a derived layer |
94
+ | `consolidate_clusters(threshold)` | **cluster-triggered** consolidation: consolidate a semantic cluster only once it's grown past `threshold` — sparse topics keep their raw episodes, dense ones don't grow unbounded |
95
+ | `contradictions()` | flag mutually-incompatible **related** memories (similarity-gated) for human review |
96
+
97
+ ## Five rules it won't break (each one cost us to learn)
98
+
99
+ 1. **Raw capture is immutable.** Consolidation adds links and markers; it never overwrites the
100
+ source. This is what stops the slow accuracy drift of LLM-rewritten memory.
101
+ 2. **Absolute timestamps at write time.** Relative/derived times rot the moment they're consolidated.
102
+ 3. **Value-ranked, type-aware decay.** Retention is `value × a per-type half-life`, not recency or
103
+ access-frequency alone. A *uniform* access-reset clock keeps merely-*popular* memories while a
104
+ load-bearing-but-cold fact — queried once a month, prevents a destructive action — starves; we
105
+ measured exactly that failure. The fix is that the half-life is set by **kind**, not by read
106
+ count: episodic events fade in days, semantic facts in months, procedural rules barely at all. A
107
+ cold-but-critical fact survives by being **typed** semantic/procedural (long half-life × its high
108
+ value), not by frequent reads; access only resets the clock *within* a type's window.
109
+ 4. **Value is reported at the cohort level** (tag / time-block), never per-memory.
110
+ 5. **Contradictions are flagged, never auto-resolved.** Silent rewrites destroy trust in the whole
111
+ memory.
112
+
113
+ ## Provenance — why these rules, with receipts
114
+
115
+ `mnemo`'s design isn't taste; it's what Agora's lab *measured*:
116
+
117
+ - **Semantic recall beats keyword recall, and the gap widens with scale** — as the store grows to
118
+ the ~6,000-note full corpus, lexical `recall@5` decays from **0.94** (small store) to **0.25**,
119
+ while semantic **holds at ~0.65** — ≈**2.6×** at full scale (Agora Lab `b4c260`); on paraphrase
120
+ queries semantic `recall@5` is **0.86 vs 0.20** lexical (`3501f1`). The embedder is the real lever
121
+ at scale; the lexical overlap match is the zero-dependency *floor* that still runs anywhere on a
122
+ small store. (Honest footnote: pruning
123
+ universal-matcher *hub* notes lifts **lexical** recall ~20% only when a store is link-spammed, and
124
+ does **not** move semantic recall — it's a lexical/hybrid optimisation, not a headline.)
125
+ - **Value-ranked consolidation** — under a keep-budget, ranking *what to keep* by value beats
126
+ FIFO/random, and the advantage **scales super-linearly as the budget shrinks** (≈1.8× at half
127
+ budget → ≈4× at one-eighth), surviving heavy estimation noise.
128
+ - **Retention must blend value with recency, not decay on access alone** — we simulated a
129
+ half-life-with-access-reset policy (a *popularity* signal) against a value-aware blend under a
130
+ shrinking budget, with value made deliberately anti-correlated with access-frequency for a
131
+ load-bearing-but-cold subset. At a 30% keep-budget the access-decay policy retained only **2.8%**
132
+ of the high-value/low-frequency memories and **20%** of total value, vs **100%** and **64%** for
133
+ the blend — about **3× more value kept** (the gap persists, ≈2.2× retained value, even at a 7%
134
+ budget). Pure access-frequency decay starves the rarely-queried-but-critical memories; forgetting
135
+ must consume an explicit value channel *separate from* access recency. (Agora Lab `19d802`.)
136
+ - **Cohort-level value** — per-memory outcome attribution is **statistically underpowered at n-of-1**
137
+ (the best proxy reached only ~0.36 power at realistic sample sizes); the cohort is where the
138
+ signal lives. Hence rule 4.
139
+ - **Contradiction detection** runs in production over the 6,000-note vault; the lesson that it must
140
+ *flag, not auto-edit* (rule 5) is why silent rewrites are forbidden.
141
+
142
+ (Methods + numbers live in the Agora track record: <https://dancenitra.github.io/agora/>.)
143
+
144
+ ## The `second_brain` thinking layer
145
+
146
+ `mnemo_mcp` gives an agent **memory**. `second_brain_mcp` gives it a **second brain to think over** —
147
+ point it at any folder of Markdown notes (an Obsidian vault, a Zettelkasten, a `docs/` tree) and an
148
+ MCP client (Claude Desktop, Claude Code, Cursor, your own agent) gets the substrate to *reason
149
+ against* those notes: pull what's relevant, find where the network is blind, surface non-obvious
150
+ bridges, isolate the claims worth checking, and generate ideas by named methods.
151
+
152
+ **The split that keeps it honest.** The server returns **retrieval + structure**; the calling LLM does
153
+ the **reasoning**. The tool is the memory and the map; the agent is the mind. There is no LLM call
154
+ inside this server — it scores, links, and slices your notes, then hands the material back. So the
155
+ claims below are about what an *agent* did with the tools, not about the tool "thinking" on its own.
156
+ No autonomous oracle.
157
+
158
+ **Runs today, zero config.** It indexes your notes into an in-process `mnemo` store at startup; with
159
+ no embedder it uses the lexical-overlap fallback. An embedder (`MNEMO_EMBED_URL/MODEL/KEY`) is optional
160
+ and matters **at scale**: on a ~6,000-note vault, lexical recall@5 decays from 0.94 (small store) to
161
+ **0.25** at full corpus while semantic **holds ~0.65** — ≈2.6× (Agora Lab `b4c260`); on paraphrase
162
+ queries semantic recall@5 is **0.86 vs 0.20** lexical (`3501f1`).
163
+
164
+ ```
165
+ NOTES_DIR=/path/to/your/vault python second_brain_mcp.py # run after a flat download of both files
166
+ ```
167
+
168
+ ### See it run (no setup)
169
+
170
+ ![second_brain demo — your notes, thinking](../examples/demo.gif)
171
+
172
+ `python examples/demo.py` runs every tool against a tiny bundled sample vault — no MCP client, no
173
+ key, no embedder. (Regenerate the GIF with `python examples/_make_gif.py` (Pillow) or
174
+ [`examples/demo.tape`](../examples/demo.tape) + [`vhs`](https://github.com/charmbracelet/vhs).)
175
+ The same session in text:
176
+
177
+ ```text
178
+ ▸ relevant_notes("how does feedback speed up learning", k=3)
179
+ → Deliberate Practice (Learning) relevance 0.60
180
+ → Expected Value (Decisions) relevance 0.20
181
+
182
+ ▸ find_gaps() → isolated: ["Sourdough Starter"] (the one note with no [[links]])
183
+
184
+ ▸ bridge_candidates("Deliberate Practice")
185
+ → Habit Loops (Habits, DISTANT domain) — both turn on "feedback latency", and nothing links them
186
+
187
+ ▸ extract_claims("Deliberate Practice")
188
+ → "Feedback latency is the hidden variable: the longer the gap between an action
189
+ and its feedback, the slower the learning." (line 3 — go ground or challenge it)
190
+
191
+ ▸ idea_methods() → 10 recipes (Hidden-Connection Bridge, Missing-Reciprocity, …)
192
+ ```
193
+
194
+ That `bridge_candidates` hit is the point: a connection across two folders that *you never linked* —
195
+ the agent now writes the mapping (or rejects it). The tool found the material; the agent does the thinking.
196
+
197
+ Register it with an MCP client (point `args` at the file's absolute path so `mnemo.py`, which sits
198
+ beside it, is found):
199
+
200
+ ```json
201
+ {
202
+ "mcpServers": {
203
+ "second_brain": {
204
+ "command": "python",
205
+ "args": ["/abs/path/to/second_brain_mcp.py"],
206
+ "env": {
207
+ "NOTES_DIR": "/abs/path/to/your/vault",
208
+ "SECOND_BRAIN_INDEX": "/abs/path/to/second_brain_index.json"
209
+ }
210
+ }
211
+ }
212
+ }
213
+ ```
214
+
215
+ | tool | returns |
216
+ |---|---|
217
+ | `index_status` | notes indexed, folder spread, resolved `NOTES_DIR` (call first; `0` ⇒ fix `NOTES_DIR`) |
218
+ | `relevant_notes` | the `k` most relevant notes by relevance × accrued value (value accrues with use; a cold index is effectively relevance-ranked), with excerpts |
219
+ | `coverage_gap` | the **negative space** of a question: top notes + a measured completeness score + the explicit sub-terms with **no** supporting note — a WYSIATI guard so the agent sees what's *missing* and doesn't answer a tidy-but-incomplete context with false confidence |
220
+ | `find_gaps` | isolated/under-linked notes + thin folders — where the network is blind (noisy on a tiny vault; earns its keep at scale) |
221
+ | `bridge_candidates` | distant notes (different folder, no link) that are semantically close = candidate connections; the agent writes or rejects the mapping |
222
+ | `extract_claims` | claim-like sentences from a note so the agent can ground or challenge them |
223
+ | `idea_methods` | a toolkit of named idea-generation recipes, so generation is principled, not a vibe |
224
+
225
+ Dogfood result, stated honestly: pointed at the maintainer's own ~6,000-note vault, an agent using
226
+ these tools caught a number in his *own* forecasting note inflated ~7× ("60-78%" vs the real ~6-11%),
227
+ surfaced two silently-contradicting notes, and proposed ideas via `idea_methods` — two of which were
228
+ then severe-tested **in Agora's separate research lab** (not inside this server) and held. The LLM did
229
+ the reasoning; the corrections still warrant a source-check before public citation.
230
+
231
+ ### Trust & safety
232
+ - **Read-only over your notes.** The server reads `NOTES_DIR` recursively; it does no `eval`, no shell,
233
+ no subprocess, and writes only its own index file. Symlinks/junctions that point *outside*
234
+ `NOTES_DIR` are deliberately **not** followed (so a planted link in a shared/cloned vault can't leak
235
+ files from elsewhere on disk).
236
+ - **The embedder is a trust boundary.** If you set `MNEMO_EMBED_URL`, the **full text of every note**
237
+ is POSTed there. It's validated at startup — `https` anywhere, plain `http` only to loopback (local
238
+ Ollama, etc.), and cloud-metadata/link-local targets are refused. Point it only at an endpoint you trust.
239
+ - **Notes over ~2 MB are skipped** (configurable via `SECOND_BRAIN_MAX_BYTES`) so a single huge file
240
+ can't exhaust memory.
241
+
242
+ ## Status
243
+
244
+ `v0.1` — the core, honest and runnable, **now with two MCP servers**: `mnemo_mcp` (memory) and
245
+ `second_brain_mcp` (the thinking layer over your notes). Roadmap: pluggable vector stores, a hosted
246
+ tier. Open-core; the core stays free.
247
+
248
+ MIT-licensed · part of [Agora](https://github.com/DanceNitra/agora).
249
+
250
+ ## Self-maintaining (maintain.py)
251
+ The #1 second-brain frustration is **maintenance**, not capture. `maintain.py` runs the chore people
252
+ stop doing — over a folder of Markdown notes it finds **dead `[[wikilinks]]`, orphan notes, stale
253
+ notes, near-duplicate clusters**, and a **vault health score** (`self_legibility` = % of notes in the
254
+ link graph's giant component — knowledge debt is a *percolation* collapse, so it warns *before* the
255
+ cliff). Crucially it turns findings into **actions**: for each orphan it **suggests which existing
256
+ note to link it to** (re-connecting it to the graph), and flags **archive candidates** (old +
257
+ isolated). It resolves links by filename *or* frontmatter alias, and dates notes by frontmatter
258
+ (not git-reset mtime) — both learned from dogfooding it on a real ~7,700-note vault (it rescued ~300
259
+ falsely-flagged orphans). Advisory + safe: it returns a plan and an action list; it never edits,
260
+ moves, or deletes a note. And it can **apply** the fix when you ask: `apply_suggestions` appends a
261
+ marked `## Related (auto-suggested)` block of `[[links]]` to each orphan — additive only, idempotent
262
+ (re-running replaces its own block), **dry-run by default**. `python maintain.py` runs a verified
263
+ round-trip on a synthetic vault (diagnose → suggest → apply); `maintenance_report` and `apply_links`
264
+ in `second_brain_mcp.py` expose it to any MCP agent.