agora-mnemo 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- agora_mnemo-0.1.0/PKG-INFO +281 -0
- agora_mnemo-0.1.0/README.md +264 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/PKG-INFO +281 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/SOURCES.txt +11 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/dependency_links.txt +1 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/entry_points.txt +2 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/requires.txt +3 -0
- agora_mnemo-0.1.0/agora_mnemo.egg-info/top_level.txt +1 -0
- agora_mnemo-0.1.0/mnemo/__init__.py +4 -0
- agora_mnemo-0.1.0/mnemo/mnemo.py +548 -0
- agora_mnemo-0.1.0/mnemo/mnemo_mcp.py +138 -0
- agora_mnemo-0.1.0/pyproject.toml +27 -0
- agora_mnemo-0.1.0/setup.cfg +4 -0
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: agora-mnemo
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: mnemo - a zero-dependency memory layer for AI agents: value-ranked recall, per-type decay, consolidation, and semantic+lexical auto-mode. Extracted from an autonomous research system running over ~9,000 notes.
|
|
5
|
+
Author: Agora (autonomous research organization)
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/DanceNitra/agora
|
|
8
|
+
Project-URL: Source, https://github.com/DanceNitra/agora
|
|
9
|
+
Keywords: llm,agent,memory,rag,recall,consolidation,mcp,embeddings,second-brain
|
|
10
|
+
Classifier: Programming Language :: Python :: 3
|
|
11
|
+
Classifier: Intended Audience :: Science/Research
|
|
12
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
13
|
+
Requires-Python: >=3.8
|
|
14
|
+
Description-Content-Type: text/markdown
|
|
15
|
+
Provides-Extra: mcp
|
|
16
|
+
Requires-Dist: mcp[cli]>=1.0; extra == "mcp"
|
|
17
|
+
|
|
18
|
+
<div align="center">
|
|
19
|
+
|
|
20
|
+
# Mnemosyne · `mnemo`
|
|
21
|
+
|
|
22
|
+
**A memory layer for AI agents — the one that already runs an autonomous research OS over ~6,000 notes.**
|
|
23
|
+
|
|
24
|
+
*Memory is the mother of the Muses. An agent with no memory has no ideas.*
|
|
25
|
+
|
|
26
|
+
</div>
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
`mnemo` is the recall + consolidation core of [Agora](https://github.com/DanceNitra/agora) — an
|
|
31
|
+
autonomous research system — distilled into **a single file with no required dependencies**. It does
|
|
32
|
+
the four things agent memory actually needs, the way that held up running in production for weeks.
|
|
33
|
+
|
|
34
|
+
Most "agent memory" libraries are demos. This one is extracted from a system that has used it daily
|
|
35
|
+
to curate a 6,000-note knowledge base, and whose consolidation behaviour we have **measured**, not
|
|
36
|
+
assumed (see *Provenance* below).
|
|
37
|
+
|
|
38
|
+
## Install
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
# single file, zero dependencies
|
|
42
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## Install
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
pip install agora-mnemo # the zero-dep core (import name stays `mnemo`)
|
|
49
|
+
pip install "agora-mnemo[mcp]" # + the MCP server, so any Claude/Cursor agent uses it as memory
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Use
|
|
53
|
+
|
|
54
|
+
```python
|
|
55
|
+
from mnemo import Mnemo
|
|
56
|
+
|
|
57
|
+
m = Mnemo("memory.json") # persists to JSON; or Mnemo("memory.json", embed=my_model)
|
|
58
|
+
|
|
59
|
+
m.remember("Pre-trend tests catch only ~31% of fatal DiD bias.", tags=["causal"], value=3, mtype="semantic")
|
|
60
|
+
m.recall("difference in differences", k=5) # relevance × value, decayed by the memory's per-type half-life
|
|
61
|
+
m.consolidate(keep=200) # the "dream" pass: hubs, dedup, STATE-TOGGLE, keep-budget
|
|
62
|
+
m.consolidate_clusters(threshold=15) # cluster-TRIGGERED: consolidate only a topic that's grown dense
|
|
63
|
+
m.contradictions() # flag incompatible memories for REVIEW (never deletes)
|
|
64
|
+
m.value_by_cohort() # value reported per tag/time-block, not per memory
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Bring any text→vector function as `embed=` for semantic recall; with none, `mnemo` falls back to a
|
|
68
|
+
forgiving lexical match so it **runs anywhere, today**.
|
|
69
|
+
|
|
70
|
+
## Use it as an MCP server (any Claude / Cursor / agent client)
|
|
71
|
+
|
|
72
|
+
`mnemo` ships an [MCP](https://modelcontextprotocol.io) stdio server so any MCP-compatible agent can
|
|
73
|
+
use it as long-term memory — `remember` (with a per-type decay prior), value-ranked `recall`,
|
|
74
|
+
`consolidate`, `consolidate_clusters`, `contradictions`, `value_by_cohort`. `mnemo.py` stays
|
|
75
|
+
zero-dependency; only the server needs the SDK:
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
pip install "mcp[cli]"
|
|
79
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
|
|
80
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo_mcp.py
|
|
81
|
+
MNEMO_PATH=./agent_memory.json python mnemo_mcp.py # speaks MCP over stdio
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Register it with a client — e.g. Claude Code (`.mcp.json`) or Claude Desktop
|
|
85
|
+
(`claude_desktop_config.json`):
|
|
86
|
+
|
|
87
|
+
```json
|
|
88
|
+
{
|
|
89
|
+
"mcpServers": {
|
|
90
|
+
"mnemo": {
|
|
91
|
+
"command": "python",
|
|
92
|
+
"args": ["/abs/path/to/mnemo/mnemo_mcp.py"],
|
|
93
|
+
"env": { "MNEMO_PATH": "/abs/path/to/agent_memory.json" }
|
|
94
|
+
}
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
For **semantic** recall, point it at any OpenAI-compatible embeddings endpoint via
|
|
100
|
+
`MNEMO_EMBED_URL` / `MNEMO_EMBED_MODEL` / `MNEMO_EMBED_KEY`; with none set it uses the lexical
|
|
101
|
+
fallback. The agent then calls `recall(query)` before reasoning and `remember(fact)` as it learns —
|
|
102
|
+
its memory is value-ranked and append-only, not a recency buffer.
|
|
103
|
+
|
|
104
|
+
## The four operations
|
|
105
|
+
|
|
106
|
+
| op | what it does |
|
|
107
|
+
|---|---|
|
|
108
|
+
| `remember(text, tags, value, mtype)` | **append-only** raw capture, absolute UTC time, never edited; `mtype` ∈ {episodic, semantic, procedural} sets the **decay prior** (events fade fast, durable facts slow, rules barely) |
|
|
109
|
+
| `recall(query, k)` | **value-ranked** retrieval: relevance × value, **decayed by the memory's per-type half-life** (access resets the clock), so important durable memories beat both merely-similar and stale ones. Reinforcement is **relevance-weighted** (a bullseye hit reinforces value more than one that squeaked into top-k, so a weak-but-frequent false positive can't go immortal); a repeatedly-recalled episodic memory **graduates** to semantic; and a memory whose source was later contradicted is **provenance-demoted** + flagged `stale_derived` |
|
|
110
|
+
| `consolidate(keep)` | the **dream pass**: flag universal-matcher *hubs*, link near-duplicates, apply the **state-toggle guard** (a polarity clash supersedes, doesn't merge), supersede the low-value surplus — only *adds* a derived layer |
|
|
111
|
+
| `consolidate_clusters(threshold)` | **cluster-triggered** consolidation: consolidate a semantic cluster only once it's grown past `threshold` — sparse topics keep their raw episodes, dense ones don't grow unbounded |
|
|
112
|
+
| `contradictions()` | flag mutually-incompatible **related** memories (similarity-gated) for human review |
|
|
113
|
+
|
|
114
|
+
## Five rules it won't break (each one cost us to learn)
|
|
115
|
+
|
|
116
|
+
1. **Raw capture is immutable.** Consolidation adds links and markers; it never overwrites the
|
|
117
|
+
source. This is what stops the slow accuracy drift of LLM-rewritten memory.
|
|
118
|
+
2. **Absolute timestamps at write time.** Relative/derived times rot the moment they're consolidated.
|
|
119
|
+
3. **Value-ranked, type-aware decay.** Retention is `value × a per-type half-life`, not recency or
|
|
120
|
+
access-frequency alone. A *uniform* access-reset clock keeps merely-*popular* memories while a
|
|
121
|
+
load-bearing-but-cold fact — queried once a month, prevents a destructive action — starves; we
|
|
122
|
+
measured exactly that failure. The fix is that the half-life is set by **kind**, not by read
|
|
123
|
+
count: episodic events fade in days, semantic facts in months, procedural rules barely at all. A
|
|
124
|
+
cold-but-critical fact survives by being **typed** semantic/procedural (long half-life × its high
|
|
125
|
+
value), not by frequent reads; access only resets the clock *within* a type's window.
|
|
126
|
+
4. **Value is reported at the cohort level** (tag / time-block), never per-memory.
|
|
127
|
+
5. **Contradictions are flagged, never auto-resolved.** Silent rewrites destroy trust in the whole
|
|
128
|
+
memory.
|
|
129
|
+
|
|
130
|
+
## Provenance — why these rules, with receipts
|
|
131
|
+
|
|
132
|
+
`mnemo`'s design isn't taste; it's what Agora's lab *measured*:
|
|
133
|
+
|
|
134
|
+
- **Semantic recall beats keyword recall, and the gap widens with scale** — as the store grows to
|
|
135
|
+
the ~6,000-note full corpus, lexical `recall@5` decays from **0.94** (small store) to **0.25**,
|
|
136
|
+
while semantic **holds at ~0.65** — ≈**2.6×** at full scale (Agora Lab `b4c260`); on paraphrase
|
|
137
|
+
queries semantic `recall@5` is **0.86 vs 0.20** lexical (`3501f1`). The embedder is the real lever
|
|
138
|
+
at scale; the lexical overlap match is the zero-dependency *floor* that still runs anywhere on a
|
|
139
|
+
small store. (Honest footnote: pruning
|
|
140
|
+
universal-matcher *hub* notes lifts **lexical** recall ~20% only when a store is link-spammed, and
|
|
141
|
+
does **not** move semantic recall — it's a lexical/hybrid optimisation, not a headline.)
|
|
142
|
+
- **Value-ranked consolidation** — under a keep-budget, ranking *what to keep* by value beats
|
|
143
|
+
FIFO/random, and the advantage **scales super-linearly as the budget shrinks** (≈1.8× at half
|
|
144
|
+
budget → ≈4× at one-eighth), surviving heavy estimation noise.
|
|
145
|
+
- **Retention must blend value with recency, not decay on access alone** — we simulated a
|
|
146
|
+
half-life-with-access-reset policy (a *popularity* signal) against a value-aware blend under a
|
|
147
|
+
shrinking budget, with value made deliberately anti-correlated with access-frequency for a
|
|
148
|
+
load-bearing-but-cold subset. At a 30% keep-budget the access-decay policy retained only **2.8%**
|
|
149
|
+
of the high-value/low-frequency memories and **20%** of total value, vs **100%** and **64%** for
|
|
150
|
+
the blend — about **3× more value kept** (the gap persists, ≈2.2× retained value, even at a 7%
|
|
151
|
+
budget). Pure access-frequency decay starves the rarely-queried-but-critical memories; forgetting
|
|
152
|
+
must consume an explicit value channel *separate from* access recency. (Agora Lab `19d802`.)
|
|
153
|
+
- **Cohort-level value** — per-memory outcome attribution is **statistically underpowered at n-of-1**
|
|
154
|
+
(the best proxy reached only ~0.36 power at realistic sample sizes); the cohort is where the
|
|
155
|
+
signal lives. Hence rule 4.
|
|
156
|
+
- **Contradiction detection** runs in production over the 6,000-note vault; the lesson that it must
|
|
157
|
+
*flag, not auto-edit* (rule 5) is why silent rewrites are forbidden.
|
|
158
|
+
|
|
159
|
+
(Methods + numbers live in the Agora track record: <https://dancenitra.github.io/agora/>.)
|
|
160
|
+
|
|
161
|
+
## The `second_brain` thinking layer
|
|
162
|
+
|
|
163
|
+
`mnemo_mcp` gives an agent **memory**. `second_brain_mcp` gives it a **second brain to think over** —
|
|
164
|
+
point it at any folder of Markdown notes (an Obsidian vault, a Zettelkasten, a `docs/` tree) and an
|
|
165
|
+
MCP client (Claude Desktop, Claude Code, Cursor, your own agent) gets the substrate to *reason
|
|
166
|
+
against* those notes: pull what's relevant, find where the network is blind, surface non-obvious
|
|
167
|
+
bridges, isolate the claims worth checking, and generate ideas by named methods.
|
|
168
|
+
|
|
169
|
+
**The split that keeps it honest.** The server returns **retrieval + structure**; the calling LLM does
|
|
170
|
+
the **reasoning**. The tool is the memory and the map; the agent is the mind. There is no LLM call
|
|
171
|
+
inside this server — it scores, links, and slices your notes, then hands the material back. So the
|
|
172
|
+
claims below are about what an *agent* did with the tools, not about the tool "thinking" on its own.
|
|
173
|
+
No autonomous oracle.
|
|
174
|
+
|
|
175
|
+
**Runs today, zero config.** It indexes your notes into an in-process `mnemo` store at startup; with
|
|
176
|
+
no embedder it uses the lexical-overlap fallback. An embedder (`MNEMO_EMBED_URL/MODEL/KEY`) is optional
|
|
177
|
+
and matters **at scale**: on a ~6,000-note vault, lexical recall@5 decays from 0.94 (small store) to
|
|
178
|
+
**0.25** at full corpus while semantic **holds ~0.65** — ≈2.6× (Agora Lab `b4c260`); on paraphrase
|
|
179
|
+
queries semantic recall@5 is **0.86 vs 0.20** lexical (`3501f1`).
|
|
180
|
+
|
|
181
|
+
```
|
|
182
|
+
NOTES_DIR=/path/to/your/vault python second_brain_mcp.py # run after a flat download of both files
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
### See it run (no setup)
|
|
186
|
+
|
|
187
|
+

|
|
188
|
+
|
|
189
|
+
`python examples/demo.py` runs every tool against a tiny bundled sample vault — no MCP client, no
|
|
190
|
+
key, no embedder. (Regenerate the GIF with `python examples/_make_gif.py` (Pillow) or
|
|
191
|
+
[`examples/demo.tape`](../examples/demo.tape) + [`vhs`](https://github.com/charmbracelet/vhs).)
|
|
192
|
+
The same session in text:
|
|
193
|
+
|
|
194
|
+
```text
|
|
195
|
+
▸ relevant_notes("how does feedback speed up learning", k=3)
|
|
196
|
+
→ Deliberate Practice (Learning) relevance 0.60
|
|
197
|
+
→ Expected Value (Decisions) relevance 0.20
|
|
198
|
+
|
|
199
|
+
▸ find_gaps() → isolated: ["Sourdough Starter"] (the one note with no [[links]])
|
|
200
|
+
|
|
201
|
+
▸ bridge_candidates("Deliberate Practice")
|
|
202
|
+
→ Habit Loops (Habits, DISTANT domain) — both turn on "feedback latency", and nothing links them
|
|
203
|
+
|
|
204
|
+
▸ extract_claims("Deliberate Practice")
|
|
205
|
+
→ "Feedback latency is the hidden variable: the longer the gap between an action
|
|
206
|
+
and its feedback, the slower the learning." (line 3 — go ground or challenge it)
|
|
207
|
+
|
|
208
|
+
▸ idea_methods() → 10 recipes (Hidden-Connection Bridge, Missing-Reciprocity, …)
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
That `bridge_candidates` hit is the point: a connection across two folders that *you never linked* —
|
|
212
|
+
the agent now writes the mapping (or rejects it). The tool found the material; the agent does the thinking.
|
|
213
|
+
|
|
214
|
+
Register it with an MCP client (point `args` at the file's absolute path so `mnemo.py`, which sits
|
|
215
|
+
beside it, is found):
|
|
216
|
+
|
|
217
|
+
```json
|
|
218
|
+
{
|
|
219
|
+
"mcpServers": {
|
|
220
|
+
"second_brain": {
|
|
221
|
+
"command": "python",
|
|
222
|
+
"args": ["/abs/path/to/second_brain_mcp.py"],
|
|
223
|
+
"env": {
|
|
224
|
+
"NOTES_DIR": "/abs/path/to/your/vault",
|
|
225
|
+
"SECOND_BRAIN_INDEX": "/abs/path/to/second_brain_index.json"
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
}
|
|
229
|
+
}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
| tool | returns |
|
|
233
|
+
|---|---|
|
|
234
|
+
| `index_status` | notes indexed, folder spread, resolved `NOTES_DIR` (call first; `0` ⇒ fix `NOTES_DIR`) |
|
|
235
|
+
| `relevant_notes` | the `k` most relevant notes by relevance × accrued value (value accrues with use; a cold index is effectively relevance-ranked), with excerpts |
|
|
236
|
+
| `coverage_gap` | the **negative space** of a question: top notes + a measured completeness score + the explicit sub-terms with **no** supporting note — a WYSIATI guard so the agent sees what's *missing* and doesn't answer a tidy-but-incomplete context with false confidence |
|
|
237
|
+
| `find_gaps` | isolated/under-linked notes + thin folders — where the network is blind (noisy on a tiny vault; earns its keep at scale) |
|
|
238
|
+
| `bridge_candidates` | distant notes (different folder, no link) that are semantically close = candidate connections; the agent writes or rejects the mapping |
|
|
239
|
+
| `extract_claims` | claim-like sentences from a note so the agent can ground or challenge them |
|
|
240
|
+
| `idea_methods` | a toolkit of named idea-generation recipes, so generation is principled, not a vibe |
|
|
241
|
+
|
|
242
|
+
Dogfood result, stated honestly: pointed at the maintainer's own ~6,000-note vault, an agent using
|
|
243
|
+
these tools caught a number in his *own* forecasting note inflated ~7× ("60-78%" vs the real ~6-11%),
|
|
244
|
+
surfaced two silently-contradicting notes, and proposed ideas via `idea_methods` — two of which were
|
|
245
|
+
then severe-tested **in Agora's separate research lab** (not inside this server) and held. The LLM did
|
|
246
|
+
the reasoning; the corrections still warrant a source-check before public citation.
|
|
247
|
+
|
|
248
|
+
### Trust & safety
|
|
249
|
+
- **Read-only over your notes.** The server reads `NOTES_DIR` recursively; it does no `eval`, no shell,
|
|
250
|
+
no subprocess, and writes only its own index file. Symlinks/junctions that point *outside*
|
|
251
|
+
`NOTES_DIR` are deliberately **not** followed (so a planted link in a shared/cloned vault can't leak
|
|
252
|
+
files from elsewhere on disk).
|
|
253
|
+
- **The embedder is a trust boundary.** If you set `MNEMO_EMBED_URL`, the **full text of every note**
|
|
254
|
+
is POSTed there. It's validated at startup — `https` anywhere, plain `http` only to loopback (local
|
|
255
|
+
Ollama, etc.), and cloud-metadata/link-local targets are refused. Point it only at an endpoint you trust.
|
|
256
|
+
- **Notes over ~2 MB are skipped** (configurable via `SECOND_BRAIN_MAX_BYTES`) so a single huge file
|
|
257
|
+
can't exhaust memory.
|
|
258
|
+
|
|
259
|
+
## Status
|
|
260
|
+
|
|
261
|
+
`v0.1` — the core, honest and runnable, **now with two MCP servers**: `mnemo_mcp` (memory) and
|
|
262
|
+
`second_brain_mcp` (the thinking layer over your notes). Roadmap: pluggable vector stores, a hosted
|
|
263
|
+
tier. Open-core; the core stays free.
|
|
264
|
+
|
|
265
|
+
MIT-licensed · part of [Agora](https://github.com/DanceNitra/agora).
|
|
266
|
+
|
|
267
|
+
## Self-maintaining (maintain.py)
|
|
268
|
+
The #1 second-brain frustration is **maintenance**, not capture. `maintain.py` runs the chore people
|
|
269
|
+
stop doing — over a folder of Markdown notes it finds **dead `[[wikilinks]]`, orphan notes, stale
|
|
270
|
+
notes, near-duplicate clusters**, and a **vault health score** (`self_legibility` = % of notes in the
|
|
271
|
+
link graph's giant component — knowledge debt is a *percolation* collapse, so it warns *before* the
|
|
272
|
+
cliff). Crucially it turns findings into **actions**: for each orphan it **suggests which existing
|
|
273
|
+
note to link it to** (re-connecting it to the graph), and flags **archive candidates** (old +
|
|
274
|
+
isolated). It resolves links by filename *or* frontmatter alias, and dates notes by frontmatter
|
|
275
|
+
(not git-reset mtime) — both learned from dogfooding it on a real ~7,700-note vault (it rescued ~300
|
|
276
|
+
falsely-flagged orphans). Advisory + safe: it returns a plan and an action list; it never edits,
|
|
277
|
+
moves, or deletes a note. And it can **apply** the fix when you ask: `apply_suggestions` appends a
|
|
278
|
+
marked `## Related (auto-suggested)` block of `[[links]]` to each orphan — additive only, idempotent
|
|
279
|
+
(re-running replaces its own block), **dry-run by default**. `python maintain.py` runs a verified
|
|
280
|
+
round-trip on a synthetic vault (diagnose → suggest → apply); `maintenance_report` and `apply_links`
|
|
281
|
+
in `second_brain_mcp.py` expose it to any MCP agent.
|
|
@@ -0,0 +1,264 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# Mnemosyne · `mnemo`
|
|
4
|
+
|
|
5
|
+
**A memory layer for AI agents — the one that already runs an autonomous research OS over ~6,000 notes.**
|
|
6
|
+
|
|
7
|
+
*Memory is the mother of the Muses. An agent with no memory has no ideas.*
|
|
8
|
+
|
|
9
|
+
</div>
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
`mnemo` is the recall + consolidation core of [Agora](https://github.com/DanceNitra/agora) — an
|
|
14
|
+
autonomous research system — distilled into **a single file with no required dependencies**. It does
|
|
15
|
+
the four things agent memory actually needs, the way that held up running in production for weeks.
|
|
16
|
+
|
|
17
|
+
Most "agent memory" libraries are demos. This one is extracted from a system that has used it daily
|
|
18
|
+
to curate a 6,000-note knowledge base, and whose consolidation behaviour we have **measured**, not
|
|
19
|
+
assumed (see *Provenance* below).
|
|
20
|
+
|
|
21
|
+
## Install
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
# single file, zero dependencies
|
|
25
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Install
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
pip install agora-mnemo # the zero-dep core (import name stays `mnemo`)
|
|
32
|
+
pip install "agora-mnemo[mcp]" # + the MCP server, so any Claude/Cursor agent uses it as memory
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Use
|
|
36
|
+
|
|
37
|
+
```python
|
|
38
|
+
from mnemo import Mnemo
|
|
39
|
+
|
|
40
|
+
m = Mnemo("memory.json") # persists to JSON; or Mnemo("memory.json", embed=my_model)
|
|
41
|
+
|
|
42
|
+
m.remember("Pre-trend tests catch only ~31% of fatal DiD bias.", tags=["causal"], value=3, mtype="semantic")
|
|
43
|
+
m.recall("difference in differences", k=5) # relevance × value, decayed by the memory's per-type half-life
|
|
44
|
+
m.consolidate(keep=200) # the "dream" pass: hubs, dedup, STATE-TOGGLE, keep-budget
|
|
45
|
+
m.consolidate_clusters(threshold=15) # cluster-TRIGGERED: consolidate only a topic that's grown dense
|
|
46
|
+
m.contradictions() # flag incompatible memories for REVIEW (never deletes)
|
|
47
|
+
m.value_by_cohort() # value reported per tag/time-block, not per memory
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Bring any text→vector function as `embed=` for semantic recall; with none, `mnemo` falls back to a
|
|
51
|
+
forgiving lexical match so it **runs anywhere, today**.
|
|
52
|
+
|
|
53
|
+
## Use it as an MCP server (any Claude / Cursor / agent client)
|
|
54
|
+
|
|
55
|
+
`mnemo` ships an [MCP](https://modelcontextprotocol.io) stdio server so any MCP-compatible agent can
|
|
56
|
+
use it as long-term memory — `remember` (with a per-type decay prior), value-ranked `recall`,
|
|
57
|
+
`consolidate`, `consolidate_clusters`, `contradictions`, `value_by_cohort`. `mnemo.py` stays
|
|
58
|
+
zero-dependency; only the server needs the SDK:
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
pip install "mcp[cli]"
|
|
62
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo.py
|
|
63
|
+
curl -O https://raw.githubusercontent.com/DanceNitra/agora/main/mnemo/mnemo_mcp.py
|
|
64
|
+
MNEMO_PATH=./agent_memory.json python mnemo_mcp.py # speaks MCP over stdio
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Register it with a client — e.g. Claude Code (`.mcp.json`) or Claude Desktop
|
|
68
|
+
(`claude_desktop_config.json`):
|
|
69
|
+
|
|
70
|
+
```json
|
|
71
|
+
{
|
|
72
|
+
"mcpServers": {
|
|
73
|
+
"mnemo": {
|
|
74
|
+
"command": "python",
|
|
75
|
+
"args": ["/abs/path/to/mnemo/mnemo_mcp.py"],
|
|
76
|
+
"env": { "MNEMO_PATH": "/abs/path/to/agent_memory.json" }
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
For **semantic** recall, point it at any OpenAI-compatible embeddings endpoint via
|
|
83
|
+
`MNEMO_EMBED_URL` / `MNEMO_EMBED_MODEL` / `MNEMO_EMBED_KEY`; with none set it uses the lexical
|
|
84
|
+
fallback. The agent then calls `recall(query)` before reasoning and `remember(fact)` as it learns —
|
|
85
|
+
its memory is value-ranked and append-only, not a recency buffer.
|
|
86
|
+
|
|
87
|
+
## The four operations
|
|
88
|
+
|
|
89
|
+
| op | what it does |
|
|
90
|
+
|---|---|
|
|
91
|
+
| `remember(text, tags, value, mtype)` | **append-only** raw capture, absolute UTC time, never edited; `mtype` ∈ {episodic, semantic, procedural} sets the **decay prior** (events fade fast, durable facts slow, rules barely) |
|
|
92
|
+
| `recall(query, k)` | **value-ranked** retrieval: relevance × value, **decayed by the memory's per-type half-life** (access resets the clock), so important durable memories beat both merely-similar and stale ones. Reinforcement is **relevance-weighted** (a bullseye hit reinforces value more than one that squeaked into top-k, so a weak-but-frequent false positive can't go immortal); a repeatedly-recalled episodic memory **graduates** to semantic; and a memory whose source was later contradicted is **provenance-demoted** + flagged `stale_derived` |
|
|
93
|
+
| `consolidate(keep)` | the **dream pass**: flag universal-matcher *hubs*, link near-duplicates, apply the **state-toggle guard** (a polarity clash supersedes, doesn't merge), supersede the low-value surplus — only *adds* a derived layer |
|
|
94
|
+
| `consolidate_clusters(threshold)` | **cluster-triggered** consolidation: consolidate a semantic cluster only once it's grown past `threshold` — sparse topics keep their raw episodes, dense ones don't grow unbounded |
|
|
95
|
+
| `contradictions()` | flag mutually-incompatible **related** memories (similarity-gated) for human review |
|
|
96
|
+
|
|
97
|
+
## Five rules it won't break (each one cost us to learn)
|
|
98
|
+
|
|
99
|
+
1. **Raw capture is immutable.** Consolidation adds links and markers; it never overwrites the
|
|
100
|
+
source. This is what stops the slow accuracy drift of LLM-rewritten memory.
|
|
101
|
+
2. **Absolute timestamps at write time.** Relative/derived times rot the moment they're consolidated.
|
|
102
|
+
3. **Value-ranked, type-aware decay.** Retention is `value × a per-type half-life`, not recency or
|
|
103
|
+
access-frequency alone. A *uniform* access-reset clock keeps merely-*popular* memories while a
|
|
104
|
+
load-bearing-but-cold fact — queried once a month, prevents a destructive action — starves; we
|
|
105
|
+
measured exactly that failure. The fix is that the half-life is set by **kind**, not by read
|
|
106
|
+
count: episodic events fade in days, semantic facts in months, procedural rules barely at all. A
|
|
107
|
+
cold-but-critical fact survives by being **typed** semantic/procedural (long half-life × its high
|
|
108
|
+
value), not by frequent reads; access only resets the clock *within* a type's window.
|
|
109
|
+
4. **Value is reported at the cohort level** (tag / time-block), never per-memory.
|
|
110
|
+
5. **Contradictions are flagged, never auto-resolved.** Silent rewrites destroy trust in the whole
|
|
111
|
+
memory.
|
|
112
|
+
|
|
113
|
+
## Provenance — why these rules, with receipts
|
|
114
|
+
|
|
115
|
+
`mnemo`'s design isn't taste; it's what Agora's lab *measured*:
|
|
116
|
+
|
|
117
|
+
- **Semantic recall beats keyword recall, and the gap widens with scale** — as the store grows to
|
|
118
|
+
the ~6,000-note full corpus, lexical `recall@5` decays from **0.94** (small store) to **0.25**,
|
|
119
|
+
while semantic **holds at ~0.65** — ≈**2.6×** at full scale (Agora Lab `b4c260`); on paraphrase
|
|
120
|
+
queries semantic `recall@5` is **0.86 vs 0.20** lexical (`3501f1`). The embedder is the real lever
|
|
121
|
+
at scale; the lexical overlap match is the zero-dependency *floor* that still runs anywhere on a
|
|
122
|
+
small store. (Honest footnote: pruning
|
|
123
|
+
universal-matcher *hub* notes lifts **lexical** recall ~20% only when a store is link-spammed, and
|
|
124
|
+
does **not** move semantic recall — it's a lexical/hybrid optimisation, not a headline.)
|
|
125
|
+
- **Value-ranked consolidation** — under a keep-budget, ranking *what to keep* by value beats
|
|
126
|
+
FIFO/random, and the advantage **scales super-linearly as the budget shrinks** (≈1.8× at half
|
|
127
|
+
budget → ≈4× at one-eighth), surviving heavy estimation noise.
|
|
128
|
+
- **Retention must blend value with recency, not decay on access alone** — we simulated a
|
|
129
|
+
half-life-with-access-reset policy (a *popularity* signal) against a value-aware blend under a
|
|
130
|
+
shrinking budget, with value made deliberately anti-correlated with access-frequency for a
|
|
131
|
+
load-bearing-but-cold subset. At a 30% keep-budget the access-decay policy retained only **2.8%**
|
|
132
|
+
of the high-value/low-frequency memories and **20%** of total value, vs **100%** and **64%** for
|
|
133
|
+
the blend — about **3× more value kept** (the gap persists, ≈2.2× retained value, even at a 7%
|
|
134
|
+
budget). Pure access-frequency decay starves the rarely-queried-but-critical memories; forgetting
|
|
135
|
+
must consume an explicit value channel *separate from* access recency. (Agora Lab `19d802`.)
|
|
136
|
+
- **Cohort-level value** — per-memory outcome attribution is **statistically underpowered at n-of-1**
|
|
137
|
+
(the best proxy reached only ~0.36 power at realistic sample sizes); the cohort is where the
|
|
138
|
+
signal lives. Hence rule 4.
|
|
139
|
+
- **Contradiction detection** runs in production over the 6,000-note vault; the lesson that it must
|
|
140
|
+
*flag, not auto-edit* (rule 5) is why silent rewrites are forbidden.
|
|
141
|
+
|
|
142
|
+
(Methods + numbers live in the Agora track record: <https://dancenitra.github.io/agora/>.)
|
|
143
|
+
|
|
144
|
+
## The `second_brain` thinking layer
|
|
145
|
+
|
|
146
|
+
`mnemo_mcp` gives an agent **memory**. `second_brain_mcp` gives it a **second brain to think over** —
|
|
147
|
+
point it at any folder of Markdown notes (an Obsidian vault, a Zettelkasten, a `docs/` tree) and an
|
|
148
|
+
MCP client (Claude Desktop, Claude Code, Cursor, your own agent) gets the substrate to *reason
|
|
149
|
+
against* those notes: pull what's relevant, find where the network is blind, surface non-obvious
|
|
150
|
+
bridges, isolate the claims worth checking, and generate ideas by named methods.
|
|
151
|
+
|
|
152
|
+
**The split that keeps it honest.** The server returns **retrieval + structure**; the calling LLM does
|
|
153
|
+
the **reasoning**. The tool is the memory and the map; the agent is the mind. There is no LLM call
|
|
154
|
+
inside this server — it scores, links, and slices your notes, then hands the material back. So the
|
|
155
|
+
claims below are about what an *agent* did with the tools, not about the tool "thinking" on its own.
|
|
156
|
+
No autonomous oracle.
|
|
157
|
+
|
|
158
|
+
**Runs today, zero config.** It indexes your notes into an in-process `mnemo` store at startup; with
|
|
159
|
+
no embedder it uses the lexical-overlap fallback. An embedder (`MNEMO_EMBED_URL/MODEL/KEY`) is optional
|
|
160
|
+
and matters **at scale**: on a ~6,000-note vault, lexical recall@5 decays from 0.94 (small store) to
|
|
161
|
+
**0.25** at full corpus while semantic **holds ~0.65** — ≈2.6× (Agora Lab `b4c260`); on paraphrase
|
|
162
|
+
queries semantic recall@5 is **0.86 vs 0.20** lexical (`3501f1`).
|
|
163
|
+
|
|
164
|
+
```
|
|
165
|
+
NOTES_DIR=/path/to/your/vault python second_brain_mcp.py # run after a flat download of both files
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### See it run (no setup)
|
|
169
|
+
|
|
170
|
+

|
|
171
|
+
|
|
172
|
+
`python examples/demo.py` runs every tool against a tiny bundled sample vault — no MCP client, no
|
|
173
|
+
key, no embedder. (Regenerate the GIF with `python examples/_make_gif.py` (Pillow) or
|
|
174
|
+
[`examples/demo.tape`](../examples/demo.tape) + [`vhs`](https://github.com/charmbracelet/vhs).)
|
|
175
|
+
The same session in text:
|
|
176
|
+
|
|
177
|
+
```text
|
|
178
|
+
▸ relevant_notes("how does feedback speed up learning", k=3)
|
|
179
|
+
→ Deliberate Practice (Learning) relevance 0.60
|
|
180
|
+
→ Expected Value (Decisions) relevance 0.20
|
|
181
|
+
|
|
182
|
+
▸ find_gaps() → isolated: ["Sourdough Starter"] (the one note with no [[links]])
|
|
183
|
+
|
|
184
|
+
▸ bridge_candidates("Deliberate Practice")
|
|
185
|
+
→ Habit Loops (Habits, DISTANT domain) — both turn on "feedback latency", and nothing links them
|
|
186
|
+
|
|
187
|
+
▸ extract_claims("Deliberate Practice")
|
|
188
|
+
→ "Feedback latency is the hidden variable: the longer the gap between an action
|
|
189
|
+
and its feedback, the slower the learning." (line 3 — go ground or challenge it)
|
|
190
|
+
|
|
191
|
+
▸ idea_methods() → 10 recipes (Hidden-Connection Bridge, Missing-Reciprocity, …)
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
That `bridge_candidates` hit is the point: a connection across two folders that *you never linked* —
|
|
195
|
+
the agent now writes the mapping (or rejects it). The tool found the material; the agent does the thinking.
|
|
196
|
+
|
|
197
|
+
Register it with an MCP client (point `args` at the file's absolute path so `mnemo.py`, which sits
|
|
198
|
+
beside it, is found):
|
|
199
|
+
|
|
200
|
+
```json
|
|
201
|
+
{
|
|
202
|
+
"mcpServers": {
|
|
203
|
+
"second_brain": {
|
|
204
|
+
"command": "python",
|
|
205
|
+
"args": ["/abs/path/to/second_brain_mcp.py"],
|
|
206
|
+
"env": {
|
|
207
|
+
"NOTES_DIR": "/abs/path/to/your/vault",
|
|
208
|
+
"SECOND_BRAIN_INDEX": "/abs/path/to/second_brain_index.json"
|
|
209
|
+
}
|
|
210
|
+
}
|
|
211
|
+
}
|
|
212
|
+
}
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
| tool | returns |
|
|
216
|
+
|---|---|
|
|
217
|
+
| `index_status` | notes indexed, folder spread, resolved `NOTES_DIR` (call first; `0` ⇒ fix `NOTES_DIR`) |
|
|
218
|
+
| `relevant_notes` | the `k` most relevant notes by relevance × accrued value (value accrues with use; a cold index is effectively relevance-ranked), with excerpts |
|
|
219
|
+
| `coverage_gap` | the **negative space** of a question: top notes + a measured completeness score + the explicit sub-terms with **no** supporting note — a WYSIATI guard so the agent sees what's *missing* and doesn't answer a tidy-but-incomplete context with false confidence |
|
|
220
|
+
| `find_gaps` | isolated/under-linked notes + thin folders — where the network is blind (noisy on a tiny vault; earns its keep at scale) |
|
|
221
|
+
| `bridge_candidates` | distant notes (different folder, no link) that are semantically close = candidate connections; the agent writes or rejects the mapping |
|
|
222
|
+
| `extract_claims` | claim-like sentences from a note so the agent can ground or challenge them |
|
|
223
|
+
| `idea_methods` | a toolkit of named idea-generation recipes, so generation is principled, not a vibe |
|
|
224
|
+
|
|
225
|
+
Dogfood result, stated honestly: pointed at the maintainer's own ~6,000-note vault, an agent using
|
|
226
|
+
these tools caught a number in his *own* forecasting note inflated ~7× ("60-78%" vs the real ~6-11%),
|
|
227
|
+
surfaced two silently-contradicting notes, and proposed ideas via `idea_methods` — two of which were
|
|
228
|
+
then severe-tested **in Agora's separate research lab** (not inside this server) and held. The LLM did
|
|
229
|
+
the reasoning; the corrections still warrant a source-check before public citation.
|
|
230
|
+
|
|
231
|
+
### Trust & safety
|
|
232
|
+
- **Read-only over your notes.** The server reads `NOTES_DIR` recursively; it does no `eval`, no shell,
|
|
233
|
+
no subprocess, and writes only its own index file. Symlinks/junctions that point *outside*
|
|
234
|
+
`NOTES_DIR` are deliberately **not** followed (so a planted link in a shared/cloned vault can't leak
|
|
235
|
+
files from elsewhere on disk).
|
|
236
|
+
- **The embedder is a trust boundary.** If you set `MNEMO_EMBED_URL`, the **full text of every note**
|
|
237
|
+
is POSTed there. It's validated at startup — `https` anywhere, plain `http` only to loopback (local
|
|
238
|
+
Ollama, etc.), and cloud-metadata/link-local targets are refused. Point it only at an endpoint you trust.
|
|
239
|
+
- **Notes over ~2 MB are skipped** (configurable via `SECOND_BRAIN_MAX_BYTES`) so a single huge file
|
|
240
|
+
can't exhaust memory.
|
|
241
|
+
|
|
242
|
+
## Status
|
|
243
|
+
|
|
244
|
+
`v0.1` — the core, honest and runnable, **now with two MCP servers**: `mnemo_mcp` (memory) and
|
|
245
|
+
`second_brain_mcp` (the thinking layer over your notes). Roadmap: pluggable vector stores, a hosted
|
|
246
|
+
tier. Open-core; the core stays free.
|
|
247
|
+
|
|
248
|
+
MIT-licensed · part of [Agora](https://github.com/DanceNitra/agora).
|
|
249
|
+
|
|
250
|
+
## Self-maintaining (maintain.py)
|
|
251
|
+
The #1 second-brain frustration is **maintenance**, not capture. `maintain.py` runs the chore people
|
|
252
|
+
stop doing — over a folder of Markdown notes it finds **dead `[[wikilinks]]`, orphan notes, stale
|
|
253
|
+
notes, near-duplicate clusters**, and a **vault health score** (`self_legibility` = % of notes in the
|
|
254
|
+
link graph's giant component — knowledge debt is a *percolation* collapse, so it warns *before* the
|
|
255
|
+
cliff). Crucially it turns findings into **actions**: for each orphan it **suggests which existing
|
|
256
|
+
note to link it to** (re-connecting it to the graph), and flags **archive candidates** (old +
|
|
257
|
+
isolated). It resolves links by filename *or* frontmatter alias, and dates notes by frontmatter
|
|
258
|
+
(not git-reset mtime) — both learned from dogfooding it on a real ~7,700-note vault (it rescued ~300
|
|
259
|
+
falsely-flagged orphans). Advisory + safe: it returns a plan and an action list; it never edits,
|
|
260
|
+
moves, or deletes a note. And it can **apply** the fix when you ask: `apply_suggestions` appends a
|
|
261
|
+
marked `## Related (auto-suggested)` block of `[[links]]` to each orphan — additive only, idempotent
|
|
262
|
+
(re-running replaces its own block), **dry-run by default**. `python maintain.py` runs a verified
|
|
263
|
+
round-trip on a synthetic vault (diagnose → suggest → apply); `maintenance_report` and `apply_links`
|
|
264
|
+
in `second_brain_mcp.py` expose it to any MCP agent.
|