hippo-memory 0.13.2 → 0.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,832 +1,840 @@
1
- # 🦛 Hippo
2
-
3
- **The secret to good memory isn't remembering more. It's knowing what to forget.**
4
-
5
- [![npm](https://img.shields.io/npm/v/hippo-memory)](https://npmjs.com/package/hippo-memory)
6
- [![license](https://img.shields.io/badge/license-MIT-blue)](./LICENSE)
7
-
8
- ```
9
- Works with: Claude Code, Codex, Cursor, OpenClaw, OpenCode, any CLI agent
10
- Imports from: ChatGPT, Claude (CLAUDE.md), Cursor (.cursorrules), any markdown
11
- Storage: SQLite backbone + markdown/YAML mirrors. Git-trackable and human-readable.
12
- Dependencies: Zero runtime deps. Requires Node.js 22.5+. Optional embeddings via @xenova/transformers.
13
- ```
14
-
15
- ---
16
-
17
- ## The Problem
18
-
19
- AI agents forget everything between sessions. Existing solutions just save everything and search later. That's a filing cabinet, not a brain.
20
-
21
- Your memories are also trapped. ChatGPT knows things Claude doesn't. Cursor rules don't travel to Codex. Switch tools and you start from zero.
22
-
23
- ---
24
-
25
- ## Who Is This For
26
-
27
- - **Multi-tool developers.** You use Claude Code on Monday, Cursor on Tuesday, Codex on Wednesday. Context doesn't carry over. Hippo is the shared memory layer across all of them.
28
- - **Teams where agents repeat mistakes.** The agent hit the same deployment bug last week. And the week before. Hippo's error memories and decay mechanics mean hard lessons stick and noise fades.
29
- - **Anyone whose CLAUDE.md is a mess.** Your instruction file grew to 400 lines of mixed rules, preferences, and stale workarounds. Hippo gives that structure: tags, confidence levels, automatic decay of outdated info.
30
- - **People who want portable AI memory.** No vendor lock-in. Markdown files in your repo. Import from ChatGPT, Claude, Cursor. Export by copying a folder.
31
-
32
- ---
33
-
34
- ## Quick Start
35
-
36
- ```bash
37
- npm install -g hippo-memory
38
-
39
- hippo init
40
- hippo remember "FRED cache silently dropped the tips_10y series" --tag error
41
- hippo recall "data pipeline issues" --budget 2000
42
- ```
43
-
44
- That's it. You have a memory system.
45
-
46
- ### What's new in v0.13.2
47
-
48
- - **7 more bug fixes** from second deep review: Windows schtasks injection, MCP error handling, cross-store budget consistency, embedding mutex, and more. See CHANGELOG.
49
-
50
- ### What's new in v0.13.0
51
-
52
- - **Security: command injection fixed.** OpenClaw plugin now uses `execFileSync` (no shell). All user input is passed as array args, eliminating shell injection vectors.
53
- - **17 bug fixes** across search, embeddings, physics, MCP server, store, and CLI. See CHANGELOG for details.
54
-
55
- ### What's new in v0.12.0
56
-
57
- - **Configurable global store.** Set `$HIPPO_HOME` or use XDG (`$XDG_DATA_HOME/hippo`) to put the global store wherever you want. Falls back to `~/.hippo/` if neither is set.
58
-
59
- ### What's new in v0.11.2
60
-
61
- - **Cross-platform path fix.** OpenClaw plugin now correctly resolves `.hippo` paths on Unix when given Windows-style backslash paths. Uses `path/posix` instead of platform-dependent `path.basename`.
62
-
63
- ### What's new in v0.11.1
64
-
65
- - **OpenClaw error capture filtering.** The `autoLearn` hook now applies three filters before storing tool errors: a noise pattern filter for known transient errors, per-session rate limiting (max 5), and per-session deduplication. Prevents memory pollution from infrastructure noise.
66
-
67
- ### What's new in v0.11.0
68
-
69
- - **Reward-proportional decay.** Outcome feedback now modulates decay rate continuously instead of fixed half-life deltas. Memories with consistent positive outcomes decay up to 1.5x slower; consistent negatives decay up to 2x faster. Mixed outcomes converge toward neutral. Inspired by R-STDP in spiking neural networks. `hippo inspect` now shows cumulative outcome counts and the computed reward factor.
70
- - **Public benchmarks.** Two benchmarks in `benchmarks/`: a [Sequential Learning Benchmark](benchmarks/sequential-learning/) (50 tasks, 10 traps, measures agent improvement over time) and a [LongMemEval integration](benchmarks/longmemeval/) (industry-standard 500-question retrieval benchmark, R@5=74.0% with BM25 only). The sequential learning benchmark is unique: no other public benchmark tests whether memory systems produce learning curves.
71
-
72
- ### What's new in v0.10.0
73
-
74
- - **Active invalidation.** `hippo learn --git` detects migration and breaking-change commits and actively weakens memories referencing the old pattern. Manual invalidation via `hippo invalidate "REST API" --reason "migrated to GraphQL"`.
75
- - **Architectural decisions.** `hippo decide` stores one-off decisions with 90-day half-life and verified confidence. Supports `--context` for reasoning and `--supersedes` to chain decisions when the architecture evolves.
76
- - **Path-based memory triggers.** Memories auto-tagged with `path:<segment>` from your working directory. Recall boosts memories from the same location (up to 1.3x). Working in `src/api/`? API-related memories surface first.
77
- - **OpenCode integration.** `hippo hook install opencode` patches AGENTS.md. Auto-detected during `hippo init`. Integration guide with MCP config and skill for progressive discovery.
78
- - **`hippo export`** outputs all memories as JSON or markdown.
79
- - **Decision recall boost.** 1.2x scoring multiplier for decision-tagged memories so they surface despite low retrieval frequency.
80
-
81
- ### What's new in v0.9.1
82
-
83
- - **Auto-sleep on session exit.** `hippo hook install claude-code` now installs a Stop hook in `~/.claude/settings.json` so `hippo sleep` runs automatically when Claude Code exits. `hippo init` does this too when Claude Code is detected. No cron needed, no manual sleep.
84
-
85
- ### What's new in v0.9.0
86
-
87
- - **Working memory layer** (`hippo wm push/read/clear/flush`). Bounded buffer (max 20 per scope) with importance-based eviction. Current-state notes live separately from long-term memory.
88
- - **Session handoffs** (`hippo handoff create/latest/show`). Persist session summaries, next actions, and artifacts so successor sessions can resume without transcript archaeology.
89
- - **Session lifecycle** with explicit start/end events, fallback session IDs, and `hippo session resume` for continuity.
90
- - **Explainable recall** (`hippo recall --why`). See which terms matched, whether BM25 or embedding contributed, and the source bucket (layer, confidence, local/global).
91
- - **`hippo current show`** for compact current-state display (active task + recent session events), ready for agent injection.
92
- - **SQLite lock hardening**: `busy_timeout=5000`, `synchronous=NORMAL`, `wal_autocheckpoint=100`. Concurrent plugin calls no longer hit `SQLITE_BUSY`.
93
- - **Consolidation batching**: all writes/deletes happen in a single transaction instead of N open/close cycles.
94
- - **`--limit` flag** on `hippo recall` and `hippo context` to cap result count independently of token budget.
95
- - **Plugin injection dedup guard** prevents double context injection on reconnect.
96
-
97
- ### What's new in v0.8.0
98
-
99
- - **Hybrid search** blends BM25 keywords with cosine embedding similarity. Install `@xenova/transformers`, run `hippo embed`, recall quality jumps. Falls back to BM25 otherwise.
100
- - **Schema acceleration** auto-computes how well new memories fit existing patterns. Familiar memories consolidate faster; novel ones decay faster if unused.
101
- - **Multi-agent shared memory** with `hippo share`, `hippo peers`, and transfer scoring. Universal lessons travel between projects; project-specific config stays local.
102
- - **Conflict resolution** via `hippo resolve <id> --keep <mem_id>`. Closes the detect-inspect-resolve loop.
103
- - **Agent eval benchmark** validates the learning hypothesis: hippo agents drop from 78% trap rate to 14% over a 50-task sequence.
104
-
105
- ### Zero-config agent integration
106
-
107
- `hippo init` auto-detects your agent framework and wires itself in:
108
-
109
- ```bash
110
- cd my-project
111
- hippo init
112
-
113
- # Initialized Hippo at /my-project
114
- # Directories: buffer/ episodic/ semantic/ conflicts/
115
- # Auto-installed claude-code hook in CLAUDE.md
116
- ```
117
-
118
- If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw/OpenCode. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
119
-
120
- It also sets up a daily cron job (6:15am) that runs `hippo learn --git` and `hippo sleep` automatically. Memories get captured from your commits and consolidated every day without you thinking about it.
121
-
122
- To skip: `hippo init --no-hooks --no-schedule`
123
-
124
- ---
125
-
126
- ## Cross-Tool Import
127
-
128
- Your memories shouldn't be locked inside one tool. Hippo pulls them in from anywhere.
129
-
130
- ```bash
131
- # ChatGPT memory export
132
- hippo import --chatgpt memories.json
133
-
134
- # Claude's CLAUDE.md (skips existing hippo hook blocks)
135
- hippo import --claude CLAUDE.md
136
-
137
- # Cursor rules
138
- hippo import --cursor .cursorrules
139
-
140
- # Any markdown file (headings become tags)
141
- hippo import --markdown MEMORY.md
142
-
143
- # Any text file
144
- hippo import --file notes.txt
145
- ```
146
-
147
- All import commands support `--dry-run` (preview without writing), `--global` (write to `~/.hippo/`), and `--tag` (add extra tags). Duplicates are detected and skipped automatically.
148
-
149
- ### Conversation Capture
150
-
151
- Extract memories from raw conversation text. No LLM needed: pattern-based heuristics find decisions, rules, errors, and preferences.
152
-
153
- ```bash
154
- # Pipe a conversation in
155
- cat session.log | hippo capture --stdin
156
-
157
- # Or point at a file
158
- hippo capture --file conversation.md
159
-
160
- # Preview first
161
- hippo capture --file conversation.md --dry-run
162
- ```
163
-
164
- ### Active task snapshots
165
-
166
- Long-running work needs short-term continuity, not just long-term memory. Hippo can persist the current in-flight task so a later `continue` has something concrete to recover.
167
-
168
- ```bash
169
- hippo snapshot save \
170
- --task "Ship SQLite backbone" \
171
- --summary "Tests/build/smoke are green, next slice is active-session recovery" \
172
- --next-step "Implement active snapshot retrieval in context output"
173
-
174
- hippo snapshot show
175
- hippo context --auto --budget 1500
176
- hippo snapshot clear
177
- ```
178
-
179
- `hippo context --auto` includes the active task snapshot before long-term memories, so agents get both the immediate thread and the deeper lessons.
180
-
181
- ### Session event trails
182
-
183
- Manual snapshots are useful, but real work also needs a breadcrumb trail. Hippo can now store short session events and link them to the active snapshot so context output shows the latest steps, not just the last summary.
184
-
185
- ```bash
186
- hippo session log \
187
- --id sess_20260326 \
188
- --task "Ship continuity" \
189
- --type progress \
190
- --content "Schema migration is done, next step is CLI wiring"
191
-
192
- hippo snapshot save \
193
- --task "Ship continuity" \
194
- --summary "Structured session events are flowing" \
195
- --next-step "Surface them in framework hooks" \
196
- --session sess_20260326
197
-
198
- hippo session show --id sess_20260326
199
- hippo context --auto --budget 1500
200
- ```
201
-
202
- Hippo mirrors the latest trail to `.hippo/buffer/recent-session.md` so you can inspect the short-term thread without opening SQLite.
203
-
204
- ### Session handoffs
205
-
206
- When you're done for the day (or switching to another agent), create a handoff so the next session knows exactly where to pick up:
207
-
208
- ```bash
209
- hippo handoff create \
210
- --summary "Finished schema migration, tests green" \
211
- --next "Wire handoff injection into context output" \
212
- --session sess_20260403 \
213
- --artifact src/db.ts
214
-
215
- hippo handoff latest # show the most recent handoff
216
- hippo handoff show 3 # show a specific handoff by ID
217
- hippo session resume # re-inject latest handoff as context
218
- ```
219
-
220
- ### Working memory
221
-
222
- Working memory is a bounded scratchpad for current-state notes. It's separate from long-term memory and gets cleared between sessions.
223
-
224
- ```bash
225
- hippo wm push --scope repo \
226
- --content "Investigating flaky test in store.test.ts, line 42" \
227
- --importance 0.9
228
-
229
- hippo wm read --scope repo # show current working notes
230
- hippo wm clear --scope repo # wipe the scratchpad
231
- hippo wm flush --scope repo # flush on session end
232
- ```
233
-
234
- The buffer holds a maximum of 20 entries per scope. When full, the lowest-importance entry is evicted.
235
-
236
- ### Explainable recall
237
-
238
- See why a memory was returned:
239
-
240
- ```bash
241
- hippo recall "data pipeline" --why --limit 5
242
-
243
- # --- mem_a1b2c3 [episodic] [observed] [local] score=0.847
244
- # BM25: matched [data, pipeline]; cosine: 0.82
245
- # ...memory content...
246
- ```
247
-
248
- ---
249
-
250
- ## How It Works
251
-
252
- Input enters the buffer. Important things get encoded into episodic memory. During "sleep," repeated episodes compress into semantic patterns. Weak memories decay and disappear.
253
-
254
- ```
255
- New information
256
- |
257
- v
258
- +-----------+
259
- | Buffer | Working memory. Current session only. No decay.
260
- | (session) |
261
- +-----+-----+
262
- | encoded (tags, strength, half-life assigned)
263
- v
264
- +-----------+
265
- | Episodic | Timestamped memories. Decay by default.
266
- | Store | Retrieval strengthens. Errors stick longer.
267
- +-----+-----+
268
- | consolidation (hippo sleep)
269
- v
270
- +-----------+
271
- | Semantic | Compressed patterns. Stable. Schema-aware.
272
- | Store | Extracted from repeated episodes.
273
- +-----------+
274
-
275
- hippo sleep: decay + replay + merge
276
- ```
277
-
278
- ---
279
-
280
- ## Key Features
281
-
282
- ### Decay by default
283
-
284
- Every memory has a half-life. 7 days by default. Persistence is earned.
285
-
286
- ```bash
287
- hippo remember "always check cache contents after refresh"
288
- # stored with half_life: 7d, strength: 1.0
289
-
290
- # 14 days later with no retrieval:
291
- hippo inspect mem_a1b2c3
292
- # strength: 0.25 (decayed by 2 half-lives)
293
- # at risk of removal on next sleep
294
- ```
295
-
296
- ---
297
-
298
- ### Retrieval strengthens
299
-
300
- Use it or lose it. Each recall boosts the half-life by 2 days.
301
-
302
- ```bash
303
- hippo recall "cache issues"
304
- # finds mem_a1b2c3, retrieval_count: 1 -> 2
305
- # half_life extended: 7d -> 9d
306
- # strength recalculated from retrieval timestamp
307
-
308
- hippo recall "cache issues" # again next week
309
- # retrieval_count: 2 -> 3
310
- # half_life: 9d -> 11d
311
- # this memory is learning to survive
312
- ```
313
-
314
- ---
315
-
316
- ### Active invalidation
317
-
318
- When you migrate from one tool to another, old memories about the replaced tool should die immediately. Hippo detects migration and breaking-change commits during `hippo learn --git` and actively weakens matching memories.
319
-
320
- ```bash
321
- hippo learn --git
322
- # feat: migrate from webpack to vite
323
- # Invalidated 3 memories referencing "webpack"
324
- # Learned: migrate from webpack to vite
325
- ```
326
-
327
- You can also invalidate manually:
328
-
329
- ```bash
330
- hippo invalidate "REST API" --reason "migrated to GraphQL"
331
- # Invalidated 5 memories referencing "REST API".
332
- ```
333
-
334
- ---
335
-
336
- ### Architectural decisions
337
-
338
- One-off decisions don't repeat, so they can't earn their keep through retrieval alone. `hippo decide` stores them with a 90-day half-life and verified confidence so they survive long enough to matter.
339
-
340
- ```bash
341
- hippo decide "Use PostgreSQL for all new services" --context "JSONB support"
342
- # Decision recorded: mem_a1b2c3
343
-
344
- # Later, when the decision changes:
345
- hippo decide "Use CockroachDB for global services" \
346
- --context "Need multi-region" \
347
- --supersedes mem_a1b2c3
348
- # Superseded mem_a1b2c3 (half-life halved, marked stale)
349
- # Decision recorded: mem_d4e5f6
350
- ```
351
-
352
- ---
353
-
354
- ### Error memories stick
355
-
356
- Tag a memory as an error and it gets 2x the half-life automatically.
357
-
358
- ```bash
359
- hippo remember "deployment failed: forgot to run migrations" --error
360
- # half_life: 14d instead of 7d
361
- # emotional_valence: negative
362
- # strength formula applies 1.5x multiplier
363
-
364
- # production incidents don't fade quietly
365
- ```
366
-
367
- ---
368
-
369
- ### Confidence tiers
370
-
371
- Every memory carries a confidence level: `verified`, `observed`, `inferred`, or `stale`. This tells agents how much to trust what they're reading.
372
-
373
- ```bash
374
- hippo remember "API rate limit is 100/min" --verified
375
- hippo remember "deploy usually takes ~3 min" --observed
376
- hippo remember "the flaky test might be a race condition" --inferred
377
- ```
378
-
379
- When context is generated, confidence is shown inline:
380
-
381
- ```
382
- [verified] API rate limit is 100/min per the docs
383
- [observed] Deploy usually takes ~3 min
384
- [inferred] The flaky test might be a race condition
385
- ```
386
-
387
- Agents can see at a glance what's established fact vs. a pattern worth questioning.
388
-
389
- Memories unretrieved for 30+ days are automatically marked `stale` during the next `hippo sleep`. If one gets recalled again, Hippo wakes it back up to `observed` so it can earn trust again instead of staying permanently stale.
390
-
391
- ### Conflict tracking
392
-
393
- Hippo now detects obvious contradictions between overlapping memories and keeps them visible instead of silently letting both masquerade as truth.
394
-
395
- ```bash
396
- hippo sleep # refreshes open conflicts
397
- hippo conflicts # inspect them
398
- ```
399
-
400
- Open conflicts are stored in SQLite, mirrored under `.hippo/conflicts/`, and linked back into each memory's `conflicts_with` field.
401
-
402
- ---
403
-
404
- ### Observation framing
405
-
406
- Memories aren't presented as bare assertions. By default, Hippo frames them as observations with dates, so agents treat them as context rather than commands.
407
-
408
- ```bash
409
- hippo context --framing observe # default
410
- # Output: "Previously observed (2026-03-10): deploy takes ~3 min"
411
-
412
- hippo context --framing suggest
413
- # Output: "Consider: deploy takes ~3 min"
414
-
415
- hippo context --framing assert
416
- # Output: "Deploy takes ~3 min"
417
- ```
418
-
419
- Three modes: `observe` (default), `suggest`, `assert`. Choose based on how directive you want the memory to be.
420
-
421
- ---
422
-
423
- ### Sleep consolidation
424
-
425
- Run `hippo sleep` and episodes compress into patterns.
426
-
427
- ```bash
428
- hippo sleep
429
-
430
- # Running consolidation...
431
- #
432
- # Results:
433
- # Active memories: 23
434
- # Removed (decayed): 4
435
- # Merged episodic: 6
436
- # New semantic: 2
437
- ```
438
-
439
- Three or more related episodes get merged into a single semantic memory. The originals decay. The pattern survives.
440
-
441
- ---
442
-
443
- ### Outcome feedback
444
-
445
- Did the recalled memories actually help? Tell Hippo. It tightens the feedback loop.
446
-
447
- ```bash
448
- hippo recall "why is the gold model broken"
449
- # ... you read the memories and fix the bug ...
450
-
451
- hippo outcome --good
452
- # Applied positive outcome to 3 memories
453
- # reward factor increases, decay slows
454
-
455
- hippo outcome --bad
456
- # Applied negative outcome to 3 memories
457
- # reward factor decreases, decay accelerates
458
- ```
459
-
460
- Outcomes are cumulative. A memory with 5 positive outcomes and 0 negative has a reward factor of ~1.42, making its effective half-life 42% longer. A memory with 0 positive and 3 negative has a factor of ~0.63, decaying nearly twice as fast. Mixed outcomes converge toward neutral (1.0).
461
-
462
- ---
463
-
464
- ### Token budgets
465
-
466
- Recall only what fits. No context stuffing.
467
-
468
- ```bash
469
- # fits within Claude's 2K token window for task context
470
- hippo recall "deployment checklist" --budget 2000
471
-
472
- # need more for a big task
473
- hippo recall "full project history" --budget 8000
474
-
475
- # machine-readable for programmatic use
476
- hippo recall "api errors" --budget 1000 --json
477
- ```
478
-
479
- Results are ranked by `relevance * strength * recency`. The highest-signal memories fill the budget first.
480
-
481
- ---
482
-
483
- ### Auto-learn from git
484
-
485
- Hippo can scan your commit history and extract lessons from fix/revert/bug commits automatically.
486
-
487
- ```bash
488
- # Learn from the last 7 days of commits
489
- hippo learn --git
490
-
491
- # Learn from the last 30 days
492
- hippo learn --git --days 30
493
-
494
- # Scan multiple repos in one pass
495
- hippo learn --git --repos "~/project-a,~/project-b,~/project-c"
496
- ```
497
-
498
- The `--repos` flag accepts comma-separated paths. Hippo scans each repo's git log, extracts fix/revert/bug lessons, deduplicates against existing memories, and stores new ones. Pair with `hippo sleep` afterwards to consolidate.
499
-
500
- Ideal for a weekly cron:
501
-
502
- ```bash
503
- hippo learn --git --repos "~/repo1,~/repo2" --days 7
504
- hippo sleep
505
- ```
506
-
507
- ---
508
-
509
- ### Watch mode
510
-
511
- Wrap any command with `hippo watch` to auto-learn from failures:
512
-
513
- ```bash
514
- hippo watch "npm run build"
515
- # if it fails, Hippo captures the error automatically
516
- # next time an agent asks about build issues, the memory is there
517
- ```
518
-
519
- ---
520
-
521
- ## CLI Reference
522
-
523
- | Command | What it does |
524
- |---------|-------------|
525
- | `hippo init` | Create `.hippo/` + auto-install agent hooks |
526
- | `hippo init --global` | Create global store at `~/.hippo/` |
527
- | `hippo init --no-hooks` | Create `.hippo/` without auto-installing hooks |
528
- | `hippo remember "<text>"` | Store a memory |
529
- | `hippo remember "<text>" --tag <t>` | Store with tag (repeatable) |
530
- | `hippo remember "<text>" --error` | Store as error (2x half-life) |
531
- | `hippo remember "<text>" --pin` | Store with no decay |
532
- | `hippo remember "<text>" --verified` | Set confidence: verified (default) |
533
- | `hippo remember "<text>" --observed` | Set confidence: observed |
534
- | `hippo remember "<text>" --inferred` | Set confidence: inferred |
535
- | `hippo remember "<text>" --global` | Store in global `~/.hippo/` store |
536
- | `hippo recall "<query>"` | Retrieve relevant memories (local + global) |
537
- | `hippo recall "<query>" --budget <n>` | Recall within token limit (default: 4000) |
538
- | `hippo recall "<query>" --limit <n>` | Cap result count |
539
- | `hippo recall "<query>" --why` | Show match reasons and source buckets |
540
- | `hippo recall "<query>" --json` | Output as JSON |
541
- | `hippo context --auto` | Smart context injection (auto-detects task from git) |
542
- | `hippo context "<query>" --budget <n>` | Context injection with explicit query (default: 1500) |
543
- | `hippo context --limit <n>` | Cap memory count in context |
544
- | `hippo context --budget 0` | Skip entirely (zero token cost) |
545
- | `hippo context --framing <mode>` | Framing: observe (default), suggest, assert |
546
- | `hippo context --format <fmt>` | Output format: markdown (default) or json |
547
- | `hippo import --chatgpt <path>` | Import from ChatGPT memory export (JSON or txt) |
548
- | `hippo import --claude <path>` | Import from CLAUDE.md or Claude memory.json |
549
- | `hippo import --cursor <path>` | Import from .cursorrules or .cursor/rules |
550
- | `hippo import --markdown <path>` | Import from structured markdown (headings -> tags) |
551
- | `hippo import --file <path>` | Import from any text file |
552
- | `hippo import --dry-run` | Preview import without writing |
553
- | `hippo import --global` | Write imported memories to `~/.hippo/` |
554
- | `hippo capture --stdin` | Extract memories from piped conversation text |
555
- | `hippo capture --file <path>` | Extract memories from a file |
556
- | `hippo capture --dry-run` | Preview extraction without writing |
557
- | `hippo sleep` | Run consolidation (decay + merge + compress) |
558
- | `hippo sleep --dry-run` | Preview consolidation without writing |
559
- | `hippo status` | Memory health: counts, strengths, last sleep |
560
- | `hippo outcome --good` | Strengthen last recalled memories |
561
- | `hippo outcome --bad` | Weaken last recalled memories |
562
- | `hippo outcome --id <id> --good` | Target a specific memory |
563
- | `hippo inspect <id>` | Full detail on one memory |
564
- | `hippo forget <id>` | Force remove a memory |
565
- | `hippo embed` | Embed all memories for semantic search |
566
- | `hippo embed --status` | Show embedding coverage |
567
- | `hippo watch "<command>"` | Run command, auto-learn from failures |
568
- | `hippo learn --git` | Scan recent git commits for lessons |
569
- | `hippo learn --git --days <n>` | Scan N days back (default: 7) |
570
- | `hippo learn --git --repos <paths>` | Scan multiple repos (comma-separated) |
571
- | `hippo conflicts` | List detected open memory conflicts |
572
- | `hippo conflicts --json` | Output conflicts as JSON |
573
- | `hippo resolve <id>` | Show both conflicting memories for comparison |
574
- | `hippo resolve <id> --keep <mem_id>` | Resolve: keep winner, weaken loser |
575
- | `hippo resolve <id> --keep <mem_id> --forget` | Resolve: keep winner, delete loser |
576
- | `hippo promote <id>` | Copy a local memory to the global store |
577
- | `hippo share <id>` | Share with attribution + transfer scoring |
578
- | `hippo share <id> --force` | Share even if transfer score is low |
579
- | `hippo share --auto` | Auto-share all high-scoring memories |
580
- | `hippo share --auto --dry-run` | Preview what would be shared |
581
- | `hippo peers` | List projects contributing to global store |
582
- | `hippo sync` | Pull global memories into local project |
583
- | `hippo invalidate "<pattern>"` | Actively weaken memories matching an old pattern |
584
- | `hippo invalidate "<pattern>" --reason "<why>"` | Include what replaced it |
585
- | `hippo decide "<decision>"` | Record architectural decision (90-day half-life) |
586
- | `hippo decide "<decision>" --context "<why>"` | Include reasoning |
587
- | `hippo decide "<decision>" --supersedes <id>` | Supersede a previous decision |
588
- | `hippo hook list` | Show available framework hooks |
589
- | `hippo hook install <target>` | Install hook (claude-code also adds Stop hook for auto-sleep) |
590
- | `hippo hook uninstall <target>` | Remove hook |
591
- | `hippo handoff create --summary "..."` | Create a session handoff |
592
- | `hippo handoff latest` | Show the most recent handoff |
593
- | `hippo handoff show <id>` | Show a specific handoff by ID |
594
- | `hippo session latest` | Show latest task snapshot + events |
595
- | `hippo session resume` | Re-inject latest handoff as context |
596
- | `hippo current show` | Compact current state (task + session events) |
597
- | `hippo wm push --scope <s> --content "..."` | Push to working memory |
598
- | `hippo wm read --scope <s>` | Read working memory entries |
599
- | `hippo wm clear --scope <s>` | Clear working memory |
600
- | `hippo wm flush --scope <s>` | Flush working memory (session end) |
601
- | `hippo dashboard` | Open web dashboard at localhost:3333 |
602
- | `hippo dashboard --port <n>` | Use custom port |
603
- | `hippo mcp` | Start MCP server (stdio transport) |
604
-
605
- ---
606
-
607
- ## Framework Integrations
608
-
609
- ### Auto-install (recommended)
610
-
611
- `hippo init` detects your agent framework and patches the right config file automatically:
612
-
613
- | Framework | Detected by | Patches |
614
- |-----------|------------|---------|
615
- | Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` + Stop hook in `settings.json` |
616
- | Codex | `AGENTS.md` or `.codex` | `AGENTS.md` |
617
- | Cursor | `.cursorrules` or `.cursor/rules` | `.cursorrules` |
618
- | OpenClaw | `.openclaw` or `AGENTS.md` | `AGENTS.md` |
619
- | OpenCode | `.opencode/` or `opencode.json` | `AGENTS.md` |
620
-
621
- No extra commands needed. Just `hippo init` and your agent knows about Hippo.
622
-
623
- ### Manual install
624
-
625
- If you prefer explicit control:
626
-
627
- ```bash
628
- hippo hook install claude-code # patches CLAUDE.md + adds Stop hook to settings.json
629
- hippo hook install codex # patches AGENTS.md
630
- hippo hook install cursor # patches .cursorrules
631
- hippo hook install openclaw # patches AGENTS.md
632
- hippo hook install opencode # patches AGENTS.md
633
- ```
634
-
635
- This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the agent to:
636
- 1. Run `hippo context --auto --budget 1500` at session start
637
- 2. Run `hippo remember "<lesson>" --error` on errors
638
- 3. Run `hippo outcome --good` on completion
639
-
640
- For Claude Code, it also adds a Stop hook to `~/.claude/settings.json` so `hippo sleep` runs automatically when the session exits.
641
-
642
- To remove: `hippo hook uninstall claude-code`
643
-
644
- ### What the hook adds (Claude Code example)
645
-
646
- ```markdown
647
- ## Project Memory (Hippo)
648
-
649
- Before starting work, load relevant context:
650
- hippo context --auto --budget 1500
651
-
652
- When you hit an error or discover a gotcha:
653
- hippo remember "<what went wrong and why>" --error
654
-
655
- After completing work successfully:
656
- hippo outcome --good
657
- ```
658
-
659
- ### MCP Server
660
-
661
- For any MCP-compatible client (Cursor, Windsurf, Cline, Claude Desktop):
662
-
663
- ```bash
664
- hippo mcp # starts MCP server over stdio
665
- ```
666
-
667
- Add to your MCP config (e.g. `.cursor/mcp.json` or `claude_desktop_config.json`):
668
-
669
- ```json
670
- {
671
- "mcpServers": {
672
- "hippo-memory": {
673
- "command": "hippo",
674
- "args": ["mcp"]
675
- }
676
- }
677
- }
678
- ```
679
-
680
- Exposes tools: `hippo_recall`, `hippo_remember`, `hippo_outcome`, `hippo_context`, `hippo_status`, `hippo_learn`, `hippo_wm_push`.
681
-
682
- ### OpenClaw Plugin
683
-
684
- Native plugin with auto-context injection, workspace-aware memory lookup, and
685
- tool hooks for auto-learn / auto-sleep.
686
-
687
- ```bash
688
- openclaw plugins install hippo-memory
689
- openclaw plugins enable hippo-memory
690
- ```
691
-
692
- Plugin docs: [extensions/openclaw-plugin/](extensions/openclaw-plugin/). Integration guide: [integrations/openclaw.md](integrations/openclaw.md).
693
-
694
- ### Claude Code Plugin
695
-
696
- Plugin with SessionStart/Stop hooks and error auto-capture. See [extensions/claude-code-plugin/](extensions/claude-code-plugin/).
697
-
698
- Full integration details: [integrations/](integrations/)
699
-
700
- ---
701
-
702
- ## The Neuroscience
703
-
704
- Hippo is modeled on seven properties of the human hippocampus. Not metaphorically. Literally.
705
-
706
- **Why two stores?** The brain uses a fast hippocampal buffer + a slow neocortical store (Complementary Learning Systems theory, McClelland et al. 1995). If the neocortex learned fast, new information would overwrite old knowledge. The buffer absorbs new episodes; the neocortex extracts patterns over time.
707
-
708
- **Why does decay help?** New neurons born in the dentate gyrus actively disrupt old memory traces (Frankland et al. 2013). This is adaptive: it reduces interference from outdated information. Forgetting isn't failure. It's maintenance.
709
-
710
- **Why do errors stick?** The amygdala modulates hippocampal consolidation based on emotional significance. Fear and error signals boost encoding. Your first production incident is burned into memory. Your 200th uneventful deploy isn't.
711
-
712
- **Why does retrieval strengthen?** Recalled memories undergo "reconsolidation" (Nader et al. 2000). The act of retrieval destabilizes the trace, then re-encodes it stronger. This is the testing effect. Hippo implements it mechanically via the half-life extension on recall.
713
-
714
- **Why does sleep consolidate?** During sleep, the hippocampus replays compressed versions of recent episodes and "teaches" the neocortex by repeatedly activating the same patterns. Hippo's `sleep` command runs this as a deliberate consolidation pass.
715
-
716
- The 7 mechanisms in full: [PLAN.md#core-principles](PLAN.md#core-principles)
717
-
718
- For how these mechanisms connect to LLM training, continual learning, and open research problems: **[RESEARCH.md](RESEARCH.md)**
719
-
720
- **Why does reward modulate decay?** In spiking neural networks, reward-modulated STDP strengthens synapses that contribute to positive outcomes and weakens those that don't. Hippo's reward-proportional decay (v0.11.0) implements this: memories with consistent positive outcomes decay slower, negatives decay faster, with no fixed deltas. Inspired by [MH-FLOCKE](https://github.com/MarcHesse/mhflocke)'s R-STDP architecture for quadruped locomotion, where the same mechanism produces stable learning with 11.6x lower variance than PPO.
721
-
722
- **Prior art in agent memory simulation.** The idea that human-like memory produces human-like behavior as an emergent property was explored in IEEE research from 2010-2011 ([5952114](https://ieeexplore.ieee.org/document/5952114), [5548405](https://ieeexplore.ieee.org/document/5548405), [5953964](https://ieeexplore.ieee.org/document/5953964)). Walking between rooms and forgetting why you went there doesn't need direct simulation; it emerges naturally from a memory system with capacity limits and decay. Hippo's design follows the same principle: implement the mechanisms, and the behavior follows.
723
-
724
- **Related work:** [HippoRAG](https://arxiv.org/abs/2405.14831) (Gutierrez et al., 2024) applies hippocampal indexing to RAG via knowledge graphs. [MemPalace](https://github.com/milla-jovovich/mempalace) (Sigman & Jovovich, 2026) organizes memory spatially (wings/halls/rooms) with AAAK compression, achieving 100% on [LongMemEval](https://arxiv.org/abs/2410.10813). [MH-FLOCKE](https://github.com/MarcHesse/mhflocke) (Hesse, 2026) uses spiking neurons with R-STDP for embodied cognition. Each system tackles a different facet: HippoRAG optimizes retrieval quality, MemPalace optimizes retrieval organization, MH-FLOCKE optimizes embodied learning, and Hippo optimizes memory lifecycle.
725
-
726
- ---
727
-
728
- ## Comparison
729
-
730
- | Feature | Hippo | MemPalace | Mem0 | Basic Memory |
731
- |---------|-------|-----------|------|-------------|
732
- | Decay by default | Yes | No | No | No |
733
- | Retrieval strengthening | Yes | No | No | No |
734
- | Reward-proportional decay | Yes | No | No | No |
735
- | Hybrid search (BM25 + embeddings) | Yes | Embeddings + spatial | Embeddings only | No |
736
- | Schema acceleration | Yes | No | No | No |
737
- | Conflict detection + resolution | Yes | No | No | No |
738
- | Multi-agent shared memory | Yes | No | No | No |
739
- | Transfer scoring | Yes | No | No | No |
740
- | Outcome tracking | Yes | No | No | No |
741
- | Confidence tiers | Yes | No | No | No |
742
- | Spatial organization | No | Yes (wings/halls/rooms) | No | No |
743
- | Lossless compression | No | Yes (AAAK, 30x) | No | No |
744
- | Cross-tool import | Yes | No | No | No |
745
- | Auto-hook install | Yes | No | No | No |
746
- | MCP server | Yes | Yes | No | No |
747
- | Zero dependencies | Yes | No (ChromaDB) | No | No |
748
- | LongMemEval R@5 (retrieval) | 74.0% (BM25 only) | 96.6% (raw) / 100% (reranked) | ~49-85% | N/A |
749
- | Git-friendly | Yes | No | No | Yes |
750
- | Framework agnostic | Yes | Yes | Partial | Yes |
751
-
752
- Different tools answer different questions. Mem0 and Basic Memory implement "save everything, search later." MemPalace implements "store everything, organize spatially for retrieval." Hippo implements "forget by default, earn persistence through use." These are complementary approaches: MemPalace's retrieval precision + Hippo's lifecycle management would be stronger than either alone.
753
-
754
- ---
755
-
756
- ## Benchmarks
757
-
758
- Two benchmarks testing two different things. Full details in [`benchmarks/`](benchmarks/).
759
-
760
- ### LongMemEval (retrieval accuracy)
761
-
762
- [LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) is the industry-standard benchmark: 500 questions across 5 memory abilities, embedded in 115k+ token chat histories.
763
-
764
- **Hippo v0.11.0 results (BM25 only, zero dependencies):**
765
-
766
- | Metric | Score |
767
- |--------|-------|
768
- | Recall@1 | 50.4% |
769
- | Recall@3 | 66.6% |
770
- | Recall@5 | 74.0% |
771
- | Recall@10 | 82.6% |
772
- | Answer in content@5 | 46.6% |
773
-
774
- | Question Type | Count | R@5 |
775
- |---------------|-------|-----|
776
- | single-session-assistant | 56 | 94.6% |
777
- | knowledge-update | 78 | 88.5% |
778
- | temporal-reasoning | 133 | 73.7% |
779
- | multi-session | 133 | 72.2% |
780
- | single-session-user | 70 | 65.7% |
781
- | single-session-preference | 30 | 26.7% |
782
-
783
- For context: MemPalace scores 96.6% (raw) using ChromaDB embeddings + spatial indexing. Hippo achieves 74.0% using BM25 keyword matching alone with zero runtime dependencies. Adding embeddings via `hippo embed` (optional `@xenova/transformers` peer dep) enables hybrid search and should close the gap.
784
-
785
- Hippo's strongest categories (knowledge-update 88.5%, single-session-assistant 94.6%) are the ones where keyword overlap between question and stored content is highest. The weakest (preference 26.7%) involves indirect references that need semantic understanding.
786
-
787
- ```bash
788
- cd benchmarks/longmemeval
789
- python ingest_direct.py --data data/longmemeval_oracle.json --store-dir ./store
790
- python retrieve_fast.py --data data/longmemeval_oracle.json --store-dir ./store --output results/retrieval.jsonl
791
- python evaluate_retrieval.py --retrieval results/retrieval.jsonl --data data/longmemeval_oracle.json
792
- ```
793
-
794
- ### Sequential Learning Benchmark (agent improvement over time)
795
-
796
- No other public benchmark tests whether memory systems produce learning curves. LongMemEval tests retrieval on a fixed corpus. This benchmark tests whether an agent with memory *performs better on task 40 than task 5*.
797
-
798
- 50 tasks, 10 trap categories, each appearing 2-3 times across the sequence.
799
-
800
- **Hippo v0.11.0 results:**
801
-
802
- | Condition | Overall | Early | Mid | Late | Learns? |
803
- |-----------|---------|-------|-----|------|---------|
804
- | No memory | 100% | 100% | 100% | 100% | No |
805
- | Static memory | 20% | 33% | 11% | 14% | No |
806
- | Hippo | 40% | 78% | 22% | 14% | Yes |
807
-
808
- The hippo agent's trap-hit rate drops from 78% to 14% as it accumulates error memories with 2x half-life. Static pre-loaded memory helps from the start but doesn't improve. Any memory system can run this benchmark by implementing the [adapter interface](benchmarks/sequential-learning/adapters/interface.mjs).
809
-
810
- ```bash
811
- cd benchmarks/sequential-learning
812
- node run.mjs --adapter all
813
- ```
814
-
815
- ---
816
-
817
- ## Contributing
818
-
819
- Issues and PRs welcome. Before contributing, run `hippo status` in the repo root to see the project's own memory.
820
-
821
- The interesting problems:
822
- - **Improve LongMemEval score.** Current R@5 is 74.0% with BM25 only. Adding embeddings (`hippo embed`) and hybrid search should close the gap toward MemPalace's 96.6%.
823
- - Better consolidation heuristics (LLM-powered merge vs current text overlap)
824
- - Web UI / dashboard for visualizing decay curves and memory health
825
- - Optimal decay parameter tuning from real usage data
826
- - Cross-agent transfer learning evaluation
827
- - **MemPalace-style spatial organization.** Could spatial structure (wings/halls/rooms) improve hippo's semantic layer?
828
- - **AAAK-style compression for semantic memories.** Lossless token compression for context injection.
829
-
830
- ## License
831
-
832
- MIT
1
+ # 🦛 Hippo
2
+
3
+ **The secret to good memory isn't remembering more. It's knowing what to forget.**
4
+
5
+ [![npm](https://img.shields.io/npm/v/hippo-memory)](https://npmjs.com/package/hippo-memory)
6
+ [![license](https://img.shields.io/badge/license-MIT-blue)](./LICENSE)
7
+
8
+ ```
9
+ Works with: Claude Code, Codex, Cursor, OpenClaw, OpenCode, any CLI agent
10
+ Imports from: ChatGPT, Claude (CLAUDE.md), Cursor (.cursorrules), any markdown
11
+ Storage: SQLite backbone + markdown/YAML mirrors. Git-trackable and human-readable.
12
+ Dependencies: Zero runtime deps. Requires Node.js 22.5+. Optional embeddings via @xenova/transformers.
13
+ ```
14
+
15
+ ---
16
+
17
+ ## The Problem
18
+
19
+ AI agents forget everything between sessions. Existing solutions just save everything and search later. That's a filing cabinet, not a brain.
20
+
21
+ Your memories are also trapped. ChatGPT knows things Claude doesn't. Cursor rules don't travel to Codex. Switch tools and you start from zero.
22
+
23
+ ---
24
+
25
+ ## Who Is This For
26
+
27
+ - **Multi-tool developers.** You use Claude Code on Monday, Cursor on Tuesday, Codex on Wednesday. Context doesn't carry over. Hippo is the shared memory layer across all of them.
28
+ - **Teams where agents repeat mistakes.** The agent hit the same deployment bug last week. And the week before. Hippo's error memories and decay mechanics mean hard lessons stick and noise fades.
29
+ - **Anyone whose CLAUDE.md is a mess.** Your instruction file grew to 400 lines of mixed rules, preferences, and stale workarounds. Hippo gives that structure: tags, confidence levels, automatic decay of outdated info.
30
+ - **People who want portable AI memory.** No vendor lock-in. Markdown files in your repo. Import from ChatGPT, Claude, Cursor. Export by copying a folder.
31
+
32
+ ---
33
+
34
+ ## Quick Start
35
+
36
+ ```bash
37
+ npm install -g hippo-memory
38
+
39
+ hippo init
40
+ hippo remember "FRED cache silently dropped the tips_10y series" --tag error
41
+ hippo recall "data pipeline issues" --budget 2000
42
+ ```
43
+
44
+ That's it. You have a memory system.
45
+
46
+ ### What's new in v0.14.0
47
+
48
+ - **OpenClaw backup cleanup.** Plugin updates no longer leave `hippo-memory.bak-*` directories that cause duplicate plugin ID errors. Cleanup runs automatically at boot.
49
+
50
+ ### What's new in v0.13.3
51
+
52
+ - **Final polish.** 8 remaining review findings fixed: ROLLBACK safety, MCP protocol compliance, dead code removal, atomic write cleanup, env var trimming, ESM import consistency.
53
+
54
+ ### What's new in v0.13.2
55
+
56
+ - **7 more bug fixes** from second deep review: Windows schtasks injection, MCP error handling, cross-store budget consistency, embedding mutex, and more. See CHANGELOG.
57
+
58
+ ### What's new in v0.13.0
59
+
60
+ - **Security: command injection fixed.** OpenClaw plugin now uses `execFileSync` (no shell). All user input is passed as array args, eliminating shell injection vectors.
61
+ - **17 bug fixes** across search, embeddings, physics, MCP server, store, and CLI. See CHANGELOG for details.
62
+
63
+ ### What's new in v0.12.0
64
+
65
+ - **Configurable global store.** Set `$HIPPO_HOME` or use XDG (`$XDG_DATA_HOME/hippo`) to put the global store wherever you want. Falls back to `~/.hippo/` if neither is set.
66
+
67
+ ### What's new in v0.11.2
68
+
69
+ - **Cross-platform path fix.** OpenClaw plugin now correctly resolves `.hippo` paths on Unix when given Windows-style backslash paths. Uses `path/posix` instead of platform-dependent `path.basename`.
70
+
71
+ ### What's new in v0.11.1
72
+
73
+ - **OpenClaw error capture filtering.** The `autoLearn` hook now applies three filters before storing tool errors: a noise pattern filter for known transient errors, per-session rate limiting (max 5), and per-session deduplication. Prevents memory pollution from infrastructure noise.
74
+
75
+ ### What's new in v0.11.0
76
+
77
+ - **Reward-proportional decay.** Outcome feedback now modulates decay rate continuously instead of fixed half-life deltas. Memories with consistent positive outcomes decay up to 1.5x slower; consistent negatives decay up to 2x faster. Mixed outcomes converge toward neutral. Inspired by R-STDP in spiking neural networks. `hippo inspect` now shows cumulative outcome counts and the computed reward factor.
78
+ - **Public benchmarks.** Two benchmarks in `benchmarks/`: a [Sequential Learning Benchmark](benchmarks/sequential-learning/) (50 tasks, 10 traps, measures agent improvement over time) and a [LongMemEval integration](benchmarks/longmemeval/) (industry-standard 500-question retrieval benchmark, R@5=74.0% with BM25 only). The sequential learning benchmark is unique: no other public benchmark tests whether memory systems produce learning curves.
79
+
80
+ ### What's new in v0.10.0
81
+
82
+ - **Active invalidation.** `hippo learn --git` detects migration and breaking-change commits and actively weakens memories referencing the old pattern. Manual invalidation via `hippo invalidate "REST API" --reason "migrated to GraphQL"`.
83
+ - **Architectural decisions.** `hippo decide` stores one-off decisions with 90-day half-life and verified confidence. Supports `--context` for reasoning and `--supersedes` to chain decisions when the architecture evolves.
84
+ - **Path-based memory triggers.** Memories auto-tagged with `path:<segment>` from your working directory. Recall boosts memories from the same location (up to 1.3x). Working in `src/api/`? API-related memories surface first.
85
+ - **OpenCode integration.** `hippo hook install opencode` patches AGENTS.md. Auto-detected during `hippo init`. Integration guide with MCP config and skill for progressive discovery.
86
+ - **`hippo export`** outputs all memories as JSON or markdown.
87
+ - **Decision recall boost.** 1.2x scoring multiplier for decision-tagged memories so they surface despite low retrieval frequency.
88
+
89
+ ### What's new in v0.9.1
90
+
91
+ - **Auto-sleep on session exit.** `hippo hook install claude-code` now installs a Stop hook in `~/.claude/settings.json` so `hippo sleep` runs automatically when Claude Code exits. `hippo init` does this too when Claude Code is detected. No cron needed, no manual sleep.
92
+
93
+ ### What's new in v0.9.0
94
+
95
+ - **Working memory layer** (`hippo wm push/read/clear/flush`). Bounded buffer (max 20 per scope) with importance-based eviction. Current-state notes live separately from long-term memory.
96
+ - **Session handoffs** (`hippo handoff create/latest/show`). Persist session summaries, next actions, and artifacts so successor sessions can resume without transcript archaeology.
97
+ - **Session lifecycle** with explicit start/end events, fallback session IDs, and `hippo session resume` for continuity.
98
+ - **Explainable recall** (`hippo recall --why`). See which terms matched, whether BM25 or embedding contributed, and the source bucket (layer, confidence, local/global).
99
+ - **`hippo current show`** for compact current-state display (active task + recent session events), ready for agent injection.
100
+ - **SQLite lock hardening**: `busy_timeout=5000`, `synchronous=NORMAL`, `wal_autocheckpoint=100`. Concurrent plugin calls no longer hit `SQLITE_BUSY`.
101
+ - **Consolidation batching**: all writes/deletes happen in a single transaction instead of N open/close cycles.
102
+ - **`--limit` flag** on `hippo recall` and `hippo context` to cap result count independently of token budget.
103
+ - **Plugin injection dedup guard** prevents double context injection on reconnect.
104
+
105
+ ### What's new in v0.8.0
106
+
107
+ - **Hybrid search** blends BM25 keywords with cosine embedding similarity. Install `@xenova/transformers`, run `hippo embed`, recall quality jumps. Falls back to BM25 otherwise.
108
+ - **Schema acceleration** auto-computes how well new memories fit existing patterns. Familiar memories consolidate faster; novel ones decay faster if unused.
109
+ - **Multi-agent shared memory** with `hippo share`, `hippo peers`, and transfer scoring. Universal lessons travel between projects; project-specific config stays local.
110
+ - **Conflict resolution** via `hippo resolve <id> --keep <mem_id>`. Closes the detect-inspect-resolve loop.
111
+ - **Agent eval benchmark** validates the learning hypothesis: hippo agents drop from 78% trap rate to 14% over a 50-task sequence.
112
+
113
+ ### Zero-config agent integration
114
+
115
+ `hippo init` auto-detects your agent framework and wires itself in:
116
+
117
+ ```bash
118
+ cd my-project
119
+ hippo init
120
+
121
+ # Initialized Hippo at /my-project
122
+ # Directories: buffer/ episodic/ semantic/ conflicts/
123
+ # Auto-installed claude-code hook in CLAUDE.md
124
+ ```
125
+
126
+ If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw/OpenCode. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
127
+
128
+ It also sets up a daily cron job (6:15am) that runs `hippo learn --git` and `hippo sleep` automatically. Memories get captured from your commits and consolidated every day without you thinking about it.
129
+
130
+ To skip: `hippo init --no-hooks --no-schedule`
131
+
132
+ ---
133
+
134
+ ## Cross-Tool Import
135
+
136
+ Your memories shouldn't be locked inside one tool. Hippo pulls them in from anywhere.
137
+
138
+ ```bash
139
+ # ChatGPT memory export
140
+ hippo import --chatgpt memories.json
141
+
142
+ # Claude's CLAUDE.md (skips existing hippo hook blocks)
143
+ hippo import --claude CLAUDE.md
144
+
145
+ # Cursor rules
146
+ hippo import --cursor .cursorrules
147
+
148
+ # Any markdown file (headings become tags)
149
+ hippo import --markdown MEMORY.md
150
+
151
+ # Any text file
152
+ hippo import --file notes.txt
153
+ ```
154
+
155
+ All import commands support `--dry-run` (preview without writing), `--global` (write to `~/.hippo/`), and `--tag` (add extra tags). Duplicates are detected and skipped automatically.
156
+
157
+ ### Conversation Capture
158
+
159
+ Extract memories from raw conversation text. No LLM needed: pattern-based heuristics find decisions, rules, errors, and preferences.
160
+
161
+ ```bash
162
+ # Pipe a conversation in
163
+ cat session.log | hippo capture --stdin
164
+
165
+ # Or point at a file
166
+ hippo capture --file conversation.md
167
+
168
+ # Preview first
169
+ hippo capture --file conversation.md --dry-run
170
+ ```
171
+
172
+ ### Active task snapshots
173
+
174
+ Long-running work needs short-term continuity, not just long-term memory. Hippo can persist the current in-flight task so a later `continue` has something concrete to recover.
175
+
176
+ ```bash
177
+ hippo snapshot save \
178
+ --task "Ship SQLite backbone" \
179
+ --summary "Tests/build/smoke are green, next slice is active-session recovery" \
180
+ --next-step "Implement active snapshot retrieval in context output"
181
+
182
+ hippo snapshot show
183
+ hippo context --auto --budget 1500
184
+ hippo snapshot clear
185
+ ```
186
+
187
+ `hippo context --auto` includes the active task snapshot before long-term memories, so agents get both the immediate thread and the deeper lessons.
188
+
189
+ ### Session event trails
190
+
191
+ Manual snapshots are useful, but real work also needs a breadcrumb trail. Hippo can now store short session events and link them to the active snapshot so context output shows the latest steps, not just the last summary.
192
+
193
+ ```bash
194
+ hippo session log \
195
+ --id sess_20260326 \
196
+ --task "Ship continuity" \
197
+ --type progress \
198
+ --content "Schema migration is done, next step is CLI wiring"
199
+
200
+ hippo snapshot save \
201
+ --task "Ship continuity" \
202
+ --summary "Structured session events are flowing" \
203
+ --next-step "Surface them in framework hooks" \
204
+ --session sess_20260326
205
+
206
+ hippo session show --id sess_20260326
207
+ hippo context --auto --budget 1500
208
+ ```
209
+
210
+ Hippo mirrors the latest trail to `.hippo/buffer/recent-session.md` so you can inspect the short-term thread without opening SQLite.
211
+
212
+ ### Session handoffs
213
+
214
+ When you're done for the day (or switching to another agent), create a handoff so the next session knows exactly where to pick up:
215
+
216
+ ```bash
217
+ hippo handoff create \
218
+ --summary "Finished schema migration, tests green" \
219
+ --next "Wire handoff injection into context output" \
220
+ --session sess_20260403 \
221
+ --artifact src/db.ts
222
+
223
+ hippo handoff latest # show the most recent handoff
224
+ hippo handoff show 3 # show a specific handoff by ID
225
+ hippo session resume # re-inject latest handoff as context
226
+ ```
227
+
228
+ ### Working memory
229
+
230
+ Working memory is a bounded scratchpad for current-state notes. It's separate from long-term memory and gets cleared between sessions.
231
+
232
+ ```bash
233
+ hippo wm push --scope repo \
234
+ --content "Investigating flaky test in store.test.ts, line 42" \
235
+ --importance 0.9
236
+
237
+ hippo wm read --scope repo # show current working notes
238
+ hippo wm clear --scope repo # wipe the scratchpad
239
+ hippo wm flush --scope repo # flush on session end
240
+ ```
241
+
242
+ The buffer holds a maximum of 20 entries per scope. When full, the lowest-importance entry is evicted.
243
+
244
+ ### Explainable recall
245
+
246
+ See why a memory was returned:
247
+
248
+ ```bash
249
+ hippo recall "data pipeline" --why --limit 5
250
+
251
+ # --- mem_a1b2c3 [episodic] [observed] [local] score=0.847
252
+ # BM25: matched [data, pipeline]; cosine: 0.82
253
+ # ...memory content...
254
+ ```
255
+
256
+ ---
257
+
258
+ ## How It Works
259
+
260
+ Input enters the buffer. Important things get encoded into episodic memory. During "sleep," repeated episodes compress into semantic patterns. Weak memories decay and disappear.
261
+
262
+ ```
263
+ New information
264
+ |
265
+ v
266
+ +-----------+
267
+ | Buffer | Working memory. Current session only. No decay.
268
+ | (session) |
269
+ +-----+-----+
270
+ | encoded (tags, strength, half-life assigned)
271
+ v
272
+ +-----------+
273
+ | Episodic | Timestamped memories. Decay by default.
274
+ | Store | Retrieval strengthens. Errors stick longer.
275
+ +-----+-----+
276
+ | consolidation (hippo sleep)
277
+ v
278
+ +-----------+
279
+ | Semantic | Compressed patterns. Stable. Schema-aware.
280
+ | Store | Extracted from repeated episodes.
281
+ +-----------+
282
+
283
+ hippo sleep: decay + replay + merge
284
+ ```
285
+
286
+ ---
287
+
288
+ ## Key Features
289
+
290
+ ### Decay by default
291
+
292
+ Every memory has a half-life. 7 days by default. Persistence is earned.
293
+
294
+ ```bash
295
+ hippo remember "always check cache contents after refresh"
296
+ # stored with half_life: 7d, strength: 1.0
297
+
298
+ # 14 days later with no retrieval:
299
+ hippo inspect mem_a1b2c3
300
+ # strength: 0.25 (decayed by 2 half-lives)
301
+ # at risk of removal on next sleep
302
+ ```
303
+
304
+ ---
305
+
306
+ ### Retrieval strengthens
307
+
308
+ Use it or lose it. Each recall boosts the half-life by 2 days.
309
+
310
+ ```bash
311
+ hippo recall "cache issues"
312
+ # finds mem_a1b2c3, retrieval_count: 1 -> 2
313
+ # half_life extended: 7d -> 9d
314
+ # strength recalculated from retrieval timestamp
315
+
316
+ hippo recall "cache issues" # again next week
317
+ # retrieval_count: 2 -> 3
318
+ # half_life: 9d -> 11d
319
+ # this memory is learning to survive
320
+ ```
321
+
322
+ ---
323
+
324
+ ### Active invalidation
325
+
326
+ When you migrate from one tool to another, old memories about the replaced tool should die immediately. Hippo detects migration and breaking-change commits during `hippo learn --git` and actively weakens matching memories.
327
+
328
+ ```bash
329
+ hippo learn --git
330
+ # feat: migrate from webpack to vite
331
+ # Invalidated 3 memories referencing "webpack"
332
+ # Learned: migrate from webpack to vite
333
+ ```
334
+
335
+ You can also invalidate manually:
336
+
337
+ ```bash
338
+ hippo invalidate "REST API" --reason "migrated to GraphQL"
339
+ # Invalidated 5 memories referencing "REST API".
340
+ ```
341
+
342
+ ---
343
+
344
+ ### Architectural decisions
345
+
346
+ One-off decisions don't repeat, so they can't earn their keep through retrieval alone. `hippo decide` stores them with a 90-day half-life and verified confidence so they survive long enough to matter.
347
+
348
+ ```bash
349
+ hippo decide "Use PostgreSQL for all new services" --context "JSONB support"
350
+ # Decision recorded: mem_a1b2c3
351
+
352
+ # Later, when the decision changes:
353
+ hippo decide "Use CockroachDB for global services" \
354
+ --context "Need multi-region" \
355
+ --supersedes mem_a1b2c3
356
+ # Superseded mem_a1b2c3 (half-life halved, marked stale)
357
+ # Decision recorded: mem_d4e5f6
358
+ ```
359
+
360
+ ---
361
+
362
+ ### Error memories stick
363
+
364
+ Tag a memory as an error and it gets 2x the half-life automatically.
365
+
366
+ ```bash
367
+ hippo remember "deployment failed: forgot to run migrations" --error
368
+ # half_life: 14d instead of 7d
369
+ # emotional_valence: negative
370
+ # strength formula applies 1.5x multiplier
371
+
372
+ # production incidents don't fade quietly
373
+ ```
374
+
375
+ ---
376
+
377
+ ### Confidence tiers
378
+
379
+ Every memory carries a confidence level: `verified`, `observed`, `inferred`, or `stale`. This tells agents how much to trust what they're reading.
380
+
381
+ ```bash
382
+ hippo remember "API rate limit is 100/min" --verified
383
+ hippo remember "deploy usually takes ~3 min" --observed
384
+ hippo remember "the flaky test might be a race condition" --inferred
385
+ ```
386
+
387
+ When context is generated, confidence is shown inline:
388
+
389
+ ```
390
+ [verified] API rate limit is 100/min per the docs
391
+ [observed] Deploy usually takes ~3 min
392
+ [inferred] The flaky test might be a race condition
393
+ ```
394
+
395
+ Agents can see at a glance what's established fact vs. a pattern worth questioning.
396
+
397
+ Memories unretrieved for 30+ days are automatically marked `stale` during the next `hippo sleep`. If one gets recalled again, Hippo wakes it back up to `observed` so it can earn trust again instead of staying permanently stale.
398
+
399
+ ### Conflict tracking
400
+
401
+ Hippo now detects obvious contradictions between overlapping memories and keeps them visible instead of silently letting both masquerade as truth.
402
+
403
+ ```bash
404
+ hippo sleep # refreshes open conflicts
405
+ hippo conflicts # inspect them
406
+ ```
407
+
408
+ Open conflicts are stored in SQLite, mirrored under `.hippo/conflicts/`, and linked back into each memory's `conflicts_with` field.
409
+
410
+ ---
411
+
412
+ ### Observation framing
413
+
414
+ Memories aren't presented as bare assertions. By default, Hippo frames them as observations with dates, so agents treat them as context rather than commands.
415
+
416
+ ```bash
417
+ hippo context --framing observe # default
418
+ # Output: "Previously observed (2026-03-10): deploy takes ~3 min"
419
+
420
+ hippo context --framing suggest
421
+ # Output: "Consider: deploy takes ~3 min"
422
+
423
+ hippo context --framing assert
424
+ # Output: "Deploy takes ~3 min"
425
+ ```
426
+
427
+ Three modes: `observe` (default), `suggest`, `assert`. Choose based on how directive you want the memory to be.
428
+
429
+ ---
430
+
431
+ ### Sleep consolidation
432
+
433
+ Run `hippo sleep` and episodes compress into patterns.
434
+
435
+ ```bash
436
+ hippo sleep
437
+
438
+ # Running consolidation...
439
+ #
440
+ # Results:
441
+ # Active memories: 23
442
+ # Removed (decayed): 4
443
+ # Merged episodic: 6
444
+ # New semantic: 2
445
+ ```
446
+
447
+ Three or more related episodes get merged into a single semantic memory. The originals decay. The pattern survives.
448
+
449
+ ---
450
+
451
+ ### Outcome feedback
452
+
453
+ Did the recalled memories actually help? Tell Hippo. It tightens the feedback loop.
454
+
455
+ ```bash
456
+ hippo recall "why is the gold model broken"
457
+ # ... you read the memories and fix the bug ...
458
+
459
+ hippo outcome --good
460
+ # Applied positive outcome to 3 memories
461
+ # reward factor increases, decay slows
462
+
463
+ hippo outcome --bad
464
+ # Applied negative outcome to 3 memories
465
+ # reward factor decreases, decay accelerates
466
+ ```
467
+
468
+ Outcomes are cumulative. A memory with 5 positive outcomes and 0 negative has a reward factor of ~1.42, making its effective half-life 42% longer. A memory with 0 positive and 3 negative has a factor of ~0.63, decaying nearly twice as fast. Mixed outcomes converge toward neutral (1.0).
469
+
470
+ ---
471
+
472
+ ### Token budgets
473
+
474
+ Recall only what fits. No context stuffing.
475
+
476
+ ```bash
477
+ # fits within Claude's 2K token window for task context
478
+ hippo recall "deployment checklist" --budget 2000
479
+
480
+ # need more for a big task
481
+ hippo recall "full project history" --budget 8000
482
+
483
+ # machine-readable for programmatic use
484
+ hippo recall "api errors" --budget 1000 --json
485
+ ```
486
+
487
+ Results are ranked by `relevance * strength * recency`. The highest-signal memories fill the budget first.
488
+
489
+ ---
490
+
491
+ ### Auto-learn from git
492
+
493
+ Hippo can scan your commit history and extract lessons from fix/revert/bug commits automatically.
494
+
495
+ ```bash
496
+ # Learn from the last 7 days of commits
497
+ hippo learn --git
498
+
499
+ # Learn from the last 30 days
500
+ hippo learn --git --days 30
501
+
502
+ # Scan multiple repos in one pass
503
+ hippo learn --git --repos "~/project-a,~/project-b,~/project-c"
504
+ ```
505
+
506
+ The `--repos` flag accepts comma-separated paths. Hippo scans each repo's git log, extracts fix/revert/bug lessons, deduplicates against existing memories, and stores new ones. Pair with `hippo sleep` afterwards to consolidate.
507
+
508
+ Ideal for a weekly cron:
509
+
510
+ ```bash
511
+ hippo learn --git --repos "~/repo1,~/repo2" --days 7
512
+ hippo sleep
513
+ ```
514
+
515
+ ---
516
+
517
+ ### Watch mode
518
+
519
+ Wrap any command with `hippo watch` to auto-learn from failures:
520
+
521
+ ```bash
522
+ hippo watch "npm run build"
523
+ # if it fails, Hippo captures the error automatically
524
+ # next time an agent asks about build issues, the memory is there
525
+ ```
526
+
527
+ ---
528
+
529
+ ## CLI Reference
530
+
531
+ | Command | What it does |
532
+ |---------|-------------|
533
+ | `hippo init` | Create `.hippo/` + auto-install agent hooks |
534
+ | `hippo init --global` | Create global store at `~/.hippo/` |
535
+ | `hippo init --no-hooks` | Create `.hippo/` without auto-installing hooks |
536
+ | `hippo remember "<text>"` | Store a memory |
537
+ | `hippo remember "<text>" --tag <t>` | Store with tag (repeatable) |
538
+ | `hippo remember "<text>" --error` | Store as error (2x half-life) |
539
+ | `hippo remember "<text>" --pin` | Store with no decay |
540
+ | `hippo remember "<text>" --verified` | Set confidence: verified (default) |
541
+ | `hippo remember "<text>" --observed` | Set confidence: observed |
542
+ | `hippo remember "<text>" --inferred` | Set confidence: inferred |
543
+ | `hippo remember "<text>" --global` | Store in global `~/.hippo/` store |
544
+ | `hippo recall "<query>"` | Retrieve relevant memories (local + global) |
545
+ | `hippo recall "<query>" --budget <n>` | Recall within token limit (default: 4000) |
546
+ | `hippo recall "<query>" --limit <n>` | Cap result count |
547
+ | `hippo recall "<query>" --why` | Show match reasons and source buckets |
548
+ | `hippo recall "<query>" --json` | Output as JSON |
549
+ | `hippo context --auto` | Smart context injection (auto-detects task from git) |
550
+ | `hippo context "<query>" --budget <n>` | Context injection with explicit query (default: 1500) |
551
+ | `hippo context --limit <n>` | Cap memory count in context |
552
+ | `hippo context --budget 0` | Skip entirely (zero token cost) |
553
+ | `hippo context --framing <mode>` | Framing: observe (default), suggest, assert |
554
+ | `hippo context --format <fmt>` | Output format: markdown (default) or json |
555
+ | `hippo import --chatgpt <path>` | Import from ChatGPT memory export (JSON or txt) |
556
+ | `hippo import --claude <path>` | Import from CLAUDE.md or Claude memory.json |
557
+ | `hippo import --cursor <path>` | Import from .cursorrules or .cursor/rules |
558
+ | `hippo import --markdown <path>` | Import from structured markdown (headings -> tags) |
559
+ | `hippo import --file <path>` | Import from any text file |
560
+ | `hippo import --dry-run` | Preview import without writing |
561
+ | `hippo import --global` | Write imported memories to `~/.hippo/` |
562
+ | `hippo capture --stdin` | Extract memories from piped conversation text |
563
+ | `hippo capture --file <path>` | Extract memories from a file |
564
+ | `hippo capture --dry-run` | Preview extraction without writing |
565
+ | `hippo sleep` | Run consolidation (decay + merge + compress) |
566
+ | `hippo sleep --dry-run` | Preview consolidation without writing |
567
+ | `hippo status` | Memory health: counts, strengths, last sleep |
568
+ | `hippo outcome --good` | Strengthen last recalled memories |
569
+ | `hippo outcome --bad` | Weaken last recalled memories |
570
+ | `hippo outcome --id <id> --good` | Target a specific memory |
571
+ | `hippo inspect <id>` | Full detail on one memory |
572
+ | `hippo forget <id>` | Force remove a memory |
573
+ | `hippo embed` | Embed all memories for semantic search |
574
+ | `hippo embed --status` | Show embedding coverage |
575
+ | `hippo watch "<command>"` | Run command, auto-learn from failures |
576
+ | `hippo learn --git` | Scan recent git commits for lessons |
577
+ | `hippo learn --git --days <n>` | Scan N days back (default: 7) |
578
+ | `hippo learn --git --repos <paths>` | Scan multiple repos (comma-separated) |
579
+ | `hippo conflicts` | List detected open memory conflicts |
580
+ | `hippo conflicts --json` | Output conflicts as JSON |
581
+ | `hippo resolve <id>` | Show both conflicting memories for comparison |
582
+ | `hippo resolve <id> --keep <mem_id>` | Resolve: keep winner, weaken loser |
583
+ | `hippo resolve <id> --keep <mem_id> --forget` | Resolve: keep winner, delete loser |
584
+ | `hippo promote <id>` | Copy a local memory to the global store |
585
+ | `hippo share <id>` | Share with attribution + transfer scoring |
586
+ | `hippo share <id> --force` | Share even if transfer score is low |
587
+ | `hippo share --auto` | Auto-share all high-scoring memories |
588
+ | `hippo share --auto --dry-run` | Preview what would be shared |
589
+ | `hippo peers` | List projects contributing to global store |
590
+ | `hippo sync` | Pull global memories into local project |
591
+ | `hippo invalidate "<pattern>"` | Actively weaken memories matching an old pattern |
592
+ | `hippo invalidate "<pattern>" --reason "<why>"` | Include what replaced it |
593
+ | `hippo decide "<decision>"` | Record architectural decision (90-day half-life) |
594
+ | `hippo decide "<decision>" --context "<why>"` | Include reasoning |
595
+ | `hippo decide "<decision>" --supersedes <id>` | Supersede a previous decision |
596
+ | `hippo hook list` | Show available framework hooks |
597
+ | `hippo hook install <target>` | Install hook (claude-code also adds Stop hook for auto-sleep) |
598
+ | `hippo hook uninstall <target>` | Remove hook |
599
+ | `hippo handoff create --summary "..."` | Create a session handoff |
600
+ | `hippo handoff latest` | Show the most recent handoff |
601
+ | `hippo handoff show <id>` | Show a specific handoff by ID |
602
+ | `hippo session latest` | Show latest task snapshot + events |
603
+ | `hippo session resume` | Re-inject latest handoff as context |
604
+ | `hippo current show` | Compact current state (task + session events) |
605
+ | `hippo wm push --scope <s> --content "..."` | Push to working memory |
606
+ | `hippo wm read --scope <s>` | Read working memory entries |
607
+ | `hippo wm clear --scope <s>` | Clear working memory |
608
+ | `hippo wm flush --scope <s>` | Flush working memory (session end) |
609
+ | `hippo dashboard` | Open web dashboard at localhost:3333 |
610
+ | `hippo dashboard --port <n>` | Use custom port |
611
+ | `hippo mcp` | Start MCP server (stdio transport) |
612
+
613
+ ---
614
+
615
+ ## Framework Integrations
616
+
617
+ ### Auto-install (recommended)
618
+
619
+ `hippo init` detects your agent framework and patches the right config file automatically:
620
+
621
+ | Framework | Detected by | Patches |
622
+ |-----------|------------|---------|
623
+ | Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` + Stop hook in `settings.json` |
624
+ | Codex | `AGENTS.md` or `.codex` | `AGENTS.md` |
625
+ | Cursor | `.cursorrules` or `.cursor/rules` | `.cursorrules` |
626
+ | OpenClaw | `.openclaw` or `AGENTS.md` | `AGENTS.md` |
627
+ | OpenCode | `.opencode/` or `opencode.json` | `AGENTS.md` |
628
+
629
+ No extra commands needed. Just `hippo init` and your agent knows about Hippo.
630
+
631
+ ### Manual install
632
+
633
+ If you prefer explicit control:
634
+
635
+ ```bash
636
+ hippo hook install claude-code # patches CLAUDE.md + adds Stop hook to settings.json
637
+ hippo hook install codex # patches AGENTS.md
638
+ hippo hook install cursor # patches .cursorrules
639
+ hippo hook install openclaw # patches AGENTS.md
640
+ hippo hook install opencode # patches AGENTS.md
641
+ ```
642
+
643
+ This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the agent to:
644
+ 1. Run `hippo context --auto --budget 1500` at session start
645
+ 2. Run `hippo remember "<lesson>" --error` on errors
646
+ 3. Run `hippo outcome --good` on completion
647
+
648
+ For Claude Code, it also adds a Stop hook to `~/.claude/settings.json` so `hippo sleep` runs automatically when the session exits.
649
+
650
+ To remove: `hippo hook uninstall claude-code`
651
+
652
+ ### What the hook adds (Claude Code example)
653
+
654
+ ```markdown
655
+ ## Project Memory (Hippo)
656
+
657
+ Before starting work, load relevant context:
658
+ hippo context --auto --budget 1500
659
+
660
+ When you hit an error or discover a gotcha:
661
+ hippo remember "<what went wrong and why>" --error
662
+
663
+ After completing work successfully:
664
+ hippo outcome --good
665
+ ```
666
+
667
+ ### MCP Server
668
+
669
+ For any MCP-compatible client (Cursor, Windsurf, Cline, Claude Desktop):
670
+
671
+ ```bash
672
+ hippo mcp # starts MCP server over stdio
673
+ ```
674
+
675
+ Add to your MCP config (e.g. `.cursor/mcp.json` or `claude_desktop_config.json`):
676
+
677
+ ```json
678
+ {
679
+ "mcpServers": {
680
+ "hippo-memory": {
681
+ "command": "hippo",
682
+ "args": ["mcp"]
683
+ }
684
+ }
685
+ }
686
+ ```
687
+
688
+ Exposes tools: `hippo_recall`, `hippo_remember`, `hippo_outcome`, `hippo_context`, `hippo_status`, `hippo_learn`, `hippo_wm_push`.
689
+
690
+ ### OpenClaw Plugin
691
+
692
+ Native plugin with auto-context injection, workspace-aware memory lookup, and
693
+ tool hooks for auto-learn / auto-sleep.
694
+
695
+ ```bash
696
+ openclaw plugins install hippo-memory
697
+ openclaw plugins enable hippo-memory
698
+ ```
699
+
700
+ Plugin docs: [extensions/openclaw-plugin/](extensions/openclaw-plugin/). Integration guide: [integrations/openclaw.md](integrations/openclaw.md).
701
+
702
+ ### Claude Code Plugin
703
+
704
+ Plugin with SessionStart/Stop hooks and error auto-capture. See [extensions/claude-code-plugin/](extensions/claude-code-plugin/).
705
+
706
+ Full integration details: [integrations/](integrations/)
707
+
708
+ ---
709
+
710
+ ## The Neuroscience
711
+
712
+ Hippo is modeled on seven properties of the human hippocampus. Not metaphorically. Literally.
713
+
714
+ **Why two stores?** The brain uses a fast hippocampal buffer + a slow neocortical store (Complementary Learning Systems theory, McClelland et al. 1995). If the neocortex learned fast, new information would overwrite old knowledge. The buffer absorbs new episodes; the neocortex extracts patterns over time.
715
+
716
+ **Why does decay help?** New neurons born in the dentate gyrus actively disrupt old memory traces (Frankland et al. 2013). This is adaptive: it reduces interference from outdated information. Forgetting isn't failure. It's maintenance.
717
+
718
+ **Why do errors stick?** The amygdala modulates hippocampal consolidation based on emotional significance. Fear and error signals boost encoding. Your first production incident is burned into memory. Your 200th uneventful deploy isn't.
719
+
720
+ **Why does retrieval strengthen?** Recalled memories undergo "reconsolidation" (Nader et al. 2000). The act of retrieval destabilizes the trace, then re-encodes it stronger. This is the testing effect. Hippo implements it mechanically via the half-life extension on recall.
721
+
722
+ **Why does sleep consolidate?** During sleep, the hippocampus replays compressed versions of recent episodes and "teaches" the neocortex by repeatedly activating the same patterns. Hippo's `sleep` command runs this as a deliberate consolidation pass.
723
+
724
+ The 7 mechanisms in full: [PLAN.md#core-principles](PLAN.md#core-principles)
725
+
726
+ For how these mechanisms connect to LLM training, continual learning, and open research problems: **[RESEARCH.md](RESEARCH.md)**
727
+
728
+ **Why does reward modulate decay?** In spiking neural networks, reward-modulated STDP strengthens synapses that contribute to positive outcomes and weakens those that don't. Hippo's reward-proportional decay (v0.11.0) implements this: memories with consistent positive outcomes decay slower, negatives decay faster, with no fixed deltas. Inspired by [MH-FLOCKE](https://github.com/MarcHesse/mhflocke)'s R-STDP architecture for quadruped locomotion, where the same mechanism produces stable learning with 11.6x lower variance than PPO.
729
+
730
+ **Prior art in agent memory simulation.** The idea that human-like memory produces human-like behavior as an emergent property was explored in IEEE research from 2010-2011 ([5952114](https://ieeexplore.ieee.org/document/5952114), [5548405](https://ieeexplore.ieee.org/document/5548405), [5953964](https://ieeexplore.ieee.org/document/5953964)). Walking between rooms and forgetting why you went there doesn't need direct simulation; it emerges naturally from a memory system with capacity limits and decay. Hippo's design follows the same principle: implement the mechanisms, and the behavior follows.
731
+
732
+ **Related work:** [HippoRAG](https://arxiv.org/abs/2405.14831) (Gutierrez et al., 2024) applies hippocampal indexing to RAG via knowledge graphs. [MemPalace](https://github.com/milla-jovovich/mempalace) (Sigman & Jovovich, 2026) organizes memory spatially (wings/halls/rooms) with AAAK compression, achieving 100% on [LongMemEval](https://arxiv.org/abs/2410.10813). [MH-FLOCKE](https://github.com/MarcHesse/mhflocke) (Hesse, 2026) uses spiking neurons with R-STDP for embodied cognition. Each system tackles a different facet: HippoRAG optimizes retrieval quality, MemPalace optimizes retrieval organization, MH-FLOCKE optimizes embodied learning, and Hippo optimizes memory lifecycle.
733
+
734
+ ---
735
+
736
+ ## Comparison
737
+
738
+ | Feature | Hippo | MemPalace | Mem0 | Basic Memory |
739
+ |---------|-------|-----------|------|-------------|
740
+ | Decay by default | Yes | No | No | No |
741
+ | Retrieval strengthening | Yes | No | No | No |
742
+ | Reward-proportional decay | Yes | No | No | No |
743
+ | Hybrid search (BM25 + embeddings) | Yes | Embeddings + spatial | Embeddings only | No |
744
+ | Schema acceleration | Yes | No | No | No |
745
+ | Conflict detection + resolution | Yes | No | No | No |
746
+ | Multi-agent shared memory | Yes | No | No | No |
747
+ | Transfer scoring | Yes | No | No | No |
748
+ | Outcome tracking | Yes | No | No | No |
749
+ | Confidence tiers | Yes | No | No | No |
750
+ | Spatial organization | No | Yes (wings/halls/rooms) | No | No |
751
+ | Lossless compression | No | Yes (AAAK, 30x) | No | No |
752
+ | Cross-tool import | Yes | No | No | No |
753
+ | Auto-hook install | Yes | No | No | No |
754
+ | MCP server | Yes | Yes | No | No |
755
+ | Zero dependencies | Yes | No (ChromaDB) | No | No |
756
+ | LongMemEval R@5 (retrieval) | 74.0% (BM25 only) | 96.6% (raw) / 100% (reranked) | ~49-85% | N/A |
757
+ | Git-friendly | Yes | No | No | Yes |
758
+ | Framework agnostic | Yes | Yes | Partial | Yes |
759
+
760
+ Different tools answer different questions. Mem0 and Basic Memory implement "save everything, search later." MemPalace implements "store everything, organize spatially for retrieval." Hippo implements "forget by default, earn persistence through use." These are complementary approaches: MemPalace's retrieval precision + Hippo's lifecycle management would be stronger than either alone.
761
+
762
+ ---
763
+
764
+ ## Benchmarks
765
+
766
+ Two benchmarks testing two different things. Full details in [`benchmarks/`](benchmarks/).
767
+
768
+ ### LongMemEval (retrieval accuracy)
769
+
770
+ [LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) is the industry-standard benchmark: 500 questions across 5 memory abilities, embedded in 115k+ token chat histories.
771
+
772
+ **Hippo v0.11.0 results (BM25 only, zero dependencies):**
773
+
774
+ | Metric | Score |
775
+ |--------|-------|
776
+ | Recall@1 | 50.4% |
777
+ | Recall@3 | 66.6% |
778
+ | Recall@5 | 74.0% |
779
+ | Recall@10 | 82.6% |
780
+ | Answer in content@5 | 46.6% |
781
+
782
+ | Question Type | Count | R@5 |
783
+ |---------------|-------|-----|
784
+ | single-session-assistant | 56 | 94.6% |
785
+ | knowledge-update | 78 | 88.5% |
786
+ | temporal-reasoning | 133 | 73.7% |
787
+ | multi-session | 133 | 72.2% |
788
+ | single-session-user | 70 | 65.7% |
789
+ | single-session-preference | 30 | 26.7% |
790
+
791
+ For context: MemPalace scores 96.6% (raw) using ChromaDB embeddings + spatial indexing. Hippo achieves 74.0% using BM25 keyword matching alone with zero runtime dependencies. Adding embeddings via `hippo embed` (optional `@xenova/transformers` peer dep) enables hybrid search and should close the gap.
792
+
793
+ Hippo's strongest categories (knowledge-update 88.5%, single-session-assistant 94.6%) are the ones where keyword overlap between question and stored content is highest. The weakest (preference 26.7%) involves indirect references that need semantic understanding.
794
+
795
+ ```bash
796
+ cd benchmarks/longmemeval
797
+ python ingest_direct.py --data data/longmemeval_oracle.json --store-dir ./store
798
+ python retrieve_fast.py --data data/longmemeval_oracle.json --store-dir ./store --output results/retrieval.jsonl
799
+ python evaluate_retrieval.py --retrieval results/retrieval.jsonl --data data/longmemeval_oracle.json
800
+ ```
801
+
802
+ ### Sequential Learning Benchmark (agent improvement over time)
803
+
804
+ No other public benchmark tests whether memory systems produce learning curves. LongMemEval tests retrieval on a fixed corpus. This benchmark tests whether an agent with memory *performs better on task 40 than task 5*.
805
+
806
+ 50 tasks, 10 trap categories, each appearing 2-3 times across the sequence.
807
+
808
+ **Hippo v0.11.0 results:**
809
+
810
+ | Condition | Overall | Early | Mid | Late | Learns? |
811
+ |-----------|---------|-------|-----|------|---------|
812
+ | No memory | 100% | 100% | 100% | 100% | No |
813
+ | Static memory | 20% | 33% | 11% | 14% | No |
814
+ | Hippo | 40% | 78% | 22% | 14% | Yes |
815
+
816
+ The hippo agent's trap-hit rate drops from 78% to 14% as it accumulates error memories with 2x half-life. Static pre-loaded memory helps from the start but doesn't improve. Any memory system can run this benchmark by implementing the [adapter interface](benchmarks/sequential-learning/adapters/interface.mjs).
817
+
818
+ ```bash
819
+ cd benchmarks/sequential-learning
820
+ node run.mjs --adapter all
821
+ ```
822
+
823
+ ---
824
+
825
+ ## Contributing
826
+
827
+ Issues and PRs welcome. Before contributing, run `hippo status` in the repo root to see the project's own memory.
828
+
829
+ The interesting problems:
830
+ - **Improve LongMemEval score.** Current R@5 is 74.0% with BM25 only. Adding embeddings (`hippo embed`) and hybrid search should close the gap toward MemPalace's 96.6%.
831
+ - Better consolidation heuristics (LLM-powered merge vs current text overlap)
832
+ - Web UI / dashboard for visualizing decay curves and memory health
833
+ - Optimal decay parameter tuning from real usage data
834
+ - Cross-agent transfer learning evaluation
835
+ - **MemPalace-style spatial organization.** Could spatial structure (wings/halls/rooms) improve hippo's semantic layer?
836
+ - **AAAK-style compression for semantic memories.** Lossless token compression for context injection.
837
+
838
+ ## License
839
+
840
+ MIT