hippo-memory 0.9.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +156 -19
- package/dist/cli.d.ts +1 -0
- package/dist/cli.d.ts.map +1 -1
- package/dist/cli.js +280 -4
- package/dist/cli.js.map +1 -1
- package/dist/db.d.ts.map +1 -1
- package/dist/db.js +12 -1
- package/dist/db.js.map +1 -1
- package/dist/invalidation.d.ts +23 -0
- package/dist/invalidation.d.ts.map +1 -0
- package/dist/invalidation.js +94 -0
- package/dist/invalidation.js.map +1 -0
- package/dist/memory.d.ts +27 -1
- package/dist/memory.d.ts.map +1 -1
- package/dist/memory.js +45 -7
- package/dist/memory.js.map +1 -1
- package/dist/path-context.d.ts +12 -0
- package/dist/path-context.d.ts.map +1 -0
- package/dist/path-context.js +32 -0
- package/dist/path-context.js.map +1 -0
- package/dist/search.d.ts.map +1 -1
- package/dist/search.js +20 -1
- package/dist/search.js.map +1 -1
- package/dist/store.d.ts.map +1 -1
- package/dist/store.js +14 -5
- package/dist/store.js.map +1 -1
- package/extensions/openclaw-plugin/openclaw.plugin.json +1 -1
- package/extensions/openclaw-plugin/package.json +1 -1
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
[](./LICENSE)
|
|
7
7
|
|
|
8
8
|
```
|
|
9
|
-
Works with: Claude Code, Codex, Cursor, OpenClaw, any CLI agent
|
|
9
|
+
Works with: Claude Code, Codex, Cursor, OpenClaw, OpenCode, any CLI agent
|
|
10
10
|
Imports from: ChatGPT, Claude (CLAUDE.md), Cursor (.cursorrules), any markdown
|
|
11
11
|
Storage: SQLite backbone + markdown/YAML mirrors. Git-trackable and human-readable.
|
|
12
12
|
Dependencies: Zero runtime deps. Requires Node.js 22.5+. Optional embeddings via @xenova/transformers.
|
|
@@ -43,6 +43,24 @@ hippo recall "data pipeline issues" --budget 2000
|
|
|
43
43
|
|
|
44
44
|
That's it. You have a memory system.
|
|
45
45
|
|
|
46
|
+
### What's new in v0.11.0
|
|
47
|
+
|
|
48
|
+
- **Reward-proportional decay.** Outcome feedback now modulates decay rate continuously instead of fixed half-life deltas. Memories with consistent positive outcomes decay up to 1.5x slower; consistent negatives decay up to 2x faster. Mixed outcomes converge toward neutral. Inspired by R-STDP in spiking neural networks. `hippo inspect` now shows cumulative outcome counts and the computed reward factor.
|
|
49
|
+
- **Public benchmarks.** Two benchmarks in `benchmarks/`: a [Sequential Learning Benchmark](benchmarks/sequential-learning/) (50 tasks, 10 traps, measures agent improvement over time) and a [LongMemEval integration](benchmarks/longmemeval/) (industry-standard 500-question retrieval benchmark, R@5=74.0% with BM25 only). The sequential learning benchmark is unique: no other public benchmark tests whether memory systems produce learning curves.
|
|
50
|
+
|
|
51
|
+
### What's new in v0.10.0
|
|
52
|
+
|
|
53
|
+
- **Active invalidation.** `hippo learn --git` detects migration and breaking-change commits and actively weakens memories referencing the old pattern. Manual invalidation via `hippo invalidate "REST API" --reason "migrated to GraphQL"`.
|
|
54
|
+
- **Architectural decisions.** `hippo decide` stores one-off decisions with 90-day half-life and verified confidence. Supports `--context` for reasoning and `--supersedes` to chain decisions when the architecture evolves.
|
|
55
|
+
- **Path-based memory triggers.** Memories auto-tagged with `path:<segment>` from your working directory. Recall boosts memories from the same location (up to 1.3x). Working in `src/api/`? API-related memories surface first.
|
|
56
|
+
- **OpenCode integration.** `hippo hook install opencode` patches AGENTS.md. Auto-detected during `hippo init`. Integration guide with MCP config and skill for progressive discovery.
|
|
57
|
+
- **`hippo export`** outputs all memories as JSON or markdown.
|
|
58
|
+
- **Decision recall boost.** 1.2x scoring multiplier for decision-tagged memories so they surface despite low retrieval frequency.
|
|
59
|
+
|
|
60
|
+
### What's new in v0.9.1
|
|
61
|
+
|
|
62
|
+
- **Auto-sleep on session exit.** `hippo hook install claude-code` now installs a Stop hook in `~/.claude/settings.json` so `hippo sleep` runs automatically when Claude Code exits. `hippo init` does this too when Claude Code is detected. No cron needed, no manual sleep.
|
|
63
|
+
|
|
46
64
|
### What's new in v0.9.0
|
|
47
65
|
|
|
48
66
|
- **Working memory layer** (`hippo wm push/read/clear/flush`). Bounded buffer (max 20 per scope) with importance-based eviction. Current-state notes live separately from long-term memory.
|
|
@@ -76,7 +94,7 @@ hippo init
|
|
|
76
94
|
# Auto-installed claude-code hook in CLAUDE.md
|
|
77
95
|
```
|
|
78
96
|
|
|
79
|
-
If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
|
|
97
|
+
If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw/OpenCode. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
|
|
80
98
|
|
|
81
99
|
It also sets up a daily cron job (6:15am) that runs `hippo learn --git` and `hippo sleep` automatically. Memories get captured from your commits and consolidated every day without you thinking about it.
|
|
82
100
|
|
|
@@ -274,6 +292,44 @@ hippo recall "cache issues" # again next week
|
|
|
274
292
|
|
|
275
293
|
---
|
|
276
294
|
|
|
295
|
+
### Active invalidation
|
|
296
|
+
|
|
297
|
+
When you migrate from one tool to another, old memories about the replaced tool should die immediately. Hippo detects migration and breaking-change commits during `hippo learn --git` and actively weakens matching memories.
|
|
298
|
+
|
|
299
|
+
```bash
|
|
300
|
+
hippo learn --git
|
|
301
|
+
# feat: migrate from webpack to vite
|
|
302
|
+
# Invalidated 3 memories referencing "webpack"
|
|
303
|
+
# Learned: migrate from webpack to vite
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
You can also invalidate manually:
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
hippo invalidate "REST API" --reason "migrated to GraphQL"
|
|
310
|
+
# Invalidated 5 memories referencing "REST API".
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
### Architectural decisions
|
|
316
|
+
|
|
317
|
+
One-off decisions don't repeat, so they can't earn their keep through retrieval alone. `hippo decide` stores them with a 90-day half-life and verified confidence so they survive long enough to matter.
|
|
318
|
+
|
|
319
|
+
```bash
|
|
320
|
+
hippo decide "Use PostgreSQL for all new services" --context "JSONB support"
|
|
321
|
+
# Decision recorded: mem_a1b2c3
|
|
322
|
+
|
|
323
|
+
# Later, when the decision changes:
|
|
324
|
+
hippo decide "Use CockroachDB for global services" \
|
|
325
|
+
--context "Need multi-region" \
|
|
326
|
+
--supersedes mem_a1b2c3
|
|
327
|
+
# Superseded mem_a1b2c3 (half-life halved, marked stale)
|
|
328
|
+
# Decision recorded: mem_d4e5f6
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
277
333
|
### Error memories stick
|
|
278
334
|
|
|
279
335
|
Tag a memory as an error and it gets 2x the half-life automatically.
|
|
@@ -373,14 +429,15 @@ hippo recall "why is the gold model broken"
|
|
|
373
429
|
|
|
374
430
|
hippo outcome --good
|
|
375
431
|
# Applied positive outcome to 3 memories
|
|
376
|
-
#
|
|
432
|
+
# reward factor increases, decay slows
|
|
377
433
|
|
|
378
434
|
hippo outcome --bad
|
|
379
435
|
# Applied negative outcome to 3 memories
|
|
380
|
-
#
|
|
381
|
-
# irrelevant memories decay faster
|
|
436
|
+
# reward factor decreases, decay accelerates
|
|
382
437
|
```
|
|
383
438
|
|
|
439
|
+
Outcomes are cumulative. A memory with 5 positive outcomes and 0 negative has a reward factor of ~1.42, making its effective half-life 42% longer. A memory with 0 positive and 3 negative has a factor of ~0.63, decaying nearly twice as fast. Mixed outcomes converge toward neutral (1.0).
|
|
440
|
+
|
|
384
441
|
---
|
|
385
442
|
|
|
386
443
|
### Token budgets
|
|
@@ -502,8 +559,13 @@ hippo watch "npm run build"
|
|
|
502
559
|
| `hippo share --auto --dry-run` | Preview what would be shared |
|
|
503
560
|
| `hippo peers` | List projects contributing to global store |
|
|
504
561
|
| `hippo sync` | Pull global memories into local project |
|
|
562
|
+
| `hippo invalidate "<pattern>"` | Actively weaken memories matching an old pattern |
|
|
563
|
+
| `hippo invalidate "<pattern>" --reason "<why>"` | Include what replaced it |
|
|
564
|
+
| `hippo decide "<decision>"` | Record architectural decision (90-day half-life) |
|
|
565
|
+
| `hippo decide "<decision>" --context "<why>"` | Include reasoning |
|
|
566
|
+
| `hippo decide "<decision>" --supersedes <id>` | Supersede a previous decision |
|
|
505
567
|
| `hippo hook list` | Show available framework hooks |
|
|
506
|
-
| `hippo hook install <target>` | Install hook (claude-code
|
|
568
|
+
| `hippo hook install <target>` | Install hook (claude-code also adds Stop hook for auto-sleep) |
|
|
507
569
|
| `hippo hook uninstall <target>` | Remove hook |
|
|
508
570
|
| `hippo handoff create --summary "..."` | Create a session handoff |
|
|
509
571
|
| `hippo handoff latest` | Show the most recent handoff |
|
|
@@ -529,10 +591,11 @@ hippo watch "npm run build"
|
|
|
529
591
|
|
|
530
592
|
| Framework | Detected by | Patches |
|
|
531
593
|
|-----------|------------|---------|
|
|
532
|
-
| Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` |
|
|
594
|
+
| Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` + Stop hook in `settings.json` |
|
|
533
595
|
| Codex | `AGENTS.md` or `.codex` | `AGENTS.md` |
|
|
534
596
|
| Cursor | `.cursorrules` or `.cursor/rules` | `.cursorrules` |
|
|
535
597
|
| OpenClaw | `.openclaw` or `AGENTS.md` | `AGENTS.md` |
|
|
598
|
+
| OpenCode | `.opencode/` or `opencode.json` | `AGENTS.md` |
|
|
536
599
|
|
|
537
600
|
No extra commands needed. Just `hippo init` and your agent knows about Hippo.
|
|
538
601
|
|
|
@@ -541,10 +604,11 @@ No extra commands needed. Just `hippo init` and your agent knows about Hippo.
|
|
|
541
604
|
If you prefer explicit control:
|
|
542
605
|
|
|
543
606
|
```bash
|
|
544
|
-
hippo hook install claude-code # patches CLAUDE.md
|
|
607
|
+
hippo hook install claude-code # patches CLAUDE.md + adds Stop hook to settings.json
|
|
545
608
|
hippo hook install codex # patches AGENTS.md
|
|
546
609
|
hippo hook install cursor # patches .cursorrules
|
|
547
610
|
hippo hook install openclaw # patches AGENTS.md
|
|
611
|
+
hippo hook install opencode # patches AGENTS.md
|
|
548
612
|
```
|
|
549
613
|
|
|
550
614
|
This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the agent to:
|
|
@@ -552,6 +616,8 @@ This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the
|
|
|
552
616
|
2. Run `hippo remember "<lesson>" --error` on errors
|
|
553
617
|
3. Run `hippo outcome --good` on completion
|
|
554
618
|
|
|
619
|
+
For Claude Code, it also adds a Stop hook to `~/.claude/settings.json` so `hippo sleep` runs automatically when the session exits.
|
|
620
|
+
|
|
555
621
|
To remove: `hippo hook uninstall claude-code`
|
|
556
622
|
|
|
557
623
|
### What the hook adds (Claude Code example)
|
|
@@ -630,32 +696,100 @@ The 7 mechanisms in full: [PLAN.md#core-principles](PLAN.md#core-principles)
|
|
|
630
696
|
|
|
631
697
|
For how these mechanisms connect to LLM training, continual learning, and open research problems: **[RESEARCH.md](RESEARCH.md)**
|
|
632
698
|
|
|
699
|
+
**Why does reward modulate decay?** In spiking neural networks, reward-modulated STDP strengthens synapses that contribute to positive outcomes and weakens those that don't. Hippo's reward-proportional decay (v0.11.0) implements this: memories with consistent positive outcomes decay slower, negatives decay faster, with no fixed deltas. Inspired by [MH-FLOCKE](https://github.com/MarcHesse/mhflocke)'s R-STDP architecture for quadruped locomotion, where the same mechanism produces stable learning with 11.6x lower variance than PPO.
|
|
700
|
+
|
|
701
|
+
**Prior art in agent memory simulation.** The idea that human-like memory produces human-like behavior as an emergent property was explored in IEEE research from 2010-2011 ([5952114](https://ieeexplore.ieee.org/document/5952114), [5548405](https://ieeexplore.ieee.org/document/5548405), [5953964](https://ieeexplore.ieee.org/document/5953964)). Walking between rooms and forgetting why you went there doesn't need direct simulation; it emerges naturally from a memory system with capacity limits and decay. Hippo's design follows the same principle: implement the mechanisms, and the behavior follows.
|
|
702
|
+
|
|
703
|
+
**Related work:** [HippoRAG](https://arxiv.org/abs/2405.14831) (Gutierrez et al., 2024) applies hippocampal indexing to RAG via knowledge graphs. [MemPalace](https://github.com/milla-jovovich/mempalace) (Sigman & Jovovich, 2026) organizes memory spatially (wings/halls/rooms) with AAAK compression, achieving 100% on [LongMemEval](https://arxiv.org/abs/2410.10813). [MH-FLOCKE](https://github.com/MarcHesse/mhflocke) (Hesse, 2026) uses spiking neurons with R-STDP for embodied cognition. Each system tackles a different facet: HippoRAG optimizes retrieval quality, MemPalace optimizes retrieval organization, MH-FLOCKE optimizes embodied learning, and Hippo optimizes memory lifecycle.
|
|
704
|
+
|
|
633
705
|
---
|
|
634
706
|
|
|
635
707
|
## Comparison
|
|
636
708
|
|
|
637
|
-
| Feature | Hippo | Mem0 | Basic Memory |
|
|
638
|
-
|
|
709
|
+
| Feature | Hippo | MemPalace | Mem0 | Basic Memory |
|
|
710
|
+
|---------|-------|-----------|------|-------------|
|
|
639
711
|
| Decay by default | Yes | No | No | No |
|
|
640
712
|
| Retrieval strengthening | Yes | No | No | No |
|
|
641
|
-
|
|
|
713
|
+
| Reward-proportional decay | Yes | No | No | No |
|
|
714
|
+
| Hybrid search (BM25 + embeddings) | Yes | Embeddings + spatial | Embeddings only | No |
|
|
642
715
|
| Schema acceleration | Yes | No | No | No |
|
|
643
716
|
| Conflict detection + resolution | Yes | No | No | No |
|
|
644
717
|
| Multi-agent shared memory | Yes | No | No | No |
|
|
645
718
|
| Transfer scoring | Yes | No | No | No |
|
|
646
719
|
| Outcome tracking | Yes | No | No | No |
|
|
647
720
|
| Confidence tiers | Yes | No | No | No |
|
|
721
|
+
| Spatial organization | No | Yes (wings/halls/rooms) | No | No |
|
|
722
|
+
| Lossless compression | No | Yes (AAAK, 30x) | No | No |
|
|
648
723
|
| Cross-tool import | Yes | No | No | No |
|
|
649
|
-
| Conversation capture | Yes | No | No | No |
|
|
650
724
|
| Auto-hook install | Yes | No | No | No |
|
|
651
|
-
| MCP server | Yes |
|
|
652
|
-
|
|
|
653
|
-
|
|
|
654
|
-
|
|
|
655
|
-
|
|
|
656
|
-
|
|
725
|
+
| MCP server | Yes | Yes | No | No |
|
|
726
|
+
| Zero dependencies | Yes | No (ChromaDB) | No | No |
|
|
727
|
+
| LongMemEval R@5 (retrieval) | 74.0% (BM25 only) | 96.6% (raw) / 100% (reranked) | ~49-85% | N/A |
|
|
728
|
+
| Git-friendly | Yes | No | No | Yes |
|
|
729
|
+
| Framework agnostic | Yes | Yes | Partial | Yes |
|
|
730
|
+
|
|
731
|
+
Different tools answer different questions. Mem0 and Basic Memory implement "save everything, search later." MemPalace implements "store everything, organize spatially for retrieval." Hippo implements "forget by default, earn persistence through use." These are complementary approaches: MemPalace's retrieval precision + Hippo's lifecycle management would be stronger than either alone.
|
|
732
|
+
|
|
733
|
+
---
|
|
734
|
+
|
|
735
|
+
## Benchmarks
|
|
736
|
+
|
|
737
|
+
Two benchmarks testing two different things. Full details in [`benchmarks/`](benchmarks/).
|
|
738
|
+
|
|
739
|
+
### LongMemEval (retrieval accuracy)
|
|
740
|
+
|
|
741
|
+
[LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) is the industry-standard benchmark: 500 questions across 5 memory abilities, embedded in 115k+ token chat histories.
|
|
742
|
+
|
|
743
|
+
**Hippo v0.11.0 results (BM25 only, zero dependencies):**
|
|
744
|
+
|
|
745
|
+
| Metric | Score |
|
|
746
|
+
|--------|-------|
|
|
747
|
+
| Recall@1 | 50.4% |
|
|
748
|
+
| Recall@3 | 66.6% |
|
|
749
|
+
| Recall@5 | 74.0% |
|
|
750
|
+
| Recall@10 | 82.6% |
|
|
751
|
+
| Answer in content@5 | 46.6% |
|
|
752
|
+
|
|
753
|
+
| Question Type | Count | R@5 |
|
|
754
|
+
|---------------|-------|-----|
|
|
755
|
+
| single-session-assistant | 56 | 94.6% |
|
|
756
|
+
| knowledge-update | 78 | 88.5% |
|
|
757
|
+
| temporal-reasoning | 133 | 73.7% |
|
|
758
|
+
| multi-session | 133 | 72.2% |
|
|
759
|
+
| single-session-user | 70 | 65.7% |
|
|
760
|
+
| single-session-preference | 30 | 26.7% |
|
|
657
761
|
|
|
658
|
-
|
|
762
|
+
For context: MemPalace scores 96.6% (raw) using ChromaDB embeddings + spatial indexing. Hippo achieves 74.0% using BM25 keyword matching alone with zero runtime dependencies. Adding embeddings via `hippo embed` (optional `@xenova/transformers` peer dep) enables hybrid search and should close the gap.
|
|
763
|
+
|
|
764
|
+
Hippo's strongest categories (knowledge-update 88.5%, single-session-assistant 94.6%) are the ones where keyword overlap between question and stored content is highest. The weakest (preference 26.7%) involves indirect references that need semantic understanding.
|
|
765
|
+
|
|
766
|
+
```bash
|
|
767
|
+
cd benchmarks/longmemeval
|
|
768
|
+
python ingest_direct.py --data data/longmemeval_oracle.json --store-dir ./store
|
|
769
|
+
python retrieve_fast.py --data data/longmemeval_oracle.json --store-dir ./store --output results/retrieval.jsonl
|
|
770
|
+
python evaluate_retrieval.py --retrieval results/retrieval.jsonl --data data/longmemeval_oracle.json
|
|
771
|
+
```
|
|
772
|
+
|
|
773
|
+
### Sequential Learning Benchmark (agent improvement over time)
|
|
774
|
+
|
|
775
|
+
No other public benchmark tests whether memory systems produce learning curves. LongMemEval tests retrieval on a fixed corpus. This benchmark tests whether an agent with memory *performs better on task 40 than task 5*.
|
|
776
|
+
|
|
777
|
+
50 tasks, 10 trap categories, each appearing 2-3 times across the sequence.
|
|
778
|
+
|
|
779
|
+
**Hippo v0.11.0 results:**
|
|
780
|
+
|
|
781
|
+
| Condition | Overall | Early | Mid | Late | Learns? |
|
|
782
|
+
|-----------|---------|-------|-----|------|---------|
|
|
783
|
+
| No memory | 100% | 100% | 100% | 100% | No |
|
|
784
|
+
| Static memory | 20% | 33% | 11% | 14% | No |
|
|
785
|
+
| Hippo | 40% | 78% | 22% | 14% | Yes |
|
|
786
|
+
|
|
787
|
+
The hippo agent's trap-hit rate drops from 78% to 14% as it accumulates error memories with 2x half-life. Static pre-loaded memory helps from the start but doesn't improve. Any memory system can run this benchmark by implementing the [adapter interface](benchmarks/sequential-learning/adapters/interface.mjs).
|
|
788
|
+
|
|
789
|
+
```bash
|
|
790
|
+
cd benchmarks/sequential-learning
|
|
791
|
+
node run.mjs --adapter all
|
|
792
|
+
```
|
|
659
793
|
|
|
660
794
|
---
|
|
661
795
|
|
|
@@ -664,10 +798,13 @@ Mem0, Basic Memory, and Claude-Mem all implement "save everything, search later.
|
|
|
664
798
|
Issues and PRs welcome. Before contributing, run `hippo status` in the repo root to see the project's own memory.
|
|
665
799
|
|
|
666
800
|
The interesting problems:
|
|
801
|
+
- **Improve LongMemEval score.** Current R@5 is 74.0% with BM25 only. Adding embeddings (`hippo embed`) and hybrid search should close the gap toward MemPalace's 96.6%.
|
|
667
802
|
- Better consolidation heuristics (LLM-powered merge vs current text overlap)
|
|
668
803
|
- Web UI / dashboard for visualizing decay curves and memory health
|
|
669
804
|
- Optimal decay parameter tuning from real usage data
|
|
670
805
|
- Cross-agent transfer learning evaluation
|
|
806
|
+
- **MemPalace-style spatial organization.** Could spatial structure (wings/halls/rooms) improve hippo's semantic layer?
|
|
807
|
+
- **AAAK-style compression for semantic memories.** Lossless token compression for context injection.
|
|
671
808
|
|
|
672
809
|
## License
|
|
673
810
|
|
package/dist/cli.d.ts
CHANGED
package/dist/cli.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA
|
|
1
|
+
{"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA;;;;;;;;;;;;;;;;;;;;;;;;GAwBG"}
|