npm - hippo-memory - Versions diffs - 0.9.0 → 0.11.0 - Mend

hippo-memory 0.9.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/README.md +156 -19
package/dist/cli.d.ts +1 -0
package/dist/cli.d.ts.map +1 -1
package/dist/cli.js +280 -4
package/dist/cli.js.map +1 -1
package/dist/db.d.ts.map +1 -1
package/dist/db.js +12 -1
package/dist/db.js.map +1 -1
package/dist/invalidation.d.ts +23 -0
package/dist/invalidation.d.ts.map +1 -0
package/dist/invalidation.js +94 -0
package/dist/invalidation.js.map +1 -0
package/dist/memory.d.ts +27 -1
package/dist/memory.d.ts.map +1 -1
package/dist/memory.js +45 -7
package/dist/memory.js.map +1 -1
package/dist/path-context.d.ts +12 -0
package/dist/path-context.d.ts.map +1 -0
package/dist/path-context.js +32 -0
package/dist/path-context.js.map +1 -0
package/dist/search.d.ts.map +1 -1
package/dist/search.js +20 -1
package/dist/search.js.map +1 -1
package/dist/store.d.ts.map +1 -1
package/dist/store.js +14 -5
package/dist/store.js.map +1 -1
package/extensions/openclaw-plugin/openclaw.plugin.json +1 -1
package/extensions/openclaw-plugin/package.json +1 -1
package/openclaw.plugin.json +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -6,7 +6,7 @@
 [![license](https://img.shields.io/badge/license-MIT-blue)](./LICENSE)
 ```
-Works with:  Claude Code, Codex, Cursor, OpenClaw, any CLI agent
+Works with:  Claude Code, Codex, Cursor, OpenClaw, OpenCode, any CLI agent
 Imports from: ChatGPT, Claude (CLAUDE.md), Cursor (.cursorrules), any markdown
 Storage:     SQLite backbone + markdown/YAML mirrors. Git-trackable and human-readable.
 Dependencies: Zero runtime deps. Requires Node.js 22.5+. Optional embeddings via @xenova/transformers.
@@ -43,6 +43,24 @@ hippo recall "data pipeline issues" --budget 2000
 That's it. You have a memory system.
+### What's new in v0.11.0
+- **Reward-proportional decay.** Outcome feedback now modulates decay rate continuously instead of fixed half-life deltas. Memories with consistent positive outcomes decay up to 1.5x slower; consistent negatives decay up to 2x faster. Mixed outcomes converge toward neutral. Inspired by R-STDP in spiking neural networks. `hippo inspect` now shows cumulative outcome counts and the computed reward factor.
+- **Public benchmarks.** Two benchmarks in `benchmarks/`: a [Sequential Learning Benchmark](benchmarks/sequential-learning/) (50 tasks, 10 traps, measures agent improvement over time) and a [LongMemEval integration](benchmarks/longmemeval/) (industry-standard 500-question retrieval benchmark, R@5=74.0% with BM25 only). The sequential learning benchmark is unique: no other public benchmark tests whether memory systems produce learning curves.
+### What's new in v0.10.0
+- **Active invalidation.** `hippo learn --git` detects migration and breaking-change commits and actively weakens memories referencing the old pattern. Manual invalidation via `hippo invalidate "REST API" --reason "migrated to GraphQL"`.
+- **Architectural decisions.** `hippo decide` stores one-off decisions with 90-day half-life and verified confidence. Supports `--context` for reasoning and `--supersedes` to chain decisions when the architecture evolves.
+- **Path-based memory triggers.** Memories auto-tagged with `path:<segment>` from your working directory. Recall boosts memories from the same location (up to 1.3x). Working in `src/api/`? API-related memories surface first.
+- **OpenCode integration.** `hippo hook install opencode` patches AGENTS.md. Auto-detected during `hippo init`. Integration guide with MCP config and skill for progressive discovery.
+- **`hippo export`** outputs all memories as JSON or markdown.
+- **Decision recall boost.** 1.2x scoring multiplier for decision-tagged memories so they surface despite low retrieval frequency.
+### What's new in v0.9.1
+- **Auto-sleep on session exit.** `hippo hook install claude-code` now installs a Stop hook in `~/.claude/settings.json` so `hippo sleep` runs automatically when Claude Code exits. `hippo init` does this too when Claude Code is detected. No cron needed, no manual sleep.
 ### What's new in v0.9.0
 - **Working memory layer** (`hippo wm push/read/clear/flush`). Bounded buffer (max 20 per scope) with importance-based eviction. Current-state notes live separately from long-term memory.
@@ -76,7 +94,7 @@ hippo init
 #    Auto-installed claude-code hook in CLAUDE.md
 ```
-If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
+If you have a `CLAUDE.md`, it patches it. `AGENTS.md` for Codex/OpenClaw/OpenCode. `.cursorrules` for Cursor. No manual `hook install` needed. Your agent starts using Hippo on its next session.
 It also sets up a daily cron job (6:15am) that runs `hippo learn --git` and `hippo sleep` automatically. Memories get captured from your commits and consolidated every day without you thinking about it.
@@ -274,6 +292,44 @@ hippo recall "cache issues"   # again next week
 ---
+### Active invalidation
+When you migrate from one tool to another, old memories about the replaced tool should die immediately. Hippo detects migration and breaking-change commits during `hippo learn --git` and actively weakens matching memories.
+```bash
+hippo learn --git
+# feat: migrate from webpack to vite
+#    Invalidated 3 memories referencing "webpack"
+#    Learned: migrate from webpack to vite
+```
+You can also invalidate manually:
+```bash
+hippo invalidate "REST API" --reason "migrated to GraphQL"
+# Invalidated 5 memories referencing "REST API".
+```
+---
+### Architectural decisions
+One-off decisions don't repeat, so they can't earn their keep through retrieval alone. `hippo decide` stores them with a 90-day half-life and verified confidence so they survive long enough to matter.
+```bash
+hippo decide "Use PostgreSQL for all new services" --context "JSONB support"
+# Decision recorded: mem_a1b2c3
+# Later, when the decision changes:
+hippo decide "Use CockroachDB for global services" \
+  --context "Need multi-region" \
+  --supersedes mem_a1b2c3
+# Superseded mem_a1b2c3 (half-life halved, marked stale)
+# Decision recorded: mem_d4e5f6
+```
+---
 ### Error memories stick
 Tag a memory as an error and it gets 2x the half-life automatically.
@@ -373,14 +429,15 @@ hippo recall "why is the gold model broken"
 hippo outcome --good
 # Applied positive outcome to 3 memories
-# half_life +5d on each
+# reward factor increases, decay slows
 hippo outcome --bad
 # Applied negative outcome to 3 memories
-# half_life -3d on each
-# irrelevant memories decay faster
+# reward factor decreases, decay accelerates
 ```
+Outcomes are cumulative. A memory with 5 positive outcomes and 0 negative has a reward factor of ~1.42, making its effective half-life 42% longer. A memory with 0 positive and 3 negative has a factor of ~0.63, decaying nearly twice as fast. Mixed outcomes converge toward neutral (1.0).
 ---
 ### Token budgets
@@ -502,8 +559,13 @@ hippo watch "npm run build"
 | `hippo share --auto --dry-run` | Preview what would be shared |
 | `hippo peers` | List projects contributing to global store |
 | `hippo sync` | Pull global memories into local project |
+| `hippo invalidate "<pattern>"` | Actively weaken memories matching an old pattern |
+| `hippo invalidate "<pattern>" --reason "<why>"` | Include what replaced it |
+| `hippo decide "<decision>"` | Record architectural decision (90-day half-life) |
+| `hippo decide "<decision>" --context "<why>"` | Include reasoning |
+| `hippo decide "<decision>" --supersedes <id>` | Supersede a previous decision |
 | `hippo hook list` | Show available framework hooks |
-| `hippo hook install <target>` | Install hook (claude-code, codex, cursor, openclaw) |
+| `hippo hook install <target>` | Install hook (claude-code also adds Stop hook for auto-sleep) |
 | `hippo hook uninstall <target>` | Remove hook |
 | `hippo handoff create --summary "..."` | Create a session handoff |
 | `hippo handoff latest` | Show the most recent handoff |
@@ -529,10 +591,11 @@ hippo watch "npm run build"
 | Framework | Detected by | Patches |
 |-----------|------------|---------|
-| Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` |
+| Claude Code | `CLAUDE.md` or `.claude/settings.json` | `CLAUDE.md` + Stop hook in `settings.json` |
 | Codex | `AGENTS.md` or `.codex` | `AGENTS.md` |
 | Cursor | `.cursorrules` or `.cursor/rules` | `.cursorrules` |
 | OpenClaw | `.openclaw` or `AGENTS.md` | `AGENTS.md` |
+| OpenCode | `.opencode/` or `opencode.json` | `AGENTS.md` |
 No extra commands needed. Just `hippo init` and your agent knows about Hippo.
@@ -541,10 +604,11 @@ No extra commands needed. Just `hippo init` and your agent knows about Hippo.
 If you prefer explicit control:
 ```bash
-hippo hook install claude-code   # patches CLAUDE.md
+hippo hook install claude-code   # patches CLAUDE.md + adds Stop hook to settings.json
 hippo hook install codex         # patches AGENTS.md
 hippo hook install cursor        # patches .cursorrules
 hippo hook install openclaw      # patches AGENTS.md
+hippo hook install opencode      # patches AGENTS.md
 ```
 This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the agent to:
@@ -552,6 +616,8 @@ This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the
 2. Run `hippo remember "<lesson>" --error` on errors
 3. Run `hippo outcome --good` on completion
+For Claude Code, it also adds a Stop hook to `~/.claude/settings.json` so `hippo sleep` runs automatically when the session exits.
 To remove: `hippo hook uninstall claude-code`
 ### What the hook adds (Claude Code example)
@@ -630,32 +696,100 @@ The 7 mechanisms in full: [PLAN.md#core-principles](PLAN.md#core-principles)
 For how these mechanisms connect to LLM training, continual learning, and open research problems: **[RESEARCH.md](RESEARCH.md)**
+**Why does reward modulate decay?** In spiking neural networks, reward-modulated STDP strengthens synapses that contribute to positive outcomes and weakens those that don't. Hippo's reward-proportional decay (v0.11.0) implements this: memories with consistent positive outcomes decay slower, negatives decay faster, with no fixed deltas. Inspired by [MH-FLOCKE](https://github.com/MarcHesse/mhflocke)'s R-STDP architecture for quadruped locomotion, where the same mechanism produces stable learning with 11.6x lower variance than PPO.
+**Prior art in agent memory simulation.** The idea that human-like memory produces human-like behavior as an emergent property was explored in IEEE research from 2010-2011 ([5952114](https://ieeexplore.ieee.org/document/5952114), [5548405](https://ieeexplore.ieee.org/document/5548405), [5953964](https://ieeexplore.ieee.org/document/5953964)). Walking between rooms and forgetting why you went there doesn't need direct simulation; it emerges naturally from a memory system with capacity limits and decay. Hippo's design follows the same principle: implement the mechanisms, and the behavior follows.
+**Related work:** [HippoRAG](https://arxiv.org/abs/2405.14831) (Gutierrez et al., 2024) applies hippocampal indexing to RAG via knowledge graphs. [MemPalace](https://github.com/milla-jovovich/mempalace) (Sigman & Jovovich, 2026) organizes memory spatially (wings/halls/rooms) with AAAK compression, achieving 100% on [LongMemEval](https://arxiv.org/abs/2410.10813). [MH-FLOCKE](https://github.com/MarcHesse/mhflocke) (Hesse, 2026) uses spiking neurons with R-STDP for embodied cognition. Each system tackles a different facet: HippoRAG optimizes retrieval quality, MemPalace optimizes retrieval organization, MH-FLOCKE optimizes embodied learning, and Hippo optimizes memory lifecycle.
 ---
 ## Comparison
-| Feature | Hippo | Mem0 | Basic Memory | Claude-Mem |
-|---------|-------|------|-------------|-----------|
+| Feature | Hippo | MemPalace | Mem0 | Basic Memory |
+|---------|-------|-----------|------|-------------|
 | Decay by default | Yes | No | No | No |
 | Retrieval strengthening | Yes | No | No | No |
-| Hybrid search (BM25 + embeddings) | Yes | Embeddings only | No | No |
+| Reward-proportional decay | Yes | No | No | No |
+| Hybrid search (BM25 + embeddings) | Yes | Embeddings + spatial | Embeddings only | No |
 | Schema acceleration | Yes | No | No | No |
 | Conflict detection + resolution | Yes | No | No | No |
 | Multi-agent shared memory | Yes | No | No | No |
 | Transfer scoring | Yes | No | No | No |
 | Outcome tracking | Yes | No | No | No |
 | Confidence tiers | Yes | No | No | No |
+| Spatial organization | No | Yes (wings/halls/rooms) | No | No |
+| Lossless compression | No | Yes (AAAK, 30x) | No | No |
 | Cross-tool import | Yes | No | No | No |
-| Conversation capture | Yes | No | No | No |
 | Auto-hook install | Yes | No | No | No |
-| MCP server | Yes | No | No | No |
-| Native plugins | OpenClaw + Claude Code | No | No | No |
-| Multi-repo git learn | Yes | No | No | No |
-| Zero dependencies | Yes | No | No | No |
-| Git-friendly | Yes | No | Yes | No |
-| Framework agnostic | Yes | Partial | Yes | No |
+| MCP server | Yes | Yes | No | No |
+| Zero dependencies | Yes | No (ChromaDB) | No | No |
+| LongMemEval R@5 (retrieval) | 74.0% (BM25 only) | 96.6% (raw) / 100% (reranked) | ~49-85% | N/A |
+| Git-friendly | Yes | No | No | Yes |
+| Framework agnostic | Yes | Yes | Partial | Yes |
+Different tools answer different questions. Mem0 and Basic Memory implement "save everything, search later." MemPalace implements "store everything, organize spatially for retrieval." Hippo implements "forget by default, earn persistence through use." These are complementary approaches: MemPalace's retrieval precision + Hippo's lifecycle management would be stronger than either alone.
+---
+## Benchmarks
+Two benchmarks testing two different things. Full details in [`benchmarks/`](benchmarks/).
+### LongMemEval (retrieval accuracy)
+[LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) is the industry-standard benchmark: 500 questions across 5 memory abilities, embedded in 115k+ token chat histories.
+**Hippo v0.11.0 results (BM25 only, zero dependencies):**
+| Metric | Score |
+|--------|-------|
+| Recall@1 | 50.4% |
+| Recall@3 | 66.6% |
+| Recall@5 | 74.0% |
+| Recall@10 | 82.6% |
+| Answer in content@5 | 46.6% |
+| Question Type | Count | R@5 |
+|---------------|-------|-----|
+| single-session-assistant | 56 | 94.6% |
+| knowledge-update | 78 | 88.5% |
+| temporal-reasoning | 133 | 73.7% |
+| multi-session | 133 | 72.2% |
+| single-session-user | 70 | 65.7% |
+| single-session-preference | 30 | 26.7% |
-Mem0, Basic Memory, and Claude-Mem all implement "save everything, search later." Hippo implements all 7 hippocampal mechanisms: two-speed storage, decay, retrieval strengthening, schema acceleration, conflict detection, multi-agent transfer, and explicit working memory. It's the only tool that models what memories are worth keeping.
+For context: MemPalace scores 96.6% (raw) using ChromaDB embeddings + spatial indexing. Hippo achieves 74.0% using BM25 keyword matching alone with zero runtime dependencies. Adding embeddings via `hippo embed` (optional `@xenova/transformers` peer dep) enables hybrid search and should close the gap.
+Hippo's strongest categories (knowledge-update 88.5%, single-session-assistant 94.6%) are the ones where keyword overlap between question and stored content is highest. The weakest (preference 26.7%) involves indirect references that need semantic understanding.
+```bash
+cd benchmarks/longmemeval
+python ingest_direct.py --data data/longmemeval_oracle.json --store-dir ./store
+python retrieve_fast.py --data data/longmemeval_oracle.json --store-dir ./store --output results/retrieval.jsonl
+python evaluate_retrieval.py --retrieval results/retrieval.jsonl --data data/longmemeval_oracle.json
+```
+### Sequential Learning Benchmark (agent improvement over time)
+No other public benchmark tests whether memory systems produce learning curves. LongMemEval tests retrieval on a fixed corpus. This benchmark tests whether an agent with memory *performs better on task 40 than task 5*.
+50 tasks, 10 trap categories, each appearing 2-3 times across the sequence.
+**Hippo v0.11.0 results:**
+| Condition | Overall | Early | Mid | Late | Learns? |
+|-----------|---------|-------|-----|------|---------|
+| No memory | 100% | 100% | 100% | 100% | No |
+| Static memory | 20% | 33% | 11% | 14% | No |
+| Hippo | 40% | 78% | 22% | 14% | Yes |
+The hippo agent's trap-hit rate drops from 78% to 14% as it accumulates error memories with 2x half-life. Static pre-loaded memory helps from the start but doesn't improve. Any memory system can run this benchmark by implementing the [adapter interface](benchmarks/sequential-learning/adapters/interface.mjs).
+```bash
+cd benchmarks/sequential-learning
+node run.mjs --adapter all
+```
 ---
@@ -664,10 +798,13 @@ Mem0, Basic Memory, and Claude-Mem all implement "save everything, search later.
 Issues and PRs welcome. Before contributing, run `hippo status` in the repo root to see the project's own memory.
 The interesting problems:
+- **Improve LongMemEval score.** Current R@5 is 74.0% with BM25 only. Adding embeddings (`hippo embed`) and hybrid search should close the gap toward MemPalace's 96.6%.
 - Better consolidation heuristics (LLM-powered merge vs current text overlap)
 - Web UI / dashboard for visualizing decay curves and memory health
 - Optimal decay parameter tuning from real usage data
 - Cross-agent transfer learning evaluation
+- **MemPalace-style spatial organization.** Could spatial structure (wings/halls/rooms) improve hippo's semantic layer?
+- **AAAK-style compression for semantic memories.** Lossless token compression for context injection.
 ## License

package/dist/cli.d.ts CHANGED Viewed

@@ -21,6 +21,7 @@
  *   hippo learn --git [--days <n>] [--repos <paths>]
  *   hippo promote <id>
  *   hippo sync
+ *   hippo decide "<decision>" [--context "<why>"] [--supersedes <id>]
  *   hippo wm <push|read|clear|flush>
  */
 export {};

package/dist/cli.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA~~;;;;;;;;;;;;;;;;;;;;;;;GAuBG~~"}
1	+ {"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA;;;;;;;;;;;;;;;;;;;;;;;;GAwBG"}