cozo-memory 1.2.6 → 1.2.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,8 +3,14 @@
3
3
  [![npm](https://img.shields.io/npm/v/cozo-memory)](https://www.npmjs.com/package/cozo-memory)
4
4
  [![Node](https://img.shields.io/node/v/cozo-memory)](https://nodejs.org)
5
5
  [![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)
6
+ [![MCP Badge](https://lobehub.com/badge/mcp/tobs-code-cozo-memory)](https://lobehub.com/mcp/tobs-code-cozo-memory)
6
7
 
7
- **Local-first memory for Claude & AI agents with hybrid search, Graph-RAG, and time-travel – all in a single binary, no cloud, no Docker.**
8
+ > **Why Cozo Memory?**
9
+ > LLMs have short-term memory limits. Standard RAG retrieves documents but can't connect facts across time. Cozo Memory gives your AI agent **persistent, structured memory** – it remembers past conversations, infers relationships, detects contradictions, and explores its knowledge graph – fully on your machine, with **optional local LLM integration via Ollama** for intelligent actions (cleanup, reflection, summarization, agentic routing).
10
+ >
11
+ > Most memory stacks combine separate databases: SQLite for facts, Chroma for vector search, NetworkX for graphs. **CozoDB replaces all of that with one embedded engine**: relational, graph, vector, and full-text search in a single query language, one file, zero sync lag.
12
+
13
+ **Local-first memory for Claude & AI agents with hybrid search, Graph-RAG, and time-travel – runs entirely on your machine. Optional [Ollama](https://ollama.ai) integration enables LLM-powered actions (cleanup, reflect, summarize, agentic retrieval).**
8
14
 
9
15
  ## Table of Contents
10
16
 
@@ -51,7 +57,7 @@ Now add the server to your MCP client (e.g. Claude Desktop) – see [Integration
51
57
 
52
58
  ⏳ **Temporal Conflict Resolution** - Automatic detection and resolution of contradictory observations with semantic analysis and audit preservation
53
59
 
54
- 🏠 **100% Local** - Embeddings via ONNX/Transformers; no external services, no cloud, complete data ownership
60
+ 🏠 **100% Local** - Embeddings via ONNX/Transformers; data stays on your machine. Some advanced features (cleanup, reflect, summarize, agentic search) require an optional [Ollama](https://ollama.ai) service for local LLM inference — but the core search, CRUD, and graph operations work **without any LLM**.
55
61
 
56
62
  🧠 **Multi-Hop Reasoning** - Logic-aware graph traversal with vector pivots for deep relational reasoning
57
63
 
@@ -61,17 +67,36 @@ Now add the server to your MCP client (e.g. Claude Desktop) – see [Integration
61
67
 
62
68
  ## Positioning & Comparison
63
69
 
70
+ ### Why CozoDB instead of SQLite + Chroma + NetworkX?
71
+
72
+ A common first question is: *"Why not just combine existing tools?"*
73
+
74
+ | If you need... | Typical separate stack | CozoDB Memory |
75
+ | :--- | :--- | :--- |
76
+ | Structured data & relations | **SQLite** / PostgreSQL | ✅ Built-in relational engine |
77
+ | Semantic / vector search | **Chroma** / Qdrant / Pinecone | ✅ HNSW + FTS + RRF in one engine |
78
+ | Graph traversal & reasoning | **NetworkX** / Neo4j | ✅ Native graph queries + PageRank |
79
+ | Time-travel / versioning | Custom audit tables | ✅ Built-in `Validity` time-travel |
80
+ | Unified query language | Multiple APIs + glue code | ✅ Single Datalog query across all dimensions |
81
+
82
+ **The core insight:** Most memory stacks bolt vector search onto a graph DB, or graph search onto a vector DB. CozoDB is different: it is a **single engine** that natively combines relational, graph, vector, and full-text search. That means:
83
+
84
+ - **One query language** (Datalog) reaches every dimension.
85
+ - **No sync lag** between separate indexes.
86
+ - **No ETL bridge** between "vector results" and "graph expansion."
87
+ - **Smaller operational surface**: one database file, one process, one dependency chain.
88
+
89
+ ### Comparison with other memory solutions
90
+
64
91
  Most "Memory" MCP servers fall into two categories:
65
92
  1. **Simple Knowledge Graphs**: CRUD operations on triples, often only text search
66
93
  2. **Pure Vector Stores**: Semantic search (RAG), but little understanding of complex relationships
67
94
 
68
- This server fills the gap in between ("Sweet Spot"): A **local, database-backed memory engine** combining vector, graph, and keyword signals.
69
-
70
- ### Comparison with other solutions
95
+ This server fills the gap in between ("Sweet Spot"): A **local, database-backed memory engine** combining vector, graph, and keyword signals — powered by CozoDB's unified engine rather than a patchwork of separate databases.
71
96
 
72
97
  | Feature | **CozoDB Memory (This Project)** | **Official Reference (`@modelcontextprotocol/server-memory`)** | **mcp-memory-service (Community)** | **Database Adapters (Qdrant/Neo4j)** |
73
98
  | :--- | :--- | :--- | :--- | :--- |
74
- | **Backend** | **CozoDB** (Graph + Vector + Relational) | JSON file (`memory.jsonl`) | SQLite / Cloudflare | Specialized DB (only Vector or Graph) |
99
+ | **Backend** | **CozoDB** (Graph + Vector + Relational + FTS in one engine) | JSON file (`memory.jsonl`) | SQLite / Cloudflare | Specialized DB (only Vector or Graph) |
75
100
  | **Search Logic** | **Agentic (Auto-Route)**: Hybrid + Graph + Summaries | Keyword only / Exact Graph Match | Vector + Keyword | Mostly only one dimension |
76
101
  | **Inference** | **Yes**: Built-in engine for implicit knowledge | No | No ("Dreaming" is consolidation) | No (Retrieval only) |
77
102
  | **Community** | **Yes**: Hierarchical Community Summaries | No | No | Only clustering (no summary) |
@@ -89,9 +114,34 @@ The core advantage is **Intelligence and Traceability**: By combining an **Agent
89
114
  - **RAM: 1.7 GB minimum** (for default bge-m3 model)
90
115
  - Model download: ~600 MB
91
116
  - Runtime memory: ~1.1 GB
92
- - For lower-spec machines, see [Embedding Model Options](#embedding-model-options) below
117
+ - **Too heavy?** Use `EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2` only **~400 MB RAM** needed (see [Embedding Model Options](#embedding-model-options))
93
118
  - CozoDB native dependency is installed via `cozo-node`
94
119
 
120
+ ### Optional: Ollama for LLM-powered actions
121
+
122
+ Some advanced actions use a local LLM via [Ollama](https://ollama.ai) for intelligent
123
+ processing. **The core server works without Ollama** (CRUD, search, graph operations),
124
+ but the following actions require it:
125
+
126
+ | Action | Purpose |
127
+ |--------|---------|
128
+ | `cleanup` | LLM-backed observation consolidation |
129
+ | `reflect` | Generate insights, detect contradictions |
130
+ | `summarize_communities` | LLM-generated community summaries |
131
+ | `compact` | Session / entity compaction with LLM summarization |
132
+ | `agentic_search` | Query intent classification for auto-routing |
133
+
134
+ **Setup (if you need these features):**
135
+ ```bash
136
+ # 1. Install Ollama from https://ollama.ai
137
+ # 2. Pull a model (e.g. small + fast for dev):
138
+ ollama pull demyagent-4b-i1:Q6_K
139
+ # 3. Ollama runs automatically on http://localhost:11434
140
+ ```
141
+
142
+ If Ollama is not running, the affected actions gracefully fall back to non-LLM behavior
143
+ (where possible) or return a clear error message.
144
+
95
145
  ### Via npm (Easiest)
96
146
 
97
147
  ```bash
@@ -337,10 +387,10 @@ The interface is reduced to **5 consolidated tools**:
337
387
 
338
388
  | Tool | Purpose | Key Actions |
339
389
  |------|---------|-------------|
340
- | `mutate_memory` | Write operations | create_entity, update_entity, delete_entity, add_observation, create_relation, transactions, sessions, tasks |
341
- | `query_memory` | Read operations | search, advancedSearch, context, graph_rag, graph_walking, agentic_search, adaptive_retrieval |
390
+ | `mutate_memory` | Write operations | create_entity, update_entity, delete_entity, add_observation, create_relation, transactions, sessions, tasks, update_observation, batch_delete, manage_tags, batch |
391
+ | `query_memory` | Read operations | search, advancedSearch, context, graph_rag, graph_walking, agentic_search, adaptive_retrieval, list_entities, get_entity_detail, get_session_context, list_sessions |
342
392
  | `analyze_graph` | Graph analysis | explore, communities, pagerank, betweenness, hits, shortest_path, semantic_walk |
343
- | `manage_system` | Maintenance | health, metrics, export, import, cleanup, defrag, reflect, snapshots |
393
+ | `manage_system` | Maintenance | health, metrics, stats, export, import, cleanup, defrag, reflect, snapshots |
344
394
  | `edit_user_profile` | User preferences | Edit global user profile with preferences and work style |
345
395
 
346
396
  > **See [docs/API.md](docs/API.md) for complete API reference with all parameters and examples**
@@ -354,10 +404,12 @@ The interface is reduced to **5 consolidated tools**:
354
404
  - This is normal and only happens once
355
405
  - Subsequent starts are fast (< 2 seconds)
356
406
 
357
- **Cleanup/Reflect Requires Ollama**
358
- - If using `cleanup` or `reflect` actions, an Ollama service must be running locally
407
+ **LLM-powered actions require Ollama**
408
+ - The following actions use a local LLM for intelligent processing: `cleanup`, `reflect`, `summarize_communities`, `compact`, `agentic_search`
359
409
  - Install Ollama from https://ollama.ai
360
410
  - Pull the desired model: `ollama pull demyagent-4b-i1:Q6_K` (or your preferred model)
411
+ - Without Ollama, these actions fall back to non-LLM behavior or return a clear error
412
+ - **Core features (CRUD, search, graph, infer) work without any LLM**
361
413
 
362
414
  **Windows-Specific**
363
415
  - Embeddings are processed on CPU for maximum compatibility
@@ -403,30 +455,6 @@ npm run benchmark # Runs performance tests
403
455
  npm run eval # Runs evaluation suite
404
456
  ```
405
457
 
406
- ## Roadmap
407
-
408
- ### Near-Term (v1.x)
409
-
410
- - **GPU Acceleration** - CUDA support for embedding generation (10-50x faster)
411
- - **Streaming Ingestion** - Real-time data ingestion from logs, APIs, webhooks
412
- - **Advanced Chunking** - Semantic chunking for `ingest_file` (paragraph-aware splitting)
413
- - **Query Optimization** - Automatic query plan optimization for complex graph traversals
414
- - **Additional Export Formats** - Notion, Roam Research, Logseq compatibility
415
-
416
- ### Mid-Term (v2.x)
417
-
418
- - **Multi-Modal Embeddings** - Support for images, audio, code
419
- - **Distributed Memory** - Sharding and replication for large-scale deployments
420
- - **Advanced Inference** - Neural-symbolic reasoning, causal inference
421
- - **Real-Time Sync** - WebSocket-based real-time updates
422
- - **Web UI** - Browser-based management interface
423
-
424
- ### Long-Term (v3.x)
425
-
426
- - **Federated Learning** - Privacy-preserving collaborative learning
427
- - **Quantum-Inspired Algorithms** - Advanced graph algorithms
428
- - **Multi-Agent Coordination** - Shared memory across multiple agents
429
-
430
458
  ## Contributing
431
459
 
432
460
  Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.