cozo-memory 1.0.3 → 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +246 -37
- package/dist/export-import-service.js +472 -0
- package/dist/index.js +242 -7
- package/dist/inference-engine.js +9 -2
- package/dist/test-bugfixes.js +374 -0
- package/dist/test-delete-comprehensive.js +174 -0
- package/dist/test-export-import.js +152 -0
- package/dist/test-fixes-simple.js +50 -0
- package/package.json +3 -1
- package/dist/verify_transaction_tool.js +0 -46
package/README.md
CHANGED
|
@@ -1,8 +1,34 @@
|
|
|
1
1
|
# CozoDB Memory MCP Server
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
3
|
+
[](https://www.npmjs.com/package/cozo-memory)
|
|
4
|
+
[](https://nodejs.org)
|
|
5
|
+
[](LICENSE)
|
|
6
|
+
|
|
7
|
+
**Local-first memory for Claude & AI agents with hybrid search, Graph-RAG, and time-travel – all in a single binary, no cloud, no Docker.**
|
|
8
|
+
|
|
9
|
+
## Table of Contents
|
|
10
|
+
|
|
11
|
+
- [Quick Start](#quick-start)
|
|
12
|
+
- [Key Features](#key-features)
|
|
13
|
+
- [Positioning & Comparison](#positioning--comparison)
|
|
14
|
+
- [Performance & Benchmarks](#performance--benchmarks)
|
|
15
|
+
- [Architecture](#architecture)
|
|
16
|
+
- [Installation](#installation)
|
|
17
|
+
- [Start / Integration](#start--integration)
|
|
18
|
+
- [Configuration & Backends](#configuration--backends)
|
|
19
|
+
- [Data Model](#data-model)
|
|
20
|
+
- [MCP Tools](#mcp-tools)
|
|
21
|
+
- [mutate_memory (Write)](#mutate_memory-write)
|
|
22
|
+
- [query_memory (Read)](#query_memory-read)
|
|
23
|
+
- [analyze_graph (Analysis)](#analyze_graph-analysis)
|
|
24
|
+
- [manage_system (Maintenance)](#manage_system-maintenance)
|
|
25
|
+
- [Production Monitoring](#production-monitoring)
|
|
26
|
+
- [Technical Highlights](#technical-highlights)
|
|
27
|
+
- [Optional: HTTP API Bridge](#optional-http-api-bridge)
|
|
28
|
+
- [Development](#development)
|
|
29
|
+
- [User Preference Profiling](#user-preference-profiling-mem0-style)
|
|
30
|
+
- [Troubleshooting](#troubleshooting)
|
|
31
|
+
- [License](#license)
|
|
6
32
|
|
|
7
33
|
## Quick Start
|
|
8
34
|
|
|
@@ -27,13 +53,35 @@ npm run start
|
|
|
27
53
|
|
|
28
54
|
Now you can add the server to your MCP client (e.g. Claude Desktop).
|
|
29
55
|
|
|
30
|
-
##
|
|
56
|
+
## Key Features
|
|
57
|
+
|
|
58
|
+
🔍 **Hybrid Search (since v0.7)** - Combines semantic search (HNSW), full-text search (FTS), and graph signals via Reciprocal Rank Fusion (RRF)
|
|
59
|
+
|
|
60
|
+
🕸️ **Graph-RAG & Graph-Walking (since v1.7)** - Advanced retrieval combining vector seeds with recursive graph traversals using optimized Datalog algorithms
|
|
61
|
+
|
|
62
|
+
🎯 **Multi-Vector Support (since v1.7)** - Dual embeddings per entity: content-embedding for context, name-embedding for identification
|
|
63
|
+
|
|
64
|
+
⚡ **Semantic Caching (since v0.8.5)** - Two-level cache (L1 memory + L2 persistent) with semantic query matching
|
|
65
|
+
|
|
66
|
+
⏱️ **Time-Travel Queries** - Version all changes via CozoDB Validity; query any point in history
|
|
67
|
+
|
|
68
|
+
🔗 **Atomic Transactions (since v1.2)** - Multi-statement transactions ensuring data consistency
|
|
69
|
+
|
|
70
|
+
📊 **Graph Algorithms (since v1.3/v1.6)** - PageRank, Betweenness Centrality, HITS, Community Detection, Shortest Path
|
|
31
71
|
|
|
32
|
-
|
|
33
|
-
- An MCP server (stdio) for Claude/other MCP clients.
|
|
34
|
-
- An optional HTTP API bridge server for UI/tools.
|
|
72
|
+
🧹 **Janitor Service** - LLM-backed automatic cleanup with hierarchical summarization
|
|
35
73
|
|
|
36
|
-
|
|
74
|
+
👤 **User Preference Profiling** - Persistent user preferences with automatic 50% search boost
|
|
75
|
+
|
|
76
|
+
🔍 **Near-Duplicate Detection** - Automatic LSH-based deduplication to avoid redundancy
|
|
77
|
+
|
|
78
|
+
🧠 **Inference Engine** - Implicit knowledge discovery with multiple strategies
|
|
79
|
+
|
|
80
|
+
🏠 **100% Local** - Embeddings via ONNX/Transformers; no external services required
|
|
81
|
+
|
|
82
|
+
📦 **Export/Import (since v1.8)** - Export to JSON, Markdown, or Obsidian-ready ZIP; import from Mem0, MemGPT, Markdown, or native format
|
|
83
|
+
|
|
84
|
+
### Detailed Features
|
|
37
85
|
- **Hybrid Search (v0.7 Optimized)**: Combination of semantic search (HNSW), **Full-Text Search (FTS)**, and graph signals, merged via Reciprocal Rank Fusion (RRF).
|
|
38
86
|
- **Full-Text Search (FTS)**: Native CozoDB v0.7 FTS indices with stemming, stopword filtering, and robust query sanitizing (cleaning of `+ - * / \ ( ) ? .`) for maximum stability.
|
|
39
87
|
- **Near-Duplicate Detection (LSH)**: Automatically detects very similar observations via MinHash-LSH (CozoDB v0.7) to avoid redundancy.
|
|
@@ -119,32 +167,57 @@ This tool (`src/benchmark.ts`) performs the following tests:
|
|
|
119
167
|
3. **Search Performance**: Latency measurement for Hybrid Search vs. Raw Vector Search.
|
|
120
168
|
4. **RRF Overhead**: Determination of additional computation time for fusion logic.
|
|
121
169
|
|
|
122
|
-
## Architecture
|
|
123
|
-
|
|
124
|
-
```
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
170
|
+
## Architecture
|
|
171
|
+
|
|
172
|
+
```mermaid
|
|
173
|
+
graph TB
|
|
174
|
+
Client[MCP Client<br/>Claude Desktop, etc.]
|
|
175
|
+
Server[MCP Server<br/>FastMCP + Zod Schemas]
|
|
176
|
+
Services[Memory Services]
|
|
177
|
+
Embeddings[Embeddings<br/>ONNX Runtime]
|
|
178
|
+
Search[Hybrid Search<br/>RRF Fusion]
|
|
179
|
+
Cache[Semantic Cache<br/>L1 + L2]
|
|
180
|
+
Inference[Inference Engine<br/>Multi-Strategy]
|
|
181
|
+
DB[(CozoDB SQLite<br/>Relations + Validity<br/>HNSW Indices<br/>Datalog/Graph)]
|
|
182
|
+
|
|
183
|
+
Client -->|stdio| Server
|
|
184
|
+
Server --> Services
|
|
185
|
+
Services --> Embeddings
|
|
186
|
+
Services --> Search
|
|
187
|
+
Services --> Cache
|
|
188
|
+
Services --> Inference
|
|
189
|
+
Services --> DB
|
|
190
|
+
|
|
191
|
+
style Client fill:#e1f5ff
|
|
192
|
+
style Server fill:#fff4e1
|
|
193
|
+
style Services fill:#f0e1ff
|
|
194
|
+
style DB fill:#e1ffe1
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
### Graph-Walking Visualization
|
|
198
|
+
|
|
199
|
+
```mermaid
|
|
200
|
+
graph LR
|
|
201
|
+
Start([Query: What is Alice working on?])
|
|
202
|
+
V1[Vector Search<br/>Find: Alice]
|
|
203
|
+
E1[Alice<br/>Person]
|
|
204
|
+
E2[Project X<br/>Project]
|
|
205
|
+
E3[Feature Flags<br/>Technology]
|
|
206
|
+
E4[Bob<br/>Person]
|
|
207
|
+
|
|
208
|
+
Start --> V1
|
|
209
|
+
V1 -.semantic similarity.-> E1
|
|
210
|
+
E1 -->|works_on| E2
|
|
211
|
+
E2 -->|uses_tech| E3
|
|
212
|
+
E1 -->|colleague_of| E4
|
|
213
|
+
E4 -.semantic: also relevant.-> E2
|
|
214
|
+
|
|
215
|
+
style Start fill:#e1f5ff
|
|
216
|
+
style V1 fill:#fff4e1
|
|
217
|
+
style E1 fill:#ffe1e1
|
|
218
|
+
style E2 fill:#e1ffe1
|
|
219
|
+
style E3 fill:#f0e1ff
|
|
220
|
+
style E4 fill:#ffe1e1
|
|
148
221
|
```
|
|
149
222
|
|
|
150
223
|
## Installation
|
|
@@ -277,6 +350,13 @@ CozoDB Relations (simplified) – all write operations create new `Validity` ent
|
|
|
277
350
|
|
|
278
351
|
The interface is reduced to **4 consolidated tools**. The concrete operation is always chosen via `action`.
|
|
279
352
|
|
|
353
|
+
| Tool | Purpose | Key Actions |
|
|
354
|
+
|------|---------|-------------|
|
|
355
|
+
| `mutate_memory` | Write operations | create_entity, update_entity, delete_entity, add_observation, create_relation, run_transaction, add_inference_rule, ingest_file |
|
|
356
|
+
| `query_memory` | Read operations | search, advancedSearch, context, entity_details, history, graph_rag, graph_walking |
|
|
357
|
+
| `analyze_graph` | Graph analysis | explore, communities, pagerank, betweenness, hits, shortest_path, bridge_discovery, semantic_walk, infer_relations |
|
|
358
|
+
| `manage_system` | Maintenance | health, metrics, export_memory, import_memory, snapshot_create, snapshot_list, snapshot_diff, cleanup, reflect, clear_memory |
|
|
359
|
+
|
|
280
360
|
### mutate_memory (Write)
|
|
281
361
|
|
|
282
362
|
Actions:
|
|
@@ -288,6 +368,7 @@ Actions:
|
|
|
288
368
|
- `run_transaction`: `{ operations: Array<{ action, params }> }` **(New v1.2)**: Executes multiple operations atomically.
|
|
289
369
|
- `add_inference_rule`: `{ name, datalog }`
|
|
290
370
|
- `ingest_file`: `{ format, content, entity_id?, entity_name?, entity_type?, chunking?, metadata?, observation_metadata?, deduplicate?, max_observations? }`
|
|
371
|
+
- `chunking` options: `"none"`, `"paragraphs"` (future: `"semantic"`)
|
|
291
372
|
|
|
292
373
|
Important Details:
|
|
293
374
|
- `run_transaction` supports `create_entity`, `add_observation`, and `create_relation`. Parameters are automatically suffixed to avoid collisions.
|
|
@@ -447,7 +528,10 @@ Examples:
|
|
|
447
528
|
### manage_system (Maintenance)
|
|
448
529
|
|
|
449
530
|
Actions:
|
|
450
|
-
- `health`: `{}` returns DB counts + embedding cache stats.
|
|
531
|
+
- `health`: `{}` returns DB counts + embedding cache stats + performance metrics.
|
|
532
|
+
- `metrics`: `{}` returns detailed operation counts, error statistics, and performance data.
|
|
533
|
+
- `export_memory`: `{ format, includeMetadata?, includeRelationships?, includeObservations?, entityTypes?, since? }` exports memory to various formats.
|
|
534
|
+
- `import_memory`: `{ data, sourceFormat, mergeStrategy?, defaultEntityType? }` imports memory from external sources.
|
|
451
535
|
- `snapshot_create`: `{ metadata? }`
|
|
452
536
|
- `snapshot_list`: `{}`
|
|
453
537
|
- `snapshot_diff`: `{ snapshot_id_a, snapshot_id_b }`
|
|
@@ -460,6 +544,22 @@ Janitor Cleanup Details:
|
|
|
460
544
|
- With `confirm: true`, the Janitor becomes active:
|
|
461
545
|
- **Hierarchical Summarization**: Detects isolated or old observations, has them summarized by a local LLM (Ollama), and creates a new `ExecutiveSummary` node. Old fragments are deleted to reduce noise while preserving knowledge.
|
|
462
546
|
|
|
547
|
+
**Before Janitor:**
|
|
548
|
+
```
|
|
549
|
+
Entity: Project X
|
|
550
|
+
├─ Observation 1: "Started in Q1" (90 days old, isolated)
|
|
551
|
+
├─ Observation 2: "Uses React" (85 days old, isolated)
|
|
552
|
+
├─ Observation 3: "Team of 5" (80 days old, isolated)
|
|
553
|
+
└─ Observation 4: "Deployed to staging" (75 days old, isolated)
|
|
554
|
+
```
|
|
555
|
+
|
|
556
|
+
**After Janitor:**
|
|
557
|
+
```
|
|
558
|
+
Entity: Project X
|
|
559
|
+
└─ ExecutiveSummary: "Project X is a React-based application started in Q1
|
|
560
|
+
with a team of 5 developers, currently deployed to staging environment."
|
|
561
|
+
```
|
|
562
|
+
|
|
463
563
|
Reflection Service Details:
|
|
464
564
|
- `reflect` analyzes observations of an entity (or top 5 active entities) to find contradictions, patterns, or temporal developments.
|
|
465
565
|
- Results are persisted as new observations with metadata field `{ "kind": "reflection" }` and are retrievable via `context`.
|
|
@@ -467,6 +567,97 @@ Reflection Service Details:
|
|
|
467
567
|
|
|
468
568
|
Defaults: `older_than_days=30`, `max_observations=20`, `min_entity_degree=2`, `model="demyagent-4b-i1:Q6_K"`.
|
|
469
569
|
|
|
570
|
+
Export/Import Details:
|
|
571
|
+
- `export_memory` supports three formats:
|
|
572
|
+
- **JSON** (`format: "json"`): Native Cozo format, fully re-importable with all metadata and timestamps.
|
|
573
|
+
- **Markdown** (`format: "markdown"`): Human-readable document with entities, observations, and relationships.
|
|
574
|
+
- **Obsidian** (`format: "obsidian"`): ZIP archive with Wiki-Links `[[Entity]]`, YAML frontmatter, ready for Obsidian vault.
|
|
575
|
+
- `import_memory` supports four source formats:
|
|
576
|
+
- **Cozo** (`sourceFormat: "cozo"`): Import from native JSON export.
|
|
577
|
+
- **Mem0** (`sourceFormat: "mem0"`): Import from Mem0 format (user_id becomes entity).
|
|
578
|
+
- **MemGPT** (`sourceFormat: "memgpt"`): Import from MemGPT archival/recall memory.
|
|
579
|
+
- **Markdown** (`sourceFormat: "markdown"`): Parse markdown sections as entities with observations.
|
|
580
|
+
- Merge strategies: `skip` (default, skip duplicates), `overwrite` (replace existing), `merge` (combine metadata).
|
|
581
|
+
- Optional filters: `entityTypes` (array), `since` (Unix timestamp in ms), `includeMetadata`, `includeRelationships`, `includeObservations`.
|
|
582
|
+
|
|
583
|
+
Example Export:
|
|
584
|
+
```json
|
|
585
|
+
{
|
|
586
|
+
"action": "export_memory",
|
|
587
|
+
"format": "obsidian",
|
|
588
|
+
"includeMetadata": true,
|
|
589
|
+
"entityTypes": ["Person", "Project"]
|
|
590
|
+
}
|
|
591
|
+
```
|
|
592
|
+
|
|
593
|
+
Example Import:
|
|
594
|
+
```json
|
|
595
|
+
{
|
|
596
|
+
"action": "import_memory",
|
|
597
|
+
"sourceFormat": "mem0",
|
|
598
|
+
"data": "{\"user_id\": \"alice\", \"memories\": [...]}",
|
|
599
|
+
"mergeStrategy": "skip"
|
|
600
|
+
}
|
|
601
|
+
```
|
|
602
|
+
|
|
603
|
+
Production Monitoring Details:
|
|
604
|
+
- `health` provides comprehensive system status including entity/observation/relationship counts, embedding cache statistics, and performance metrics (last operation time, average operation time, total operations).
|
|
605
|
+
- `metrics` returns detailed operational metrics:
|
|
606
|
+
- **Operation Counts**: Tracks create_entity, update_entity, delete_entity, add_observation, create_relation, search, and graph_operations.
|
|
607
|
+
- **Error Statistics**: Total errors and breakdown by operation type.
|
|
608
|
+
- **Performance Metrics**: Last operation duration, average operation duration, and total operations executed.
|
|
609
|
+
- Delete operations now include detailed logging with verification steps and return statistics about deleted data (observations, outgoing/incoming relations).
|
|
610
|
+
|
|
611
|
+
## Production Monitoring
|
|
612
|
+
|
|
613
|
+
The system includes comprehensive monitoring capabilities for production deployments:
|
|
614
|
+
|
|
615
|
+
### Metrics Tracking
|
|
616
|
+
|
|
617
|
+
All operations are automatically tracked with detailed metrics:
|
|
618
|
+
- Operation counts by type (create, update, delete, search, etc.)
|
|
619
|
+
- Error tracking with breakdown by operation
|
|
620
|
+
- Performance metrics (latency, throughput)
|
|
621
|
+
|
|
622
|
+
### Health Endpoint
|
|
623
|
+
|
|
624
|
+
The `health` action provides real-time system status:
|
|
625
|
+
```json
|
|
626
|
+
{ "action": "health" }
|
|
627
|
+
```
|
|
628
|
+
|
|
629
|
+
Returns:
|
|
630
|
+
- Database counts (entities, observations, relationships)
|
|
631
|
+
- Embedding cache statistics (hit rate, size)
|
|
632
|
+
- Performance metrics (last operation time, average time, total operations)
|
|
633
|
+
|
|
634
|
+
### Metrics Endpoint
|
|
635
|
+
|
|
636
|
+
The `metrics` action provides detailed operational metrics:
|
|
637
|
+
```json
|
|
638
|
+
{ "action": "metrics" }
|
|
639
|
+
```
|
|
640
|
+
|
|
641
|
+
Returns:
|
|
642
|
+
- **operations**: Count of each operation type
|
|
643
|
+
- **errors**: Total errors and breakdown by operation
|
|
644
|
+
- **performance**: Last operation duration, average duration, total operations
|
|
645
|
+
|
|
646
|
+
### Enhanced Delete Operations
|
|
647
|
+
|
|
648
|
+
Delete operations include comprehensive logging and verification:
|
|
649
|
+
- Detailed step-by-step logging with `[Delete]` prefix
|
|
650
|
+
- Counts related data before deletion
|
|
651
|
+
- Verification after deletion
|
|
652
|
+
- Returns statistics: `{ deleted: { observations: N, outgoing_relations: N, incoming_relations: N } }`
|
|
653
|
+
|
|
654
|
+
Example:
|
|
655
|
+
```json
|
|
656
|
+
{ "action": "delete_entity", "entity_id": "ENTITY_ID" }
|
|
657
|
+
```
|
|
658
|
+
|
|
659
|
+
Returns deletion statistics showing exactly what was removed.
|
|
660
|
+
|
|
470
661
|
## Technical Highlights
|
|
471
662
|
|
|
472
663
|
### Local ONNX Embeddings (Transformers)
|
|
@@ -572,8 +763,26 @@ npx ts-node test-user-pref.ts
|
|
|
572
763
|
|
|
573
764
|
## Troubleshooting
|
|
574
765
|
|
|
575
|
-
|
|
576
|
-
|
|
766
|
+
### Common Issues
|
|
767
|
+
|
|
768
|
+
**First Start Takes Long**
|
|
769
|
+
- The embedding model download takes 30-90 seconds on first start (Transformers loads ~500MB of artifacts)
|
|
770
|
+
- This is normal and only happens once
|
|
771
|
+
- Subsequent starts are fast (< 2 seconds)
|
|
772
|
+
|
|
773
|
+
**Cleanup/Reflect Requires Ollama**
|
|
774
|
+
- If using `cleanup` or `reflect` actions, an Ollama service must be running locally
|
|
775
|
+
- Install Ollama from https://ollama.ai
|
|
776
|
+
- Pull the desired model: `ollama pull demyagent-4b-i1:Q6_K` (or your preferred model)
|
|
777
|
+
|
|
778
|
+
**Windows-Specific**
|
|
779
|
+
- Embeddings are processed on CPU for maximum compatibility
|
|
780
|
+
- RocksDB backend requires Visual C++ Redistributable if using that option
|
|
781
|
+
|
|
782
|
+
**Performance Issues**
|
|
783
|
+
- First query after restart is slower (cold cache)
|
|
784
|
+
- Use `health` action to check cache hit rates
|
|
785
|
+
- Consider RocksDB backend for datasets > 100k entities
|
|
577
786
|
|
|
578
787
|
## License
|
|
579
788
|
|