cozo-memory 1.1.3 β†’ 1.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -59,11 +59,19 @@ Now you can add the server to your MCP client (e.g. Claude Desktop).
59
59
 
60
60
  πŸ” **Hybrid Search (since v0.7)** - Combines semantic search (HNSW), full-text search (FTS), and graph signals via Reciprocal Rank Fusion (RRF)
61
61
 
62
- πŸ•ΈοΈ **Graph-RAG & Graph-Walking (v1.7/v2.0)** - Hierarchical retrieval with community detection and summarization; recursive traversals using optimized Datalog algorithms
62
+ πŸ”€ **Dynamic Fusion Framework (v2.3)** - Advanced 4-path retrieval system combining Dense Vector, Sparse Vector, FTS, and Graph traversal with configurable weights and fusion strategies (RRF, Weighted Sum, Max, Adaptive)
63
+
64
+ 🧠 **Logical Edges from Knowledge Graph (v1.0)** - Metadata-driven implicit relationship discovery with 5 patterns: same category, same type, hierarchical, contextual, and transitive logical edges
65
+
66
+ πŸ”€ **Multi-Hop Reasoning with Vector Pivots (v2.5)** - Logic-aware Retrieve-Reason-Prune pipeline using vector search as springboard for graph traversal with helpfulness scoring and pivot depth security
67
+
68
+ ⏳ **Temporal Graph Neural Networks (v2.4)** - Time-aware node embeddings capturing historical context, temporal smoothness, and recency-weighted aggregation using Time2Vec encoding and multi-signal fusion
63
69
 
64
70
  🧠 **Agentic Retrieval Layer (v2.0)** - Auto-routing engine that analyzes query intent via local LLM to select the optimal search strategy (Vector, Graph, or Community)
65
71
 
66
- 🧠 **Multi-Level Memory (v2.0)** - Context-aware memory system with built-in session and task management
72
+ οΏ½ **GraphRAG-R1 Adaptive Retrieval (v2.6)** - Intelligent retrieval system with Progressive Retrieval Attenuation (PRA) and Cost-Aware F1 (CAF) scoring that automatically selects optimal strategies based on query complexity and learns from historical performance
73
+
74
+ �🧠 **Multi-Level Memory (v2.0)** - Context-aware memory system with built-insession and task management
67
75
 
68
76
  🎯 **Tiny Learned Reranker (v2.0)** - Integrated Cross-Encoder model (`ms-marco-MiniLM-L-6-v2`) for ultra-precise re-ranking of top search results
69
77
 
@@ -358,6 +366,153 @@ EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 npm run download-model
358
366
 
359
367
  **Note:** Changing models requires re-embedding existing data. The model is downloaded once on first use.
360
368
 
369
+ ## Framework Adapters
370
+
371
+ Official adapters for seamless integration with popular AI frameworks:
372
+
373
+ ### 🦜 LangChain Adapter
374
+
375
+ ```bash
376
+ npm install @cozo-memory/langchain @cozo-memory/adapters-core
377
+ ```
378
+
379
+ ```typescript
380
+ import { CozoMemoryChatHistory, CozoMemoryRetriever } from '@cozo-memory/langchain';
381
+ import { BufferMemory } from 'langchain/memory';
382
+
383
+ // Chat history with session management
384
+ const chatHistory = new CozoMemoryChatHistory({
385
+ sessionName: 'user-123'
386
+ });
387
+
388
+ const memory = new BufferMemory({ chatHistory });
389
+
390
+ // Retriever with hybrid search or Graph-RAG
391
+ const retriever = new CozoMemoryRetriever({
392
+ useGraphRAG: true,
393
+ graphRAGDepth: 2
394
+ });
395
+ ```
396
+
397
+ ### πŸ¦™ LlamaIndex Adapter
398
+
399
+ ```bash
400
+ npm install @cozo-memory/llamaindex @cozo-memory/adapters-core
401
+ ```
402
+
403
+ ```typescript
404
+ import { CozoVectorStore } from '@cozo-memory/llamaindex';
405
+ import { VectorStoreIndex } from 'llamaindex';
406
+
407
+ // Vector store with Graph-RAG support
408
+ const vectorStore = new CozoVectorStore({
409
+ useGraphRAG: true
410
+ });
411
+
412
+ const index = await VectorStoreIndex.fromDocuments(
413
+ documents,
414
+ { vectorStore }
415
+ );
416
+ ```
417
+
418
+ **Features:**
419
+ - βœ… Persistent chat history (LangChain)
420
+ - βœ… Hybrid search retrieval (both)
421
+ - βœ… Graph-RAG mode (both)
422
+ - βœ… Session management (LangChain)
423
+ - βœ… Vector store operations (LlamaIndex)
424
+
425
+ **Documentation:** See [adapters/README.md](./adapters/README.md) for complete examples and API reference.
426
+
427
+ ## Temporal Graph Neural Networks (v2.4)
428
+
429
+ CozoDB Memory now includes **Temporal Graph Neural Network (TGNN) embeddings** that capture time-aware node representations combining historical context, temporal smoothness, and graph structure.
430
+
431
+ ### What are Temporal Embeddings?
432
+
433
+ Traditional embeddings are static snapshots. Temporal embeddings evolve over time, capturing:
434
+
435
+ 1. **Historical Context** - Weighted aggregation of past observations with exponential decay
436
+ 2. **Temporal Smoothness** - Recency-weighted signals ensure gradual changes, not sudden jumps
437
+ 3. **Time Encoding** - Time2Vec-inspired sinusoidal encoding captures periodicity and time differences
438
+ 4. **Neighborhood Aggregation** - Related entities influence the embedding through weighted graph signals
439
+
440
+ ### Architecture
441
+
442
+ ```
443
+ Entity Embedding = Fuse(
444
+ content_embedding (0.4), # Semantic meaning
445
+ temporal_encoding (0.2), # Time information (Time2Vec)
446
+ historical_context (0.2), # Past observations (exponential decay)
447
+ neighborhood_agg (0.2) # Related entities (graph signals)
448
+ )
449
+ ```
450
+
451
+ ### Key Features
452
+
453
+ - **Time2Vec Encoding** - Sinusoidal functions capture temporal patterns without discretization
454
+ - **Exponential Decay Weighting** - Recent observations matter more (30-day half-life)
455
+ - **Multi-Signal Fusion** - Combines content, temporal, historical, and graph signals
456
+ - **Confidence Scoring** - Reflects data freshness and completeness (0-1 scale)
457
+ - **Memory Caching** - Efficient temporal state for multi-hop traversals
458
+ - **Time-Travel Support** - Generate embeddings at any historical timepoint via CozoDB Validity
459
+
460
+ ### Usage Example
461
+
462
+ ```typescript
463
+ import { TemporalEmbeddingService } from 'cozo-memory';
464
+
465
+ const temporalService = new TemporalEmbeddingService(
466
+ embeddingService,
467
+ dbQuery
468
+ );
469
+
470
+ // Generate embedding at current time
471
+ const embedding = await temporalService.generateTemporalEmbedding(
472
+ entityId,
473
+ new Date()
474
+ );
475
+
476
+ // Or at a historical timepoint
477
+ const pastEmbedding = await temporalService.generateTemporalEmbedding(
478
+ entityId,
479
+ new Date('2026-02-01')
480
+ );
481
+
482
+ // Compare temporal trajectories
483
+ const similarity = cosineSimilarity(
484
+ embedding.embedding,
485
+ pastEmbedding.embedding
486
+ );
487
+ ```
488
+
489
+ ### Confidence Scoring
490
+
491
+ Confidence reflects data quality and freshness:
492
+
493
+ ```
494
+ Base: 0.5
495
+ + Recent entity (< 7 days): +0.3
496
+ + Many observations (> 5): +0.15
497
+ + Well-connected (> 10 relations): +0.15
498
+ = Max: 1.0
499
+ ```
500
+
501
+ ### Research Foundation
502
+
503
+ Based on cutting-edge research (2023-2026):
504
+
505
+ - **ACM Temporal Graph Learning Primer** (2025) - Comprehensive TGNN taxonomy
506
+ - **TempGNN** (2023) - Temporal embeddings for dynamic session-based recommendations
507
+ - **Time-Aware Graph Embedding** (2021) - Temporal smoothness and task-oriented approaches
508
+ - **Allan-Poe** (2025) - All-in-One Graph-Based Hybrid Search with dynamic fusion
509
+
510
+ ### Testing
511
+
512
+ ```bash
513
+ npx ts-node src/test-temporal-embeddings.ts
514
+ ```
515
+
361
516
  ## Start / Integration
362
517
 
363
518
  ### MCP Server (stdio)
@@ -532,9 +687,9 @@ The interface is reduced to **4 consolidated tools**. The concrete operation is
532
687
  | Tool | Purpose | Key Actions |
533
688
  |------|---------|-------------|
534
689
  | `mutate_memory` | Write operations | create_entity, update_entity, delete_entity, add_observation, create_relation, start_session, stop_session, start_task, stop_task, run_transaction, add_inference_rule, ingest_file, invalidate_observation, invalidate_relation |
535
- | `query_memory` | Read operations | search, advancedSearch, context, entity_details, history, graph_rag, graph_walking, agentic_search (Multi-Level Context support) |
690
+ | `query_memory` | Read operations | search, advancedSearch, context, entity_details, history, graph_rag, graph_walking, agentic_search, dynamic_fusion, adaptive_retrieval (Multi-Level Context support) |
536
691
  | `analyze_graph` | Graph analysis | explore, communities, pagerank, betweenness, hits, shortest_path, bridge_discovery, semantic_walk, infer_relations |
537
- | `manage_system` | Maintenance | health, metrics, export_memory, import_memory, snapshot_create, snapshot_list, snapshot_diff, cleanup, reflect, summarize_communities, clear_memory, compact |
692
+ | `manage_system` | Maintenance | health, metrics, export_memory, import_memory, snapshot_create, snapshot_list, snapshot_diff, cleanup, defrag, reflect, summarize_communities, clear_memory, compact |
538
693
 
539
694
  ### mutate_memory (Write)
540
695
 
@@ -645,6 +800,8 @@ Actions:
645
800
  - `graph_rag`: `{ query, max_depth?, limit?, filters?, rerank? }` Graph-based reasoning. Finds vector seeds (with inline filtering) first and then expands transitive relationships. Uses recursive Datalog for efficient BFS expansion.
646
801
  - `graph_walking`: `{ query, start_entity_id?, max_depth?, limit? }` (v1.7) Recursive semantic graph search. Starts at vector seeds or a specific entity and follows relationships to other semantically relevant entities. Ideal for deeper path exploration.
647
802
  - `agentic_search`: `{ query, limit?, rerank? }` **(New v2.0)**: **Auto-Routing Search**. Uses a local LLM (Ollama) to analyze query intent and automatically routes it to the most appropriate strategy (`vector_search`, `graph_walk`, or `community_summary`).
803
+ - `adaptive_retrieval`: `{ query, limit? }` **(New v2.6)**: **GraphRAG-R1 Adaptive Retrieval**. Intelligent system inspired by GraphRAG-R1 (Yu et al., WWW 2026) that automatically classifies query complexity (Simple/Moderate/Complex/Exploratory) and selects the optimal retrieval strategy from 5 options (Vector-Only, Graph-Walk, Hybrid-Fusion, Community-Expansion, Semantic-Walk). Features Progressive Retrieval Attenuation (PRA) to prevent over-retrieval and Cost-Aware F1 (CAF) scoring to balance answer quality with computational cost. Learns from usage and adapts strategy selection based on historical performance stored in CozoDB.
804
+ - `dynamic_fusion`: `{ query, config?, limit? }` **(New v2.3)**: **Dynamic Fusion Framework**. Combines 4 retrieval paths (Dense Vector, Sparse Vector, FTS, Graph) with configurable weights and fusion strategies. Inspired by Allan-Poe (arXiv:2511.00855).
648
805
  - `get_relation_evolution`: `{ from_id, to_id?, since?, until? }` (in `analyze_graph`) Shows temporal development of relationships including time range filter and diff summary.
649
806
 
650
807
  Important Details:
@@ -684,6 +841,81 @@ Examples:
684
841
  { "action": "context", "query": "What is Alice working on right now?", "context_window": 20 }
685
842
  ```
686
843
 
844
+ #### Dynamic Fusion Framework (v2.3)
845
+
846
+ The Dynamic Fusion Framework combines 4 retrieval paths with configurable weights and fusion strategies:
847
+
848
+ **Retrieval Paths:**
849
+ 1. **Dense Vector Search (HNSW)**: Semantic similarity via embeddings
850
+ 2. **Sparse Vector Search**: Keyword-based matching with TF-IDF scoring
851
+ 3. **Full-Text Search (FTS)**: BM25 scoring on entity names
852
+ 4. **Graph Traversal**: Multi-hop relationship expansion from vector seeds
853
+
854
+ **Fusion Strategies:**
855
+ - `rrf` (Reciprocal Rank Fusion): Combines rankings with position-based scoring
856
+ - `weighted_sum`: Direct weighted combination of scores
857
+ - `max`: Takes maximum score across all paths
858
+ - `adaptive`: Query-dependent weighting (future enhancement)
859
+
860
+ **Configuration Example:**
861
+
862
+ ```json
863
+ {
864
+ "action": "dynamic_fusion",
865
+ "query": "database with graph capabilities",
866
+ "limit": 10,
867
+ "config": {
868
+ "vector": {
869
+ "enabled": true,
870
+ "weight": 0.4,
871
+ "topK": 20,
872
+ "efSearch": 100
873
+ },
874
+ "sparse": {
875
+ "enabled": true,
876
+ "weight": 0.3,
877
+ "topK": 20,
878
+ "minScore": 0.1
879
+ },
880
+ "fts": {
881
+ "enabled": true,
882
+ "weight": 0.2,
883
+ "topK": 20,
884
+ "fuzzy": true
885
+ },
886
+ "graph": {
887
+ "enabled": true,
888
+ "weight": 0.1,
889
+ "maxDepth": 2,
890
+ "maxResults": 20,
891
+ "relationTypes": ["related_to", "uses"]
892
+ },
893
+ "fusion": {
894
+ "strategy": "rrf",
895
+ "rrfK": 60,
896
+ "minScore": 0.0,
897
+ "deduplication": true
898
+ }
899
+ }
900
+ }
901
+ ```
902
+
903
+ **Response includes:**
904
+ - `results`: Fused and ranked results with path contribution details
905
+ - `stats`: Performance metrics including:
906
+ - `totalResults`: Number of results after fusion
907
+ - `pathContributions`: Count of results from each path
908
+ - `fusionTime`: Total execution time
909
+ - `pathTimes`: Individual execution times per path
910
+
911
+ **Use Cases:**
912
+ - **Broad Exploration**: Enable all paths with balanced weights
913
+ - **Precision Search**: High vector weight, low graph weight
914
+ - **Relationship Discovery**: High graph weight with specific relation types
915
+ - **Keyword Matching**: High sparse/FTS weights for exact term matching
916
+
917
+ ```json
918
+
687
919
  #### Conflict Detection (Status)
688
920
 
689
921
  If there are contradictory statements about the status of an entity, a conflict is marked. The system considers **temporal consistency**:
@@ -738,6 +970,12 @@ Actions:
738
970
  - `snapshot_list`: `{}`
739
971
  - `snapshot_diff`: `{ snapshot_id_a, snapshot_id_b }`
740
972
  - `cleanup`: `{ confirm, older_than_days?, max_observations?, min_entity_degree?, model? }`
973
+ - `defrag`: `{ confirm, similarity_threshold?, min_island_size? }` **(New v2.3)**: Memory defragmentation. Reorganizes memory structure by:
974
+ - **Duplicate Detection**: Finds and merges near-duplicate observations using cosine similarity (threshold 0.8-1.0, default 0.95)
975
+ - **Island Connection**: Connects small knowledge islands (≀3 nodes) to main graph via semantic bridges
976
+ - **Orphan Removal**: Deletes orphaned entities without observations or relations
977
+ - With `confirm: false`: Dry-run mode showing candidates without making changes
978
+ - With `confirm: true`: Executes defragmentation and returns statistics
741
979
  - `compact`: `{ session_id?, entity_id?, model? }` **(New v2.2)**: Manual context compaction. Supports three modes:
742
980
  - **Session Compaction**: `{ session_id, model? }` - Summarizes session observations into 2-3 bullet points and stores in user profile
743
981
  - **Entity Compaction**: `{ entity_id, model? }` - Compacts entity observations when threshold exceeded, creates Executive Summary
@@ -882,6 +1120,94 @@ Example:
882
1120
 
883
1121
  Returns deletion statistics showing exactly what was removed.
884
1122
 
1123
+ ## Multi-Hop Reasoning with Vector Pivots (v2.5)
1124
+
1125
+ **Research-backed implementation** based on HopRAG (ACL 2025), Retrieval Pivot Attacks (arXiv:2602.08668), and Neo4j GraphRAG patterns.
1126
+
1127
+ ### Retrieve-Reason-Prune Pipeline
1128
+
1129
+ 1. **RETRIEVE**: Find semantic pivot points via HNSW vector search
1130
+ 2. **REASON**: Logic-aware graph traversal with relationship context
1131
+ 3. **PRUNE**: Helpfulness scoring combining textual similarity + logical importance
1132
+ 4. **AGGREGATE**: Deduplicate and rank entities by occurrence and confidence
1133
+
1134
+ ### Key Features
1135
+
1136
+ - **Logic-Aware Traversal**: Considers relationship types, strengths, and PageRank scores
1137
+ - **Helpfulness Scoring**: Combines semantic similarity (60%) + logical importance (40%)
1138
+ - **Pivot Depth Security**: Enforces max depth limit to prevent uncontrolled graph expansion
1139
+ - **Confidence Decay**: Exponential decay (0.9^depth) for recency weighting
1140
+ - **Adaptive Pruning**: Filters paths below confidence threshold
1141
+
1142
+ ### Usage Example
1143
+
1144
+ ```typescript
1145
+ const multiHop = new MultiHopVectorPivot(db, embeddingService);
1146
+ const result = await multiHop.multiHopVectorPivot(
1147
+ "how does deep learning relate to NLP",
1148
+ maxHops: 3,
1149
+ limit: 10
1150
+ );
1151
+
1152
+ // Returns:
1153
+ // - pivots: Initial vector search results
1154
+ // - paths: High-quality reasoning paths
1155
+ // - aggregated_results: Ranked entities with scores
1156
+ // - total_hops: Maximum traversal depth
1157
+ // - execution_time_ms: Performance metrics
1158
+ ```
1159
+
1160
+ ### Research Foundation
1161
+
1162
+ - **HopRAG (ACL 2025)**: Logic-aware RAG with pseudo-queries as edges, achieving 76.78% higher answer accuracy
1163
+ - **Retrieval Pivot Attacks**: Security patterns for hybrid RAG systems with boundary enforcement
1164
+ - **Neo4j GraphRAG**: Multi-hop reasoning patterns for knowledge graphs
1165
+
1166
+ ## Logical Edges from Knowledge Graph (v1.0)
1167
+
1168
+ **Research-backed implementation** based on SAGE (ICLR 2026), Metadata Knowledge Graphs (Atlan 2026), and Knowledge Graph Completion research.
1169
+
1170
+ ### Five Logical Edge Patterns
1171
+
1172
+ 1. **Same Category Edges** - Entities with identical category metadata (confidence: 0.8)
1173
+ 2. **Same Type Edges** - Entities of the same type (confidence: 0.7)
1174
+ 3. **Hierarchical Edges** - Parent-child relationships from metadata (confidence: 0.9)
1175
+ 4. **Contextual Edges** - Entities sharing domain, time period, location, or organization (confidence: 0.7-0.75)
1176
+ 5. **Transitive Logical Edges** - Derived from explicit relationships + metadata patterns (confidence: 0.55-0.6)
1177
+
1178
+ ### Usage Example
1179
+
1180
+ ```typescript
1181
+ const logicalEdges = new LogicalEdgesService(db);
1182
+
1183
+ // Discover all logical edges for an entity
1184
+ const edges = await logicalEdges.discoverLogicalEdges(entityId);
1185
+
1186
+ // Optionally materialize as explicit relationships
1187
+ const created = await logicalEdges.materializeLogicalEdges(entityId);
1188
+
1189
+ // Returns:
1190
+ // - from_id, to_id: Entity IDs
1191
+ // - relation_type: "same_category", "same_type", "hierarchical", "contextual", "transitive_logical"
1192
+ // - confidence: 0.55-0.9 based on pattern
1193
+ // - reason: Human-readable explanation
1194
+ // - pattern: Pattern type for analysis
1195
+ ```
1196
+
1197
+ ### Key Features
1198
+
1199
+ - **Metadata-Driven**: Discovers relationships from entity metadata without explicit encoding
1200
+ - **Multi-Pattern**: Combines 5 different logical inference patterns
1201
+ - **Deduplication**: Automatically removes duplicate edges, keeping highest confidence
1202
+ - **Materialization**: Optional: create explicit relationships for performance optimization
1203
+ - **Explainability**: Each edge includes reason and pattern for interpretability
1204
+
1205
+ ### Research Foundation
1206
+
1207
+ - **SAGE (ICLR 2026)**: Implicit graph exploration with on-demand edge discovery
1208
+ - **Metadata Knowledge Graphs (Atlan 2026)**: Metadata-driven relationship inference
1209
+ - **Knowledge Graph Completion (Frontiers 2025)**: Predicting implicit relationships using embeddings
1210
+
885
1211
  ## Technical Highlights
886
1212
 
887
1213
  ### Dual Timestamp Format (v1.9)