npm - @henrychong-ai/mcp-neo4j-knowledge-graph - Versions diffs - 1.5.2 → 1.7.0 - Mend

@henrychong-ai/mcp-neo4j-knowledge-graph 1.5.2 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/README.md CHANGED Viewed

@@ -8,6 +8,40 @@ Scalable, high-performance knowledge graph memory system with semantic retrieval
 ---
+## 🚀 Quick Start with Claude Code
+**Already have Claude Code running?** Paste this into your session:
+```
+Read and execute the setup guide at henrychong.com/mcp-neo4j-kg/setup
+```
+Claude Code will guide you through:
+- ✓ Prerequisites check (Node.js, Docker)
+- ✓ Neo4j database setup
+- ✓ Configuration and environment variables
+- ✓ Integration with Claude Desktop/Code
+- ✓ Testing and verification
+**Expected time:** 10-15 minutes
+**No prior installation needed** - the automation handles everything!
+---
+### Manual Setup
+If you prefer manual installation, see the detailed sections below for:
+- [Installation](#installation) - npm/npx setup
+- [Neo4j Setup](#storage-backend) - Docker or local database
+- [Configuration](#configuration) - Environment variables
+- [Claude Desktop](#integration-with-claude-desktop) - MCP client setup
+- [Claude Code](#integration-with-claude-code) - CLI client setup
+- [Testing](#testing-your-setup) - Verification steps
+**Expected manual setup time:** 10-15 minutes
+---
 ## Installation
 ### Global Installation with npx (Recommended)
@@ -300,6 +334,83 @@ Rich metadata support for both entities and relations with custom fields:
 - **Query Support**: Search and filter based on metadata properties
 - **Extensible Schema**: Add custom fields as needed without modifying the core data model
+### Batch Operations
+Optimized bulk operations providing 10-50x performance improvement over individual operations:
+- **High-Performance Bulk Processing**: Batch operations use Neo4j's UNWIND clause for dramatic performance gains
+- **Automatic Chunking**: Large batches are automatically split into optimal chunk sizes (default: 100 items)
+- **Parallel Processing**: Independent operations (like embedding generation) can run concurrently
+- **Progress Tracking**: Optional callbacks provide real-time progress updates for long-running operations
+- **Partial Failure Handling**: Continue processing on failures with detailed error reports per item
+- **Performance Metrics**: Each batch operation returns total time and per-item average timing
+- **Transaction Safety**: Automatic rollback on failures ensures data consistency
+**Available Batch Tools**:
+- `create_entities_batch`: Create multiple entities in single operation
+- `create_relations_batch`: Create multiple relations in single operation
+- `add_observations_batch`: Add observations to multiple entities in single operation
+- `update_entities_batch`: Update multiple entities in single operation
+**Performance Comparison**:
+```typescript
+// Individual operations: ~50 seconds for 100 entities
+for (const entity of entities) {
+  await createEntities([entity]);
+}
+// Batch operation: ~1.5 seconds for 100 entities (33x faster)
+await createEntitiesBatch(entities, {
+  maxBatchSize: 100,
+  enableParallel: true
+});
+```
+**Configuration Options**:
+- `maxBatchSize`: Control chunk size (default: 100)
+- `enableParallel`: Reserved for future parallel chunk processing (embeddings always generated if service available)
+- `onProgress`: Callback for progress tracking
+**Cost Management:**
+- Incremental approach minimizes API calls
+- Only processes entities without embeddings
+- Typical cost: ~$0.02 per 1M tokens
+- Production cost: ~$0.0025 per daily run (for typical workloads)
+This automation ensures semantic search remains highly effective as your knowledge graph grows, without requiring manual embedding regeneration.
+### Query Result Caching (v1.5.0+)
+Semantic search queries are automatically cached for improved performance:
+**Cache Configuration:**
+- **LRU (Least Recently Used) Strategy**: Automatically evicts oldest entries when full
+- **Capacity**: 500 unique queries cached simultaneously
+- **TTL (Time-To-Live)**: 5 minutes per cache entry
+- **Size Limit**: 10,000 entities maximum across all cached results
+- **Size Calculation**: Entity count + relation count
+**Cache Behavior:**
+- **Cache Hits**: Sub-millisecond response for repeated queries
+- **Automatic Invalidation**: Cache cleared on mutations (create_entities, add_observations, delete_entities, etc.)
+- **Intelligent Keying**: Considers query text, limit, similarity threshold, entity types, and hybrid config
+- **Metrics Integration**: Cache hits/misses tracked via Prometheus (when enabled)
+**Performance Impact:**
+- **First Query**: Normal latency (~100-500ms depending on graph size)
+- **Cached Query**: <1ms response time
+- **Memory Usage**: Minimal - automatically bounded by size limits
+- **Cache Miss Rate**: Typically <10% for conversational workloads
+**Example Scenarios:**
+- User asks "What programming languages do you know?" → Cache miss (~300ms)
+- User asks "What programming languages do you know?" again → Cache hit (<1ms)
+- User creates new entity → Cache cleared for consistency
+- User asks "What programming languages do you know?" → Cache miss (~300ms, fresh results)
+This caching layer provides significant performance improvements for repeated or similar queries without any configuration needed.
 ## MCP API Tools
 The following tools are available to LLM client hosts through the Model Context Protocol:
@@ -684,6 +795,100 @@ The adaptive search capabilities provide practical benefits:
 For example, when a user asks "What do you know about machine learning?", the system can retrieve conceptually related entities even if they don't explicitly mention "machine learning" - perhaps entities about neural networks, data science, or specific algorithms. But if semantic search yields insufficient results, the system automatically adjusts its approach to ensure useful information is still returned.
+## Integration with Claude Code
+### Configuration
+Add this to your `~/.claude.json`:
+```json
+{
+  "mcpServers": {
+    "neo4j-kg": {
+      "command": "npx",
+      "args": ["-y", "@henrychong-ai/mcp-neo4j-knowledge-graph"],
+      "env": {
+        "MEMORY_STORAGE_TYPE": "neo4j",
+        "NEO4J_URI": "bolt://127.0.0.1:7687",
+        "NEO4J_USERNAME": "neo4j",
+        "NEO4J_PASSWORD": "your_password_here",
+        "NEO4J_DATABASE": "neo4j",
+        "NEO4J_VECTOR_INDEX": "entity_embeddings",
+        "NEO4J_VECTOR_DIMENSIONS": "1536",
+        "NEO4J_SIMILARITY_FUNCTION": "cosine",
+        "OPENAI_API_KEY": "your-openai-api-key",
+        "OPENAI_EMBEDDING_MODEL": "text-embedding-3-small"
+      }
+    }
+  }
+}
+```
+### Verify MCP Tools Available
+In a Claude Code session, the MCP tools will be automatically available. You can verify by asking:
+```
+Show me the available MCP tools for the knowledge graph
+```
+You should see tools like:
+- `mcp__kg__create_entities`
+- `mcp__kg__create_relations`
+- `mcp__kg__add_observations`
+- `mcp__kg__search_nodes`
+- `mcp__kg__semantic_search`
+- And more...
+## Testing Your Setup
+### Step 1: Create Your First Entity
+In Claude Desktop or Claude Code, say:
+```
+Use the knowledge graph to create an entity named "Python"
+of type "Programming Language" with the observation
+"General-purpose, high-level programming language known for readability"
+```
+### Step 2: Search for the Entity
+```
+Search the knowledge graph for "Python"
+```
+Claude should find your entity using the `mcp__kg__search_nodes` tool.
+### Step 3: Add More Observations
+```
+Add these observations to the Python entity:
+- Created by Guido van Rossum in 1991
+- Popular for data science, web development, and automation
+- Dynamic typing with interpreted execution
+```
+### Step 4: Verify in Neo4j Browser
+Open `http://localhost:7474` and run:
+```cypher
+MATCH (e:Entity {name: "Python"})
+WHERE e.validTo IS NULL
+RETURN e
+```
+You should see your entity with all observations.
+### Step 5: Test Semantic Search (If OpenAI API Key Configured)
+```
+Perform a semantic search for "programming languages for beginners"
+```
+The Python entity should appear in results based on semantic similarity.
 ## Troubleshooting
 ### Schema Constraint Configuration