npm - claude-self-reflect - Versions diffs - 2.5.16 → 2.5.18 - Mend

claude-self-reflect 2.5.16 → 2.5.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/.claude/agents/docker-orchestrator.md +20 -0
package/.claude/agents/import-debugger.md +15 -0
package/.claude/agents/qdrant-specialist.md +19 -0
package/Dockerfile.streaming-importer +4 -3
package/README.md +31 -52
package/docker-compose.yaml +1 -1
package/mcp-server/pyproject.toml +1 -1
package/package.json +1 -1

package/.claude/agents/docker-orchestrator.md CHANGED Viewed

@@ -13,6 +13,26 @@ You are a Docker orchestration specialist for the memento-stack project. You man
 - Services run on host network for local development
 - Production uses Railway deployment
+## CRITICAL GUARDRAILS (from v2.5.17 crisis)
+### Resource Limit Guidelines
+⚠️ **Memory limits must include baseline usage**
+- Measure baseline: `docker stats --no-stream`
+- Add 200MB+ headroom above baseline
+- Default: 600MB minimum (not 400MB)
+⚠️ **CPU monitoring in containers**
+- Containers see all host CPUs but have cgroup limits
+- 1437% CPU = ~90% of actual allocation
+- Use cgroup-aware monitoring: `/sys/fs/cgroup/cpu/cpu.cfs_quota_us`
+### Pre-Deployment Checklist
+✅ Test with production data volumes (600+ files)
+✅ Verify STATE_FILE paths match between config and container
+✅ Check volume mounts are writable
+✅ Confirm memory/CPU limits are realistic
+✅ Test graceful shutdown handling
 ## Key Responsibilities
 1. **Service Management**

package/.claude/agents/import-debugger.md CHANGED Viewed

@@ -16,6 +16,21 @@ You are an import pipeline debugging expert for the memento-stack project. You s
 - Project name must be correctly extracted from path for proper collection naming
 - Collections named using MD5 hash of project name
+## CRITICAL GUARDRAILS (from v2.5.17 crisis)
+### Pre-Release Testing Checklist
+✅ **Test with actual Claude JSONL files** - Real ~/.claude/projects/*.jsonl files
+✅ **Verify processing metrics** - files_processed, chunks_created must be > 0
+✅ **Memory limits = baseline + headroom** - Measure actual usage first (typically 400MB base)
+✅ **Run tests to completion** - Don't mark as done without execution proof
+✅ **Handle production backlogs** - Test with 600+ file queues
+### Common Failure Patterns
+🚨 **State updates without progress** - high_water_mark changes but processed_files = 0
+🚨 **Memory limit blocking** - "Memory limit exceeded" on every file = limit too low
+🚨 **CPU misreporting** - 1437% CPU might be 90% of container limit
+🚨 **Wrong file format** - Testing with .json when production uses .jsonl
 ## Key Responsibilities
 1. **JSONL Processing**

package/.claude/agents/qdrant-specialist.md CHANGED Viewed

@@ -36,6 +36,25 @@ You are a Qdrant vector database specialist for the memento-stack project. Your
    - Analyze embedding quality
    - Compare different embedding models (Voyage vs OpenAI)
+## CRITICAL GUARDRAILS (from v2.5.17 crisis)
+### Testing Requirements
+- **ALWAYS test with real JSONL files** - Claude uses .jsonl, not .json
+- **Verify actual processing** - Check files_processed > 0, not just state updates
+- **Test memory limits with baseline** - System uses 400MB baseline, set limits accordingly
+- **Never mark tests complete without execution** - Run and verify output
+### Resource Management
+- **Default memory to 600MB minimum** - 400MB is too conservative
+- **Monitor baseline + headroom** - Measure actual usage before setting limits
+- **Use cgroup-aware CPU monitoring** - Docker shows all CPUs but has limits
+### Quality Gates
+- **Follow the workflow**: implementation → review → test → docs → release
+- **Use pre-releases for major changes** - Better to test than break production
+- **Document quantitative metrics** - "0 files processed" is clearer than "test failed"
+- **Rollback immediately on failure** - Don't push through broken releases
 ## Essential Commands
 ### Collection Operations

package/Dockerfile.streaming-importer CHANGED Viewed

@@ -14,11 +14,12 @@ RUN pip install --upgrade pip setuptools wheel
 # Install PyTorch CPU version for smaller size
 RUN pip install --no-cache-dir torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
-# Install other dependencies
+# Install other dependencies with updated versions for security
+# Note: fastembed 0.4.0 requires numpy<2, so using compatible version
 RUN pip install --no-cache-dir \
     qdrant-client==1.15.0 \
-    fastembed==0.2.7 \
-    numpy==1.26.0 \
+    fastembed==0.4.0 \
+    numpy==1.26.4 \
     psutil==7.0.0 \
     tenacity==8.2.3 \
     python-dotenv==1.0.0 \

package/README.md CHANGED Viewed

@@ -6,10 +6,7 @@ Claude forgets everything. This fixes that.
 - [What You Get](#what-you-get)
 - [Requirements](#requirements)
-- [Quick Install](#quick-install)
-  - [Local Mode (Default)](#local-mode-default---your-data-stays-private)
-  - [Cloud Mode](#cloud-mode-better-search-accuracy)
-- [Uninstall Instructions](#uninstall-instructions)
+- [Quick Install/Uninstall](#quick-installuninstall)
 - [The Magic](#the-magic)
 - [Before & After](#before--after)
 - [Real Examples](#real-examples-that-made-us-build-this)
@@ -18,10 +15,9 @@ Claude forgets everything. This fixes that.
 - [Using It](#using-it)
 - [Key Features](#key-features)
 - [Performance](#performance)
-- [V2.5.16 Critical Updates](#v2516-critical-updates)
 - [Configuration](#configuration)
 - [Technical Stack](#the-technical-stack)
-- [Problems?](#problems)
+- [Problems](#problems)
 - [What's New](#whats-new)
 - [Advanced Topics](#advanced-topics)
 - [Contributors](#contributors)
@@ -30,12 +26,12 @@ Claude forgets everything. This fixes that.
 Ask Claude about past conversations. Get actual answers. **100% local by default** - your conversations never leave your machine. Cloud-enhanced search available when you need it.
-**✅ Proven at Scale**: Successfully indexed 682 conversation files with 100% reliability. No data loss, no corruption, just seamless conversation memory that works.
+**Proven at Scale**: Successfully indexed 682 conversation files with 100% reliability. No data loss, no corruption, just seamless conversation memory that works.
 **Before**: "I don't have access to previous conversations"
 **After**:
 ```
-⏺ reflection-specialist(Search FastEmbed vs cloud embedding decision)
+reflection-specialist(Search FastEmbed vs cloud embedding decision)
   ⎿ Done (3 tool uses · 8.2k tokens · 12.4s)
 "Found it! Yesterday we decided on FastEmbed for local mode - better privacy,
@@ -52,9 +48,11 @@ Your conversations become searchable. Your decisions stay remembered. Your conte
 - **Node.js** 16+ (for the setup wizard)
 - **Claude Desktop** app
-## Quick Install
+## Quick Install/Uninstall
-### Local Mode (Default - Your Data Stays Private)
+### Install
+#### Local Mode (Default - Your Data Stays Private)
 ```bash
 # Install and run automatic setup
 npm install -g claude-self-reflect
@@ -69,7 +67,7 @@ claude-self-reflect setup
 # 🔒 Keep all data local - no API keys needed
 ```
-### Cloud Mode (Better Search Accuracy)
+#### Cloud Mode (Better Search Accuracy)
 ```bash
 # Step 1: Get your free Voyage AI key
 # Sign up at https://www.voyageai.com/ - it takes 30 seconds
@@ -122,7 +120,7 @@ Here's how your conversations get imported and prioritized:
 ![Import Architecture](docs/diagrams/import-architecture.png)
 **The system intelligently prioritizes your conversations:**
-- **🔥 HOT** (< 5 minutes): Switches to 2-second intervals for near real-time import
+- **HOT** (< 5 minutes): Switches to 2-second intervals for near real-time import
 - **🌡️ WARM** (< 24 hours): Normal priority, processed every 60 seconds
 - **❄️ COLD** (> 24 hours): Batch processed, max 5 per cycle to prevent blocking
@@ -138,7 +136,7 @@ The reflection specialist automatically activates. No special commands needed.
 ## Key Features
-### 🎯 Project-Scoped Search
+### Project-Scoped Search
 Searches are **project-aware by default**. Claude automatically searches within your current project:
 ```
@@ -162,40 +160,6 @@ Recent conversations matter more. Old ones fade. Like your brain, but reliable.
 - **Scale**: 100% indexing success rate across all conversation types
 - **V2 Migration**: 100% complete - all conversations use token-aware chunking
-## V2.5.16 Critical Updates
-### 🚨 CPU Performance Fix - RESOLVED
-**Issue**: Streaming importer was consuming **1437% CPU** causing system overload
-**Solution**: Complete rewrite with production-grade throttling and monitoring
-**Result**: CPU usage reduced to **<1%** (99.93% improvement)
-### ✅ Production-Ready Streaming Importer
-- **Non-blocking CPU monitoring** with cgroup awareness
-- **Queue overflow protection** - data deferred, never dropped
-- **Atomic state persistence** with fsync for crash recovery
-- **Memory management** with 15% GC buffer and automatic cleanup
-- **Proper async signal handling** for clean shutdowns
-### 🎯 100% V2 Token-Aware Chunking
-- **Complete Migration**: All collections now use optimized chunking
-- **Configuration**: 400 tokens/1600 chars with 75 token/300 char overlap
-- **Search Quality**: Improved semantic boundaries and context preservation
-- **Memory Efficiency**: Streaming processing prevents OOM during imports
-### 📊 Performance Metrics (v2.5.16)
-| Metric | Before | After | Improvement |
-|--------|--------|-------|-------------|
-| CPU Usage | 1437% | <1% | 99.93% ↓ |
-| Memory | 8GB peak | 302MB | 96.2% ↓ |
-| Search Latency | Variable | 3.16ms avg | Consistent |
-| Test Success | Unstable | 21/25 passing | Reliable |
-### 🔧 CLI Status Command Fix
-Fixed broken `--status` command in MCP server - now returns:
-- Collection counts and health
-- Real-time CPU and memory usage
-- Search performance metrics
-- Import processing status
 ## The Technical Stack
@@ -206,7 +170,7 @@ Fixed broken `--status` command in MCP server - now returns:
 - **MCP Server**: Python + FastMCP
 - **Search**: Semantic similarity with time decay
-## Problems?
+## Problems
 - [Troubleshooting Guide](docs/troubleshooting.md)
 - [GitHub Issues](https://github.com/ramakay/claude-self-reflect/issues)
@@ -214,7 +178,8 @@ Fixed broken `--status` command in MCP server - now returns:
 ## What's New
-- **v2.5.16** - **CRITICAL PERFORMANCE UPDATE** - Fixed 1437% CPU overload, 100% V2 migration complete, production streaming importer
+- **v2.5.17** - Critical CPU fix and memory limit adjustment. [Full release notes](docs/releases/v2.5.17-release-notes.md)
+- **v2.5.16** - (Pre-release only) Initial streaming importer with CPU throttling
 - **v2.5.15** - Critical bug fixes and collection creation improvements
 - **v2.5.14** - Async importer collection fix - All conversations now searchable
 - **v2.5.11** - Critical cloud mode fix - Environment variables now properly passed to MCP server
@@ -231,6 +196,22 @@ Fixed broken `--status` command in MCP server - now returns:
 - [Architecture details](docs/architecture-details.md)
 - [Contributing](CONTRIBUTING.md)
+### Uninstall
+For complete uninstall instructions, see [docs/UNINSTALL.md](docs/UNINSTALL.md).
+Quick uninstall:
+```bash
+# Remove MCP server
+claude mcp remove claude-self-reflect
+# Stop Docker containers
+docker-compose down
+# Uninstall npm package
+npm uninstall -g claude-self-reflect
+```
 ## Contributors
 Special thanks to our contributors:
@@ -240,6 +221,4 @@ Special thanks to our contributors:
 ---
-Stop reading. Start installing. Your future self will thank you.
-MIT License. Built with ❤️ for the Claude community.
+Built with ❤️  by [ramakay](https://github.com/ramakay) for the Claude community.

package/docker-compose.yaml CHANGED Viewed

@@ -110,7 +110,7 @@ services:
       - MAX_CONCURRENT_QDRANT=2  # Limit concurrent Qdrant operations
       - IMPORT_FREQUENCY=15  # Check every 15 seconds instead of 1
       - BATCH_SIZE=3  # Process only 3 files at a time
-      - MEMORY_LIMIT_MB=400  # Tight memory limit
+      - MEMORY_LIMIT_MB=600  # Memory limit (increased from 400MB for stability)
       - MAX_QUEUE_SIZE=100  # Limit queue size
       - MAX_BACKLOG_HOURS=24  # Alert if backlog > 24 hours
       - QDRANT_TIMEOUT=10  # 10 second timeout for Qdrant ops

package/mcp-server/pyproject.toml CHANGED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "claude-self-reflect-mcp"
-version = "2.5.16"
+version = "2.5.18"
 description = "MCP server for Claude self-reflection with memory decay"
 # readme = "README.md"
 requires-python = ">=3.10"

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-self-reflect",
-  "version": "2.5.16",
+  "version": "2.5.18",
   "description": "Give Claude perfect memory of all your conversations - Installation wizard for Python MCP server",
   "keywords": [
     "claude",