npm - claude-self-reflect - Versions diffs - 2.7.4 → 2.8.1 - Mend

claude-self-reflect 2.7.4 → 2.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/.claude/agents/docker-orchestrator.md +1 -1
package/.claude/agents/import-debugger.md +1 -1
package/.claude/agents/mcp-integration.md +1 -1
package/.claude/agents/qdrant-specialist.md +1 -1
package/.claude/agents/search-optimizer.md +1 -1
package/Dockerfile.safe-watcher +6 -3
package/README.md +103 -139
package/docker-compose.yaml +18 -10
package/installer/setup-wizard-docker.js +24 -1
package/mcp-server/src/project_resolver.py +2 -2
package/mcp-server/src/server.py +156 -47
package/mcp-server/src/status.py +30 -1
package/package.json +1 -1
package/scripts/import-conversations-unified.backup.py +0 -374

package/.claude/agents/docker-orchestrator.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: Docker Compose orchestration expert for container management, servi
 tools: Read, Edit, Bash, Grep, LS
 ---
-You are a Docker orchestration specialist for the memento-stack project. You manage multi-container deployments, monitor service health, and troubleshoot container issues.
+You are a Docker orchestration specialist for the claude-self-reflect project. You manage multi-container deployments, monitor service health, and troubleshoot container issues.
 ## Project Context
 - Main stack: Qdrant vector database + MCP server + Python importer

package/.claude/agents/import-debugger.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: Import pipeline debugging specialist for JSONL processing, Python s
 tools: Read, Edit, Bash, Grep, Glob, LS
 ---
-You are an import pipeline debugging expert for the memento-stack project. You specialize in troubleshooting JSONL file processing, Python import scripts, and conversation chunking strategies.
+You are an import pipeline debugging expert for the claude-self-reflect project. You specialize in troubleshooting JSONL file processing, Python import scripts, and conversation chunking strategies.
 ## Project Context
 - Processes Claude Desktop logs from ~/.claude/projects/

package/.claude/agents/mcp-integration.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: MCP (Model Context Protocol) server development expert for Claude D
 tools: Read, Edit, Bash, Grep, Glob, WebFetch
 ---
-You are an MCP server development specialist for the memento-stack project. You handle Claude Desktop integration, implement MCP tools, and ensure seamless communication between Claude and the vector database.
+You are an MCP server development specialist for the claude-self-reflect project. You handle Claude Desktop integration, implement MCP tools, and ensure seamless communication between Claude and the vector database.
 ## Project Context
 - MCP server: claude-self-reflection

package/.claude/agents/qdrant-specialist.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: Qdrant vector database expert for collection management, troublesho
 tools: Read, Bash, Grep, Glob, LS, WebFetch
 ---
-You are a Qdrant vector database specialist for the memento-stack project. Your expertise covers collection management, vector search optimization, and embedding strategies.
+You are a Qdrant vector database specialist for the claude-self-reflect project. Your expertise covers collection management, vector search optimization, and embedding strategies.
 ## Project Context
 - The system uses Qdrant for storing conversation embeddings from Claude Desktop logs

package/.claude/agents/search-optimizer.md CHANGED Viewed

@@ -4,7 +4,7 @@ description: Search quality optimization expert for improving semantic search ac
 tools: Read, Edit, Bash, Grep, Glob, WebFetch
 ---
-You are a search optimization specialist for the memento-stack project. You improve semantic search quality, tune parameters, and analyze embedding model performance.
+You are a search optimization specialist for the claude-self-reflect project. You improve semantic search quality, tune parameters, and analyze embedding model performance.
 ## Project Context
 - Current baseline: 66.1% search accuracy with Voyage AI

package/Dockerfile.safe-watcher CHANGED Viewed

@@ -30,8 +30,11 @@ RUN mkdir -p /root/.cache/fastembed && \
 # Set working directory
 WORKDIR /app
-# Copy scripts
-COPY scripts/ /scripts/
+# Copy application scripts
+COPY scripts/ /app/scripts/
+# Make watcher-loop.sh executable
+RUN chmod +x /app/scripts/watcher-loop.sh
 # Create config directory
 RUN mkdir -p /config
@@ -41,4 +44,4 @@ ENV PYTHONUNBUFFERED=1
 ENV MALLOC_ARENA_MAX=2
 # Run the watcher loop
-CMD ["/scripts/watcher-loop.sh"]
+CMD ["/app/scripts/watcher-loop.sh"]

package/README.md CHANGED Viewed

@@ -24,67 +24,12 @@
 Give Claude perfect memory of all your conversations. Search past discussions instantly. Never lose context again.
-**100% Local by Default** - Your conversations never leave your machine. No cloud services required, no API keys needed, complete privacy out of the box.
+**🔒 100% Local by Default** • **⚡ Blazing Fast Search** • **🚀 Zero Configuration** • **🏭 Production Ready**
-**Blazing Fast Search** - Semantic search across thousands of conversations in milliseconds. Find that discussion about database schemas from three weeks ago in seconds.
+## 🚀 Quick Install
-**Zero Configuration** - Works immediately after installation. Smart auto-detection handles everything. No manual setup, no environment variables, just install and use.
-**Production Ready** - Battle-tested with 600+ conversations across 24 projects. Handles mixed embedding types automatically. Scales from personal use to team deployments.
-## Table of Contents
-- [What You Get](#what-you-get)
-- [Requirements](#requirements)
-- [Quick Install/Uninstall](#quick-installuninstall)
-- [The Magic](#the-magic)
-- [Before & After](#before--after)
-- [Real Examples](#real-examples-that-made-us-build-this)
-- [How It Works](#how-it-works)
-- [Import Architecture](#import-architecture)
-- [Using It](#using-it)
-- [Key Features](#key-features)
-- [Performance](#performance)
-- [Configuration](#configuration)
-- [Technical Stack](#the-technical-stack)
-- [Problems](#problems)
-- [What's New](#whats-new)
-- [Advanced Topics](#advanced-topics)
-- [Contributors](#contributors)
-## What You Get
-Ask Claude about past conversations. Get actual answers. **100% local by default** - your conversations never leave your machine. Cloud-enhanced search available when you need it.
-**Proven at Scale**: Successfully indexed 682 conversation files with 100% reliability. No data loss, no corruption, just seamless conversation memory that works.
-**Before**: "I don't have access to previous conversations"
-**After**:
-```
-reflection-specialist(Search FastEmbed vs cloud embedding decision)
-  ⎿ Done (3 tool uses · 8.2k tokens · 12.4s)
-"Found it! Yesterday we decided on FastEmbed for local mode - better privacy,
-no API calls, 384-dimensional embeddings. Works offline too."
-```
-The reflection specialist is a specialized sub-agent that Claude automatically spawns when you ask about past conversations. It searches your conversation history in its own isolated context, keeping your main chat clean and focused.
-Your conversations become searchable. Your decisions stay remembered. Your context persists.
-## Requirements
-- **Docker Desktop** (macOS/Windows) or **Docker Engine** (Linux)
-- **Node.js** 16+ (for the setup wizard)
-- **Claude Desktop** app
-## Quick Install/Uninstall
-### Install
-#### Local Mode (Default - Your Data Stays Private)
 ```bash
-# Install and run automatic setup
+# Install and run automatic setup (5 minutes, everything automatic)
 npm install -g claude-self-reflect
 claude-self-reflect setup
@@ -93,11 +38,12 @@ claude-self-reflect setup
 # ✅ Configure everything automatically
 # ✅ Install the MCP in Claude Code
 # ✅ Start monitoring for new conversations
-# ✅ Verify the reflection tools work
 # 🔒 Keep all data local - no API keys needed
 ```
-#### Cloud Mode (Better Search Accuracy)
+<details open>
+<summary>📡 Cloud Mode (Better Search Accuracy)</summary>
 ```bash
 # Step 1: Get your free Voyage AI key
 # Sign up at https://www.voyageai.com/ - it takes 30 seconds
@@ -108,17 +54,17 @@ claude-self-reflect setup --voyage-key=YOUR_ACTUAL_KEY_HERE
 ```
 *Note: Cloud mode provides more accurate semantic search but sends conversation data to Voyage AI for processing.*
-5 minutes. Everything automatic. Just works.
+</details>
-## The Magic
+## ✨ The Magic
 ![Self Reflection vs The Grind](docs/images/red-reflection.webp)
-## Before & After
+## 📊 Before & After
 ![Before and After Claude Self-Reflect](docs/diagrams/before-after-combined.webp)
-## Real Examples That Made Us Build This
+## 💬 Real Examples
 ```
 You: "What was that PostgreSQL optimization we figured out?"
@@ -137,34 +83,7 @@ Claude: "3 conversations found:
         - Nov 20: Added rate limiting per authenticated connection"
 ```
-## How It Works
-Your conversations → Vector embeddings → Semantic search → Claude remembers
-Technical details exist. You don't need them to start.
-## Import Architecture
-Here's how your conversations get imported and prioritized:
-![Import Architecture](docs/diagrams/import-architecture.png)
-**The system intelligently prioritizes your conversations:**
-- **HOT** (< 5 minutes): Switches to 2-second intervals for near real-time import
-- **🌡️ WARM** (< 24 hours): Normal priority, processed every 60 seconds
-- **❄️ COLD** (> 24 hours): Batch processed, max 5 per cycle to prevent blocking
-## Using It
-Once installed, just talk naturally:
-- "What did we discuss about database optimization?"
-- "Find our debugging session from last week"
-- "Remember this solution for next time"
-The reflection specialist automatically activates. No special commands needed.
-## Key Features
+## 🎯 Key Features
 ### Project-Scoped Search
 Searches are **project-aware by default**. Claude automatically searches within your current project:
@@ -182,16 +101,37 @@ Claude: [Searches across ALL your projects]
 ### ⏱️ Memory Decay
 Recent conversations matter more. Old ones fade. Like your brain, but reliable.
-### 🚀 Performance
-- **Search**: <3ms average response time across 121+ collections (7.55ms max)
-- **Import**: Production streaming importer with 100% reliability
-- **Memory**: 302MB operational (60% of 500MB limit) - 96% reduction from v2.5.15
-- **CPU**: <1% sustained usage (99.93% reduction from 1437% peak)
-- **Scale**: 100% indexing success rate across all conversation types
-- **V2 Migration**: 100% complete - all conversations use token-aware chunking
+### ⚡ Performance at Scale
+- **Search**: <3ms average response time
+- **Scale**: 600+ conversations across 24 projects
+- **Reliability**: 100% indexing success rate
+- **Memory**: 96% reduction from v2.5.15
+## 🏗️ Architecture
+![Import Architecture](docs/diagrams/import-architecture.png)
+<details>
+<summary>🔥 HOT/WARM/COLD Intelligent Prioritization</summary>
+- **🔥 HOT** (< 5 minutes): 2-second intervals for near real-time import
+- **🌡️ WARM** (< 24 hours): Normal priority with starvation prevention
+- **❄️ COLD** (> 24 hours): Batch processed to prevent blocking
+Files are categorized by age and processed with priority queuing to ensure newest content gets imported quickly while preventing older files from being starved.
+</details>
+## 🛠️ Requirements
+- **Docker Desktop** (macOS/Windows) or **Docker Engine** (Linux)
+- **Node.js** 16+ (for the setup wizard)
+- **Claude Desktop** app
+## 📖 Documentation
-## The Technical Stack
+<details>
+<summary>🔧 Technical Stack</summary>
 - **Vector DB**: Qdrant (local, your data stays yours)
 - **Embeddings**:
@@ -200,18 +140,62 @@ Recent conversations matter more. Old ones fade. Like your brain, but reliable.
 - **MCP Server**: Python + FastMCP
 - **Search**: Semantic similarity with time decay
-## Problems
+</details>
+<details>
+<summary>📚 Advanced Topics</summary>
+- [Performance tuning](docs/performance-guide.md)
+- [Security & privacy](docs/security.md)
+- [Windows setup](docs/windows-setup.md)
+- [Architecture details](docs/architecture-details.md)
+- [Contributing](CONTRIBUTING.md)
+</details>
+<details>
+<summary>🐛 Troubleshooting</summary>
 - [Troubleshooting Guide](docs/troubleshooting.md)
 - [GitHub Issues](https://github.com/ramakay/claude-self-reflect/issues)
 - [Discussions](https://github.com/ramakay/claude-self-reflect/discussions)
-## Upgrading to v2.5.19
+</details>
+<details>
+<summary>🗑️ Uninstall</summary>
+For complete uninstall instructions, see [docs/UNINSTALL.md](docs/UNINSTALL.md).
+Quick uninstall:
+```bash
+# Remove MCP server
+claude mcp remove claude-self-reflect
+# Stop Docker containers
+docker-compose down
+# Uninstall npm package
+npm uninstall -g claude-self-reflect
+```
+</details>
+## 📦 What's New
+<details>
+<summary>🎉 v2.8.0 - Latest Release</summary>
-### 🆕 New Feature: Metadata Enrichment
-v2.5.19 adds searchable metadata to your conversations - concepts, files, and tools!
+- **🔧 Fixed MCP Indexing**: Now correctly shows 97.1% progress (was showing 0%)
+- **🔥 HOT/WARM/COLD**: Intelligent file prioritization for near real-time imports
+- **📊 Enhanced Monitoring**: Real-time status with visual indicators
-#### For Existing Users
+</details>
+<details>
+<summary>✨ v2.5.19 - Metadata Enrichment</summary>
+### For Existing Users
 ```bash
 # Update to latest version
 npm update -g claude-self-reflect
@@ -224,50 +208,30 @@ claude-self-reflect setup
 docker compose run --rm importer python /app/scripts/delta-metadata-update-safe.py
 ```
-#### What You Get
+### What You Get
 - `search_by_concept("docker")` - Find conversations by topic
 - `search_by_file("server.py")` - Find conversations that touched specific files
 - Better search accuracy with metadata-based filtering
-## What's New
+</details>
+<details>
+<summary>📜 Release History</summary>
-- **v2.5.19** - Metadata Enrichment! Search by concepts, files, and tools. [Full release notes](docs/releases/v2.5.19-RELEASE-NOTES.md)
 - **v2.5.18** - Security dependency updates
-- **v2.5.17** - Critical CPU fix and memory limit adjustment. [Full release notes](docs/releases/v2.5.17-release-notes.md)
-- **v2.5.16** - (Pre-release only) Initial streaming importer with CPU throttling
+- **v2.5.17** - Critical CPU fix and memory limit adjustment
+- **v2.5.16** - Initial streaming importer with CPU throttling
 - **v2.5.15** - Critical bug fixes and collection creation improvements
-- **v2.5.14** - Async importer collection fix - All conversations now searchable
-- **v2.5.11** - Critical cloud mode fix - Environment variables now properly passed to MCP server
-- **v2.5.10** - Emergency hotfix for MCP server startup failure (dead code removal)
-- **v2.5.6** - Tool Output Extraction - Captures git changes & tool outputs for cross-agent discovery
+- **v2.5.14** - Async importer collection fix
+- **v2.5.11** - Critical cloud mode fix
+- **v2.5.10** - Emergency hotfix for MCP server startup
+- **v2.5.6** - Tool Output Extraction
 [Full changelog](docs/release-history.md)
-## Advanced Topics
-- [Performance tuning](docs/performance-guide.md)
-- [Security & privacy](docs/security.md)
-- [Windows setup](docs/windows-setup.md)
-- [Architecture details](docs/architecture-details.md)
-- [Contributing](CONTRIBUTING.md)
-### Uninstall
-For complete uninstall instructions, see [docs/UNINSTALL.md](docs/UNINSTALL.md).
-Quick uninstall:
-```bash
-# Remove MCP server
-claude mcp remove claude-self-reflect
-# Stop Docker containers
-docker-compose down
-# Uninstall npm package
-npm uninstall -g claude-self-reflect
-```
+</details>
-## Contributors
+## 👥 Contributors
 Special thanks to our contributors:
 - **[@TheGordon](https://github.com/TheGordon)** - Fixed timestamp parsing (#10)
@@ -276,4 +240,4 @@ Special thanks to our contributors:
 ---
-Built with ❤️  by [ramakay](https://github.com/ramakay) for the Claude community.
+Built with ❤️ by [ramakay](https://github.com/ramakay) for the Claude community.

package/docker-compose.yaml CHANGED Viewed

@@ -177,21 +177,29 @@ services:
       - ./scripts:/scripts:ro
     environment:
       - QDRANT_URL=http://qdrant:6333
-      - STATE_FILE=/config/watcher-state.json
+      - STATE_FILE=/config/csr-watcher.json
+      - LOGS_DIR=/logs  # Fixed: Point to mounted volume
       - VOYAGE_KEY=${VOYAGE_KEY:-}
       - PREFER_LOCAL_EMBEDDINGS=${PREFER_LOCAL_EMBEDDINGS:-true}
-      - HOT_WINDOW_MINUTES=${HOT_WINDOW_MINUTES:-15}
-      - MAX_COLD_FILES_PER_CYCLE=${MAX_COLD_FILES_PER_CYCLE:-3}
-      - MAX_MEMORY_MB=${MAX_MEMORY_MB:-300}
-      - WATCH_INTERVAL_SECONDS=${WATCH_INTERVAL_SECONDS:-30}
-      - MAX_FILES_PER_CYCLE=${MAX_FILES_PER_CYCLE:-10}
+      - ENABLE_MEMORY_DECAY=${ENABLE_MEMORY_DECAY:-false}
+      - DECAY_WEIGHT=${DECAY_WEIGHT:-0.3}
+      - DECAY_SCALE_DAYS=${DECAY_SCALE_DAYS:-90}
+      - CHECK_INTERVAL_S=${CHECK_INTERVAL_S:-60}
+      - HOT_CHECK_INTERVAL_S=${HOT_CHECK_INTERVAL_S:-2}
+      - HOT_WINDOW_MINUTES=${HOT_WINDOW_MINUTES:-5}
+      - WARM_WINDOW_HOURS=${WARM_WINDOW_HOURS:-24}
+      - MAX_COLD_FILES=${MAX_COLD_FILES:-5}
+      - MAX_WARM_WAIT_MINUTES=${MAX_WARM_WAIT_MINUTES:-30}
+      - MAX_MESSAGES_PER_CHUNK=${MAX_MESSAGES_PER_CHUNK:-10}
       - MAX_CHUNK_SIZE=${MAX_CHUNK_SIZE:-50}  # Messages per chunk for streaming
+      - MEMORY_LIMIT_MB=${MEMORY_LIMIT_MB:-1000}
+      - MEMORY_WARNING_MB=${MEMORY_WARNING_MB:-500}
       - PYTHONUNBUFFERED=1
       - MALLOC_ARENA_MAX=2
-    restart: "no"  # Manual start only - prevent system overload
-    profiles: ["safe-watch"]  # Requires explicit profile to run
-    mem_limit: 600m  # Increased from 400m to handle large files safely
-    memswap_limit: 600m
+    restart: unless-stopped
+    profiles: ["safe-watch", "watch"]  # Requires explicit profile to run
+    mem_limit: 1g  # Increased to 1GB to match MEMORY_LIMIT_MB
+    memswap_limit: 1g
     cpus: 1.0  # Single CPU core limit
   # MCP server for Claude integration

package/installer/setup-wizard-docker.js CHANGED Viewed

@@ -454,6 +454,26 @@ async function enrichMetadata() {
   }
 }
+async function startWatcher() {
+  console.log('\n🔄 Starting the streaming watcher...');
+  console.log('   • HOT files (<5 min): 2-second processing');
+  console.log('   • WARM files (<24 hrs): Normal priority');
+  console.log('   • COLD files (>24 hrs): Batch processing');
+  try {
+    safeExec('docker', ['compose', '--profile', 'watch', 'up', '-d', 'safe-watcher'], {
+      cwd: projectRoot,
+      stdio: 'inherit'
+    });
+    console.log('✅ Watcher started successfully!');
+    return true;
+  } catch (error) {
+    console.log('⚠️  Could not start watcher automatically');
+    console.log('   You can start it manually with: docker compose --profile watch up -d');
+    return false;
+  }
+}
 async function showFinalInstructions() {
   console.log('\n✅ Setup complete!');
@@ -461,7 +481,7 @@ async function showFinalInstructions() {
   console.log('   • 🌐 Qdrant Dashboard: http://localhost:6333/dashboard/');
   console.log('   • 📊 Status: All services running');
   console.log('   • 🔍 Search: Semantic search with memory decay enabled');
-  console.log('   • 🚀 Import: Watcher checking every 60 seconds');
+  console.log('   • 🚀 Watcher: HOT/WARM/COLD prioritization active');
   console.log('\n📋 Quick Reference Commands:');
   console.log('   • Check status: docker compose ps');
@@ -568,6 +588,9 @@ async function main() {
   // Enrich metadata (new in v2.5.19)
   await enrichMetadata();
+  // Start the watcher
+  await startWatcher();
   // Show final instructions
   await showFinalInstructions();

package/mcp-server/src/project_resolver.py CHANGED Viewed

@@ -54,7 +54,7 @@ class ProjectResolver:
         4. Fuzzy matching on collection names
         Args:
-            user_project_name: User-provided project name (e.g., "anukruti", "Anukruti", full path)
+            user_project_name: User-provided project name (e.g., "example-project", "Example-Project", full path)
         Returns:
             List of collection names that match the project
@@ -362,7 +362,7 @@ class ProjectResolver:
         Examples:
         - -Users-name-projects-my-app-src -> ['my', 'app', 'src']
-        - -Users-name-Code-freightwise-documents -> ['freightwise', 'documents']
+        - -Users-name-Code-example-project -> ['example', 'project']
         Args:
             path: Path in any format

package/mcp-server/src/server.py CHANGED Viewed

@@ -9,6 +9,7 @@ import json
 import numpy as np
 import hashlib
 import time
+import logging
 from fastmcp import FastMCP, Context
 from .utils import normalize_project_name
@@ -80,15 +81,23 @@ def initialize_embeddings():
         print(f"[ERROR] Failed to initialize embeddings: {e}")
         return False
-# Debug environment loading
-print(f"[DEBUG] Environment variables loaded:")
-print(f"[DEBUG] ENABLE_MEMORY_DECAY: {ENABLE_MEMORY_DECAY}")
-print(f"[DEBUG] USE_NATIVE_DECAY: {USE_NATIVE_DECAY}")
-print(f"[DEBUG] DECAY_WEIGHT: {DECAY_WEIGHT}")
-print(f"[DEBUG] DECAY_SCALE_DAYS: {DECAY_SCALE_DAYS}")
-print(f"[DEBUG] PREFER_LOCAL_EMBEDDINGS: {PREFER_LOCAL_EMBEDDINGS}")
-print(f"[DEBUG] EMBEDDING_MODEL: {EMBEDDING_MODEL}")
-print(f"[DEBUG] env_path: {env_path}")
+# Debug environment loading and startup
+import sys
+import datetime as dt
+startup_time = dt.datetime.now().isoformat()
+print(f"[STARTUP] MCP Server starting at {startup_time}", file=sys.stderr)
+print(f"[STARTUP] Python: {sys.version}", file=sys.stderr)
+print(f"[STARTUP] Working directory: {os.getcwd()}", file=sys.stderr)
+print(f"[STARTUP] Script location: {__file__}", file=sys.stderr)
+print(f"[DEBUG] Environment variables loaded:", file=sys.stderr)
+print(f"[DEBUG] QDRANT_URL: {QDRANT_URL}", file=sys.stderr)
+print(f"[DEBUG] ENABLE_MEMORY_DECAY: {ENABLE_MEMORY_DECAY}", file=sys.stderr)
+print(f"[DEBUG] USE_NATIVE_DECAY: {USE_NATIVE_DECAY}", file=sys.stderr)
+print(f"[DEBUG] DECAY_WEIGHT: {DECAY_WEIGHT}", file=sys.stderr)
+print(f"[DEBUG] DECAY_SCALE_DAYS: {DECAY_SCALE_DAYS}", file=sys.stderr)
+print(f"[DEBUG] PREFER_LOCAL_EMBEDDINGS: {PREFER_LOCAL_EMBEDDINGS}", file=sys.stderr)
+print(f"[DEBUG] EMBEDDING_MODEL: {EMBEDDING_MODEL}", file=sys.stderr)
+print(f"[DEBUG] env_path: {env_path}", file=sys.stderr)
 class SearchResult(BaseModel):
@@ -124,18 +133,48 @@ indexing_status = {
     "is_checking": False
 }
-async def update_indexing_status():
+# Cache for indexing status (5-second TTL)
+_indexing_cache = {"result": None, "timestamp": 0}
+# Setup logger
+logger = logging.getLogger(__name__)
+def normalize_path(path_str: str) -> str:
+    """Normalize path for consistent comparison across platforms.
+    Args:
+        path_str: Path string to normalize
+    Returns:
+        Normalized path string with consistent separators
+    """
+    if not path_str:
+        return path_str
+    p = Path(path_str).expanduser().resolve()
+    return str(p).replace('\\', '/')  # Consistent separators for all platforms
+async def update_indexing_status(cache_ttl: int = 5):
     """Update indexing status by checking JSONL files vs Qdrant collections.
-    This is a lightweight check that compares file counts, not full content."""
-    global indexing_status
+    This is a lightweight check that compares file counts, not full content.
+    Args:
+        cache_ttl: Cache time-to-live in seconds (default: 5)
+    """
+    global indexing_status, _indexing_cache
+    # Check cache first (5-second TTL to prevent performance issues)
+    current_time = time.time()
+    if _indexing_cache["result"] and current_time - _indexing_cache["timestamp"] < cache_ttl:
+        # Use cached result
+        indexing_status = _indexing_cache["result"].copy()
+        return
     # Don't run concurrent checks
     if indexing_status["is_checking"]:
         return
-    # Only check every 5 minutes to avoid overhead
-    current_time = time.time()
-    if current_time - indexing_status["last_check"] < 300:  # 5 minutes
+    # Check immediately on first call, then every 60 seconds to avoid overhead
+    if indexing_status["last_check"] > 0 and current_time - indexing_status["last_check"] < 60:  # 1 minute
         return
     indexing_status["is_checking"] = True
@@ -151,47 +190,108 @@ async def update_indexing_status():
             jsonl_files = list(projects_dir.glob("**/*.jsonl"))
             total_files = len(jsonl_files)
-            # Check imported-files.json to see what's been imported
-            # The streaming importer uses imported-files.json with nested structure
-            # Try multiple possible locations for the config file
+            # Check imported-files.json AND watcher state files to see what's been imported
+            # The system uses multiple state files that need to be merged
+            all_imported_files = set()  # Use set to avoid duplicates
+            file_metadata = {}
+            # 1. Check imported-files.json (batch importer)
             possible_paths = [
                 Path.home() / ".claude-self-reflect" / "config" / "imported-files.json",
                 Path(__file__).parent.parent.parent / "config" / "imported-files.json",
                 Path("/config/imported-files.json")  # Docker path if running in container
             ]
-            imported_files_path = None
             for path in possible_paths:
                 if path.exists():
-                    imported_files_path = path
-                    break
+                    try:
+                        with open(path, 'r') as f:
+                            imported_data = json.load(f)
+                            imported_files_dict = imported_data.get("imported_files", {})
+                            file_metadata.update(imported_data.get("file_metadata", {}))
+                            # Normalize paths before adding to set
+                            normalized_files = {normalize_path(k) for k in imported_files_dict.keys()}
+                            all_imported_files.update(normalized_files)
+                    except (json.JSONDecodeError, IOError) as e:
+                        logger.debug(f"Failed to read state file {path}: {e}")
+                        pass  # Continue if file is corrupted
-            if imported_files_path and imported_files_path.exists():
-                with open(imported_files_path, 'r') as f:
-                    imported_data = json.load(f)
-                    # The actual structure has imported_files and file_metadata at the top level
-                    # NOT nested under stream_position as previously assumed
-                    imported_files_dict = imported_data.get("imported_files", {})
-                    file_metadata = imported_data.get("file_metadata", {})
-                    # Convert dict keys to list for compatibility with existing logic
-                    imported_files_list = list(imported_files_dict.keys())
-                    # Count files that have been imported
-                    for file_path in jsonl_files:
-                        # Try multiple path formats to match Docker's state file
-                        file_str = str(file_path).replace(str(Path.home()), "/logs").replace("\\", "/")
-                        # Also try without .claude/projects prefix (Docker mounts directly)
-                        file_str_alt = file_str.replace("/.claude/projects", "")
-                        # Check if file is in imported_files list (fully imported)
-                        if file_str in imported_files_list or file_str_alt in imported_files_list:
-                            indexed_files += 1
-                        # Or if it has metadata with position > 0 (partially imported)
-                        elif file_str in file_metadata and file_metadata[file_str].get("position", 0) > 0:
-                            indexed_files += 1
-                        elif file_str_alt in file_metadata and file_metadata[file_str_alt].get("position", 0) > 0:
-                            indexed_files += 1
+            # 2. Check csr-watcher.json (streaming watcher - local mode)
+            watcher_paths = [
+                Path.home() / ".claude-self-reflect" / "config" / "csr-watcher.json",
+                Path("/config/csr-watcher.json")  # Docker path
+            ]
+            for path in watcher_paths:
+                if path.exists():
+                    try:
+                        with open(path, 'r') as f:
+                            watcher_data = json.load(f)
+                            watcher_files = watcher_data.get("imported_files", {})
+                            # Normalize paths before adding to set
+                            normalized_files = {normalize_path(k) for k in watcher_files.keys()}
+                            all_imported_files.update(normalized_files)
+                            # Add to metadata with normalized paths
+                            for file_path, info in watcher_files.items():
+                                normalized = normalize_path(file_path)
+                                if normalized not in file_metadata:
+                                    file_metadata[normalized] = {
+                                        "position": 1,
+                                        "chunks": info.get("chunks", 0)
+                                    }
+                    except (json.JSONDecodeError, IOError) as e:
+                        logger.debug(f"Failed to read watcher state file {path}: {e}")
+                        pass  # Continue if file is corrupted
+            # 3. Check csr-watcher-cloud.json (streaming watcher - cloud mode)
+            cloud_watcher_path = Path.home() / ".claude-self-reflect" / "config" / "csr-watcher-cloud.json"
+            if cloud_watcher_path.exists():
+                try:
+                    with open(cloud_watcher_path, 'r') as f:
+                        cloud_data = json.load(f)
+                        cloud_files = cloud_data.get("imported_files", {})
+                        # Normalize paths before adding to set
+                        normalized_files = {normalize_path(k) for k in cloud_files.keys()}
+                        all_imported_files.update(normalized_files)
+                        # Add to metadata with normalized paths
+                        for file_path, info in cloud_files.items():
+                            normalized = normalize_path(file_path)
+                            if normalized not in file_metadata:
+                                file_metadata[normalized] = {
+                                    "position": 1,
+                                    "chunks": info.get("chunks", 0)
+                                }
+                except (json.JSONDecodeError, IOError) as e:
+                    logger.debug(f"Failed to read cloud watcher state file {cloud_watcher_path}: {e}")
+                    pass  # Continue if file is corrupted
+            # Convert set to list for compatibility
+            imported_files_list = list(all_imported_files)
+            # Count files that have been imported
+            for file_path in jsonl_files:
+                # Normalize the current file path for consistent comparison
+                normalized_file = normalize_path(str(file_path))
+                # Try multiple path formats to match Docker's state file
+                file_str = str(file_path).replace(str(Path.home()), "/logs").replace("\\", "/")
+                # Also try without .claude/projects prefix (Docker mounts directly)
+                file_str_alt = file_str.replace("/.claude/projects", "")
+                # Normalize alternative paths as well
+                normalized_alt = normalize_path(file_str)
+                normalized_alt2 = normalize_path(file_str_alt)
+                # Check if file is in imported_files list (fully imported)
+                if normalized_file in imported_files_list or normalized_alt in imported_files_list or normalized_alt2 in imported_files_list:
+                    indexed_files += 1
+                # Or if it has metadata with position > 0 (partially imported)
+                elif normalized_file in file_metadata and file_metadata[normalized_file].get("position", 0) > 0:
+                    indexed_files += 1
+                elif normalized_alt in file_metadata and file_metadata[normalized_alt].get("position", 0) > 0:
+                    indexed_files += 1
+                elif normalized_alt2 in file_metadata and file_metadata[normalized_alt2].get("position", 0) > 0:
+                    indexed_files += 1
         # Update status
         indexing_status["last_check"] = current_time
@@ -203,9 +303,14 @@ async def update_indexing_status():
             indexing_status["percentage"] = (indexed_files / total_files) * 100
         else:
             indexing_status["percentage"] = 100.0
+        # Update cache
+        _indexing_cache["result"] = indexing_status.copy()
+        _indexing_cache["timestamp"] = current_time
     except Exception as e:
         print(f"[WARNING] Failed to update indexing status: {e}")
+        logger.error(f"Failed to update indexing status: {e}", exc_info=True)
     finally:
         indexing_status["is_checking"] = False
@@ -1422,4 +1527,8 @@ if __name__ == "__main__":
         sys.exit(0)
     # Normal MCP server operation
+    print(f"[STARTUP] Starting FastMCP server in stdio mode...", file=sys.stderr)
+    print(f"[STARTUP] Server name: {mcp.name}", file=sys.stderr)
+    print(f"[STARTUP] Calling mcp.run()...", file=sys.stderr)
     mcp.run()
+    print(f"[STARTUP] Server exited normally", file=sys.stderr)

package/mcp-server/src/status.py CHANGED Viewed

@@ -5,6 +5,7 @@ Designed for <20ms execution time to support status bars and shell scripts.
 """
 import json
+import time
 from pathlib import Path
 from collections import defaultdict
@@ -53,11 +54,36 @@ def normalize_file_path(file_path: str) -> str:
     return file_path
+def get_watcher_status() -> dict:
+    """Get streaming watcher status if available."""
+    watcher_state_file = Path.home() / "config" / "csr-watcher.json"
+    if not watcher_state_file.exists():
+        return {"running": False, "status": "not configured"}
+    try:
+        with open(watcher_state_file) as f:
+            state = json.load(f)
+        # Check if watcher is active (modified recently)
+        file_age = time.time() - watcher_state_file.stat().st_mtime
+        is_active = file_age < 120  # Active if updated in last 2 minutes
+        return {
+            "running": is_active,
+            "files_processed": len(state.get("imported_files", {})),
+            "last_update_seconds": int(file_age),
+            "status": "🟢 active" if is_active else "🔴 inactive"
+        }
+    except:
+        return {"running": False, "status": "error reading state"}
 def get_status() -> dict:
     """Get indexing status with overall stats and per-project breakdown.
     Returns:
-        dict: JSON structure with overall and per-project indexing status
+        dict: JSON structure with overall and per-project indexing status, plus watcher status
     """
     projects_dir = Path.home() / ".claude" / "projects"
     project_stats = defaultdict(lambda: {"indexed": 0, "total": 0})
@@ -154,6 +180,9 @@ def get_status() -> dict:
             "total": stats["total"]
         }
+    # Add watcher status
+    result["watcher"] = get_watcher_status()
     return result

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-self-reflect",
-  "version": "2.7.4",
+  "version": "2.8.1",
   "description": "Give Claude perfect memory of all your conversations - Installation wizard for Python MCP server",
   "keywords": [
     "claude",

package/scripts/import-conversations-unified.backup.py DELETED Viewed

@@ -1,374 +0,0 @@
-#!/usr/bin/env python3
-"""
-Streaming importer with true line-by-line processing to prevent OOM.
-Processes JSONL files without loading entire file into memory.
-"""
-import json
-import os
-import sys
-import hashlib
-import gc
-from pathlib import Path
-from datetime import datetime
-from typing import List, Dict, Any, Optional
-import logging
-# Add the project root to the Python path
-project_root = Path(__file__).parent.parent
-sys.path.insert(0, str(project_root))
-from qdrant_client import QdrantClient
-from qdrant_client.models import PointStruct, Distance, VectorParams
-# Set up logging
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(levelname)s - %(message)s'
-)
-logger = logging.getLogger(__name__)
-# Environment variables
-QDRANT_URL = os.getenv("QDRANT_URL", "http://localhost:6333")
-STATE_FILE = os.getenv("STATE_FILE", "/config/imported-files.json")
-PREFER_LOCAL_EMBEDDINGS = os.getenv("PREFER_LOCAL_EMBEDDINGS", "true").lower() == "true"
-VOYAGE_API_KEY = os.getenv("VOYAGE_KEY")
-MAX_CHUNK_SIZE = int(os.getenv("MAX_CHUNK_SIZE", "50"))  # Messages per chunk
-# Initialize Qdrant client
-client = QdrantClient(url=QDRANT_URL)
-# Initialize embedding provider
-embedding_provider = None
-embedding_dimension = None
-if PREFER_LOCAL_EMBEDDINGS or not VOYAGE_API_KEY:
-    logger.info("Using local embeddings (fastembed)")
-    from fastembed import TextEmbedding
-    embedding_provider = TextEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
-    embedding_dimension = 384
-    collection_suffix = "local"
-else:
-    logger.info("Using Voyage AI embeddings")
-    import voyageai
-    embedding_provider = voyageai.Client(api_key=VOYAGE_API_KEY)
-    embedding_dimension = 1024
-    collection_suffix = "voyage"
-def normalize_project_name(project_name: str) -> str:
-    """Normalize project name for consistency."""
-    return project_name.replace("-Users-ramakrishnanannaswamy-projects-", "").replace("-", "_").lower()
-def get_collection_name(project_path: Path) -> str:
-    """Generate collection name from project path."""
-    normalized = normalize_project_name(project_path.name)
-    name_hash = hashlib.md5(normalized.encode()).hexdigest()[:8]
-    return f"conv_{name_hash}_{collection_suffix}"
-def ensure_collection(collection_name: str):
-    """Ensure collection exists with correct configuration."""
-    collections = client.get_collections().collections
-    if not any(c.name == collection_name for c in collections):
-        logger.info(f"Creating collection: {collection_name}")
-        client.create_collection(
-            collection_name=collection_name,
-            vectors_config=VectorParams(size=embedding_dimension, distance=Distance.COSINE)
-        )
-def generate_embeddings(texts: List[str]) -> List[List[float]]:
-    """Generate embeddings for texts."""
-    if PREFER_LOCAL_EMBEDDINGS or not VOYAGE_API_KEY:
-        embeddings = list(embedding_provider.passage_embed(texts))
-        return [emb.tolist() if hasattr(emb, 'tolist') else emb for emb in embeddings]
-    else:
-        response = embedding_provider.embed(texts, model="voyage-3")
-        return response.embeddings
-def process_and_upload_chunk(messages: List[Dict[str, Any]], chunk_index: int,
-                            conversation_id: str, created_at: str,
-                            metadata: Dict[str, Any], collection_name: str,
-                            project_path: Path) -> int:
-    """Process and immediately upload a single chunk."""
-    if not messages:
-        return 0
-    # Extract text content
-    texts = []
-    for msg in messages:
-        role = msg.get("role", "unknown")
-        content = msg.get("content", "")
-        if content:
-            texts.append(f"{role.upper()}: {content}")
-    if not texts:
-        return 0
-    chunk_text = "\n".join(texts)
-    try:
-        # Generate embedding
-        embeddings = generate_embeddings([chunk_text])
-        # Create point ID
-        point_id = hashlib.md5(
-            f"{conversation_id}_{chunk_index}".encode()
-        ).hexdigest()[:16]
-        # Create payload
-        payload = {
-            "text": chunk_text,
-            "conversation_id": conversation_id,
-            "chunk_index": chunk_index,
-            "timestamp": created_at,
-            "project": normalize_project_name(project_path.name),
-            "start_role": messages[0].get("role", "unknown") if messages else "unknown",
-            "message_count": len(messages)
-        }
-        # Add metadata
-        if metadata:
-            payload.update(metadata)
-        # Create point
-        point = PointStruct(
-            id=int(point_id, 16) % (2**63),
-            vector=embeddings[0],
-            payload=payload
-        )
-        # Upload immediately
-        client.upsert(
-            collection_name=collection_name,
-            points=[point],
-            wait=True
-        )
-        return 1
-    except Exception as e:
-        logger.error(f"Error processing chunk {chunk_index}: {e}")
-        return 0
-def extract_metadata_single_pass(file_path: str) -> tuple[Dict[str, Any], str]:
-    """Extract metadata in a single pass, return metadata and first timestamp."""
-    metadata = {
-        "files_analyzed": [],
-        "files_edited": [],
-        "tools_used": [],
-        "concepts": []
-    }
-    first_timestamp = None
-    try:
-        with open(file_path, 'r', encoding='utf-8') as f:
-            for line in f:
-                if not line.strip():
-                    continue
-                try:
-                    data = json.loads(line)
-                    # Get timestamp from first valid entry
-                    if first_timestamp is None and 'timestamp' in data:
-                        first_timestamp = data.get('timestamp')
-                    # Extract tool usage from messages
-                    if 'message' in data and data['message']:
-                        msg = data['message']
-                        if msg.get('content'):
-                            content = msg['content']
-                            if isinstance(content, list):
-                                for item in content:
-                                    if isinstance(item, dict) and item.get('type') == 'tool_use':
-                                        tool_name = item.get('name', '')
-                                        if tool_name and tool_name not in metadata['tools_used']:
-                                            metadata['tools_used'].append(tool_name)
-                                        # Extract file references
-                                        if 'input' in item:
-                                            input_data = item['input']
-                                            if isinstance(input_data, dict):
-                                                if 'file_path' in input_data:
-                                                    file_ref = input_data['file_path']
-                                                    if file_ref not in metadata['files_analyzed']:
-                                                        metadata['files_analyzed'].append(file_ref)
-                                                if 'path' in input_data:
-                                                    file_ref = input_data['path']
-                                                    if file_ref not in metadata['files_analyzed']:
-                                                        metadata['files_analyzed'].append(file_ref)
-                except json.JSONDecodeError:
-                    continue
-                except Exception:
-                    continue
-    except Exception as e:
-        logger.warning(f"Error extracting metadata: {e}")
-    return metadata, first_timestamp or datetime.now().isoformat()
-def stream_import_file(jsonl_file: Path, collection_name: str, project_path: Path) -> int:
-    """Stream import a single JSONL file without loading it into memory."""
-    logger.info(f"Streaming import of {jsonl_file.name}")
-    # Extract metadata in first pass (lightweight)
-    metadata, created_at = extract_metadata_single_pass(str(jsonl_file))
-    # Stream messages and process in chunks
-    chunk_buffer = []
-    chunk_index = 0
-    total_chunks = 0
-    conversation_id = jsonl_file.stem
-    try:
-        with open(jsonl_file, 'r', encoding='utf-8') as f:
-            for line_num, line in enumerate(f, 1):
-                line = line.strip()
-                if not line:
-                    continue
-                try:
-                    data = json.loads(line)
-                    # Skip non-message lines
-                    if data.get('type') == 'summary':
-                        continue
-                    # Extract message if present
-                    if 'message' in data and data['message']:
-                        msg = data['message']
-                        if msg.get('role') and msg.get('content'):
-                            # Extract content
-                            content = msg['content']
-                            if isinstance(content, list):
-                                text_parts = []
-                                for item in content:
-                                    if isinstance(item, dict) and item.get('type') == 'text':
-                                        text_parts.append(item.get('text', ''))
-                                    elif isinstance(item, str):
-                                        text_parts.append(item)
-                                content = '\n'.join(text_parts)
-                            if content:
-                                chunk_buffer.append({
-                                    'role': msg['role'],
-                                    'content': content
-                                })
-                                # Process chunk when buffer reaches MAX_CHUNK_SIZE
-                                if len(chunk_buffer) >= MAX_CHUNK_SIZE:
-                                    chunks = process_and_upload_chunk(
-                                        chunk_buffer, chunk_index, conversation_id,
-                                        created_at, metadata, collection_name, project_path
-                                    )
-                                    total_chunks += chunks
-                                    chunk_buffer = []
-                                    chunk_index += 1
-                                    # Force garbage collection after each chunk
-                                    gc.collect()
-                                    # Log progress
-                                    if chunk_index % 10 == 0:
-                                        logger.info(f"Processed {chunk_index} chunks from {jsonl_file.name}")
-                except json.JSONDecodeError:
-                    logger.debug(f"Skipping invalid JSON at line {line_num}")
-                except Exception as e:
-                    logger.debug(f"Error processing line {line_num}: {e}")
-        # Process remaining messages
-        if chunk_buffer:
-            chunks = process_and_upload_chunk(
-                chunk_buffer, chunk_index, conversation_id,
-                created_at, metadata, collection_name, project_path
-            )
-            total_chunks += chunks
-        logger.info(f"Imported {total_chunks} chunks from {jsonl_file.name}")
-        return total_chunks
-    except Exception as e:
-        logger.error(f"Failed to import {jsonl_file}: {e}")
-        return 0
-def load_state() -> dict:
-    """Load import state."""
-    if os.path.exists(STATE_FILE):
-        try:
-            with open(STATE_FILE, 'r') as f:
-                return json.load(f)
-        except:
-            pass
-    return {"imported_files": {}}
-def save_state(state: dict):
-    """Save import state."""
-    os.makedirs(os.path.dirname(STATE_FILE), exist_ok=True)
-    with open(STATE_FILE, 'w') as f:
-        json.dump(state, f, indent=2)
-def should_import_file(file_path: Path, state: dict) -> bool:
-    """Check if file should be imported."""
-    file_str = str(file_path)
-    if file_str in state.get("imported_files", {}):
-        file_info = state["imported_files"][file_str]
-        last_modified = file_path.stat().st_mtime
-        if file_info.get("last_modified") == last_modified:
-            logger.info(f"Skipping unchanged file: {file_path.name}")
-            return False
-    return True
-def update_file_state(file_path: Path, state: dict, chunks: int):
-    """Update state for imported file."""
-    file_str = str(file_path)
-    state["imported_files"][file_str] = {
-        "imported_at": datetime.now().isoformat(),
-        "last_modified": file_path.stat().st_mtime,
-        "chunks": chunks
-    }
-def main():
-    """Main import function."""
-    # Load state
-    state = load_state()
-    logger.info(f"Loaded state with {len(state.get('imported_files', {}))} previously imported files")
-    # Find all projects
-    logs_dir = Path(os.getenv("LOGS_DIR", "/logs"))
-    project_dirs = [d for d in logs_dir.iterdir() if d.is_dir()]
-    logger.info(f"Found {len(project_dirs)} projects to import")
-    total_imported = 0
-    for project_dir in project_dirs:
-        # Get collection name
-        collection_name = get_collection_name(project_dir)
-        logger.info(f"Importing project: {project_dir.name} -> {collection_name}")
-        # Ensure collection exists
-        ensure_collection(collection_name)
-        # Find JSONL files
-        jsonl_files = sorted(project_dir.glob("*.jsonl"))
-        # Limit files per cycle if specified
-        max_files = int(os.getenv("MAX_FILES_PER_CYCLE", "1000"))
-        jsonl_files = jsonl_files[:max_files]
-        for jsonl_file in jsonl_files:
-            if should_import_file(jsonl_file, state):
-                chunks = stream_import_file(jsonl_file, collection_name, project_dir)
-                if chunks > 0:
-                    update_file_state(jsonl_file, state, chunks)
-                    save_state(state)
-                    total_imported += 1
-                    # Force GC after each file
-                    gc.collect()
-    logger.info(f"Import complete: processed {total_imported} files")
-if __name__ == "__main__":
-    main()