npm - claude-self-reflect - Versions diffs - 4.0.2 → 5.0.2 - Mend

claude-self-reflect 4.0.2 → 5.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +125 -71
package/mcp-server/src/code_reload_tool.py +207 -98
package/mcp-server/src/parallel_search.py +22 -12
package/mcp-server/src/rich_formatting.py +10 -6
package/mcp-server/src/safe_getters.py +217 -0
package/mcp-server/src/search_tools.py +48 -9
package/package.json +1 -1
package/scripts/auto-migrate.cjs +87 -10

package/README.md CHANGED Viewed

@@ -24,24 +24,30 @@
 Give Claude perfect memory of all your conversations. Search past discussions instantly. Never lose context again.
-**100% Local by Default** • **Blazing Fast Search** • **Zero Configuration** • **Production Ready**
+**100% Local by Default** • **20x Faster** • **Zero Configuration** • **Production Ready**
+## Why This Exists
+Claude starts fresh every conversation. You've solved complex bugs, designed architectures, made critical decisions - all forgotten. Until now.
 ## Table of Contents
-- [Quick Install](#-quick-install)
+- [Quick Install](#quick-install)
+- [Performance](#performance)
 - [The Magic](#the-magic)
 - [Before & After](#before--after)
 - [Real Examples](#real-examples)
 - [NEW: Real-time Indexing Status](#new-real-time-indexing-status-in-your-terminal)
 - [Key Features](#key-features)
+- [Code Quality Insights](#code-quality-insights)
 - [Architecture](#architecture)
 - [Requirements](#requirements)
 - [Documentation](#documentation)
-- [What's New](#whats-new)
+- [Keeping Up to Date](#keeping-up-to-date)
 - [Troubleshooting](#troubleshooting)
 - [Contributors](#contributors)
-## 🚀 Quick Install
+## Quick Install
 ```bash
 # Install and run automatic setup (5 minutes, everything automatic)
@@ -49,15 +55,18 @@ npm install -g claude-self-reflect
 claude-self-reflect setup
 # That's it! The setup will:
-# ✅ Run everything in Docker (no Python issues!)
-# ✅ Configure everything automatically
-# ✅ Install the MCP in Claude Code
-# ✅ Start monitoring for new conversations
-# 🔒 Keep all data local - no API keys needed
+# - Run everything in Docker (no Python issues!)
+# - Configure everything automatically
+# - Install the MCP in Claude Code
+# - Start monitoring for new conversations
+# - Keep all data local - no API keys needed
 ```
+> [!TIP]
+> **v4.0+ Auto-Migration**: Updates from v3.x automatically migrate during npm install - no manual steps needed!
 <details open>
-<summary>📡 Cloud Mode (Better Search Accuracy)</summary>
+<summary>Cloud Mode (Better Search Accuracy)</summary>
 ```bash
 # Step 1: Get your free Voyage AI key
@@ -67,7 +76,35 @@ claude-self-reflect setup
 npm install -g claude-self-reflect
 claude-self-reflect setup --voyage-key=YOUR_ACTUAL_KEY_HERE
 ```
-*Note: Cloud mode provides more accurate semantic search but sends conversation data to Voyage AI for processing.*
+> [!NOTE]
+> Cloud mode provides 1024-dimensional embeddings (vs 384 local) for more accurate semantic search but sends conversation data to Voyage AI for processing.
+</details>
+## Performance
+<details open>
+<summary><b>v4.0 Performance Improvements</b></summary>
+| Metric | v3.x | v4.0 | Improvement |
+|--------|------|------|-------------|
+| **Status Check** | 119ms | 6ms | **20x faster** |
+| **Storage Usage** | 100MB | 50MB | **50% reduction** |
+| **Import Speed** | 10/sec | 100/sec | **10x faster** |
+| **Memory Usage** | 500MB | 50MB | **90% reduction** |
+| **Search Latency** | 15ms | 3ms | **5x faster** |
+### How We Compare
+| Feature | Claude Self-Reflect | MemGPT | LangChain Memory |
+|---------|---------------------|---------|------------------|
+| **Local-first** | Yes | No | Partial |
+| **No API keys** | Yes | No | No |
+| **Real-time indexing** | Yes 2-sec | Manual | No |
+| **Search speed** | <3ms | ~50ms | ~100ms |
+| **Setup time** | 5 min | 30+ min | 20+ min |
+| **Docker required** | Yes | Python | Python |
 </details>
@@ -82,20 +119,20 @@ claude-self-reflect setup --voyage-key=YOUR_ACTUAL_KEY_HERE
 ## Real Examples
 ```
-You: "What was that PostgreSQL optimization we figured out?"
-Claude: "Found it - conversation from Dec 15th. You discovered that adding
-        a GIN index on the metadata JSONB column reduced query time from
-        2.3s to 45ms."
-You: "Remember that React hooks bug?"
-Claude: "Yes, from last week. The useEffect was missing a dependency on
-        userId, causing stale closures in the event handler."
-You: "Have we discussed WebSocket authentication before?"
-Claude: "3 conversations found:
-        - Oct 12: Implemented JWT handshake for Socket.io
-        - Nov 3: Solved reconnection auth with refresh tokens
-        - Nov 20: Added rate limiting per authenticated connection"
+You: "How did we fix that 100% CPU usage bug?"
+Claude: "Found it - we fixed the circular reference causing 100% CPU usage
+        in the server modularization. Also fixed store_reflection dimension
+        mismatch by creating separate reflections_local and reflections_voyage."
+You: "What about that Docker memory issue?"
+Claude: "The container was limited to 2GB but only using 266MB. We found
+        the issue only happened with MAX_QUEUE_SIZE=1000 outside Docker.
+        With proper Docker limits, memory stays stable at 341MB."
+You: "Have we worked with JWT authentication?"
+Claude: "Found conversations about JWT patterns including User.authenticate
+        methods, TokenHandler classes, and concepts like token rotation,
+        PKCE, and social login integration."
 ```
 ## NEW: Real-time Indexing Status in Your Terminal
@@ -110,6 +147,39 @@ See your conversation indexing progress directly in your statusline:
 Works with [Claude Code Statusline](https://github.com/sirmalloc/ccstatusline) - shows progress bars, percentages, and indexing lag in real-time! The statusline also displays MCP connection status (✓ Connected) and collection counts (28/29 indexed).
+## Code Quality Insights
+<details>
+<summary><b>AST-GREP Pattern Analysis (100+ Patterns)</b></summary>
+### Real-time Quality Scoring in Statusline
+Your code quality displayed live as you work:
+- 🟢 **A+** (95-100): Exceptional code quality
+- 🟢 **A** (90-95): Excellent, production-ready
+- 🟢 **B** (80-90): Good, minor improvements possible
+- 🟡 **C** (60-80): Fair, needs refactoring
+- 🔴 **D** (40-60): Poor, significant issues
+- 🔴 **F** (0-40): Critical problems detected
+### Pattern Categories Analyzed
+- **Security Patterns**: SQL injection, XSS vulnerabilities, hardcoded secrets
+- **Performance Patterns**: N+1 queries, inefficient loops, memory leaks
+- **Error Handling**: Bare exceptions, missing error boundaries
+- **Type Safety**: Missing type hints, unsafe casts
+- **Async Patterns**: Missing await, promise handling
+- **Testing Patterns**: Test coverage, assertion quality
+### How It Works
+1. **During Import**: AST elements extracted from all code blocks
+2. **Pattern Matching**: 100+ patterns from unified registry
+3. **Quality Scoring**: Weighted scoring normalized by lines of code
+4. **Statusline Display**: Real-time feedback as you code
+> [!TIP]
+> Run `python scripts/session_quality_tracker.py` to analyze your current session quality!
+</details>
 ## Key Features
 <details>
@@ -128,11 +198,21 @@ Works with [Claude Code Statusline](https://github.com/sirmalloc/ccstatusline) -
 - `search_by_recency` - Time-constrained search like "docker issues last week"
 - `get_timeline` - Activity timeline with statistics and patterns
+**Runtime Configuration Tools (v4.0):**
+- `switch_embedding_mode` - Switch between local/cloud modes without restart
+- `get_embedding_mode` - Check current embedding configuration
+- `reload_code` - Hot reload Python code changes
+- `reload_status` - Check reload state
+- `clear_module_cache` - Clear Python cache
 **Status & Monitoring Tools:**
 - `get_status` - Real-time import progress and system status
 - `get_health` - Comprehensive system health check
 - `collection_status` - Check Qdrant collection health and stats
+> [!TIP]
+> Use `reflect_on_past --mode quick` for instant existence checks - returns count + top match only!
 All tools are automatically available when the MCP server is connected to Claude Code.
 </details>
@@ -175,6 +255,9 @@ Recent conversations matter more. Old ones fade. Like your brain, but reliable.
 - **Graceful aging**: Old information fades naturally
 - **Configurable**: Adjust decay rate to your needs
+> [!NOTE]
+> Memory decay ensures recent solutions are prioritized while still maintaining historical context.
 </details>
 <details>
@@ -186,6 +269,9 @@ Recent conversations matter more. Old ones fade. Like your brain, but reliable.
 - **Memory**: 96% reduction from v2.5.15
 - **Real-time**: HOT/WARM/COLD intelligent prioritization
+> [!TIP]
+> For best performance, keep Docker allocated 4GB+ RAM and use SSD storage.
 </details>
 ## Architecture
@@ -213,6 +299,9 @@ Files are categorized by age and processed with priority queuing to ensure newes
 ## Requirements
+> [!WARNING]
+> **Breaking Change in v4.0**: Collections now use prefixed naming (e.g., `csr_project_local_384d`). Run migration automatically via `npm update`.
 <details>
 <summary><b>System Requirements</b></summary>
@@ -288,55 +377,20 @@ npm uninstall -g claude-self-reflect
 </details>
-## What's New
-<details>
-<summary>v3.3.0 - Latest Release</summary>
-- **🚀 Major Architecture Overhaul**: Server modularized from 2,321 to 728 lines (68% reduction) for better maintainability
-- **🔧 Critical Bug Fixes**: Fixed 100% CPU usage, store_reflection dimension mismatches, and SearchResult type errors
-- **🕒 New Temporal Tools Suite**: `get_recent_work`, `search_by_recency`, `get_timeline` for time-based search and analysis
-- **🎯 Enhanced UX**: Restored rich formatting with emojis for better readability and information hierarchy
-- **⚡ All 15+ MCP Tools Operational**: Complete functionality with both local and cloud embedding modes
-- **🏗️ Production Infrastructure**: Real-time indexing with smart intervals (2s hot files, 60s normal)
-- **🔍 Enhanced Metadata**: Tool usage analysis, file tracking, and concept extraction for better search
-</details>
-<details>
-<summary>v2.5.19 - Metadata Enrichment</summary>
+## Keeping Up to Date
-### For Existing Users
-```bash
-# Update to latest version
-npm update -g claude-self-reflect
-# Run setup - it will detect your existing installation
-claude-self-reflect setup
-# Choose "yes" when asked about metadata enrichment
-# Or manually enrich metadata anytime:
-docker compose run --rm importer python /app/scripts/delta-metadata-update-safe.py
-```
-### What You Get
-- `search_by_concept("docker")` - Find conversations by topic
-- `search_by_file("server.py")` - Find conversations that touched specific files
-- Better search accuracy with metadata-based filtering
-</details>
+> [!TIP]
+> **For Existing Users**: Simply run `npm update -g claude-self-reflect` to get the latest features and improvements. Updates are automatic and preserve your data.
 <details>
-<summary>Release History</summary>
-- **v2.5.18** - Security dependency updates
-- **v2.5.17** - Critical CPU fix and memory limit adjustment
-- **v2.5.16** - Initial streaming importer with CPU throttling
-- **v2.5.15** - Critical bug fixes and collection creation improvements
-- **v2.5.14** - Async importer collection fix
-- **v2.5.11** - Critical cloud mode fix
-- **v2.5.10** - Emergency hotfix for MCP server startup
-- **v2.5.6** - Tool Output Extraction
+<summary>Recent Improvements</summary>
+- **20x faster performance** - Status checks, search, and imports
+- **Runtime configuration** - Switch modes without restarting
+- **Unified state management** - Single source of truth
+- **AST-GREP integration** - Code quality analysis
+- **Temporal search tools** - Find recent work and time-based queries
+- **Auto-migration** - Updates handle breaking changes automatically
 [Full changelog](docs/release-history.md)

package/mcp-server/src/code_reload_tool.py CHANGED Viewed

@@ -5,11 +5,22 @@ import sys
 import importlib
 import logging
 from pathlib import Path
-from typing import Dict, List, Optional, Literal
+from typing import Dict, List, Optional
 from fastmcp import Context
 from pydantic import Field
 import hashlib
 import json
+import asyncio
+# Import security module - handle both relative and absolute imports
+try:
+    from .security_patches import ModuleWhitelist
+except ImportError:
+    try:
+        from security_patches import ModuleWhitelist
+    except ImportError:
+        # Security module is required - fail closed, not open
+        raise RuntimeError("Security module 'security_patches' is required for code reload functionality")
 logger = logging.getLogger(__name__)
@@ -19,20 +30,36 @@ class CodeReloader:
     def __init__(self):
         """Initialize the code reloader."""
-        self.module_hashes: Dict[str, str] = {}
-        self.reload_history: List[Dict] = []
         self.cache_dir = Path.home() / '.claude-self-reflect' / 'reload_cache'
         self.cache_dir.mkdir(parents=True, exist_ok=True)
-        # Test comment: Hot reload test at 2025-09-15
-        logger.info("CodeReloader initialized with hot reload support")
+        self.hash_file = self.cache_dir / 'module_hashes.json'
+        self._lock = asyncio.Lock()  # Thread safety for async operations
+        # Load persisted hashes from disk with error handling
+        if self.hash_file.exists():
+            try:
+                with open(self.hash_file, 'r') as f:
+                    self.module_hashes: Dict[str, str] = json.load(f)
+            except (json.JSONDecodeError, IOError) as e:
+                logger.error(f"Failed to load module hashes: {e}. Starting fresh.")
+                self.module_hashes: Dict[str, str] = {}
+        else:
+            self.module_hashes: Dict[str, str] = {}
+        self.reload_history: List[Dict] = []
+        logger.info(f"CodeReloader initialized with {len(self.module_hashes)} cached hashes")
     def _get_file_hash(self, filepath: Path) -> str:
         """Get SHA256 hash of a file."""
         with open(filepath, 'rb') as f:
             return hashlib.sha256(f.read()).hexdigest()
-    def _get_changed_modules(self) -> List[str]:
-        """Detect which modules have changed since last check."""
+    def _detect_changed_modules(self) -> List[str]:
+        """Detect which modules have changed since last check.
+        This method ONLY detects changes, it does NOT update the stored hashes.
+        Use _update_module_hashes() to update hashes after successful reload.
+        """
         changed = []
         src_dir = Path(__file__).parent
@@ -43,13 +70,61 @@ class CodeReloader:
             module_name = f"src.{py_file.stem}"
             current_hash = self._get_file_hash(py_file)
+            # Only detect changes, DO NOT update hashes here
             if module_name in self.module_hashes:
                 if self.module_hashes[module_name] != current_hash:
                     changed.append(module_name)
+                    logger.debug(f"Change detected in {module_name}: {self.module_hashes[module_name][:8]} -> {current_hash[:8]}")
+            else:
+                # New module not seen before
+                changed.append(module_name)
+                logger.debug(f"New module detected: {module_name}")
+        return changed
+    def _update_module_hashes(self, modules: Optional[List[str]] = None) -> None:
+        """Update the stored hashes for specified modules or all modules.
+        This should be called AFTER successful reload to mark modules as up-to-date.
+        Args:
+            modules: List of module names to update. If None, updates all modules.
+        """
+        src_dir = Path(__file__).parent
+        updated = []
+        for py_file in src_dir.glob("*.py"):
+            if py_file.name == "__pycache__":
+                continue
+            module_name = f"src.{py_file.stem}"
+            # If specific modules provided, only update those
+            if modules is not None and module_name not in modules:
+                continue
+            current_hash = self._get_file_hash(py_file)
+            old_hash = self.module_hashes.get(module_name, "new")
             self.module_hashes[module_name] = current_hash
+            if old_hash != current_hash:
+                updated.append(module_name)
+                logger.debug(f"Updated hash for {module_name}: {old_hash[:8] if old_hash != 'new' else 'new'} -> {current_hash[:8]}")
-        return changed
+        # Persist the updated hashes to disk using atomic write
+        temp_file = Path(str(self.hash_file) + '.tmp')
+        try:
+            with open(temp_file, 'w') as f:
+                json.dump(self.module_hashes, f, indent=2)
+            # Atomic rename on POSIX systems
+            temp_file.replace(self.hash_file)
+        except Exception as e:
+            logger.error(f"Failed to persist module hashes: {e}")
+            if temp_file.exists():
+                temp_file.unlink()  # Clean up temp file on failure
+        if updated:
+            logger.info(f"Updated hashes for {len(updated)} modules: {', '.join(updated)}")
     async def reload_modules(
         self,
@@ -61,93 +136,98 @@ class CodeReloader:
         await ctx.debug("Starting code reload process...")
-        try:
-            # Track what we're reloading
-            reload_targets = []
-            if auto_detect:
-                # Detect changed modules
-                changed = self._get_changed_modules()
-                if changed:
-                    reload_targets.extend(changed)
-                    await ctx.debug(f"Auto-detected changes in: {changed}")
-            if modules:
-                # Add explicitly requested modules
-                reload_targets.extend(modules)
-            if not reload_targets:
-                return "📊 No modules to reload. All code is up to date!"
-            # Perform the reload
-            reloaded = []
-            failed = []
-            for module_name in reload_targets:
-                try:
-                    # SECURITY FIX: Validate module is in whitelist
-                    from .security_patches import ModuleWhitelist
-                    if not ModuleWhitelist.is_allowed_module(module_name):
-                        logger.warning(f"Module not in whitelist, skipping: {module_name}")
-                        failed.append((module_name, "Module not in whitelist"))
-                        continue
-                    if module_name in sys.modules:
-                        # Store old module reference for rollback
-                        old_module = sys.modules[module_name]
-                        # Reload the module
-                        logger.info(f"Reloading module: {module_name}")
-                        reloaded_module = importlib.reload(sys.modules[module_name])
-                        # Update any global references if needed
-                        self._update_global_references(module_name, reloaded_module)
-                        reloaded.append(module_name)
-                        await ctx.debug(f"✅ Reloaded: {module_name}")
-                    else:
-                        # Module not loaded yet, import it
-                        importlib.import_module(module_name)
-                        reloaded.append(module_name)
-                        await ctx.debug(f"✅ Imported: {module_name}")
-                except Exception as e:
-                    logger.error(f"Failed to reload {module_name}: {e}", exc_info=True)
-                    failed.append((module_name, str(e)))
-                    await ctx.debug(f"❌ Failed: {module_name} - {e}")
-            # Record reload history
-            self.reload_history.append({
-                "timestamp": os.environ.get('MCP_REQUEST_ID', 'unknown'),
-                "reloaded": reloaded,
-                "failed": failed
-            })
-            # Build response
-            response = "🔄 **Code Reload Results**\n\n"
-            if reloaded:
-                response += f"**Successfully Reloaded ({len(reloaded)}):**\n"
-                for module in reloaded:
-                    response += f"- ✅ {module}\n"
-                response += "\n"
-            if failed:
-                response += f"**Failed to Reload ({len(failed)}):**\n"
-                for module, error in failed:
-                    response += f"- ❌ {module}: {error}\n"
-                response += "\n"
-            response += "**Important Notes:**\n"
-            response += "- Class instances created before reload keep old code\n"
-            response += "- New requests will use the reloaded code\n"
-            response += "- Some changes may require full restart (e.g., new tools)\n"
-            return response
-        except Exception as e:
-            logger.error(f"Code reload failed: {e}", exc_info=True)
-            return f"❌ Code reload failed: {str(e)}"
+        async with self._lock:  # Ensure thread safety for reload operations
+            try:
+                # Track what we're reloading
+                reload_targets = []
+                if auto_detect:
+                    # Detect changed modules (without updating hashes)
+                    changed = self._detect_changed_modules()
+                    if changed:
+                        reload_targets.extend(changed)
+                        await ctx.debug(f"Auto-detected changes in: {changed}")
+                if modules:
+                    # Add explicitly requested modules
+                    reload_targets.extend(modules)
+                if not reload_targets:
+                    return "📊 No modules to reload. All code is up to date!"
+                # Perform the reload
+                reloaded = []
+                failed = []
+                for module_name in reload_targets:
+                    try:
+                        # SECURITY FIX: Validate module is in whitelist
+                        if not ModuleWhitelist.is_allowed_module(module_name):
+                            logger.warning(f"Module not in whitelist, skipping: {module_name}")
+                            failed.append((module_name, "Module not in whitelist"))
+                            continue
+                        if module_name in sys.modules:
+                            # Store old module reference for rollback
+                            old_module = sys.modules[module_name]
+                            # Reload the module
+                            logger.info(f"Reloading module: {module_name}")
+                            reloaded_module = importlib.reload(sys.modules[module_name])
+                            # Update any global references if needed
+                            self._update_global_references(module_name, reloaded_module)
+                            reloaded.append(module_name)
+                            await ctx.debug(f"✅ Reloaded: {module_name}")
+                        else:
+                            # Module not loaded yet, import it
+                            importlib.import_module(module_name)
+                            reloaded.append(module_name)
+                            await ctx.debug(f"✅ Imported: {module_name}")
+                    except Exception as e:
+                        logger.error(f"Failed to reload {module_name}: {e}", exc_info=True)
+                        failed.append((module_name, str(e)))
+                        await ctx.debug(f"❌ Failed: {module_name} - {e}")
+                # Update hashes ONLY for successfully reloaded modules
+                if reloaded:
+                    self._update_module_hashes(reloaded)
+                    await ctx.debug(f"Updated hashes for {len(reloaded)} successfully reloaded modules")
+                # Record reload history
+                self.reload_history.append({
+                    "timestamp": os.environ.get('MCP_REQUEST_ID', 'unknown'),
+                    "reloaded": reloaded,
+                    "failed": failed
+                })
+                # Build response
+                response = "🔄 **Code Reload Results**\n\n"
+                if reloaded:
+                    response += f"**Successfully Reloaded ({len(reloaded)}):**\n"
+                    for module in reloaded:
+                        response += f"- ✅ {module}\n"
+                    response += "\n"
+                if failed:
+                    response += f"**Failed to Reload ({len(failed)}):**\n"
+                    for module, error in failed:
+                        response += f"- ❌ {module}: {error}\n"
+                    response += "\n"
+                response += "**Important Notes:**\n"
+                response += "- Class instances created before reload keep old code\n"
+                response += "- New requests will use the reloaded code\n"
+                response += "- Some changes may require full restart (e.g., new tools)\n"
+                return response
+            except Exception as e:
+                logger.error(f"Code reload failed: {e}", exc_info=True)
+                return f"❌ Code reload failed: {str(e)}"
     def _update_global_references(self, module_name: str, new_module):
         """Update global references after module reload."""
@@ -171,8 +251,8 @@ class CodeReloader:
         """Get the current reload status and history."""
         try:
-            # Check for changed files
-            changed = self._get_changed_modules()
+            # Check for changed files (WITHOUT updating hashes)
+            changed = self._detect_changed_modules()
             response = "📊 **Code Reload Status**\n\n"
@@ -224,6 +304,24 @@ class CodeReloader:
             logger.error(f"Failed to clear cache: {e}", exc_info=True)
             return f"❌ Failed to clear cache: {str(e)}"
+    async def force_update_hashes(self, ctx: Context) -> str:
+        """Force update all module hashes to current state.
+        This is useful when you want to mark all current code as 'baseline'
+        without actually reloading anything.
+        """
+        try:
+            await ctx.debug("Force updating all module hashes...")
+            # Update all module hashes
+            self._update_module_hashes(modules=None)
+            return f"✅ Force updated hashes for all {len(self.module_hashes)} tracked modules"
+        except Exception as e:
+            logger.error(f"Failed to force update hashes: {e}", exc_info=True)
+            return f"❌ Failed to force update hashes: {str(e)}"
 def register_code_reload_tool(mcp, get_embedding_manager):
     """Register the code reloading tool with the MCP server."""
@@ -257,6 +355,8 @@ def register_code_reload_tool(mcp, get_embedding_manager):
         Shows which files have been modified since last reload and
         the history of recent reload operations.
+        Note: This only checks for changes, it does not update the stored hashes.
         """
         return await reloader.get_reload_status(ctx)
@@ -267,5 +367,14 @@ def register_code_reload_tool(mcp, get_embedding_manager):
         Useful when reload isn't working due to cached bytecode.
         """
         return await reloader.clear_python_cache(ctx)
+    @mcp.tool()
+    async def force_update_module_hashes(ctx: Context) -> str:
+        """Force update all module hashes to mark current code as baseline.
+        Use this when you want to ignore current changes and treat
+        the current state as the new baseline without reloading.
+        """
+        return await reloader.force_update_hashes(ctx)
-    logger.info("Code reload tools registered successfully")
+    logger.info("Code reload tools registered successfully")

package/mcp-server/src/parallel_search.py CHANGED Viewed

@@ -8,6 +8,7 @@ import time
 from typing import List, Dict, Any, Optional, Tuple
 from datetime import datetime
 import logging
+from .safe_getters import safe_get_list, safe_get_str
 logger = logging.getLogger(__name__)
@@ -88,15 +89,20 @@ async def search_single_collection(
                 logger.warning(f"Search returned None for collection {collection_name}")
                 search_results = []
+            # Ensure search_results is iterable (additional safety check)
+            if not hasattr(search_results, '__iter__'):
+                logger.error(f"Search results not iterable for collection {collection_name}: {type(search_results)}")
+                search_results = []
             # Debug: Log search results
-            logger.debug(f"Search of {collection_name} returned {len(search_results)} results")
+            logger.debug(f"Search of {collection_name} returned {len(search_results) if search_results else 0} results")
-            if should_use_decay and not USE_NATIVE_DECAY:
+            if should_use_decay and not USE_NATIVE_DECAY and search_results:
                 # Apply client-side decay
                 await ctx.debug(f"Using CLIENT-SIDE decay for {collection_name}")
                 decay_results = []
-                for point in search_results:
+                for point in (search_results or []):
                     try:
                         raw_timestamp = point.payload.get('timestamp', datetime.now().isoformat())
                         clean_timestamp = raw_timestamp.replace('Z', '+00:00') if raw_timestamp.endswith('Z') else raw_timestamp
@@ -171,15 +177,15 @@ async def search_single_collection(
                         'collection_name': collection_name,
                         'raw_payload': point.payload,  # Renamed from 'payload' for consistency
                         'code_patterns': point.payload.get('code_patterns'),
-                        'files_analyzed': point.payload.get('files_analyzed'),
-                        'tools_used': list(point.payload.get('tools_used', [])) if isinstance(point.payload.get('tools_used'), set) else point.payload.get('tools_used'),
-                        'concepts': point.payload.get('concepts')
+                        'files_analyzed': safe_get_list(point.payload, 'files_analyzed'),
+                        'tools_used': safe_get_list(point.payload, 'tools_used'),
+                        'concepts': safe_get_list(point.payload, 'concepts')
                     }
                     results.append(search_result)
             else:
                 # Process standard search results without decay
-                logger.debug(f"Processing {len(search_results)} results from {collection_name}")
-                for point in search_results:
+                logger.debug(f"Processing {len(search_results) if search_results else 0} results from {collection_name}")
+                for point in (search_results or []):
                     raw_timestamp = point.payload.get('timestamp', datetime.now().isoformat())
                     clean_timestamp = raw_timestamp.replace('Z', '+00:00') if raw_timestamp.endswith('Z') else raw_timestamp
@@ -214,9 +220,9 @@ async def search_single_collection(
                         'collection_name': collection_name,
                         'raw_payload': point.payload,
                         'code_patterns': point.payload.get('code_patterns'),
-                        'files_analyzed': point.payload.get('files_analyzed'),
-                        'tools_used': list(point.payload.get('tools_used', [])) if isinstance(point.payload.get('tools_used'), set) else point.payload.get('tools_used'),
-                        'concepts': point.payload.get('concepts')
+                        'files_analyzed': safe_get_list(point.payload, 'files_analyzed'),
+                        'tools_used': safe_get_list(point.payload, 'tools_used'),
+                        'concepts': safe_get_list(point.payload, 'concepts')
                     }
                     results.append(search_result)
@@ -307,7 +313,11 @@ async def parallel_search_collections(
             continue
         collection_name, results, timing = result
-        all_results.extend(results)
+        # Handle None results safely
+        if results is not None:
+            all_results.extend(results)
+        else:
+            logger.warning(f"Collection {collection_name} returned None results")
         collection_timings.append(timing)
     await ctx.debug(f"Parallel search complete: {len(all_results)} total results")

package/mcp-server/src/rich_formatting.py CHANGED Viewed

@@ -5,6 +5,7 @@ import time
 from datetime import datetime, timezone
 from typing import List, Dict, Any, Optional
 import logging
+from .safe_getters import safe_get_list, safe_get_str
 logger = logging.getLogger(__name__)
@@ -114,16 +115,19 @@ def format_search_results_rich(
         concept_frequency = {}
         for result in results:
-            # Count file modifications
-            for file in result.get('files_analyzed', []):
+            # Count file modifications - using safe_get_list for consistency
+            files = safe_get_list(result, 'files_analyzed')
+            for file in files:
                 file_frequency[file] = file_frequency.get(file, 0) + 1
-            # Count tool usage
-            for tool in result.get('tools_used', []):
+            # Count tool usage - using safe_get_list for consistency
+            tools = safe_get_list(result, 'tools_used')
+            for tool in tools:
                 tool_frequency[tool] = tool_frequency.get(tool, 0) + 1
-            # Count concepts
-            for concept in result.get('concepts', []):
+            # Count concepts - using safe_get_list for consistency
+            concepts = safe_get_list(result, 'concepts')
+            for concept in concepts:
                 concept_frequency[concept] = concept_frequency.get(concept, 0) + 1
         # Show most frequently modified files

package/mcp-server/src/safe_getters.py ADDED Viewed

@@ -0,0 +1,217 @@
+"""Safe getter utilities for handling None values consistently."""
+import logging
+from typing import Any, Dict, List, Optional, Set, Union
+logger = logging.getLogger(__name__)
+def safe_get_list(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: Optional[List] = None
+) -> List[Any]:
+    """
+    Safely get a list field from a dictionary, handling None and non-list values.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None
+    Returns:
+        A list, either the value, converted value, or default/empty list
+    """
+    if data is None:
+        return default if default is not None else []
+    value = data.get(key)
+    if value is None:
+        return default if default is not None else []
+    # Handle sets and tuples by converting to list
+    if isinstance(value, (set, tuple)):
+        return list(value)
+    # If it's already a list, return it
+    if isinstance(value, list):
+        return value
+    # If it's not a list-like type, log warning and return empty list
+    logger.warning(
+        f"Expected list-like type for key '{key}', got {type(value).__name__}. "
+        f"Value: {repr(value)[:100]}"
+    )
+    return default if default is not None else []
+def safe_get_str(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: str = ""
+) -> str:
+    """
+    Safely get a string field from a dictionary.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None
+    Returns:
+        A string, either the value or the default
+    """
+    if data is None:
+        return default
+    value = data.get(key)
+    if value is None:
+        return default
+    # Convert to string if needed
+    return str(value)
+def safe_get_dict(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: Optional[Dict] = None
+) -> Dict[str, Any]:
+    """
+    Safely get a dictionary field from another dictionary.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None
+    Returns:
+        A dictionary, either the value or the default/empty dict
+    """
+    if data is None:
+        return default if default is not None else {}
+    value = data.get(key)
+    if value is None:
+        return default if default is not None else {}
+    if isinstance(value, dict):
+        return value
+    logger.warning(
+        f"Expected dict for key '{key}', got {type(value).__name__}. "
+        f"Value: {repr(value)[:100]}"
+    )
+    return default if default is not None else {}
+def safe_get_float(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: float = 0.0
+) -> float:
+    """
+    Safely get a float field from a dictionary.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None/non-numeric
+    Returns:
+        A float, either the converted value or the default
+    """
+    if data is None:
+        return default
+    value = data.get(key)
+    if value is None:
+        return default
+    try:
+        return float(value)
+    except (TypeError, ValueError) as e:
+        logger.warning(
+            f"Could not convert key '{key}' value to float: {repr(value)[:100]}. "
+            f"Error: {e}"
+        )
+        return default
+def safe_get_int(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: int = 0
+) -> int:
+    """
+    Safely get an integer field from a dictionary.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None/non-numeric
+    Returns:
+        An integer, either the converted value or the default
+    """
+    if data is None:
+        return default
+    value = data.get(key)
+    if value is None:
+        return default
+    try:
+        return int(value)
+    except (TypeError, ValueError) as e:
+        logger.warning(
+            f"Could not convert key '{key}' value to int: {repr(value)[:100]}. "
+            f"Error: {e}"
+        )
+        return default
+def safe_get_bool(
+    data: Optional[Dict[str, Any]],
+    key: str,
+    default: bool = False
+) -> bool:
+    """
+    Safely get a boolean field from a dictionary.
+    Args:
+        data: Dictionary to get value from (can be None)
+        key: Key to retrieve
+        default: Default value if key not found or value is None
+    Returns:
+        A boolean, either the value or the default
+    """
+    if data is None:
+        return default
+    value = data.get(key)
+    if value is None:
+        return default
+    if isinstance(value, bool):
+        return value
+    # Handle string booleans
+    if isinstance(value, str):
+        return value.lower() in ('true', '1', 'yes', 'on')
+    # Handle numeric booleans
+    try:
+        return bool(int(value))
+    except (TypeError, ValueError):
+        logger.warning(
+            f"Could not convert key '{key}' value to bool: {repr(value)[:100]}"
+        )
+        return default

package/mcp-server/src/search_tools.py CHANGED Viewed

@@ -20,6 +20,26 @@ from .rich_formatting import format_search_results_rich
 logger = logging.getLogger(__name__)
+def is_searchable_collection(name: str) -> bool:
+    """
+    Check if collection name matches searchable patterns.
+    Supports both v3 and v4 collection naming conventions.
+    """
+    return (
+        # v3 patterns
+        name.endswith('_local')
+        or name.endswith('_voyage')
+        # v4 patterns
+        or name.endswith('_384d')  # Local v4 collections
+        or name.endswith('_1024d')  # Cloud v4 collections
+        or '_cloud_' in name  # Cloud v4 intermediate naming
+        # Reflections
+        or name.startswith('reflections')
+        # CSR prefixed collections
+        or name.startswith('csr_')
+    )
 class SearchTools:
     """Handles all search operations for the MCP server."""
@@ -114,6 +134,11 @@ class SearchTools:
             # Convert results to dict format
             results = []
             for result in search_results:
+                # Guard against None payload
+                if result.payload is None:
+                    logger.warning(f"Result in {collection_name} has None payload, skipping")
+                    continue
                 results.append({
                     'conversation_id': result.payload.get('conversation_id'),
                     'timestamp': result.payload.get('timestamp'),
@@ -260,19 +285,36 @@ class SearchTools:
             else:
                 # Use all collections INCLUDING reflections (with decay)
                 collections_response = await self.qdrant_client.get_collections()
+                # Handle None response from Qdrant
+                if collections_response is None or not hasattr(collections_response, 'collections'):
+                    await ctx.debug(f"WARNING: Qdrant returned None or invalid response")
+                    return "<search_results><message>Unable to retrieve collections from Qdrant</message></search_results>"
                 collections = collections_response.collections
+                # Ensure collections is not None
+                if collections is None:
+                    await ctx.debug(f"WARNING: collections is None!")
+                    return "<search_results><message>No collections available</message></search_results>"
                 # Include both conversation collections and reflection collections
+                # Use module-level function for consistency
                 filtered_collections = [
                     c for c in collections
-                    if (c.name.endswith('_local') or c.name.endswith('_voyage') or
-                        c.name.startswith('reflections'))
+                    if is_searchable_collection(c.name)
                 ]
                 await ctx.debug(f"Searching across {len(filtered_collections)} collections")
             if not filtered_collections:
                 return "<search_results><message>No collections found for the specified project</message></search_results>"
             # Perform PARALLEL search across collections to avoid freeze
+            # Ensure filtered_collections is not None before iterating
+            if filtered_collections is None:
+                await ctx.debug(f"WARNING: filtered_collections is None!")
+                return "<search_results><message>No collections available for search</message></search_results>"
             collection_names = [c.name for c in filtered_collections]
             await ctx.debug(f"Starting parallel search across {len(collection_names)} collections")
@@ -386,8 +428,7 @@ class SearchTools:
                 # Include both conversation collections and reflection collections
                 filtered_collections = [
                     c for c in collections
-                    if (c.name.endswith('_local') or c.name.endswith('_voyage') or
-                        c.name.startswith('reflections'))
+                    if is_searchable_collection(c.name)
                 ]
             # Quick PARALLEL count across collections
@@ -476,8 +517,7 @@ class SearchTools:
                 # Include both conversation collections and reflection collections
                 filtered_collections = [
                     c for c in collections
-                    if (c.name.endswith('_local') or c.name.endswith('_voyage') or
-                        c.name.startswith('reflections'))
+                    if is_searchable_collection(c.name)
                 ]
             # Gather results for summary using PARALLEL search
@@ -573,8 +613,7 @@ class SearchTools:
                 # Include both conversation collections and reflection collections
                 filtered_collections = [
                     c for c in collections
-                    if (c.name.endswith('_local') or c.name.endswith('_voyage') or
-                        c.name.startswith('reflections'))
+                    if is_searchable_collection(c.name)
                 ]
             # Gather all results using PARALLEL search

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-self-reflect",
-  "version": "4.0.2",
+  "version": "5.0.2",
   "description": "Give Claude perfect memory of all your conversations - Installation wizard for Python MCP server",
   "keywords": [
     "claude",

package/scripts/auto-migrate.cjs CHANGED Viewed

@@ -1,6 +1,6 @@
 #!/usr/bin/env node
-const { execSync } = require('child_process');
+const { spawnSync } = require('child_process');
 const fs = require('fs');
 const path = require('path');
 const os = require('os');
@@ -23,7 +23,22 @@ const needsMigration = legacyFiles.some(file =>
     fs.existsSync(path.join(csrConfigDir, file))
 );
-if (!needsMigration && fs.existsSync(unifiedStateFile)) {
+// Check if unified state exists and has proper structure
+let unifiedStateValid = false;
+if (fs.existsSync(unifiedStateFile)) {
+    try {
+        const state = JSON.parse(fs.readFileSync(unifiedStateFile, 'utf8'));
+        // Check for v5.0 structure
+        unifiedStateValid = state.version === '5.0.0' &&
+                           state.files &&
+                           state.collections &&
+                           state.metadata;
+    } catch {
+        unifiedStateValid = false;
+    }
+}
+if (!needsMigration && unifiedStateValid) {
     console.log('✅ Already using Unified State Management v5.0');
     process.exit(0);
 }
@@ -34,9 +49,12 @@ if (needsMigration) {
     try {
         // Check if Python is available
-        try {
-            execSync('python3 --version', { stdio: 'ignore' });
-        } catch {
+        const pythonCheck = spawnSync('python3', ['--version'], {
+            stdio: 'ignore',
+            shell: false
+        });
+        if (pythonCheck.error || pythonCheck.status !== 0) {
             console.log('⚠️  Python 3 not found. Migration will run when you first use the MCP server.');
             console.log('   To run migration manually: python3 scripts/migrate-to-unified-state.py');
             process.exit(0);
@@ -62,19 +80,78 @@ if (needsMigration) {
             process.exit(0);
         }
-        // Run the migration
+        // Run the migration safely using spawnSync to prevent shell injection
         console.log(`🚀 Running migration from: ${migrationScript}`);
-        const result = execSync(`python3 "${migrationScript}"`, {
+        const result = spawnSync('python3', [migrationScript], {
             encoding: 'utf-8',
-            stdio: 'pipe'
+            stdio: 'pipe',
+            shell: false // Explicitly disable shell to prevent injection
         });
-        console.log(result);
+        if (result.error) {
+            throw result.error;
+        }
+        if (result.status !== 0) {
+            // Categorize errors for better user guidance
+            const stderr = result.stderr || '';
+            const stdout = result.stdout || '';
+            if (stderr.includes('ModuleNotFoundError')) {
+                console.log('⚠️  Missing Python dependencies. The MCP server will install them on first run.');
+                console.log('   To install manually: pip install -r requirements.txt');
+            } else if (stderr.includes('PermissionError') || stderr.includes('Permission denied')) {
+                console.log('⚠️  Permission issue accessing state files.');
+                console.log('   Try running with appropriate permissions or check file ownership.');
+            } else if (stderr.includes('FileNotFoundError')) {
+                console.log('⚠️  State files not found at expected location.');
+                console.log('   This is normal for fresh installations.');
+            } else {
+                console.log('⚠️  Migration encountered an issue:');
+                console.log(stderr || stdout || `Exit code: ${result.status}`);
+            }
+            console.log('   Your existing state files are preserved.');
+            console.log('   To run migration manually: python3 scripts/migrate-to-unified-state.py');
+            console.log('   For help: https://github.com/ramakay/claude-self-reflect/issues');
+            process.exit(0); // Exit gracefully, don't fail npm install
+        }
+        if (result.stdout) {
+            console.log(result.stdout);
+        }
+        // Clean up legacy files after successful migration
+        console.log('🧹 Cleaning up legacy state files...');
+        let cleanedCount = 0;
+        for (const file of legacyFiles) {
+            const filePath = path.join(csrConfigDir, file);
+            if (fs.existsSync(filePath)) {
+                try {
+                    // Move to archive instead of deleting (safer)
+                    const archiveDir = path.join(csrConfigDir, 'archive');
+                    if (!fs.existsSync(archiveDir)) {
+                        fs.mkdirSync(archiveDir, { recursive: true });
+                    }
+                    const archivePath = path.join(archiveDir, `migrated-${file}`);
+                    fs.renameSync(filePath, archivePath);
+                    cleanedCount++;
+                } catch (err) {
+                    console.log(`   ⚠️ Could not archive ${file}: ${err.message}`);
+                }
+            }
+        }
+        if (cleanedCount > 0) {
+            console.log(`   ✓ Archived ${cleanedCount} legacy files to config/archive/`);
+        }
         console.log('✅ Migration completed successfully!');
         console.log('🎉 Now using Unified State Management v5.0 (20x faster!)');
     } catch (error) {
-        console.log('⚠️  Migration encountered an issue:', error.message);
+        // Handle unexpected errors
+        console.log('⚠️  Migration encountered an unexpected issue:', error.message);
         console.log('   Your existing state files are preserved.');
         console.log('   To run migration manually: python3 scripts/migrate-to-unified-state.py');
         console.log('   For help: https://github.com/ramakay/claude-self-reflect/issues');