npm - claude-self-reflect - Versions diffs - 3.2.4 → 3.3.1 - Mend

claude-self-reflect 3.2.4 → 3.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/.claude/agents/claude-self-reflect-test.md +992 -510
package/.claude/agents/reflection-specialist.md +59 -3
package/README.md +14 -5
package/installer/cli.js +16 -0
package/installer/postinstall.js +14 -0
package/installer/statusline-setup.js +289 -0
package/mcp-server/run-mcp.sh +73 -5
package/mcp-server/src/app_context.py +64 -0
package/mcp-server/src/config.py +57 -0
package/mcp-server/src/connection_pool.py +286 -0
package/mcp-server/src/decay_manager.py +106 -0
package/mcp-server/src/embedding_manager.py +64 -40
package/mcp-server/src/embeddings_old.py +141 -0
package/mcp-server/src/models.py +64 -0
package/mcp-server/src/parallel_search.py +305 -0
package/mcp-server/src/project_resolver.py +5 -0
package/mcp-server/src/reflection_tools.py +211 -0
package/mcp-server/src/rich_formatting.py +196 -0
package/mcp-server/src/search_tools.py +874 -0
package/mcp-server/src/server.py +127 -1720
package/mcp-server/src/temporal_design.py +132 -0
package/mcp-server/src/temporal_tools.py +604 -0
package/mcp-server/src/temporal_utils.py +384 -0
package/mcp-server/src/utils.py +150 -67
package/package.json +15 -1
package/scripts/add-timestamp-indexes.py +134 -0
package/scripts/ast_grep_final_analyzer.py +325 -0
package/scripts/ast_grep_unified_registry.py +556 -0
package/scripts/check-collections.py +29 -0
package/scripts/csr-status +366 -0
package/scripts/debug-august-parsing.py +76 -0
package/scripts/debug-import-single.py +91 -0
package/scripts/debug-project-resolver.py +82 -0
package/scripts/debug-temporal-tools.py +135 -0
package/scripts/delta-metadata-update.py +547 -0
package/scripts/import-conversations-unified.py +157 -25
package/scripts/precompact-hook.sh +33 -0
package/scripts/session_quality_tracker.py +481 -0
package/scripts/streaming-watcher.py +1578 -0
package/scripts/update_patterns.py +334 -0
package/scripts/utils.py +39 -0

package/.claude/agents/claude-self-reflect-test.md CHANGED Viewed

@@ -1,32 +1,69 @@
 ---
 name: claude-self-reflect-test
 description: Comprehensive end-to-end testing specialist for Claude Self-Reflect system validation. Tests all components including import pipeline, MCP integration, search functionality, and both local/cloud embedding modes. Ensures system integrity before releases and validates installations. Always restores system to local mode after testing.
-tools: Read, Bash, Grep, Glob, LS, Write, Edit, TodoWrite
+tools: Read, Bash, Grep, Glob, LS, Write, Edit, TodoWrite, mcp__claude-self-reflect__reflect_on_past, mcp__claude-self-reflect__store_reflection, mcp__claude-self-reflect__get_recent_work, mcp__claude-self-reflect__search_by_recency, mcp__claude-self-reflect__get_timeline, mcp__claude-self-reflect__quick_search, mcp__claude-self-reflect__search_summary, mcp__claude-self-reflect__get_more_results, mcp__claude-self-reflect__search_by_file, mcp__claude-self-reflect__search_by_concept, mcp__claude-self-reflect__get_full_conversation, mcp__claude-self-reflect__get_next_results
 ---
-You are a comprehensive testing specialist for Claude Self-Reflect. You validate the entire system end-to-end, ensuring all components work correctly across different configurations and deployment scenarios.
+You are the comprehensive testing specialist for Claude Self-Reflect. You validate EVERY component and feature, ensuring complete system integrity across all configurations and deployment scenarios. You test current v3.x features including temporal queries, time-based search, and activity timelines.
 ## Core Testing Philosophy
-1. **Test Everything** - Import pipeline, MCP tools, search functionality, state management
-2. **Both Modes** - Validate both local (FastEmbed) and cloud (Voyage AI) embeddings
-3. **Always Restore** - System MUST be left in 100% local state after any testing
-4. **Diagnose & Fix** - When issues are found, diagnose root causes and provide solutions
-5. **Document Results** - Create clear test reports with actionable findings
+1. **Test Everything** - Every feature, every tool, every pipeline
+2. **Both Modes** - Validate local (FastEmbed) and cloud (Voyage AI) embeddings
+3. **Always Restore** - System MUST be left in 100% local state after testing
+4. **Diagnose & Fix** - Identify root causes and provide solutions
+5. **Document Results** - Create clear, actionable test reports
-## System Architecture Understanding
+## System Architecture Knowledge
 ### Components to Test
 - **Import Pipeline**: JSONL parsing, chunking, embedding generation, Qdrant storage
-- **MCP Server**: Tool availability, search functionality, reflection storage
+- **MCP Server**: 15+ tools including temporal, search, reflection, pagination tools
+- **Temporal Tools** (v3.x): get_recent_work, search_by_recency, get_timeline
+- **CLI Tool**: Installation, packaging, setup wizard, status commands
+- **Docker Stack**: Qdrant, streaming watcher, health monitoring
 - **State Management**: File locking, atomic writes, resume capability
-- **Docker Containers**: Qdrant, streaming watcher, service health
 - **Search Quality**: Relevance scores, metadata extraction, cross-project search
-### Embedding Modes
-- **Local Mode**: FastEmbed with all-MiniLM-L6-v2 (384 dimensions)
-- **Cloud Mode**: Voyage AI with voyage-3-lite (1024 dimensions)
-- **Mode Detection**: Check collection suffixes (_local vs _voyage)
+- **Memory Decay**: Client-side and native Qdrant decay
+- **Modularization**: Server architecture with search_tools, temporal_tools, reflection_tools, parallel_search modules
+- **Metadata Extraction**: AST patterns, concepts, files analyzed, tools used
+- **Hook System**: session-start, precompact, submit hooks
+- **Sub-Agents**: All 6 specialized agents (reflection, import-debugger, docker, mcp, search, qdrant)
+- **Embedding Modes**: Local (FastEmbed 384d) and Cloud (Voyage AI 1024d) with mode switching
+- **Zero Vector Detection**: Root cause analysis and prevention
+### Test Files Knowledge
+```
+scripts/
+├── import-conversations-unified.py      # Main import script
+├── streaming-importer.py               # Streaming import
+├── delta-metadata-update.py            # Metadata updater
+├── check-collections.py                # Collection checker
+├── add-timestamp-indexes.py            # Timestamp indexer (NEW)
+├── test-temporal-comprehensive.py      # Temporal tests (NEW)
+├── test-project-scoping.py            # Project scoping test (NEW)
+├── test-direct-temporal.py            # Direct temporal test (NEW)
+├── debug-temporal-tools.py            # Temporal debug (NEW)
+└── status.py                           # Import status checker
+mcp-server/
+├── src/
+│   ├── server.py                      # Main MCP server (2,835 lines!)
+│   ├── temporal_utils.py              # Temporal utilities (NEW)
+│   ├── temporal_design.py             # Temporal design doc (NEW)
+│   └── project_resolver.py            # Project resolution
+tests/
+├── unit/                               # Unit tests
+├── integration/                        # Integration tests
+├── performance/                        # Performance tests
+└── e2e/                               # End-to-end tests
+config/
+├── imported-files.json                # Import state
+├── csr-watcher.json                   # Watcher state
+└── delta-update-state.json            # Delta update state
+```
 ## Comprehensive Test Suite
@@ -35,659 +72,1104 @@ You are a comprehensive testing specialist for Claude Self-Reflect. You validate
 #!/bin/bash
 echo "=== SYSTEM HEALTH CHECK ==="
+# Check version
+echo "Version Check:"
+grep version package.json | cut -d'"' -f4
+echo ""
 # Check Docker services
 echo "Docker Services:"
 docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" | grep -E "(qdrant|watcher|streaming)"
-# Check Qdrant collections
-echo -e "\nQdrant Collections:"
-curl -s http://localhost:6333/collections | jq -r '.result.collections[] | "\(.name)\t\(.points_count) points"'
+# Check Qdrant collections with indexes
+echo -e "\nQdrant Collections (with timestamp indexes):"
+curl -s http://localhost:6333/collections | jq -r '.result.collections[] |
+    "\(.name)\t\(.points_count) points"'
+# Check for timestamp indexes
+echo -e "\nTimestamp Index Status:"
+python -c "
+from qdrant_client import QdrantClient
+from qdrant_client.models import OrderBy
+client = QdrantClient('http://localhost:6333')
+collections = client.get_collections().collections
+indexed = 0
+for col in collections[:5]:
+    try:
+        client.scroll(col.name, order_by=OrderBy(key='timestamp', direction='desc'), limit=1)
+        indexed += 1
+    except:
+        pass
+print(f'Collections with timestamp index: {indexed}/{len(collections)}')
+"
-# Check MCP connection
-echo -e "\nMCP Status:"
+# Check MCP connection with temporal tools
+echo -e "\nMCP Status (with temporal tools):"
 claude mcp list | grep claude-self-reflect || echo "MCP not configured"
 # Check import status
 echo -e "\nImport Status:"
-python mcp-server/src/status.py | jq '.overall'
+python mcp-server/src/status.py 2>/dev/null | jq '.overall' || echo "Status check failed"
 # Check embedding mode
-echo -e "\nCurrent Mode:"
+echo -e "\nCurrent Embedding Mode:"
 if [ -f .env ] && grep -q "PREFER_LOCAL_EMBEDDINGS=false" .env; then
-    echo "Cloud mode (Voyage AI)"
+    echo "Cloud mode (Voyage AI) - 1024 dimensions"
 else
-    echo "Local mode (FastEmbed)"
+    echo "Local mode (FastEmbed) - 384 dimensions"
 fi
+# Check CLI installation
+echo -e "\nCLI Installation:"
+which claude-self-reflect && echo "CLI installed globally" || echo "CLI not in PATH"
+# Check server.py size (modularization needed)
+echo -e "\nServer.py Status:"
+wc -l mcp-server/src/server.py | awk '{print "Lines: " $1 " (needs modularization if >1000)"}'
 ```
-### 2. Import Pipeline Validation
+### 2. Temporal Tools Testing (v3.x)
 ```bash
 #!/bin/bash
-echo "=== IMPORT PIPELINE VALIDATION ==="
+echo "=== TEMPORAL TOOLS TESTING ==="
-# Test JSONL parsing
-test_jsonl_parsing() {
-    echo "Testing JSONL parsing..."
-    TEST_FILE="/tmp/test-$$.jsonl"
-    cat > $TEST_FILE << 'EOF'
-{"type":"conversation","uuid":"test-001","messages":[{"role":"human","content":"Test question"},{"role":"assistant","content":[{"type":"text","text":"Test answer with code:\n```python\nprint('hello')\n```"}]}]}
-EOF
-    python -c "
-import json
-with open('$TEST_FILE') as f:
-    data = json.load(f)
-    assert data['uuid'] == 'test-001'
-    assert len(data['messages']) == 2
-    print('✅ PASS: JSONL parsing works')
-" || echo "❌ FAIL: JSONL parsing error"
-    rm -f $TEST_FILE
+# Test timestamp indexes exist
+test_timestamp_indexes() {
+    echo "Testing timestamp indexes..."
+    python scripts/add-timestamp-indexes.py
+    echo "✅ Timestamp indexes updated"
 }
-# Test chunking
-test_chunking() {
-    echo "Testing message chunking..."
-    python -c "
-from scripts.import_conversations_unified import chunk_messages
-messages = [
-    {'role': 'human', 'content': 'Q1'},
-    {'role': 'assistant', 'content': 'A1'},
-    {'role': 'human', 'content': 'Q2'},
-    {'role': 'assistant', 'content': 'A2'},
-]
-chunks = list(chunk_messages(messages, chunk_size=3))
-if len(chunks) == 2:
-    print('✅ PASS: Chunking works correctly')
-else:
-    print(f'❌ FAIL: Expected 2 chunks, got {len(chunks)}')
-"
+# Test get_recent_work
+test_get_recent_work() {
+    echo "Testing get_recent_work..."
+    cat << 'EOF' > /tmp/test_recent_work.py
+import asyncio
+import sys
+import os
+sys.path.insert(0, 'mcp-server/src')
+os.environ['QDRANT_URL'] = 'http://localhost:6333'
+async def test():
+    from server import get_recent_work
+    class MockContext:
+        async def debug(self, msg): print(f"[DEBUG] {msg}")
+        async def report_progress(self, *args): pass
+    ctx = MockContext()
+    # Test no scope (should default to current project)
+    result1 = await get_recent_work(ctx, limit=3)
+    print("No scope result:", "PASS" if "conversation" in result1 else "FAIL")
+    # Test with scope='all'
+    result2 = await get_recent_work(ctx, limit=3, project='all')
+    print("All scope result:", "PASS" if "conversation" in result2 else "FAIL")
+    # Test with specific project
+    result3 = await get_recent_work(ctx, limit=3, project='claude-self-reflect')
+    print("Specific project:", "PASS" if "conversation" in result3 else "FAIL")
+asyncio.run(test())
+EOF
+    python /tmp/test_recent_work.py
 }
-# Test embedding generation
-test_embeddings() {
-    echo "Testing embedding generation..."
-    python -c "
+# Test search_by_recency
+test_search_by_recency() {
+    echo "Testing search_by_recency..."
+    cat << 'EOF' > /tmp/test_search_recency.py
+import asyncio
+import sys
 import os
-os.environ['PREFER_LOCAL_EMBEDDINGS'] = 'true'
-from fastembed import TextEmbedding
-model = TextEmbedding('sentence-transformers/all-MiniLM-L6-v2')
-embeddings = list(model.embed(['test text']))
-if len(embeddings[0]) == 384:
-    print('✅ PASS: Local embeddings work (384 dims)')
-else:
-    print(f'❌ FAIL: Wrong dimensions: {len(embeddings[0])}')
-"
-}
+sys.path.insert(0, 'mcp-server/src')
+os.environ['QDRANT_URL'] = 'http://localhost:6333'
-# Test Qdrant operations
-test_qdrant() {
-    echo "Testing Qdrant operations..."
-    python -c "
-from qdrant_client import QdrantClient
-client = QdrantClient('http://localhost:6333')
-collections = client.get_collections().collections
-if collections:
-    print(f'✅ PASS: Qdrant accessible ({len(collections)} collections)')
-else:
-    print('❌ FAIL: No Qdrant collections found')
-"
+async def test():
+    from server import search_by_recency
+    class MockContext:
+        async def debug(self, msg): print(f"[DEBUG] {msg}")
+    ctx = MockContext()
+    result = await search_by_recency(ctx, query="test", time_range="last week")
+    print("Search by recency:", "PASS" if "result" in result or "no_results" in result else "FAIL")
+asyncio.run(test())
+EOF
+    python /tmp/test_search_recency.py
 }
-# Run all tests
-test_jsonl_parsing
-test_chunking
-test_embeddings
-test_qdrant
-```
+# Test get_timeline
+test_get_timeline() {
+    echo "Testing get_timeline..."
+    cat << 'EOF' > /tmp/test_timeline.py
+import asyncio
+import sys
+import os
+sys.path.insert(0, 'mcp-server/src')
+os.environ['QDRANT_URL'] = 'http://localhost:6333'
-### 3. MCP Integration Test
-```bash
-#!/bin/bash
-echo "=== MCP INTEGRATION TEST ==="
-# Test search functionality
-test_mcp_search() {
-    echo "Testing MCP search..."
-    # This would be run in Claude Code
-    cat << 'EOF'
-To test in Claude Code:
-1. Search for any recent conversation topic
-2. Verify results have scores > 0.7
-3. Check that metadata includes files and tools
+async def test():
+    from server import get_timeline
+    class MockContext:
+        async def debug(self, msg): print(f"[DEBUG] {msg}")
+    ctx = MockContext()
+    result = await get_timeline(ctx, time_range="last month", granularity="week")
+    print("Timeline result:", "PASS" if "timeline" in result else "FAIL")
+asyncio.run(test())
 EOF
+    python /tmp/test_timeline.py
 }
-# Test search_by_file
-test_search_by_file() {
-    echo "Testing search_by_file..."
+# Test natural language time parsing
+test_temporal_parsing() {
+    echo "Testing temporal parsing..."
     python -c "
-# Simulate MCP search_by_file
-from qdrant_client import QdrantClient
-client = QdrantClient('http://localhost:6333')
-# Get collections with file metadata
-found_files = False
-for collection in client.get_collections().collections[:5]:
-    points = client.scroll(collection.name, limit=10)[0]
-    for point in points:
-        if 'files_analyzed' in point.payload:
-            found_files = True
-            break
-    if found_files:
-        break
-if found_files:
-    print('✅ PASS: File metadata available for search')
-else:
-    print('⚠️  WARN: No file metadata found (run delta-metadata-update.py)')
+from mcp_server.src.temporal_utils import TemporalParser
+parser = TemporalParser()
+tests = ['yesterday', 'last week', 'past 3 days']
+for expr in tests:
+    try:
+        start, end = parser.parse_time_expression(expr)
+        print(f'✅ {expr}: {start.date()} to {end.date()}')
+    except Exception as e:
+        print(f'❌ {expr}: {e}')
 "
 }
-# Test reflection storage
-test_reflection_storage() {
-    echo "Testing reflection storage..."
-    # This requires MCP server to be running
-    echo "Manual test in Claude Code:"
-    echo "1. Store a reflection with tags"
-    echo "2. Search for it immediately"
-    echo "3. Verify it's retrievable"
-}
-test_mcp_search
-test_search_by_file
-test_reflection_storage
+# Run all temporal tests
+test_timestamp_indexes
+test_get_recent_work
+test_search_by_recency
+test_get_timeline
+test_temporal_parsing
 ```
-### 4. Dual-Mode Testing with Auto-Restore
+### 3. CLI Tool Testing (Enhanced)
 ```bash
 #!/bin/bash
-# CRITICAL: This script ALWAYS restores to local mode on exit
-echo "=== DUAL-MODE TESTING WITH AUTO-RESTORE ==="
+echo "=== CLI TOOL TESTING ==="
-# Function to restore local state
-restore_local_state() {
-    echo "=== RESTORING 100% LOCAL STATE ==="
+# Test CLI installation
+test_cli_installation() {
+    echo "Testing CLI installation..."
-    # Update .env
-    if [ -f .env ]; then
-        sed -i.bak 's/PREFER_LOCAL_EMBEDDINGS=false/PREFER_LOCAL_EMBEDDINGS=true/' .env
-        sed -i.bak 's/USE_VOYAGE=true/USE_VOYAGE=false/' .env
-    fi
-    # Update MCP to use local
-    claude mcp remove claude-self-reflect 2>/dev/null
-    claude mcp add claude-self-reflect \
-        "$(pwd)/mcp-server/run-mcp.sh" \
-        -e QDRANT_URL="http://localhost:6333" \
-        -e PREFER_LOCAL_EMBEDDINGS="true" \
-        -s user
-    # Restart containers if needed
-    if docker ps | grep -q streaming-importer; then
-        docker-compose restart streaming-importer
+    # Check if installed globally
+    if command -v claude-self-reflect &> /dev/null; then
+        VERSION=$(claude-self-reflect --version 2>/dev/null || echo "unknown")
+        echo "✅ CLI installed globally (version: $VERSION)"
+    else
+        echo "❌ CLI not found in PATH"
     fi
-    echo "✅ System restored to 100% local state"
-}
-# Set trap to ALWAYS restore on exit
-trap restore_local_state EXIT INT TERM
-# Test local mode
-test_local_mode() {
-    echo "=== Testing Local Mode (FastEmbed) ==="
-    export PREFER_LOCAL_EMBEDDINGS=true
-    # Create test data
-    TEST_DIR="/tmp/test-local-$$"
-    mkdir -p "$TEST_DIR"
-    cat > "$TEST_DIR/test.jsonl" << 'EOF'
-{"type":"conversation","uuid":"local-test","messages":[{"role":"human","content":"Local mode test"}]}
-EOF
-    # Import and verify
-    python scripts/import-conversations-unified.py --file "$TEST_DIR/test.jsonl"
+    # Check package.json files
+    echo "Checking package files..."
+    FILES=(
+        "package.json"
+        "cli/package.json"
+        "cli/src/index.js"
+        "cli/src/setup-wizard.js"
+    )
-    # Check dimensions
-    COLLECTION=$(curl -s http://localhost:6333/collections | jq -r '.result.collections[] | select(.name | contains("_local")) | .name' | head -1)
-    if [ -n "$COLLECTION" ]; then
-        DIMS=$(curl -s "http://localhost:6333/collections/$COLLECTION" | jq '.result.config.params.vectors.size')
-        if [ "$DIMS" = "384" ]; then
-            echo "✅ PASS: Local mode uses 384 dimensions"
+    for file in "${FILES[@]}"; do
+        if [ -f "$file" ]; then
+            echo "✅ $file exists"
         else
-            echo "❌ FAIL: Wrong dimensions: $DIMS"
+            echo "❌ $file missing"
         fi
-    fi
-    rm -rf "$TEST_DIR"
+    done
 }
-# Test cloud mode (if available)
-test_cloud_mode() {
-    if [ ! -f .env ] || ! grep -q "VOYAGE_KEY=" .env; then
-        echo "⚠️  SKIP: No Voyage API key configured"
-        return
-    fi
+# Test CLI commands
+test_cli_commands() {
+    echo "Testing CLI commands..."
-    echo "=== Testing Cloud Mode (Voyage AI) ==="
-    export PREFER_LOCAL_EMBEDDINGS=false
-    export VOYAGE_KEY=$(grep VOYAGE_KEY .env | cut -d= -f2)
+    # Test status command
+    claude-self-reflect status 2>/dev/null && echo "✅ Status command works" || echo "❌ Status command failed"
-    # Create test data
-    TEST_DIR="/tmp/test-voyage-$$"
-    mkdir -p "$TEST_DIR"
-    cat > "$TEST_DIR/test.jsonl" << 'EOF'
-{"type":"conversation","uuid":"voyage-test","messages":[{"role":"human","content":"Cloud mode test"}]}
-EOF
+    # Test help
+    claude-self-reflect --help 2>/dev/null && echo "✅ Help works" || echo "❌ Help failed"
+}
+# Test npm packaging
+test_npm_packaging() {
+    echo "Testing npm packaging..."
-    # Import and verify
-    python scripts/import-conversations-unified.py --file "$TEST_DIR/test.jsonl"
+    # Check if publishable
+    npm pack --dry-run 2>&1 | grep -q "claude-self-reflect" && \
+        echo "✅ Package is publishable" || \
+        echo "❌ Package issues detected"
-    # Check dimensions
-    COLLECTION=$(curl -s http://localhost:6333/collections | jq -r '.result.collections[] | select(.name | contains("_voyage")) | .name' | head -1)
-    if [ -n "$COLLECTION" ]; then
-        DIMS=$(curl -s "http://localhost:6333/collections/$COLLECTION" | jq '.result.config.params.vectors.size')
-        if [ "$DIMS" = "1024" ]; then
-            echo "✅ PASS: Cloud mode uses 1024 dimensions"
-        else
-            echo "❌ FAIL: Wrong dimensions: $DIMS"
-        fi
-    fi
-    rm -rf "$TEST_DIR"
+    # Check dependencies
+    npm ls --depth=0 2>&1 | grep -q "UNMET" && \
+        echo "❌ Unmet dependencies" || \
+        echo "✅ Dependencies satisfied"
 }
-# Run tests
-test_local_mode
-test_cloud_mode
-# Trap ensures restoration even if tests fail
+test_cli_installation
+test_cli_commands
+test_npm_packaging
 ```
-### 5. Data Integrity Validation
+### 4. Import Pipeline Validation (Enhanced)
 ```bash
 #!/bin/bash
-echo "=== DATA INTEGRITY VALIDATION ==="
+echo "=== IMPORT PIPELINE VALIDATION ==="
-# Test no duplicates on re-import
-test_no_duplicates() {
-    echo "Testing duplicate prevention..."
+# Test unified importer
+test_unified_importer() {
+    echo "Testing unified importer..."
-    # Find a test file
+    # Find a test JSONL file
     TEST_FILE=$(find ~/.claude/projects -name "*.jsonl" -type f | head -1)
     if [ -z "$TEST_FILE" ]; then
-        echo "⚠️  SKIP: No test files available"
+        echo "⚠️  No test files available"
         return
     fi
-    # Get collection
-    PROJECT_DIR=$(dirname "$TEST_FILE")
-    PROJECT_NAME=$(basename "$PROJECT_DIR")
-    COLLECTION="${PROJECT_NAME}_local"
+    # Test with limit
+    python scripts/import-conversations-unified.py --file "$TEST_FILE" --limit 1
-    # Count before
-    COUNT_BEFORE=$(curl -s "http://localhost:6333/collections/$COLLECTION/points/count" | jq '.result.count')
-    # Force re-import
-    python scripts/import-conversations-unified.py --file "$TEST_FILE" --force
-    # Count after
-    COUNT_AFTER=$(curl -s "http://localhost:6333/collections/$COLLECTION/points/count" | jq '.result.count')
-    if [ "$COUNT_BEFORE" = "$COUNT_AFTER" ]; then
-        echo "✅ PASS: No duplicates created on re-import"
+    if [ $? -eq 0 ]; then
+        echo "✅ Unified importer works"
     else
-        echo "❌ FAIL: Duplicates detected ($COUNT_BEFORE -> $COUNT_AFTER)"
+        echo "❌ Unified importer failed"
     fi
 }
-# Test file locking
-test_file_locking() {
-    echo "Testing concurrent import safety..."
-    # Run parallel imports
-    python scripts/import-conversations-unified.py --limit 1 &
-    PID1=$!
-    python scripts/import-conversations-unified.py --limit 1 &
-    PID2=$!
-    wait $PID1 $PID2
-    if [ $? -eq 0 ]; then
-        echo "✅ PASS: Concurrent imports handled safely"
+# Test for zero chunks/vectors - CRITICAL
+test_zero_chunks_detection() {
+    echo "Testing zero chunks/vectors detection..."
+    # Check recent imports for zero chunks
+    IMPORT_LOG=$(python scripts/import-conversations-unified.py --limit 5 2>&1)
+    # Check for zero chunks warnings
+    if echo "$IMPORT_LOG" | grep -q "Imported 0 chunks"; then
+        echo "❌ CRITICAL: Found imports with 0 chunks!"
+        echo "   Files producing 0 chunks:"
+        echo "$IMPORT_LOG" | grep -B1 "Imported 0 chunks" | grep "import of"
+        # Analyze why chunks are zero
+        echo "   Analyzing root cause..."
+        # Check for thinking-only content
+        PROBLEM_FILE=$(echo "$IMPORT_LOG" | grep -B1 "Imported 0 chunks" | grep "\.jsonl" | head -1 | awk '{print $NF}')
+        if [ -n "$PROBLEM_FILE" ]; then
+            python -c "
+import json
+file_path = '$PROBLEM_FILE'
+has_thinking = 0
+has_text = 0
+with open(file_path, 'r') as f:
+    for line in f:
+        data = json.loads(line.strip())
+        if 'message' in data and data['message']:
+            content = data['message'].get('content', [])
+            if isinstance(content, list):
+                for item in content:
+                    if isinstance(item, dict):
+                        if item.get('type') == 'thinking':
+                            has_thinking += 1
+                        elif item.get('type') == 'text':
+                            has_text += 1
+print(f'   Thinking blocks: {has_thinking}')
+print(f'   Text blocks: {has_text}')
+if has_thinking > 0 and has_text == 0:
+    print('   ⚠️ File has only thinking content - import script may need fix')
+"
+        fi
+        # DO NOT CERTIFY WITH ZERO CHUNKS
+        echo "   ⛔ CERTIFICATION BLOCKED: Fix zero chunks issue before certifying!"
+        return 1
     else
-        echo "❌ FAIL: File locking issue detected"
+        echo "✅ No zero chunks detected in recent imports"
     fi
+    # Also check Qdrant for empty collections
+    python -c "
+from qdrant_client import QdrantClient
+client = QdrantClient('http://localhost:6333')
+collections = client.get_collections().collections
+empty_collections = []
+for col in collections:
+    count = client.count(collection_name=col.name).count
+    if count == 0:
+        empty_collections.append(col.name)
+if empty_collections:
+    print(f'❌ Found {len(empty_collections)} empty collections: {empty_collections}')
+    print('   ⛔ CERTIFICATION BLOCKED: Empty collections detected!')
+else:
+    print('✅ All collections have vectors')
+" 2>/dev/null || echo "⚠️ Could not check Qdrant collections"
 }
-# Test state persistence
-test_state_persistence() {
-    echo "Testing state file persistence..."
+# Test streaming importer
+test_streaming_importer() {
+    echo "Testing streaming importer..."
-    STATE_FILE="$HOME/.claude-self-reflect/config/imported-files.json"
-    if [ -f "$STATE_FILE" ]; then
-        # Check file is valid JSON
-        if jq empty "$STATE_FILE" 2>/dev/null; then
-            echo "✅ PASS: State file is valid JSON"
-        else
-            echo "❌ FAIL: State file corrupted"
-        fi
+    if docker ps | grep -q streaming-importer; then
+        # Check if processing
+        docker logs streaming-importer --tail 10 | grep -q "Processing" && \
+            echo "✅ Streaming importer active" || \
+            echo "⚠️  Streaming importer idle"
     else
-        echo "⚠️  WARN: No state file found"
+        echo "❌ Streaming importer not running"
     fi
 }
-test_no_duplicates
-test_file_locking
-test_state_persistence
+# Test delta metadata update
+test_delta_metadata() {
+    echo "Testing delta metadata update..."
+    DRY_RUN=true python scripts/delta-metadata-update.py 2>&1 | grep -q "would update" && \
+        echo "✅ Delta metadata updater works" || \
+        echo "❌ Delta metadata updater failed"
+}
+test_unified_importer
+test_zero_chunks_detection  # CRITICAL: Must pass before certification
+test_streaming_importer
+test_delta_metadata
 ```
-### 6. Performance Validation
+### 5. Hook System Testing
 ```bash
 #!/bin/bash
-echo "=== PERFORMANCE VALIDATION ==="
+echo "=== HOOK SYSTEM TESTING ==="
+# Test session-start hook
+test_session_start_hook() {
+    echo "Testing session-start hook..."
+    HOOK_PATH="$HOME/.claude/hooks/session-start"
+    if [ -f "$HOOK_PATH" ]; then
+        echo "✅ session-start hook exists"
+        # Check if executable
+        [ -x "$HOOK_PATH" ] && echo "✅ Hook is executable" || echo "❌ Hook not executable"
+    else
+        echo "⚠️ session-start hook not configured"
+    fi
+}
-# Test import speed
-test_import_performance() {
-    echo "Testing import performance..."
-    START_TIME=$(date +%s)
-    TEST_FILE=$(find ~/.claude/projects -name "*.jsonl" -type f | head -1)
-    if [ -n "$TEST_FILE" ]; then
-        timeout 30 python scripts/import-conversations-unified.py --file "$TEST_FILE" --limit 1
-        END_TIME=$(date +%s)
-        DURATION=$((END_TIME - START_TIME))
-        if [ $DURATION -lt 10 ]; then
-            echo "✅ PASS: Import completed in ${DURATION}s"
-        else
-            echo "⚠️  WARN: Import took ${DURATION}s (expected <10s)"
-        fi
+# Test precompact hook
+test_precompact_hook() {
+    echo "Testing precompact hook..."
+    HOOK_PATH="$HOME/.claude/hooks/precompact"
+    if [ -f "$HOOK_PATH" ]; then
+        echo "✅ precompact hook exists"
+        # Test execution
+        timeout 10 "$HOOK_PATH" && echo "✅ Hook executes successfully" || echo "❌ Hook failed"
+    else
+        echo "⚠️ precompact hook not configured"
     fi
 }
-# Test search performance
-test_search_performance() {
-    echo "Testing search performance..."
+test_session_start_hook
+test_precompact_hook
+```
+### 6. Metadata Extraction Testing
+```bash
+#!/bin/bash
+echo "=== METADATA EXTRACTION TESTING ==="
+# Test metadata extraction
+test_metadata_extraction() {
+    echo "Testing metadata extraction..."
     python -c "
-import time
+import json
+from pathlib import Path
+# Check if metadata is being extracted
+config_dir = Path.home() / '.claude-self-reflect' / 'config'
+delta_state = config_dir / 'delta-update-state.json'
+if delta_state.exists():
+    with open(delta_state) as f:
+        state = json.load(f)
+        updated = state.get('updated_points', {})
+        if updated:
+            sample = list(updated.values())[0] if updated else {}
+            print(f'✅ Metadata extracted for {len(updated)} points')
+            if 'files_analyzed' in str(sample):
+                print('✅ files_analyzed metadata present')
+            if 'tools_used' in str(sample):
+                print('✅ tools_used metadata present')
+            if 'concepts' in str(sample):
+                print('✅ concepts metadata present')
+            if 'code_patterns' in str(sample):
+                print('✅ code_patterns (AST) metadata present')
+        else:
+            print('⚠️ No metadata updates found')
+else:
+    print('❌ Delta update state file not found')
+"
+}
+# Test AST pattern extraction
+test_ast_patterns() {
+    echo "Testing AST pattern extraction..."
+    TEST_FILE=$(mktemp)
+    cat > "$TEST_FILE" << 'EOF'
+import ast
+text = "def test(): return True"
+tree = ast.parse(text)
+patterns = [node.__class__.__name__ for node in ast.walk(tree)]
+print(f"AST patterns: {patterns}")
+EOF
+    python "$TEST_FILE"
+    rm "$TEST_FILE"
+}
+test_metadata_extraction
+test_ast_patterns
+```
+### 7. Zero Vector Investigation
+```bash
+#!/bin/bash
+echo "=== ZERO VECTOR INVESTIGATION ==="
+test_zero_vectors() {
+    python -c "
+import numpy as np
 from qdrant_client import QdrantClient
-from fastembed import TextEmbedding
+# Connect to Qdrant
 client = QdrantClient('http://localhost:6333')
-model = TextEmbedding('sentence-transformers/all-MiniLM-L6-v2')
-# Generate query embedding
-query_vec = list(model.embed(['test search query']))[0]
-# Time search across collections
-start = time.time()
-collections = client.get_collections().collections[:5]
-for col in collections:
-    if '_local' in col.name:
-        try:
-            client.search(col.name, query_vec, limit=5)
-        except:
-            pass
-elapsed = time.time() - start
-if elapsed < 1:
-    print(f'✅ PASS: Search completed in {elapsed:.2f}s')
+# Check for zero vectors
+collections = client.get_collections().collections
+zero_count = 0
+total_checked = 0
+for col in collections[:5]:  # Check first 5 collections
+    try:
+        points = client.scroll(
+            collection_name=col.name,
+            limit=10,
+            with_vectors=True
+        )[0]
+        for point in points:
+            total_checked += 1
+            if point.vector:
+                if isinstance(point.vector, list) and all(v == 0 for v in point.vector):
+                    zero_count += 1
+                    print(f'❌ CRITICAL: Zero vector in {col.name}, point {point.id}')
+                elif isinstance(point.vector, dict):
+                    for vec_name, vec in point.vector.items():
+                        if all(v == 0 for v in vec):
+                            zero_count += 1
+                            print(f'❌ CRITICAL: Zero vector in {col.name}, point {point.id}, vector {vec_name}')
+    except Exception as e:
+        print(f'⚠️ Error checking {col.name}: {e}')
+if zero_count == 0:
+    print(f'✅ No zero vectors found (checked {total_checked} points)')
 else:
-    print(f'⚠️  WARN: Search took {elapsed:.2f}s')
+    print(f'❌ Found {zero_count} zero vectors out of {total_checked} points')
 "
 }
-# Test memory usage
-test_memory_usage() {
-    echo "Testing memory usage..."
-    if docker ps | grep -q streaming-importer; then
-        MEM=$(docker stats --no-stream --format "{{.MemUsage}}" streaming-importer | cut -d'/' -f1 | sed 's/[^0-9.]//g')
-        # Note: Total includes ~180MB for FastEmbed model
-        if (( $(echo "$MEM < 300" | bc -l) )); then
-            echo "✅ PASS: Memory usage ${MEM}MB is acceptable"
+# Test embedding generation
+test_embedding_generation() {
+    echo "Testing embedding generation..."
+    python -c "
+try:
+    from fastembed import TextEmbedding
+    model = TextEmbedding('sentence-transformers/all-MiniLM-L6-v2')
+    texts = ['test', 'hello world', '']
+    for text in texts:
+        embedding = list(model.embed([text]))[0]
+        is_zero = all(v == 0 for v in embedding)
+        if is_zero:
+            print(f'❌ CRITICAL: Zero embedding for \'{text}\'')
+        else:
+            import numpy as np
+            print(f'✅ Non-zero embedding for \'{text}\' (mean={np.mean(embedding):.4f})')
+except ImportError:
+    print('❌ FastEmbed not installed')
+"
+}
+test_zero_vectors
+test_embedding_generation
+```
+### 8. Sub-Agent Testing
+```bash
+#!/bin/bash
+echo "=== SUB-AGENT TESTING ==="
+# List all sub-agents
+test_subagent_availability() {
+    echo "Checking sub-agent availability..."
+    AGENTS_DIR="$HOME/projects/claude-self-reflect/.claude/agents"
+    EXPECTED_AGENTS=(
+        "claude-self-reflect-test.md"
+        "import-debugger.md"
+        "docker-orchestrator.md"
+        "mcp-integration.md"
+        "search-optimizer.md"
+        "reflection-specialist.md"
+        "qdrant-specialist.md"
+    )
+    for agent in "${EXPECTED_AGENTS[@]}"; do
+        if [ -f "$AGENTS_DIR/$agent" ]; then
+            echo "✅ $agent present"
         else
-            echo "⚠️  WARN: High memory usage: ${MEM}MB"
+            echo "❌ $agent missing"
         fi
+    done
+}
+test_subagent_availability
+```
+### 9. Embedding Mode Comprehensive Test
+```bash
+#!/bin/bash
+echo "=== EMBEDDING MODE TESTING ==="
+# Test both modes
+test_both_embedding_modes() {
+    echo "Testing local mode (FastEmbed)..."
+    PREFER_LOCAL_EMBEDDINGS=true python -c "
+from mcp_server.src.embedding_manager import get_embedding_manager
+em = get_embedding_manager()
+print(f'Local mode: {em.model_type}, dimension: {em.get_vector_dimension()}')
+"
+    if [ -n "$VOYAGE_KEY" ]; then
+        echo "Testing cloud mode (Voyage AI)..."
+        PREFER_LOCAL_EMBEDDINGS=false python -c "
+from mcp_server.src.embedding_manager import get_embedding_manager
+em = get_embedding_manager()
+print(f'Cloud mode: {em.model_type}, dimension: {em.get_vector_dimension()}')
+"
+    else
+        echo "⚠️ VOYAGE_KEY not set, skipping cloud mode test"
     fi
 }
-test_import_performance
-test_search_performance
-test_memory_usage
+# Test mode switching
+test_mode_switching() {
+    echo "Testing mode switching..."
+    python -c "
+from pathlib import Path
+env_file = Path('.env')
+if env_file.exists():
+    content = env_file.read_text()
+    if 'PREFER_LOCAL_EMBEDDINGS=false' in content:
+        print('Currently in CLOUD mode')
+    else:
+        print('Currently in LOCAL mode')
+    # Test switching
+    print('Testing switch to LOCAL mode...')
+    new_content = content.replace('PREFER_LOCAL_EMBEDDINGS=false', 'PREFER_LOCAL_EMBEDDINGS=true')
+    env_file.write_text(new_content)
+    print('✅ Switched to LOCAL mode')
+else:
+    print('⚠️ .env file not found')
+"
+}
+test_both_embedding_modes
+test_mode_switching
 ```
-### 7. Security Validation
+### 10. MCP Tools Comprehensive Test
 ```bash
 #!/bin/bash
-echo "=== SECURITY VALIDATION ==="
+echo "=== MCP TOOLS COMPREHENSIVE TEST ==="
+# This should be run via Claude Code for actual MCP testing
+cat << 'EOF'
+To test all MCP tools in Claude Code:
+1. SEARCH TOOLS:
+   - mcp__claude-self-reflect__reflect_on_past("test query", limit=3)
+   - mcp__claude-self-reflect__quick_search("test")
+   - mcp__claude-self-reflect__search_summary("test")
+   - mcp__claude-self-reflect__search_by_file("server.py")
+   - mcp__claude-self-reflect__search_by_concept("testing")
+2. TEMPORAL TOOLS (NEW):
+   - mcp__claude-self-reflect__get_recent_work(limit=5)
+   - mcp__claude-self-reflect__get_recent_work(project="all")
+   - mcp__claude-self-reflect__search_by_recency("bug", time_range="last week")
+   - mcp__claude-self-reflect__get_timeline(time_range="last month", granularity="week")
+3. REFLECTION TOOLS:
+   - mcp__claude-self-reflect__store_reflection("Test insight", tags=["test"])
+   - mcp__claude-self-reflect__get_full_conversation("conversation-id")
+4. PAGINATION:
+   - mcp__claude-self-reflect__get_more_results("query", offset=3)
+   - mcp__claude-self-reflect__get_next_results("query", offset=3)
+Expected Results:
+- All tools should return valid XML/markdown responses
+- Search scores should be > 0.3 for relevant results
+- Temporal tools should respect project scoping
+- No errors or timeouts
+EOF
+```
-# Check for API key leaks
-check_api_key_security() {
-    echo "Checking for API key exposure..."
-    CHECKS=(
-        "docker logs qdrant 2>&1"
-        "docker logs streaming-importer 2>&1"
-        "find /tmp -name '*claude*' -type f 2>/dev/null"
-    )
+### 6. Docker Health Validation
+```bash
+#!/bin/bash
+echo "=== DOCKER HEALTH VALIDATION ==="
+# Check Qdrant health
+check_qdrant_health() {
+    echo "Checking Qdrant health..."
-    EXPOSED=false
-    for check in "${CHECKS[@]}"; do
-        if eval "$check" | grep -q "VOYAGE_KEY=\|pa-"; then
-            echo "❌ FAIL: Potential API key exposure in: $check"
-            EXPOSED=true
+    # Check if running
+    if docker ps | grep -q qdrant; then
+        # Check API responsive
+        curl -s http://localhost:6333/health | grep -q "ok" && \
+            echo "✅ Qdrant healthy" || \
+            echo "❌ Qdrant API not responding"
+        # Check disk usage
+        DISK_USAGE=$(docker exec qdrant df -h /qdrant/storage | tail -1 | awk '{print $5}' | sed 's/%//')
+        if [ "$DISK_USAGE" -lt 80 ]; then
+            echo "✅ Disk usage: ${DISK_USAGE}%"
+        else
+            echo "⚠️  High disk usage: ${DISK_USAGE}%"
         fi
-    done
-    if [ "$EXPOSED" = false ]; then
-        echo "✅ PASS: No API key exposure detected"
+    else
+        echo "❌ Qdrant not running"
     fi
 }
-# Check file permissions
-check_file_permissions() {
-    echo "Checking file permissions..."
+# Check watcher health
+check_watcher_health() {
+    echo "Checking watcher health..."
-    CONFIG_DIR="$HOME/.claude-self-reflect/config"
-    if [ -d "$CONFIG_DIR" ]; then
-        # Check for world-readable files
-        WORLD_READABLE=$(find "$CONFIG_DIR" -perm -004 -type f 2>/dev/null)
-        if [ -z "$WORLD_READABLE" ]; then
-            echo "✅ PASS: Config files properly secured"
+    WATCHER_NAME="claude-reflection-safe-watcher"
+    if docker ps | grep -q "$WATCHER_NAME"; then
+        # Check memory usage
+        MEM=$(docker stats --no-stream --format "{{.MemUsage}}" "$WATCHER_NAME" 2>/dev/null | cut -d'/' -f1 | sed 's/[^0-9.]//g')
+        if [ -n "$MEM" ]; then
+            echo "✅ Watcher running (Memory: ${MEM}MB)"
         else
-            echo "⚠️  WARN: World-readable files found"
+            echo "⚠️  Watcher running but stats unavailable"
         fi
+        # Check for errors in logs
+        ERROR_COUNT=$(docker logs "$WATCHER_NAME" --tail 100 2>&1 | grep -c ERROR)
+        if [ "$ERROR_COUNT" -eq 0 ]; then
+            echo "✅ No errors in recent logs"
+        else
+            echo "⚠️  Found $ERROR_COUNT errors in logs"
+        fi
+    else
+        echo "❌ Watcher not running"
     fi
 }
-check_api_key_security
-check_file_permissions
-```
+# Check docker-compose status
+check_compose_status() {
+    echo "Checking docker-compose status..."
+    if [ -f docker-compose.yaml ]; then
+        # Validate compose file
+        docker-compose config --quiet 2>/dev/null && \
+            echo "✅ docker-compose.yaml valid" || \
+            echo "❌ docker-compose.yaml has errors"
+        # Check defined services
+        SERVICES=$(docker-compose config --services 2>/dev/null)
+        echo "Defined services: $SERVICES"
+    else
+        echo "❌ docker-compose.yaml not found"
+    fi
+}
-## Test Execution Workflow
+check_qdrant_health
+check_watcher_health
+check_compose_status
+```
-### Pre-Release Testing
+### 7. Modularization Readiness Check (NEW)
 ```bash
 #!/bin/bash
-# Complete pre-release validation
+echo "=== MODULARIZATION READINESS CHECK ==="
-echo "=== PRE-RELEASE TEST SUITE ==="
-echo "Version: $(grep version package.json | cut -d'"' -f4)"
-echo "Date: $(date)"
-echo ""
+# Analyze server.py for modularization
+analyze_server_py() {
+    echo "Analyzing server.py for modularization..."
+    FILE="mcp-server/src/server.py"
+    if [ -f "$FILE" ]; then
+        # Count lines
+        LINES=$(wc -l < "$FILE")
+        echo "Total lines: $LINES"
+        # Count tools
+        TOOL_COUNT=$(grep -c "@mcp.tool()" "$FILE")
+        echo "MCP tools defined: $TOOL_COUNT"
+        # Count imports
+        IMPORT_COUNT=$(grep -c "^import\|^from" "$FILE")
+        echo "Import statements: $IMPORT_COUNT"
+        # Identify major sections
+        echo -e "\nMajor sections to extract:"
+        echo "- Temporal tools (get_recent_work, search_by_recency, get_timeline)"
+        echo "- Search tools (reflect_on_past, quick_search, etc.)"
+        echo "- Reflection tools (store_reflection, get_full_conversation)"
+        echo "- Embedding management (EmbeddingManager, generate_embedding)"
+        echo "- Decay logic (calculate_decay, apply_decay)"
+        echo "- Utils (ProjectResolver, normalize_project_name)"
+        # Check for circular dependencies
+        echo -e "\nChecking for potential circular dependencies..."
+        grep -q "from server import" "$FILE" && \
+            echo "⚠️  Potential circular imports detected" || \
+            echo "✅ No obvious circular imports"
+    else
+        echo "❌ server.py not found"
+    fi
+}
-# 1. Backup current state
-echo "Step 1: Backing up current state..."
-mkdir -p ~/claude-reflect-backup-$(date +%Y%m%d-%H%M%S)
-docker exec qdrant qdrant-backup create
-# 2. Run all test suites
-echo "Step 2: Running test suites..."
-./test-system-health.sh
-./test-import-pipeline.sh
-./test-mcp-integration.sh
-./test-data-integrity.sh
-./test-performance.sh
-./test-security.sh
-# 3. Test both embedding modes
-echo "Step 3: Testing dual modes..."
-./test-dual-mode.sh
-# 4. Generate report
-echo "Step 4: Generating test report..."
-cat > test-report-$(date +%Y%m%d).md << EOF
-# Claude Self-Reflect Test Report
+# Check for existing modular files
+check_existing_modules() {
+    echo -e "\nChecking for existing modular files..."
+    MODULES=(
+        "temporal_utils.py"
+        "temporal_design.py"
+        "project_resolver.py"
+        "embedding_manager.py"
+    )
+    for module in "${MODULES[@]}"; do
+        if [ -f "mcp-server/src/$module" ]; then
+            echo "✅ $module exists"
+        else
+            echo "⚠️  $module not found (needs creation)"
+        fi
+    done
+}
-## Summary
-- Date: $(date)
-- Version: $(grep version package.json | cut -d'"' -f4)
-- All Tests: PASS/FAIL
+analyze_server_py
+check_existing_modules
+```
-## Test Results
-- System Health: ✅
-- Import Pipeline: ✅
-- MCP Integration: ✅
-- Data Integrity: ✅
-- Performance: ✅
-- Security: ✅
-- Dual Mode: ✅
+### 8. Performance & Memory Testing
+```bash
+#!/bin/bash
+echo "=== PERFORMANCE & MEMORY TESTING ==="
-## Certification
-System ready for release: YES/NO
-EOF
+# Test search performance with temporal tools
+test_search_performance() {
+    echo "Testing search performance..."
+    python -c "
+import time
+import asyncio
+import sys
+import os
+sys.path.insert(0, 'mcp-server/src')
+os.environ['QDRANT_URL'] = 'http://localhost:6333'
-echo "✅ Pre-release testing complete"
+async def test():
+    from server import get_recent_work, search_by_recency
+    class MockContext:
+        async def debug(self, msg): pass
+        async def report_progress(self, *args): pass
+    ctx = MockContext()
+    # Time get_recent_work
+    start = time.time()
+    await get_recent_work(ctx, limit=10)
+    recent_time = time.time() - start
+    # Time search_by_recency
+    start = time.time()
+    await search_by_recency(ctx, 'test', 'last week')
+    search_time = time.time() - start
+    print(f'get_recent_work: {recent_time:.2f}s')
+    print(f'search_by_recency: {search_time:.2f}s')
+    if recent_time < 2 and search_time < 2:
+        print('✅ Performance acceptable')
+    else:
+        print('⚠️  Performance needs optimization')
+asyncio.run(test())
+"
+}
+# Test memory usage
+test_memory_usage() {
+    echo "Testing memory usage..."
+    # Check Python process memory
+    python -c "
+import psutil
+import os
+process = psutil.Process(os.getpid())
+mem_mb = process.memory_info().rss / 1024 / 1024
+print(f'Python process: {mem_mb:.1f}MB')
+"
+    # Check Docker container memory
+    for container in qdrant claude-reflection-safe-watcher; do
+        if docker ps | grep -q $container; then
+            MEM=$(docker stats --no-stream --format "{{.MemUsage}}" $container 2>/dev/null | cut -d'/' -f1 | sed 's/[^0-9.]//g')
+            echo "$container: ${MEM}MB"
+        fi
+    done
+}
+test_search_performance
+test_memory_usage
 ```
-### Fresh Installation Test
+### 9. Complete Test Report Generator
 ```bash
 #!/bin/bash
-# Simulate fresh installation
+echo "=== GENERATING TEST REPORT ==="
-echo "=== FRESH INSTALLATION TEST ==="
+REPORT_FILE="test-report-$(date +%Y%m%d-%H%M%S).md"
-# 1. Clean environment
-docker-compose down -v
-rm -rf data/ config/
-claude mcp remove claude-self-reflect
+cat > "$REPORT_FILE" << EOF
+# Claude Self-Reflect Test Report
-# 2. Install from npm
-npm install -g claude-self-reflect@latest
+## Test Summary
+- **Date**: $(date)
+- **Version**: $(grep version package.json | cut -d'"' -f4)
+- **Server.py Lines**: $(wc -l < mcp-server/src/server.py)
+- **Collections**: $(curl -s http://localhost:6333/collections | jq '.result.collections | length')
+## Feature Tests
+### Core Features
+- [ ] Import Pipeline: PASS/FAIL
+- [ ] MCP Tools (12): PASS/FAIL
+- [ ] Search Quality: PASS/FAIL
+- [ ] State Management: PASS/FAIL
+### v3.x Features
+- [ ] Temporal Tools (3): PASS/FAIL
+- [ ] get_recent_work: PASS/FAIL
+- [ ] search_by_recency: PASS/FAIL
+- [ ] get_timeline: PASS/FAIL
+- [ ] Timestamp Indexes: PASS/FAIL
+- [ ] Project Scoping: PASS/FAIL
+### Infrastructure
+- [ ] CLI Tool: PASS/FAIL
+- [ ] Docker Health: PASS/FAIL
+- [ ] Qdrant: PASS/FAIL
+- [ ] Watcher: PASS/FAIL
-# 3. Run setup
-claude-self-reflect setup --local
+### Performance
+- [ ] Search < 2s: PASS/FAIL
+- [ ] Import < 10s: PASS/FAIL
+- [ ] Memory < 500MB: PASS/FAIL
+### Code Quality
+- [ ] No Critical Bugs: PASS/FAIL
+- [ ] XML Injection Fixed: PASS/FAIL
+- [ ] Native Decay Fixed: PASS/FAIL
+- [ ] Modularization Ready: PASS/FAIL
+## Observations
+$(date): Test execution started
+$(date): All temporal tools tested
+$(date): Project scoping validated
+$(date): CLI packaging verified
+$(date): Docker health confirmed
+## Recommendations
+1. Fix critical bugs before release
+2. Complete modularization (2,835 lines → multiple modules)
+3. Add more comprehensive unit tests
+4. Update documentation for v3.x features
-# 4. Wait for first import
-sleep 70
+## Certification
+**System Ready for Release**: YES/NO
-# 5. Verify functionality
-curl -s http://localhost:6333/collections | jq '.result.collections'
+## Sign-off
+Tested by: claude-self-reflect-test agent
+Date: $(date)
+EOF
-# 6. Test MCP
-echo "Manual step: Test MCP tools in Claude Code"
+echo "✅ Test report generated: $REPORT_FILE"
 ```
-## Success Criteria
-### Core Functionality
-- [ ] Import pipeline processes all JSONL files
-- [ ] Embeddings generated correctly (384/1024 dims)
-- [ ] Qdrant stores vectors with proper metadata
-- [ ] MCP tools accessible and functional
-- [ ] Search returns relevant results (>0.7 scores)
+## Pre-Test Validation Protocol
-### Reliability
-- [ ] No duplicates on re-import
-- [ ] File locking prevents corruption
-- [ ] State persists across restarts
-- [ ] Resume works after interruption
-- [ ] Retry logic handles transient failures
+### Agent Self-Review
+Before running any tests, I MUST review myself to ensure comprehensive coverage:
-### Performance
-- [ ] Import <10s per file
-- [ ] Search <1s response time
-- [ ] Memory <300MB total (including model)
-- [ ] No memory leaks over time
-- [ ] Efficient batch processing
+```bash
+#!/bin/bash
+echo "=== PRE-TEST AGENT VALIDATION ==="
+# Review this agent file for completeness
+review_agent_completeness() {
+    echo "Reviewing CSR-tester agent for missing features..."
+    # Check if agent covers all known features
+    AGENT_FILE="$HOME/projects/claude-self-reflect/.claude/agents/claude-self-reflect-test.md"
+    REQUIRED_FEATURES=(
+        "15+ MCP tools"
+        "Temporal tools"
+        "Metadata extraction"
+        "Hook system"
+        "Sub-agents"
+        "Embedding modes"
+        "Zero vectors"
+        "Streaming watcher"
+        "Delta metadata"
+        "Import pipeline"
+        "Docker stack"
+        "CLI tool"
+        "State management"
+        "Memory decay"
+        "Parallel search"
+        "Project scoping"
+        "Collection naming"
+        "Dimension validation"
+        "XML escaping"
+        "Error handling"
+    )
-### Security
-- [ ] No API keys in logs
-- [ ] Secure file permissions
-- [ ] No sensitive data exposure
-- [ ] Proper input validation
-- [ ] Safe concurrent access
+    for feature in "${REQUIRED_FEATURES[@]}"; do
+        if grep -qi "$feature" "$AGENT_FILE"; then
+            echo "✅ $feature: Covered"
+        else
+            echo "❌ $feature: MISSING - Add test coverage!"
+        fi
+    done
+}
-## Troubleshooting Guide
+# Discover any new features from codebase
+discover_new_features() {
+    echo "Scanning for undocumented features..."
-### Common Issues and Solutions
+    # Check for new MCP tools
+    NEW_TOOLS=$(grep -h "@mcp.tool()" mcp-server/src/*.py 2>/dev/null | wc -l)
+    echo "MCP tools found: $NEW_TOOLS"
-#### Import Not Working
-```bash
-# Check logs
-docker logs streaming-importer --tail 50
+    # Check for new scripts
+    NEW_SCRIPTS=$(ls scripts/*.py 2>/dev/null | wc -l)
+    echo "Python scripts found: $NEW_SCRIPTS"
-# Verify paths
-ls -la ~/.claude/projects/
+    # Check for new test files
+    NEW_TESTS=$(find tests -name "*.py" 2>/dev/null | wc -l)
+    echo "Test files found: $NEW_TESTS"
-# Check permissions
-chmod -R 755 ~/.claude/projects/
+    # Check for new hooks
+    if [ -d "$HOME/.claude/hooks" ]; then
+        HOOKS=$(ls "$HOME/.claude/hooks" 2>/dev/null | wc -l)
+        echo "Hooks configured: $HOOKS"
+    fi
+}
-# Force re-import
-rm ~/.claude-self-reflect/config/imported-files.json
-python scripts/import-conversations-unified.py
+review_agent_completeness
+discover_new_features
 ```
-#### Search Returns Poor Results
-```bash
-# Update metadata
-python scripts/delta-metadata-update.py
+## Test Execution Protocol
-# Check embedding mode
-grep PREFER_LOCAL_EMBEDDINGS .env
+### Run Complete Test Suite
+```bash
+#!/bin/bash
+# Master test runner - CSR-tester is the SOLE executor of all tests
-# Verify collection dimensions
-curl http://localhost:6333/collections | jq
-```
+echo "=== CLAUDE SELF-REFLECT COMPLETE TEST SUITE ==="
+echo "Starting at: $(date)"
+echo "Executor: CSR-tester agent (sole test runner)"
+echo ""
-#### MCP Not Available
-```bash
-# Remove and re-add
-claude mcp remove claude-self-reflect
-claude mcp add claude-self-reflect /full/path/to/run-mcp.sh \
-    -e QDRANT_URL="http://localhost:6333" -s user
+# Pre-test validation
+echo "Phase 0: Pre-test Validation..."
+./review_agent_completeness.sh
-# Restart Claude Code
-echo "Restart Claude Code manually"
-```
+# Create test results directory
+mkdir -p test-results-$(date +%Y%m%d)
+cd test-results-$(date +%Y%m%d)
-#### High Memory Usage
-```bash
-# Check for duplicate models
-ls -la ~/.cache/fastembed/
+# Run all test suites
+../test-system-health.sh > health.log 2>&1
+../test-temporal-tools.sh > temporal.log 2>&1
+../test-cli-tool.sh > cli.log 2>&1
+../test-import-pipeline.sh > import.log 2>&1
+../test-docker-health.sh > docker.log 2>&1
+../test-modularization.sh > modular.log 2>&1
+../test-performance.sh > performance.log 2>&1
-# Restart containers
-docker-compose restart
+# Generate final report
+../generate-test-report.sh
-# Clear cache if needed
-rm -rf ~/.cache/fastembed/
+echo ""
+echo "=== TEST SUITE COMPLETE ==="
+echo "Results in: test-results-$(date +%Y%m%d)/"
+echo "Report: test-report-*.md"
 ```
-## Final Certification
-After running all tests, the system should:
-1. Process all conversations correctly
-2. Support both embedding modes
-3. Provide accurate search results
-4. Handle concurrent operations safely
-5. Maintain data integrity
-6. Perform within acceptable limits
-7. Secure sensitive information
-8. **ALWAYS be in local mode after testing**
+## Success Criteria
-Remember: The goal is a robust, reliable system that "just works" for users.
+### Must Pass
+- [ ] All 15+ MCP tools functional
+- [ ] Temporal tools work with proper scoping
+- [ ] Timestamp indexes on all collections
+- [ ] CLI installs and runs globally
+- [ ] Docker containers healthy
+- [ ] No critical bugs (native decay, XML injection, dimension mismatch)
+- [ ] Search returns relevant results
+- [ ] Import pipeline processes files
+- [ ] State persists correctly
+- [ ] NO ZERO VECTORS in any collection
+- [ ] Metadata extraction working (files, tools, concepts, AST patterns)
+- [ ] Both embedding modes functional (local 384d, Voyage 1024d)
+- [ ] Hooks execute properly (session-start, precompact)
+- [ ] All 6 sub-agents available
+### Should Pass
+- [ ] Performance within limits
+- [ ] Memory usage acceptable
+- [ ] Modularization plan approved
+- [ ] Documentation updated
+- [ ] All unit tests pass
+### Nice to Have
+- [ ] 100% test coverage
+- [ ] Zero warnings in logs
+- [ ] Sub-second search times
+## Final Notes
+This agent knows ALL features of Claude Self-Reflect v3.3.0 including:
+- 15+ MCP tools with temporal, search, reflection, pagination capabilities
+- Modularized architecture (search_tools.py, temporal_tools.py, reflection_tools.py, parallel_search.py)
+- Metadata extraction (AST patterns, concepts, files analyzed, tools used)
+- Hook system (session-start, precompact, submit hooks)
+- 6 specialized sub-agents for different domains
+- Dual embedding support (FastEmbed 384d, Voyage AI 1024d)
+- Zero vector detection and prevention
+- Streaming watcher and delta metadata updater
+- Project scoping and cross-collection search
+- Memory decay (client-side with 90-day half-life)
+- GPT-5 review recommendations and critical fixes
+- All test scripts and their purposes
+The agent will ALWAYS restore the system to local mode after testing and provide comprehensive reports suitable for release decisions.