claude-self-reflect 2.5.16 → 2.5.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,6 +13,26 @@ You are a Docker orchestration specialist for the memento-stack project. You man
13
13
  - Services run on host network for local development
14
14
  - Production uses Railway deployment
15
15
 
16
+ ## CRITICAL GUARDRAILS (from v2.5.17 crisis)
17
+
18
+ ### Resource Limit Guidelines
19
+ ⚠️ **Memory limits must include baseline usage**
20
+ - Measure baseline: `docker stats --no-stream`
21
+ - Add 200MB+ headroom above baseline
22
+ - Default: 600MB minimum (not 400MB)
23
+
24
+ ⚠️ **CPU monitoring in containers**
25
+ - Containers see all host CPUs but have cgroup limits
26
+ - 1437% CPU = ~90% of actual allocation
27
+ - Use cgroup-aware monitoring: `/sys/fs/cgroup/cpu/cpu.cfs_quota_us`
28
+
29
+ ### Pre-Deployment Checklist
30
+ ✅ Test with production data volumes (600+ files)
31
+ ✅ Verify STATE_FILE paths match between config and container
32
+ ✅ Check volume mounts are writable
33
+ ✅ Confirm memory/CPU limits are realistic
34
+ ✅ Test graceful shutdown handling
35
+
16
36
  ## Key Responsibilities
17
37
 
18
38
  1. **Service Management**
@@ -16,6 +16,21 @@ You are an import pipeline debugging expert for the memento-stack project. You s
16
16
  - Project name must be correctly extracted from path for proper collection naming
17
17
  - Collections named using MD5 hash of project name
18
18
 
19
+ ## CRITICAL GUARDRAILS (from v2.5.17 crisis)
20
+
21
+ ### Pre-Release Testing Checklist
22
+ ✅ **Test with actual Claude JSONL files** - Real ~/.claude/projects/*.jsonl files
23
+ ✅ **Verify processing metrics** - files_processed, chunks_created must be > 0
24
+ ✅ **Memory limits = baseline + headroom** - Measure actual usage first (typically 400MB base)
25
+ ✅ **Run tests to completion** - Don't mark as done without execution proof
26
+ ✅ **Handle production backlogs** - Test with 600+ file queues
27
+
28
+ ### Common Failure Patterns
29
+ 🚨 **State updates without progress** - high_water_mark changes but processed_files = 0
30
+ 🚨 **Memory limit blocking** - "Memory limit exceeded" on every file = limit too low
31
+ 🚨 **CPU misreporting** - 1437% CPU might be 90% of container limit
32
+ 🚨 **Wrong file format** - Testing with .json when production uses .jsonl
33
+
19
34
  ## Key Responsibilities
20
35
 
21
36
  1. **JSONL Processing**
@@ -36,6 +36,25 @@ You are a Qdrant vector database specialist for the memento-stack project. Your
36
36
  - Analyze embedding quality
37
37
  - Compare different embedding models (Voyage vs OpenAI)
38
38
 
39
+ ## CRITICAL GUARDRAILS (from v2.5.17 crisis)
40
+
41
+ ### Testing Requirements
42
+ - **ALWAYS test with real JSONL files** - Claude uses .jsonl, not .json
43
+ - **Verify actual processing** - Check files_processed > 0, not just state updates
44
+ - **Test memory limits with baseline** - System uses 400MB baseline, set limits accordingly
45
+ - **Never mark tests complete without execution** - Run and verify output
46
+
47
+ ### Resource Management
48
+ - **Default memory to 600MB minimum** - 400MB is too conservative
49
+ - **Monitor baseline + headroom** - Measure actual usage before setting limits
50
+ - **Use cgroup-aware CPU monitoring** - Docker shows all CPUs but has limits
51
+
52
+ ### Quality Gates
53
+ - **Follow the workflow**: implementation → review → test → docs → release
54
+ - **Use pre-releases for major changes** - Better to test than break production
55
+ - **Document quantitative metrics** - "0 files processed" is clearer than "test failed"
56
+ - **Rollback immediately on failure** - Don't push through broken releases
57
+
39
58
  ## Essential Commands
40
59
 
41
60
  ### Collection Operations
@@ -14,11 +14,12 @@ RUN pip install --upgrade pip setuptools wheel
14
14
  # Install PyTorch CPU version for smaller size
15
15
  RUN pip install --no-cache-dir torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
16
16
 
17
- # Install other dependencies
17
+ # Install other dependencies with updated versions for security
18
+ # Note: fastembed 0.4.0 requires numpy<2, so using compatible version
18
19
  RUN pip install --no-cache-dir \
19
20
  qdrant-client==1.15.0 \
20
- fastembed==0.2.7 \
21
- numpy==1.26.0 \
21
+ fastembed==0.4.0 \
22
+ numpy==1.26.4 \
22
23
  psutil==7.0.0 \
23
24
  tenacity==8.2.3 \
24
25
  python-dotenv==1.0.0 \
package/README.md CHANGED
@@ -6,10 +6,7 @@ Claude forgets everything. This fixes that.
6
6
 
7
7
  - [What You Get](#what-you-get)
8
8
  - [Requirements](#requirements)
9
- - [Quick Install](#quick-install)
10
- - [Local Mode (Default)](#local-mode-default---your-data-stays-private)
11
- - [Cloud Mode](#cloud-mode-better-search-accuracy)
12
- - [Uninstall Instructions](#uninstall-instructions)
9
+ - [Quick Install/Uninstall](#quick-installuninstall)
13
10
  - [The Magic](#the-magic)
14
11
  - [Before & After](#before--after)
15
12
  - [Real Examples](#real-examples-that-made-us-build-this)
@@ -18,10 +15,9 @@ Claude forgets everything. This fixes that.
18
15
  - [Using It](#using-it)
19
16
  - [Key Features](#key-features)
20
17
  - [Performance](#performance)
21
- - [V2.5.16 Critical Updates](#v2516-critical-updates)
22
18
  - [Configuration](#configuration)
23
19
  - [Technical Stack](#the-technical-stack)
24
- - [Problems?](#problems)
20
+ - [Problems](#problems)
25
21
  - [What's New](#whats-new)
26
22
  - [Advanced Topics](#advanced-topics)
27
23
  - [Contributors](#contributors)
@@ -30,12 +26,12 @@ Claude forgets everything. This fixes that.
30
26
 
31
27
  Ask Claude about past conversations. Get actual answers. **100% local by default** - your conversations never leave your machine. Cloud-enhanced search available when you need it.
32
28
 
33
- **✅ Proven at Scale**: Successfully indexed 682 conversation files with 100% reliability. No data loss, no corruption, just seamless conversation memory that works.
29
+ **Proven at Scale**: Successfully indexed 682 conversation files with 100% reliability. No data loss, no corruption, just seamless conversation memory that works.
34
30
 
35
31
  **Before**: "I don't have access to previous conversations"
36
32
  **After**:
37
33
  ```
38
- reflection-specialist(Search FastEmbed vs cloud embedding decision)
34
+ reflection-specialist(Search FastEmbed vs cloud embedding decision)
39
35
  ⎿ Done (3 tool uses · 8.2k tokens · 12.4s)
40
36
 
41
37
  "Found it! Yesterday we decided on FastEmbed for local mode - better privacy,
@@ -52,9 +48,11 @@ Your conversations become searchable. Your decisions stay remembered. Your conte
52
48
  - **Node.js** 16+ (for the setup wizard)
53
49
  - **Claude Desktop** app
54
50
 
55
- ## Quick Install
51
+ ## Quick Install/Uninstall
56
52
 
57
- ### Local Mode (Default - Your Data Stays Private)
53
+ ### Install
54
+
55
+ #### Local Mode (Default - Your Data Stays Private)
58
56
  ```bash
59
57
  # Install and run automatic setup
60
58
  npm install -g claude-self-reflect
@@ -69,7 +67,7 @@ claude-self-reflect setup
69
67
  # 🔒 Keep all data local - no API keys needed
70
68
  ```
71
69
 
72
- ### Cloud Mode (Better Search Accuracy)
70
+ #### Cloud Mode (Better Search Accuracy)
73
71
  ```bash
74
72
  # Step 1: Get your free Voyage AI key
75
73
  # Sign up at https://www.voyageai.com/ - it takes 30 seconds
@@ -122,7 +120,7 @@ Here's how your conversations get imported and prioritized:
122
120
  ![Import Architecture](docs/diagrams/import-architecture.png)
123
121
 
124
122
  **The system intelligently prioritizes your conversations:**
125
- - **🔥 HOT** (< 5 minutes): Switches to 2-second intervals for near real-time import
123
+ - **HOT** (< 5 minutes): Switches to 2-second intervals for near real-time import
126
124
  - **🌡️ WARM** (< 24 hours): Normal priority, processed every 60 seconds
127
125
  - **❄️ COLD** (> 24 hours): Batch processed, max 5 per cycle to prevent blocking
128
126
 
@@ -138,7 +136,7 @@ The reflection specialist automatically activates. No special commands needed.
138
136
 
139
137
  ## Key Features
140
138
 
141
- ### 🎯 Project-Scoped Search
139
+ ### Project-Scoped Search
142
140
  Searches are **project-aware by default**. Claude automatically searches within your current project:
143
141
 
144
142
  ```
@@ -162,40 +160,6 @@ Recent conversations matter more. Old ones fade. Like your brain, but reliable.
162
160
  - **Scale**: 100% indexing success rate across all conversation types
163
161
  - **V2 Migration**: 100% complete - all conversations use token-aware chunking
164
162
 
165
- ## V2.5.16 Critical Updates
166
-
167
- ### 🚨 CPU Performance Fix - RESOLVED
168
- **Issue**: Streaming importer was consuming **1437% CPU** causing system overload
169
- **Solution**: Complete rewrite with production-grade throttling and monitoring
170
- **Result**: CPU usage reduced to **<1%** (99.93% improvement)
171
-
172
- ### ✅ Production-Ready Streaming Importer
173
- - **Non-blocking CPU monitoring** with cgroup awareness
174
- - **Queue overflow protection** - data deferred, never dropped
175
- - **Atomic state persistence** with fsync for crash recovery
176
- - **Memory management** with 15% GC buffer and automatic cleanup
177
- - **Proper async signal handling** for clean shutdowns
178
-
179
- ### 🎯 100% V2 Token-Aware Chunking
180
- - **Complete Migration**: All collections now use optimized chunking
181
- - **Configuration**: 400 tokens/1600 chars with 75 token/300 char overlap
182
- - **Search Quality**: Improved semantic boundaries and context preservation
183
- - **Memory Efficiency**: Streaming processing prevents OOM during imports
184
-
185
- ### 📊 Performance Metrics (v2.5.16)
186
- | Metric | Before | After | Improvement |
187
- |--------|--------|-------|-------------|
188
- | CPU Usage | 1437% | <1% | 99.93% ↓ |
189
- | Memory | 8GB peak | 302MB | 96.2% ↓ |
190
- | Search Latency | Variable | 3.16ms avg | Consistent |
191
- | Test Success | Unstable | 21/25 passing | Reliable |
192
-
193
- ### 🔧 CLI Status Command Fix
194
- Fixed broken `--status` command in MCP server - now returns:
195
- - Collection counts and health
196
- - Real-time CPU and memory usage
197
- - Search performance metrics
198
- - Import processing status
199
163
 
200
164
  ## The Technical Stack
201
165
 
@@ -206,7 +170,7 @@ Fixed broken `--status` command in MCP server - now returns:
206
170
  - **MCP Server**: Python + FastMCP
207
171
  - **Search**: Semantic similarity with time decay
208
172
 
209
- ## Problems?
173
+ ## Problems
210
174
 
211
175
  - [Troubleshooting Guide](docs/troubleshooting.md)
212
176
  - [GitHub Issues](https://github.com/ramakay/claude-self-reflect/issues)
@@ -214,7 +178,8 @@ Fixed broken `--status` command in MCP server - now returns:
214
178
 
215
179
  ## What's New
216
180
 
217
- - **v2.5.16** - **CRITICAL PERFORMANCE UPDATE** - Fixed 1437% CPU overload, 100% V2 migration complete, production streaming importer
181
+ - **v2.5.17** - Critical CPU fix and memory limit adjustment. [Full release notes](docs/releases/v2.5.17-release-notes.md)
182
+ - **v2.5.16** - (Pre-release only) Initial streaming importer with CPU throttling
218
183
  - **v2.5.15** - Critical bug fixes and collection creation improvements
219
184
  - **v2.5.14** - Async importer collection fix - All conversations now searchable
220
185
  - **v2.5.11** - Critical cloud mode fix - Environment variables now properly passed to MCP server
@@ -231,6 +196,22 @@ Fixed broken `--status` command in MCP server - now returns:
231
196
  - [Architecture details](docs/architecture-details.md)
232
197
  - [Contributing](CONTRIBUTING.md)
233
198
 
199
+ ### Uninstall
200
+
201
+ For complete uninstall instructions, see [docs/UNINSTALL.md](docs/UNINSTALL.md).
202
+
203
+ Quick uninstall:
204
+ ```bash
205
+ # Remove MCP server
206
+ claude mcp remove claude-self-reflect
207
+
208
+ # Stop Docker containers
209
+ docker-compose down
210
+
211
+ # Uninstall npm package
212
+ npm uninstall -g claude-self-reflect
213
+ ```
214
+
234
215
  ## Contributors
235
216
 
236
217
  Special thanks to our contributors:
@@ -240,6 +221,4 @@ Special thanks to our contributors:
240
221
 
241
222
  ---
242
223
 
243
- Stop reading. Start installing. Your future self will thank you.
244
-
245
- MIT License. Built with ❤️ for the Claude community.
224
+ Built with ❤️ by [ramakay](https://github.com/ramakay) for the Claude community.
@@ -110,7 +110,7 @@ services:
110
110
  - MAX_CONCURRENT_QDRANT=2 # Limit concurrent Qdrant operations
111
111
  - IMPORT_FREQUENCY=15 # Check every 15 seconds instead of 1
112
112
  - BATCH_SIZE=3 # Process only 3 files at a time
113
- - MEMORY_LIMIT_MB=400 # Tight memory limit
113
+ - MEMORY_LIMIT_MB=600 # Memory limit (increased from 400MB for stability)
114
114
  - MAX_QUEUE_SIZE=100 # Limit queue size
115
115
  - MAX_BACKLOG_HOURS=24 # Alert if backlog > 24 hours
116
116
  - QDRANT_TIMEOUT=10 # 10 second timeout for Qdrant ops
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "claude-self-reflect-mcp"
3
- version = "2.5.16"
3
+ version = "2.5.18"
4
4
  description = "MCP server for Claude self-reflection with memory decay"
5
5
  # readme = "README.md"
6
6
  requires-python = ">=3.10"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-self-reflect",
3
- "version": "2.5.16",
3
+ "version": "2.5.18",
4
4
  "description": "Give Claude perfect memory of all your conversations - Installation wizard for Python MCP server",
5
5
  "keywords": [
6
6
  "claude",