claude-self-reflect 2.4.4 → 2.4.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -69,6 +69,34 @@ Search for relevant past conversations using semantic similarity.
69
69
  project: "all", // Search across all projects
70
70
  limit: 10
71
71
  }
72
+
73
+ // Debug mode with raw Qdrant data (NEW in v2.4.5)
74
+ {
75
+ query: "search quality issues",
76
+ project: "all",
77
+ limit: 5,
78
+ include_raw: true // Include full payload for debugging
79
+ }
80
+
81
+ // Choose response format (NEW in v2.4.5)
82
+ {
83
+ query: "playwright issues",
84
+ limit: 5,
85
+ response_format: "xml" // Use XML format (default)
86
+ }
87
+
88
+ {
89
+ query: "playwright issues",
90
+ limit: 5,
91
+ response_format: "markdown" // Use original markdown format
92
+ }
93
+
94
+ // Brief mode for minimal responses (NEW in v2.4.5)
95
+ {
96
+ query: "error handling patterns",
97
+ limit: 3,
98
+ brief: true // Returns minimal excerpts (100 chars) for faster response
99
+ }
72
100
  ```
73
101
 
74
102
  #### Default Behavior: Project-Scoped Search (NEW in v2.4.3)
@@ -89,6 +117,87 @@ Save important insights and decisions for future retrieval.
89
117
  }
90
118
  ```
91
119
 
120
+ ### Specialized Search Tools (NEW in v2.4.5)
121
+
122
+ **Note**: These specialized tools are available through this reflection-specialist agent. Due to FastMCP limitations, they cannot be called directly via MCP (e.g., `mcp__claude-self-reflect__quick_search`), but work perfectly when used through this agent.
123
+
124
+ #### quick_search
125
+ Fast search that returns only the count and top result. Perfect for quick checks and overview.
126
+
127
+ ```javascript
128
+ // Quick overview of matches
129
+ {
130
+ query: "authentication patterns",
131
+ min_score: 0.5, // Optional, defaults to 0.7
132
+ project: "all" // Optional, defaults to current project
133
+ }
134
+ ```
135
+
136
+ Returns:
137
+ - Total match count across all results
138
+ - Details of only the top result
139
+ - Minimal response size for fast performance
140
+
141
+ #### search_summary
142
+ Get aggregated insights without individual result details. Ideal for pattern analysis.
143
+
144
+ ```javascript
145
+ // Analyze patterns across conversations
146
+ {
147
+ query: "error handling",
148
+ project: "all", // Optional
149
+ limit: 10 // Optional, how many results to analyze
150
+ }
151
+ ```
152
+
153
+ Returns:
154
+ - Total matches and average relevance score
155
+ - Project distribution (which projects contain matches)
156
+ - Common themes extracted from results
157
+ - No individual result details (faster response)
158
+
159
+ #### get_more_results
160
+ Pagination support for getting additional results after an initial search.
161
+
162
+ ```javascript
163
+ // Get next batch of results
164
+ {
165
+ query: "original search query", // Must match original query
166
+ offset: 3, // Skip first 3 results
167
+ limit: 3, // Get next 3 results
168
+ min_score: 0.7, // Optional
169
+ project: "all" // Optional
170
+ }
171
+ ```
172
+
173
+ Note: Since Qdrant doesn't support true offset, this fetches offset+limit results and returns only the requested slice. Best used for exploring beyond initial results.
174
+
175
+ ## Debug Mode (NEW in v2.4.5)
176
+
177
+ ### Using include_raw for Troubleshooting
178
+ When search quality issues arise or you need to understand why certain results are returned, enable debug mode:
179
+
180
+ ```javascript
181
+ {
182
+ query: "your search query",
183
+ include_raw: true // Adds full Qdrant payload to results
184
+ }
185
+ ```
186
+
187
+ **Warning**: Debug mode significantly increases response size (3-5x larger). Use only when necessary.
188
+
189
+ ### What's Included in Raw Data
190
+ - **full-text**: Complete conversation text (not just 500 char excerpt)
191
+ - **point-id**: Qdrant's unique identifier for the chunk
192
+ - **vector-distance**: Raw similarity score (1 - cosine_similarity)
193
+ - **metadata**: All stored fields including timestamps, roles, project paths
194
+
195
+ ### When to Use Debug Mode
196
+ 1. **Search Quality Issues**: Understanding why irrelevant results rank high
197
+ 2. **Project Filtering Problems**: Debugging project scoping issues
198
+ 3. **Embedding Analysis**: Comparing similarity scores across models
199
+ 4. **Data Validation**: Verifying what's actually stored in Qdrant
200
+
92
201
  ## Search Strategy Guidelines
93
202
 
94
203
  ### Understanding Score Ranges
@@ -111,10 +220,23 @@ Save important insights and decisions for future retrieval.
111
220
  4. **Use Context**: Include technology names, error messages, or specific terms
112
221
  5. **Cross-Project When Needed**: Similar problems may have been solved elsewhere
113
222
 
114
- ## Response Format
223
+ ## Response Format (NEW in v2.4.5)
115
224
 
116
- ### XML-Structured Output
117
- To facilitate better parsing and metadata handling, structure your responses using XML format:
225
+ ### Choosing Response Format
226
+ The MCP server now supports two response formats:
227
+ - **XML** (default): Structured format for better parsing and metadata handling
228
+ - **Markdown**: Original format for compatibility and real-time playback
229
+
230
+ Use the `response_format` parameter to select:
231
+ ```javascript
232
+ {
233
+ query: "your search",
234
+ response_format: "xml" // or "markdown"
235
+ }
236
+ ```
237
+
238
+ ### XML-Structured Output (Default)
239
+ The XML format provides better structure for parsing and includes performance metadata:
118
240
 
119
241
  ```xml
120
242
  <reflection-search>
@@ -135,6 +257,16 @@ To facilitate better parsing and metadata handling, structure your responses usi
135
257
  <key-finding>One-line summary of the main insight</key-finding>
136
258
  <excerpt>Most relevant quote or context from the conversation</excerpt>
137
259
  <conversation-id>optional-id</conversation-id>
260
+ <!-- Optional: Only when include_raw=true -->
261
+ <raw-data>
262
+ <full-text>Complete conversation text...</full-text>
263
+ <point-id>qdrant-uuid</point-id>
264
+ <vector-distance>0.275</vector-distance>
265
+ <metadata>
266
+ <field1>value1</field1>
267
+ <field2>value2</field2>
268
+ </metadata>
269
+ </raw-data>
138
270
  </result>
139
271
 
140
272
  <result rank="2">
@@ -156,13 +288,48 @@ To facilitate better parsing and metadata handling, structure your responses usi
156
288
  </reflection-search>
157
289
  ```
158
290
 
291
+ ### Markdown Format (For Compatibility)
292
+ The original markdown format is simpler and enables real-time playback in Claude:
293
+
294
+ ```
295
+ Found 3 relevant conversation(s) for 'your query':
296
+
297
+ **Result 1** (Score: 0.725)
298
+ Time: 2024-01-15 10:30:00
299
+ Project: ProjectName
300
+ Role: assistant
301
+ Excerpt: The relevant excerpt from the conversation...
302
+ ---
303
+
304
+ **Result 2** (Score: 0.612)
305
+ Time: 2024-01-14 15:45:00
306
+ Project: ProjectName
307
+ Role: user
308
+ Excerpt: Another relevant excerpt...
309
+ ---
310
+ ```
311
+
312
+ ### When to Use Each Format
313
+
314
+ **Use XML format when:**
315
+ - Main agent needs to parse and process results
316
+ - Performance metrics are important
317
+ - Debugging search quality issues
318
+ - Need structured metadata access
319
+
320
+ **Use Markdown format when:**
321
+ - Testing real-time playback in Claude UI
322
+ - Simple manual searches
323
+ - Compatibility with older workflows
324
+ - Prefer simpler output
325
+
159
326
  ### Response Best Practices
160
327
 
161
- 1. **Always use XML structure** for main content
162
- 2. **Indicate Search Scope** in the summary section
328
+ 1. **Use XML format by default** unless markdown is specifically requested
329
+ 2. **Indicate Search Scope** in the summary section (XML) or header (markdown)
163
330
  3. **Order results by relevance** (highest score first)
164
- 4. **Include actionable insights** in the analysis section
165
- 5. **Provide metadata** for transparency
331
+ 4. **Include actionable insights** in the analysis section (XML format)
332
+ 5. **Provide metadata** for transparency and debugging
166
333
 
167
334
  ### Proactive Cross-Project Search Suggestions
168
335
 
package/README.md CHANGED
@@ -123,50 +123,70 @@ Once installed, just talk naturally:
123
123
 
124
124
  The reflection specialist automatically activates. No special commands needed.
125
125
 
126
- ## Project-Scoped Search (New in v2.4.3)
126
+ ## Performance & Usage Guide
127
127
 
128
- **⚠️ Breaking Change**: Searches now default to current project only. Previously searched all projects.
128
+ ### 🚀 Lightning Fast Search
129
+ Optimized to deliver results in **200-350ms** (10-40x faster than v2.4.4)
129
130
 
130
- Conversations are now **project-aware by default**. When you ask about past conversations, Claude automatically searches within your current project directory, keeping results focused and relevant.
131
+ ### 🎯 Recommended Usage: Through Reflection-Specialist Agent
131
132
 
132
- ### How It Works
133
+ **Why use the agent instead of direct MCP tools?**
134
+ - **Preserves your main conversation context** - Search results don't clutter your working memory
135
+ - **Rich formatted responses** - Clean markdown instead of raw XML in your conversation
136
+ - **Better user experience** - Real-time streaming feedback and progress indicators
137
+ - **Proper tool counting** - Shows actual tool usage instead of "0 tool uses"
138
+ - **Automatic cross-project search** - Agent suggests searching across projects when relevant
139
+ - **Specialized search tools** - Access to quick_search, search_summary, and pagination
133
140
 
141
+ **Context Preservation Benefit:**
142
+ When you use the reflection-specialist agent, all the search results and processing happen in an isolated context. This means:
143
+ - Your main conversation stays clean and focused
144
+ - No XML dumps or raw data in your chat history
145
+ - Multiple searches won't exhaust your context window
146
+ - You get just the insights, not the implementation details
147
+
148
+ **Example:**
149
+ ```
150
+ You: "What Docker issues did we solve?"
151
+ [Claude automatically spawns reflection-specialist agent]
152
+ ⏺ reflection-specialist(Search Docker issues)
153
+ ⎿ Searching 57 collections...
154
+ ⎿ Found 5 relevant conversations
155
+ ⎿ Done (1 tool use · 12k tokens · 2.3s)
156
+ [Returns clean, formatted insights without cluttering your context]
134
157
  ```
135
- # Example: Working in ~/projects/ShopifyMCPMockShop
136
- You: "What authentication method did we implement?"
137
- Claude: [Searches ONLY ShopifyMCPMockShop conversations]
138
- "Found 3 conversations about JWT authentication..."
139
158
 
140
- # To search everywhere (like pre-v2.4.3 behavior)
141
- You: "Search all projects for WebSocket implementations"
142
- Claude: [Searches across ALL your projects]
143
- "Found implementations in 5 projects: ..."
159
+ ### Performance Baselines
144
160
 
145
- # To search a specific project
146
- You: "Find Docker setup in claude-self-reflect project"
147
- Claude: [Searches only claude-self-reflect conversations]
148
- ```
161
+ | Method | Search Time | Total Time | Context Impact | Best For |
162
+ |--------|------------|------------|----------------|----------|
163
+ | Direct MCP | 200-350ms | 200-350ms | Uses main context | Programmatic use, when context space matters |
164
+ | Via Agent | 200-350ms | 24-30s* | Isolated context | Interactive use, exploration, multiple searches |
149
165
 
150
- ### Key Behaviors
166
+ *Note: The 24-30s includes context preservation overhead, which keeps your main conversation clean
151
167
 
152
- | Search Type | How to Trigger | Example |
153
- |------------|----------------|---------|
154
- | **Current Project** (default) | Just ask normally | "What did we discuss about caching?" |
155
- | **All Projects** | Say "all projects" or "across projects" | "Search all projects for error handling" |
156
- | **Specific Project** | Mention the project name | "Find auth code in MyApp project" |
168
+ **Note**: The specialized tools (`quick_search`, `search_summary`, `get_more_results`) only work through the reflection-specialist agent due to MCP protocol limitations.
157
169
 
158
- ### Why This Change?
170
+ ## Key Features
159
171
 
160
- - **Focused Results**: No more sifting through unrelated conversations
161
- - **Better Performance**: Single-project search is ~100ms faster
162
- - **Natural Workflow**: Results match your current working context
163
- - **Privacy**: Work and personal projects stay isolated
172
+ ### 🎯 Project-Scoped Search
173
+ Searches are **project-aware by default** (v2.4.3+). Claude automatically searches within your current project:
164
174
 
165
- ### Upgrading from Earlier Versions?
175
+ ```
176
+ # In ~/projects/MyApp
177
+ You: "What authentication method did we use?"
178
+ Claude: [Searches ONLY MyApp conversations]
166
179
 
167
- Your existing conversations remain searchable. The only change is that searches now default to your current project. To get the old behavior, simply ask to "search all projects".
180
+ # To search everywhere
181
+ You: "Search all projects for WebSocket implementations"
182
+ Claude: [Searches across ALL your projects]
183
+ ```
168
184
 
169
- See [Project-Scoped Search Guide](docs/project-scoped-search.md) for detailed examples and advanced usage.
185
+ | Search Scope | How to Trigger | Example |
186
+ |------------|----------------|---------|
187
+ | Current Project (default) | Just ask normally | "What did we discuss about caching?" |
188
+ | All Projects | Say "all projects" | "Search all projects for error handling" |
189
+ | Specific Project | Name the project | "Find auth code in MyApp project" |
170
190
 
171
191
  ## Memory Decay
172
192
 
@@ -237,10 +257,14 @@ Both embedding options work well. Local mode uses FastEmbed for privacy and offl
237
257
  - [GitHub Issues](https://github.com/ramakay/claude-self-reflect/issues)
238
258
  - [Discussions](https://github.com/ramakay/claude-self-reflect/discussions)
239
259
 
240
- ## Latest Updates
260
+ ## What's New
261
+
262
+ ### Recent Updates
263
+ - **v2.4.5** - 10-40x performance boost, context preservation
264
+ - **v2.4.3** - Project-scoped search (breaking change)
265
+ - **v2.3.7** - Local embeddings by default for privacy
241
266
 
242
- - 📢 [v2.4.x Announcement](https://github.com/ramakay/claude-self-reflect/discussions/19) - Major improvements including Docker setup and project-scoped search
243
- - 💬 [Project-Scoped Search Feedback](https://github.com/ramakay/claude-self-reflect/discussions/17) - Share your experience with the breaking change
267
+ 📚 [Full Release History](docs/release-history.md) | 💬 [Discussions](https://github.com/ramakay/claude-self-reflect/discussions)
244
268
 
245
269
  ## Contributing
246
270
 
@@ -4,10 +4,11 @@ import os
4
4
  import asyncio
5
5
  from pathlib import Path
6
6
  from typing import Any, Optional, List, Dict, Union
7
- from datetime import datetime
7
+ from datetime import datetime, timezone
8
8
  import json
9
9
  import numpy as np
10
10
  import hashlib
11
+ import time
11
12
 
12
13
  from fastmcp import FastMCP, Context
13
14
  from pydantic import BaseModel, Field
@@ -87,6 +88,7 @@ class SearchResult(BaseModel):
87
88
  project_name: str
88
89
  conversation_id: Optional[str] = None
89
90
  collection_name: str
91
+ raw_payload: Optional[Dict[str, Any]] = None # Full Qdrant payload when debug mode enabled
90
92
 
91
93
 
92
94
  # Initialize FastMCP instance
@@ -151,16 +153,25 @@ async def reflect_on_past(
151
153
  limit: int = Field(default=5, description="Maximum number of results to return"),
152
154
  min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
153
155
  use_decay: Union[int, str] = Field(default=-1, description="Apply time-based decay: 1=enable, 0=disable, -1=use environment default (accepts int or str)"),
154
- project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
156
+ project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects."),
157
+ include_raw: bool = Field(default=False, description="Include raw Qdrant payload data for debugging (increases response size)"),
158
+ response_format: str = Field(default="xml", description="Response format: 'xml' or 'markdown'"),
159
+ brief: bool = Field(default=False, description="Brief mode: returns minimal information for faster response")
155
160
  ) -> str:
156
161
  """Search for relevant past conversations using semantic search with optional time decay."""
157
162
 
163
+ # Start timing
164
+ start_time = time.time()
165
+ timing_info = {}
166
+
158
167
  # Normalize use_decay to integer
168
+ timing_info['param_parsing_start'] = time.time()
159
169
  if isinstance(use_decay, str):
160
170
  try:
161
171
  use_decay = int(use_decay)
162
172
  except ValueError:
163
173
  raise ValueError("use_decay must be '1', '0', or '-1'")
174
+ timing_info['param_parsing_end'] = time.time()
164
175
 
165
176
  # Parse decay parameter using integer approach
166
177
  should_use_decay = (
@@ -207,10 +218,15 @@ async def reflect_on_past(
207
218
 
208
219
  try:
209
220
  # Generate embedding
221
+ timing_info['embedding_start'] = time.time()
210
222
  query_embedding = await generate_embedding(query)
223
+ timing_info['embedding_end'] = time.time()
211
224
 
212
225
  # Get all collections
226
+ timing_info['get_collections_start'] = time.time()
213
227
  all_collections = await get_all_collections()
228
+ timing_info['get_collections_end'] = time.time()
229
+
214
230
  if not all_collections:
215
231
  return "No conversation collections found. Please import conversations first."
216
232
 
@@ -241,7 +257,22 @@ async def reflect_on_past(
241
257
  all_results = []
242
258
 
243
259
  # Search each collection
244
- for collection_name in collections_to_search:
260
+ timing_info['search_all_start'] = time.time()
261
+ collection_timings = []
262
+
263
+ # Report initial progress
264
+ await ctx.report_progress(progress=0, total=len(collections_to_search))
265
+
266
+ for idx, collection_name in enumerate(collections_to_search):
267
+ collection_timing = {'name': collection_name, 'start': time.time()}
268
+
269
+ # Report progress before searching each collection
270
+ await ctx.report_progress(
271
+ progress=idx,
272
+ total=len(collections_to_search),
273
+ message=f"Searching {collection_name}"
274
+ )
275
+
245
276
  try:
246
277
  if should_use_decay and USE_NATIVE_DECAY and NATIVE_DECAY_AVAILABLE:
247
278
  # Use native Qdrant decay with newer API
@@ -353,10 +384,11 @@ async def reflect_on_past(
353
384
  score=point.score, # Score already includes decay
354
385
  timestamp=clean_timestamp,
355
386
  role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
356
- excerpt=(point.payload.get('text', '')[:500] + '...'),
387
+ excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
357
388
  project_name=point_project,
358
389
  conversation_id=point.payload.get('conversation_id'),
359
- collection_name=collection_name
390
+ collection_name=collection_name,
391
+ raw_payload=point.payload if include_raw else None
360
392
  ))
361
393
 
362
394
  elif should_use_decay:
@@ -372,7 +404,7 @@ async def reflect_on_past(
372
404
  )
373
405
 
374
406
  # Apply decay scoring manually
375
- now = datetime.now()
407
+ now = datetime.now(timezone.utc)
376
408
  scale_ms = DECAY_SCALE_DAYS * 24 * 60 * 60 * 1000
377
409
 
378
410
  decay_results = []
@@ -382,6 +414,9 @@ async def reflect_on_past(
382
414
  timestamp_str = point.payload.get('timestamp')
383
415
  if timestamp_str:
384
416
  timestamp = datetime.fromisoformat(timestamp_str.replace('Z', '+00:00'))
417
+ # Ensure timestamp is timezone-aware
418
+ if timestamp.tzinfo is None:
419
+ timestamp = timestamp.replace(tzinfo=timezone.utc)
385
420
  age_ms = (now - timestamp).total_seconds() * 1000
386
421
 
387
422
  # Calculate decay factor
@@ -428,10 +463,11 @@ async def reflect_on_past(
428
463
  score=adjusted_score, # Use adjusted score
429
464
  timestamp=clean_timestamp,
430
465
  role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
431
- excerpt=(point.payload.get('text', '')[:500] + '...'),
466
+ excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
432
467
  project_name=point_project,
433
468
  conversation_id=point.payload.get('conversation_id'),
434
- collection_name=collection_name
469
+ collection_name=collection_name,
470
+ raw_payload=point.payload if include_raw else None
435
471
  ))
436
472
  else:
437
473
  # Standard search without decay
@@ -463,34 +499,151 @@ async def reflect_on_past(
463
499
  score=point.score,
464
500
  timestamp=clean_timestamp,
465
501
  role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
466
- excerpt=(point.payload.get('text', '')[:500] + '...'),
502
+ excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
467
503
  project_name=point_project,
468
504
  conversation_id=point.payload.get('conversation_id'),
469
- collection_name=collection_name
505
+ collection_name=collection_name,
506
+ raw_payload=point.payload if include_raw else None
470
507
  ))
471
508
 
472
509
  except Exception as e:
473
510
  await ctx.debug(f"Error searching {collection_name}: {str(e)}")
474
- continue
511
+ collection_timing['error'] = str(e)
512
+
513
+ collection_timing['end'] = time.time()
514
+ collection_timings.append(collection_timing)
515
+
516
+ timing_info['search_all_end'] = time.time()
517
+
518
+ # Report completion of search phase
519
+ await ctx.report_progress(
520
+ progress=len(collections_to_search),
521
+ total=len(collections_to_search),
522
+ message="Search complete, processing results"
523
+ )
475
524
 
476
525
  # Sort by score and limit
526
+ timing_info['sort_start'] = time.time()
477
527
  all_results.sort(key=lambda x: x.score, reverse=True)
478
528
  all_results = all_results[:limit]
529
+ timing_info['sort_end'] = time.time()
479
530
 
480
531
  if not all_results:
481
532
  return f"No conversations found matching '{query}'. Try different keywords or check if conversations have been imported."
482
533
 
483
- # Format results
484
- result_text = f"Found {len(all_results)} relevant conversation(s) for '{query}':\n\n"
485
- for i, result in enumerate(all_results):
486
- result_text += f"**Result {i+1}** (Score: {result.score:.3f})\n"
487
- # Handle timezone suffix 'Z' properly
488
- timestamp_clean = result.timestamp.replace('Z', '+00:00') if result.timestamp.endswith('Z') else result.timestamp
489
- result_text += f"Time: {datetime.fromisoformat(timestamp_clean).strftime('%Y-%m-%d %H:%M:%S')}\n"
490
- result_text += f"Project: {result.project_name}\n"
491
- result_text += f"Role: {result.role}\n"
492
- result_text += f"Excerpt: {result.excerpt}\n"
493
- result_text += "---\n\n"
534
+ # Format results based on response_format
535
+ timing_info['format_start'] = time.time()
536
+
537
+ if response_format == "xml":
538
+ # XML format (compact tags for performance)
539
+ result_text = "<search>\n"
540
+ result_text += f" <meta>\n"
541
+ result_text += f" <q>{query}</q>\n"
542
+ result_text += f" <scope>{target_project if target_project != 'all' else 'all'}</scope>\n"
543
+ result_text += f" <count>{len(all_results)}</count>\n"
544
+ if all_results:
545
+ result_text += f" <range>{all_results[-1].score:.3f}-{all_results[0].score:.3f}</range>\n"
546
+ result_text += f" <embed>{'local' if PREFER_LOCAL_EMBEDDINGS or not voyage_client else 'voyage'}</embed>\n"
547
+
548
+ # Add timing metadata
549
+ total_time = time.time() - start_time
550
+ result_text += f" <perf>\n"
551
+ result_text += f" <ttl>{int(total_time * 1000)}</ttl>\n"
552
+ result_text += f" <emb>{int((timing_info.get('embedding_end', 0) - timing_info.get('embedding_start', 0)) * 1000)}</emb>\n"
553
+ result_text += f" <srch>{int((timing_info.get('search_all_end', 0) - timing_info.get('search_all_start', 0)) * 1000)}</srch>\n"
554
+ result_text += f" <cols>{len(collections_to_search)}</cols>\n"
555
+ result_text += f" </perf>\n"
556
+ result_text += f" </meta>\n"
557
+
558
+ result_text += " <results>\n"
559
+ for i, result in enumerate(all_results):
560
+ result_text += f' <r rank="{i+1}">\n'
561
+ result_text += f" <s>{result.score:.3f}</s>\n"
562
+ result_text += f" <p>{result.project_name}</p>\n"
563
+
564
+ # Calculate relative time
565
+ timestamp_clean = result.timestamp.replace('Z', '+00:00') if result.timestamp.endswith('Z') else result.timestamp
566
+ timestamp_dt = datetime.fromisoformat(timestamp_clean)
567
+ # Ensure both datetimes are timezone-aware
568
+ if timestamp_dt.tzinfo is None:
569
+ timestamp_dt = timestamp_dt.replace(tzinfo=timezone.utc)
570
+ now = datetime.now(timezone.utc)
571
+ days_ago = (now - timestamp_dt).days
572
+ if days_ago == 0:
573
+ time_str = "today"
574
+ elif days_ago == 1:
575
+ time_str = "yesterday"
576
+ else:
577
+ time_str = f"{days_ago}d"
578
+ result_text += f" <t>{time_str}</t>\n"
579
+
580
+ if not brief:
581
+ # Extract title from first line of excerpt
582
+ excerpt_lines = result.excerpt.split('\n')
583
+ title = excerpt_lines[0][:80] + "..." if len(excerpt_lines[0]) > 80 else excerpt_lines[0]
584
+ result_text += f" <title>{title}</title>\n"
585
+
586
+ # Key finding - summarize the main point
587
+ key_finding = result.excerpt[:100] + "..." if len(result.excerpt) > 100 else result.excerpt
588
+ result_text += f" <key-finding>{key_finding.strip()}</key-finding>\n"
589
+
590
+ # Always include excerpt, but shorter in brief mode
591
+ if brief:
592
+ brief_excerpt = result.excerpt[:100] + "..." if len(result.excerpt) > 100 else result.excerpt
593
+ result_text += f" <excerpt>{brief_excerpt.strip()}</excerpt>\n"
594
+ else:
595
+ result_text += f" <excerpt><![CDATA[{result.excerpt}]]></excerpt>\n"
596
+
597
+ if result.conversation_id:
598
+ result_text += f" <cid>{result.conversation_id}</cid>\n"
599
+
600
+ # Include raw data if requested
601
+ if include_raw and result.raw_payload:
602
+ result_text += " <raw>\n"
603
+ result_text += f" <txt><![CDATA[{result.raw_payload.get('text', '')}]]></txt>\n"
604
+ result_text += f" <id>{result.id}</id>\n"
605
+ result_text += f" <dist>{1 - result.score:.3f}</dist>\n"
606
+ result_text += " <meta>\n"
607
+ for key, value in result.raw_payload.items():
608
+ if key != 'text':
609
+ result_text += f" <{key}>{value}</{key}>\n"
610
+ result_text += " </meta>\n"
611
+ result_text += " </raw>\n"
612
+
613
+ result_text += " </r>\n"
614
+ result_text += " </results>\n"
615
+ result_text += "</search>"
616
+
617
+ else:
618
+ # Markdown format (original)
619
+ result_text = f"Found {len(all_results)} relevant conversation(s) for '{query}':\n\n"
620
+ for i, result in enumerate(all_results):
621
+ result_text += f"**Result {i+1}** (Score: {result.score:.3f})\n"
622
+ # Handle timezone suffix 'Z' properly
623
+ timestamp_clean = result.timestamp.replace('Z', '+00:00') if result.timestamp.endswith('Z') else result.timestamp
624
+ result_text += f"Time: {datetime.fromisoformat(timestamp_clean).strftime('%Y-%m-%d %H:%M:%S')}\n"
625
+ result_text += f"Project: {result.project_name}\n"
626
+ result_text += f"Role: {result.role}\n"
627
+ result_text += f"Excerpt: {result.excerpt}\n"
628
+ result_text += "---\n\n"
629
+
630
+ timing_info['format_end'] = time.time()
631
+
632
+ # Log detailed timing breakdown
633
+ await ctx.debug(f"\n=== TIMING BREAKDOWN ===")
634
+ await ctx.debug(f"Total time: {(time.time() - start_time) * 1000:.1f}ms")
635
+ await ctx.debug(f"Embedding generation: {(timing_info.get('embedding_end', 0) - timing_info.get('embedding_start', 0)) * 1000:.1f}ms")
636
+ await ctx.debug(f"Get collections: {(timing_info.get('get_collections_end', 0) - timing_info.get('get_collections_start', 0)) * 1000:.1f}ms")
637
+ await ctx.debug(f"Search all collections: {(timing_info.get('search_all_end', 0) - timing_info.get('search_all_start', 0)) * 1000:.1f}ms")
638
+ await ctx.debug(f"Sorting results: {(timing_info.get('sort_end', 0) - timing_info.get('sort_start', 0)) * 1000:.1f}ms")
639
+ await ctx.debug(f"Formatting output: {(timing_info.get('format_end', 0) - timing_info.get('format_start', 0)) * 1000:.1f}ms")
640
+
641
+ # Log per-collection timings
642
+ await ctx.debug(f"\n=== PER-COLLECTION TIMINGS ===")
643
+ for ct in collection_timings:
644
+ duration = (ct.get('end', 0) - ct.get('start', 0)) * 1000
645
+ status = "ERROR" if 'error' in ct else "OK"
646
+ await ctx.debug(f"{ct['name']}: {duration:.1f}ms ({status})")
494
647
 
495
648
  return result_text
496
649
 
@@ -498,6 +651,7 @@ async def reflect_on_past(
498
651
  await ctx.error(f"Search failed: {str(e)}")
499
652
  return f"Failed to search conversations: {str(e)}"
500
653
 
654
+
501
655
  @mcp.tool()
502
656
  async def store_reflection(
503
657
  ctx: Context,
@@ -555,5 +709,185 @@ async def store_reflection(
555
709
  return f"Failed to store reflection: {str(e)}"
556
710
 
557
711
 
712
+ @mcp.tool()
713
+ async def quick_search(
714
+ ctx: Context,
715
+ query: str = Field(description="The search query to find semantically similar conversations"),
716
+ min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
717
+ project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
718
+ ) -> str:
719
+ """Quick search that returns only the count and top result for fast overview."""
720
+ try:
721
+ # Leverage reflect_on_past with optimized parameters
722
+ result = await reflect_on_past(
723
+ ctx=ctx,
724
+ query=query,
725
+ limit=1, # Only get the top result
726
+ min_score=min_score,
727
+ project=project,
728
+ response_format="xml",
729
+ brief=True, # Use brief mode for minimal response
730
+ include_raw=False
731
+ )
732
+
733
+ # Parse and reformat for quick overview
734
+ import re
735
+
736
+ # Extract count from metadata
737
+ count_match = re.search(r'<tc>(\d+)</tc>', result)
738
+ total_count = count_match.group(1) if count_match else "0"
739
+
740
+ # Extract top result
741
+ score_match = re.search(r'<s>([\d.]+)</s>', result)
742
+ project_match = re.search(r'<p>([^<]+)</p>', result)
743
+ title_match = re.search(r'<t>([^<]+)</t>', result)
744
+
745
+ if score_match and project_match and title_match:
746
+ return f"""<quick_search>
747
+ <total_matches>{total_count}</total_matches>
748
+ <top_result>
749
+ <score>{score_match.group(1)}</score>
750
+ <project>{project_match.group(1)}</project>
751
+ <title>{title_match.group(1)}</title>
752
+ </top_result>
753
+ </quick_search>"""
754
+ else:
755
+ return f"""<quick_search>
756
+ <total_matches>{total_count}</total_matches>
757
+ <message>No relevant matches found</message>
758
+ </quick_search>"""
759
+ except Exception as e:
760
+ await ctx.error(f"Quick search failed: {str(e)}")
761
+ return f"<quick_search><error>{str(e)}</error></quick_search>"
762
+
763
+
764
+ @mcp.tool()
765
+ async def search_summary(
766
+ ctx: Context,
767
+ query: str = Field(description="The search query to find semantically similar conversations"),
768
+ project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
769
+ ) -> str:
770
+ """Get aggregated insights from search results without individual result details."""
771
+ # Get more results for better summary
772
+ result = await reflect_on_past(
773
+ ctx=ctx,
774
+ query=query,
775
+ limit=10, # Get more results for analysis
776
+ min_score=0.6, # Lower threshold for broader context
777
+ project=project,
778
+ response_format="xml",
779
+ brief=False, # Get full excerpts for analysis
780
+ include_raw=False
781
+ )
782
+
783
+ # Parse results for summary generation
784
+ import re
785
+ from collections import Counter
786
+
787
+ # Extract all projects
788
+ projects = re.findall(r'<p>([^<]+)</p>', result)
789
+ project_counts = Counter(projects)
790
+
791
+ # Extract scores for statistics
792
+ scores = [float(s) for s in re.findall(r'<s>([\d.]+)</s>', result)]
793
+ avg_score = sum(scores) / len(scores) if scores else 0
794
+
795
+ # Extract themes from titles and excerpts
796
+ titles = re.findall(r'<t>([^<]+)</t>', result)
797
+ excerpts = re.findall(r'<e>([^<]+)</e>', result)
798
+
799
+ # Extract metadata
800
+ count_match = re.search(r'<tc>(\d+)</tc>', result)
801
+ total_count = count_match.group(1) if count_match else "0"
802
+
803
+ # Generate summary
804
+ summary = f"""<search_summary>
805
+ <total_matches>{total_count}</total_matches>
806
+ <searched_projects>{len(project_counts)}</searched_projects>
807
+ <average_relevance>{avg_score:.2f}</average_relevance>
808
+ <project_distribution>"""
809
+
810
+ for proj, count in project_counts.most_common(3):
811
+ summary += f"\n <project name='{proj}' matches='{count}'/>"
812
+
813
+ summary += f"""
814
+ </project_distribution>
815
+ <common_themes>"""
816
+
817
+ # Simple theme extraction from titles
818
+ theme_words = []
819
+ for title in titles[:5]: # Top 5 results
820
+ words = [w.lower() for w in title.split() if len(w) > 4]
821
+ theme_words.extend(words)
822
+
823
+ theme_counts = Counter(theme_words)
824
+ for theme, count in theme_counts.most_common(5):
825
+ if count > 1: # Only show repeated themes
826
+ summary += f"\n <theme>{theme}</theme>"
827
+
828
+ summary += """
829
+ </common_themes>
830
+ </search_summary>"""
831
+
832
+ return summary
833
+
834
+
835
+ @mcp.tool()
836
+ async def get_more_results(
837
+ ctx: Context,
838
+ query: str = Field(description="The original search query"),
839
+ offset: int = Field(default=3, description="Number of results to skip (for pagination)"),
840
+ limit: int = Field(default=3, description="Number of additional results to return"),
841
+ min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
842
+ project: Optional[str] = Field(default=None, description="Search specific project only")
843
+ ) -> str:
844
+ """Get additional search results after an initial search (pagination support)."""
845
+ # Note: Since Qdrant doesn't support true offset in our current implementation,
846
+ # we'll fetch offset+limit results and slice
847
+ total_limit = offset + limit
848
+
849
+ # Get the larger result set
850
+ result = await reflect_on_past(
851
+ ctx=ctx,
852
+ query=query,
853
+ limit=total_limit,
854
+ min_score=min_score,
855
+ project=project,
856
+ response_format="xml",
857
+ brief=False,
858
+ include_raw=False
859
+ )
860
+
861
+ # Parse and extract only the additional results
862
+ import re
863
+
864
+ # Find all result blocks
865
+ result_pattern = r'<r>.*?</r>'
866
+ all_results = re.findall(result_pattern, result, re.DOTALL)
867
+
868
+ # Get only the results after offset
869
+ additional_results = all_results[offset:offset+limit] if len(all_results) > offset else []
870
+
871
+ if not additional_results:
872
+ return """<more_results>
873
+ <message>No additional results found</message>
874
+ </more_results>"""
875
+
876
+ # Reconstruct response with only additional results
877
+ response = f"""<more_results>
878
+ <offset>{offset}</offset>
879
+ <count>{len(additional_results)}</count>
880
+ <results>
881
+ {''.join(additional_results)}
882
+ </results>
883
+ </more_results>"""
884
+
885
+ return response
886
+
887
+
558
888
  # Debug output
559
889
  print(f"[DEBUG] FastMCP server created with name: {mcp.name}")
890
+
891
+ # Run the server
892
+ if __name__ == "__main__":
893
+ mcp.run()
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-self-reflect",
3
- "version": "2.4.4",
3
+ "version": "2.4.6",
4
4
  "description": "Give Claude perfect memory of all your conversations - Installation wizard for Python MCP server",
5
5
  "keywords": [
6
6
  "claude",