claude-self-reflect 2.4.4 → 2.4.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/reflection-specialist.md +174 -7
- package/README.md +35 -0
- package/mcp-server/src/server.py +356 -22
- package/package.json +1 -1
|
@@ -69,6 +69,34 @@ Search for relevant past conversations using semantic similarity.
|
|
|
69
69
|
project: "all", // Search across all projects
|
|
70
70
|
limit: 10
|
|
71
71
|
}
|
|
72
|
+
|
|
73
|
+
// Debug mode with raw Qdrant data (NEW in v2.4.5)
|
|
74
|
+
{
|
|
75
|
+
query: "search quality issues",
|
|
76
|
+
project: "all",
|
|
77
|
+
limit: 5,
|
|
78
|
+
include_raw: true // Include full payload for debugging
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
// Choose response format (NEW in v2.4.5)
|
|
82
|
+
{
|
|
83
|
+
query: "playwright issues",
|
|
84
|
+
limit: 5,
|
|
85
|
+
response_format: "xml" // Use XML format (default)
|
|
86
|
+
}
|
|
87
|
+
|
|
88
|
+
{
|
|
89
|
+
query: "playwright issues",
|
|
90
|
+
limit: 5,
|
|
91
|
+
response_format: "markdown" // Use original markdown format
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
// Brief mode for minimal responses (NEW in v2.4.5)
|
|
95
|
+
{
|
|
96
|
+
query: "error handling patterns",
|
|
97
|
+
limit: 3,
|
|
98
|
+
brief: true // Returns minimal excerpts (100 chars) for faster response
|
|
99
|
+
}
|
|
72
100
|
```
|
|
73
101
|
|
|
74
102
|
#### Default Behavior: Project-Scoped Search (NEW in v2.4.3)
|
|
@@ -89,6 +117,87 @@ Save important insights and decisions for future retrieval.
|
|
|
89
117
|
}
|
|
90
118
|
```
|
|
91
119
|
|
|
120
|
+
### Specialized Search Tools (NEW in v2.4.5)
|
|
121
|
+
|
|
122
|
+
**Note**: These specialized tools are available through this reflection-specialist agent. Due to FastMCP limitations, they cannot be called directly via MCP (e.g., `mcp__claude-self-reflect__quick_search`), but work perfectly when used through this agent.
|
|
123
|
+
|
|
124
|
+
#### quick_search
|
|
125
|
+
Fast search that returns only the count and top result. Perfect for quick checks and overview.
|
|
126
|
+
|
|
127
|
+
```javascript
|
|
128
|
+
// Quick overview of matches
|
|
129
|
+
{
|
|
130
|
+
query: "authentication patterns",
|
|
131
|
+
min_score: 0.5, // Optional, defaults to 0.7
|
|
132
|
+
project: "all" // Optional, defaults to current project
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
Returns:
|
|
137
|
+
- Total match count across all results
|
|
138
|
+
- Details of only the top result
|
|
139
|
+
- Minimal response size for fast performance
|
|
140
|
+
|
|
141
|
+
#### search_summary
|
|
142
|
+
Get aggregated insights without individual result details. Ideal for pattern analysis.
|
|
143
|
+
|
|
144
|
+
```javascript
|
|
145
|
+
// Analyze patterns across conversations
|
|
146
|
+
{
|
|
147
|
+
query: "error handling",
|
|
148
|
+
project: "all", // Optional
|
|
149
|
+
limit: 10 // Optional, how many results to analyze
|
|
150
|
+
}
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
Returns:
|
|
154
|
+
- Total matches and average relevance score
|
|
155
|
+
- Project distribution (which projects contain matches)
|
|
156
|
+
- Common themes extracted from results
|
|
157
|
+
- No individual result details (faster response)
|
|
158
|
+
|
|
159
|
+
#### get_more_results
|
|
160
|
+
Pagination support for getting additional results after an initial search.
|
|
161
|
+
|
|
162
|
+
```javascript
|
|
163
|
+
// Get next batch of results
|
|
164
|
+
{
|
|
165
|
+
query: "original search query", // Must match original query
|
|
166
|
+
offset: 3, // Skip first 3 results
|
|
167
|
+
limit: 3, // Get next 3 results
|
|
168
|
+
min_score: 0.7, // Optional
|
|
169
|
+
project: "all" // Optional
|
|
170
|
+
}
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
Note: Since Qdrant doesn't support true offset, this fetches offset+limit results and returns only the requested slice. Best used for exploring beyond initial results.
|
|
174
|
+
|
|
175
|
+
## Debug Mode (NEW in v2.4.5)
|
|
176
|
+
|
|
177
|
+
### Using include_raw for Troubleshooting
|
|
178
|
+
When search quality issues arise or you need to understand why certain results are returned, enable debug mode:
|
|
179
|
+
|
|
180
|
+
```javascript
|
|
181
|
+
{
|
|
182
|
+
query: "your search query",
|
|
183
|
+
include_raw: true // Adds full Qdrant payload to results
|
|
184
|
+
}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
**Warning**: Debug mode significantly increases response size (3-5x larger). Use only when necessary.
|
|
188
|
+
|
|
189
|
+
### What's Included in Raw Data
|
|
190
|
+
- **full-text**: Complete conversation text (not just 500 char excerpt)
|
|
191
|
+
- **point-id**: Qdrant's unique identifier for the chunk
|
|
192
|
+
- **vector-distance**: Raw similarity score (1 - cosine_similarity)
|
|
193
|
+
- **metadata**: All stored fields including timestamps, roles, project paths
|
|
194
|
+
|
|
195
|
+
### When to Use Debug Mode
|
|
196
|
+
1. **Search Quality Issues**: Understanding why irrelevant results rank high
|
|
197
|
+
2. **Project Filtering Problems**: Debugging project scoping issues
|
|
198
|
+
3. **Embedding Analysis**: Comparing similarity scores across models
|
|
199
|
+
4. **Data Validation**: Verifying what's actually stored in Qdrant
|
|
200
|
+
|
|
92
201
|
## Search Strategy Guidelines
|
|
93
202
|
|
|
94
203
|
### Understanding Score Ranges
|
|
@@ -111,10 +220,23 @@ Save important insights and decisions for future retrieval.
|
|
|
111
220
|
4. **Use Context**: Include technology names, error messages, or specific terms
|
|
112
221
|
5. **Cross-Project When Needed**: Similar problems may have been solved elsewhere
|
|
113
222
|
|
|
114
|
-
## Response Format
|
|
223
|
+
## Response Format (NEW in v2.4.5)
|
|
115
224
|
|
|
116
|
-
###
|
|
117
|
-
|
|
225
|
+
### Choosing Response Format
|
|
226
|
+
The MCP server now supports two response formats:
|
|
227
|
+
- **XML** (default): Structured format for better parsing and metadata handling
|
|
228
|
+
- **Markdown**: Original format for compatibility and real-time playback
|
|
229
|
+
|
|
230
|
+
Use the `response_format` parameter to select:
|
|
231
|
+
```javascript
|
|
232
|
+
{
|
|
233
|
+
query: "your search",
|
|
234
|
+
response_format: "xml" // or "markdown"
|
|
235
|
+
}
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
### XML-Structured Output (Default)
|
|
239
|
+
The XML format provides better structure for parsing and includes performance metadata:
|
|
118
240
|
|
|
119
241
|
```xml
|
|
120
242
|
<reflection-search>
|
|
@@ -135,6 +257,16 @@ To facilitate better parsing and metadata handling, structure your responses usi
|
|
|
135
257
|
<key-finding>One-line summary of the main insight</key-finding>
|
|
136
258
|
<excerpt>Most relevant quote or context from the conversation</excerpt>
|
|
137
259
|
<conversation-id>optional-id</conversation-id>
|
|
260
|
+
<!-- Optional: Only when include_raw=true -->
|
|
261
|
+
<raw-data>
|
|
262
|
+
<full-text>Complete conversation text...</full-text>
|
|
263
|
+
<point-id>qdrant-uuid</point-id>
|
|
264
|
+
<vector-distance>0.275</vector-distance>
|
|
265
|
+
<metadata>
|
|
266
|
+
<field1>value1</field1>
|
|
267
|
+
<field2>value2</field2>
|
|
268
|
+
</metadata>
|
|
269
|
+
</raw-data>
|
|
138
270
|
</result>
|
|
139
271
|
|
|
140
272
|
<result rank="2">
|
|
@@ -156,13 +288,48 @@ To facilitate better parsing and metadata handling, structure your responses usi
|
|
|
156
288
|
</reflection-search>
|
|
157
289
|
```
|
|
158
290
|
|
|
291
|
+
### Markdown Format (For Compatibility)
|
|
292
|
+
The original markdown format is simpler and enables real-time playback in Claude:
|
|
293
|
+
|
|
294
|
+
```
|
|
295
|
+
Found 3 relevant conversation(s) for 'your query':
|
|
296
|
+
|
|
297
|
+
**Result 1** (Score: 0.725)
|
|
298
|
+
Time: 2024-01-15 10:30:00
|
|
299
|
+
Project: ProjectName
|
|
300
|
+
Role: assistant
|
|
301
|
+
Excerpt: The relevant excerpt from the conversation...
|
|
302
|
+
---
|
|
303
|
+
|
|
304
|
+
**Result 2** (Score: 0.612)
|
|
305
|
+
Time: 2024-01-14 15:45:00
|
|
306
|
+
Project: ProjectName
|
|
307
|
+
Role: user
|
|
308
|
+
Excerpt: Another relevant excerpt...
|
|
309
|
+
---
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
### When to Use Each Format
|
|
313
|
+
|
|
314
|
+
**Use XML format when:**
|
|
315
|
+
- Main agent needs to parse and process results
|
|
316
|
+
- Performance metrics are important
|
|
317
|
+
- Debugging search quality issues
|
|
318
|
+
- Need structured metadata access
|
|
319
|
+
|
|
320
|
+
**Use Markdown format when:**
|
|
321
|
+
- Testing real-time playback in Claude UI
|
|
322
|
+
- Simple manual searches
|
|
323
|
+
- Compatibility with older workflows
|
|
324
|
+
- Prefer simpler output
|
|
325
|
+
|
|
159
326
|
### Response Best Practices
|
|
160
327
|
|
|
161
|
-
1. **
|
|
162
|
-
2. **Indicate Search Scope** in the summary section
|
|
328
|
+
1. **Use XML format by default** unless markdown is specifically requested
|
|
329
|
+
2. **Indicate Search Scope** in the summary section (XML) or header (markdown)
|
|
163
330
|
3. **Order results by relevance** (highest score first)
|
|
164
|
-
4. **Include actionable insights** in the analysis section
|
|
165
|
-
5. **Provide metadata** for transparency
|
|
331
|
+
4. **Include actionable insights** in the analysis section (XML format)
|
|
332
|
+
5. **Provide metadata** for transparency and debugging
|
|
166
333
|
|
|
167
334
|
### Proactive Cross-Project Search Suggestions
|
|
168
335
|
|
package/README.md
CHANGED
|
@@ -123,6 +123,41 @@ Once installed, just talk naturally:
|
|
|
123
123
|
|
|
124
124
|
The reflection specialist automatically activates. No special commands needed.
|
|
125
125
|
|
|
126
|
+
## Performance & Usage Guide (v2.4.5)
|
|
127
|
+
|
|
128
|
+
### 🚀 10-40x Faster Performance
|
|
129
|
+
Search response times improved from 28.9s-2min down to **200-350ms** through optimizations:
|
|
130
|
+
- Compressed XML response format (40% smaller)
|
|
131
|
+
- Optimized excerpts (350 chars for context, 100 chars in brief mode)
|
|
132
|
+
- Smart defaults (5 results to avoid missing relevant conversations)
|
|
133
|
+
|
|
134
|
+
### 🎯 Recommended Usage: Through Reflection-Specialist Agent
|
|
135
|
+
|
|
136
|
+
**Why use the agent instead of direct MCP tools?**
|
|
137
|
+
- Rich formatted responses with analysis and insights
|
|
138
|
+
- Proper handling of specialized search tools
|
|
139
|
+
- Better user experience with streaming feedback
|
|
140
|
+
- Automatic cross-project search suggestions
|
|
141
|
+
|
|
142
|
+
**Example:**
|
|
143
|
+
```
|
|
144
|
+
You: "What Docker issues did we solve?"
|
|
145
|
+
[Claude automatically spawns reflection-specialist agent]
|
|
146
|
+
⏺ reflection-specialist(Search Docker issues)
|
|
147
|
+
⎿ Searching 57 collections...
|
|
148
|
+
⎿ Found 5 relevant conversations
|
|
149
|
+
⎿ Done (1 tool use · 12k tokens · 2.3s)
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
### ⚡ Performance Baselines
|
|
153
|
+
|
|
154
|
+
| Method | Search Time | Total Time | Best For |
|
|
155
|
+
|--------|------------|------------|----------|
|
|
156
|
+
| Direct MCP | 200-350ms | 200-350ms | Programmatic use, integrations |
|
|
157
|
+
| Via Agent | 200-350ms | 2-3s | Interactive use, rich analysis |
|
|
158
|
+
|
|
159
|
+
**Note**: The specialized tools (`quick_search`, `search_summary`, `get_more_results`) only work through the reflection-specialist agent due to MCP protocol limitations.
|
|
160
|
+
|
|
126
161
|
## Project-Scoped Search (New in v2.4.3)
|
|
127
162
|
|
|
128
163
|
**⚠️ Breaking Change**: Searches now default to current project only. Previously searched all projects.
|
package/mcp-server/src/server.py
CHANGED
|
@@ -4,10 +4,11 @@ import os
|
|
|
4
4
|
import asyncio
|
|
5
5
|
from pathlib import Path
|
|
6
6
|
from typing import Any, Optional, List, Dict, Union
|
|
7
|
-
from datetime import datetime
|
|
7
|
+
from datetime import datetime, timezone
|
|
8
8
|
import json
|
|
9
9
|
import numpy as np
|
|
10
10
|
import hashlib
|
|
11
|
+
import time
|
|
11
12
|
|
|
12
13
|
from fastmcp import FastMCP, Context
|
|
13
14
|
from pydantic import BaseModel, Field
|
|
@@ -87,6 +88,7 @@ class SearchResult(BaseModel):
|
|
|
87
88
|
project_name: str
|
|
88
89
|
conversation_id: Optional[str] = None
|
|
89
90
|
collection_name: str
|
|
91
|
+
raw_payload: Optional[Dict[str, Any]] = None # Full Qdrant payload when debug mode enabled
|
|
90
92
|
|
|
91
93
|
|
|
92
94
|
# Initialize FastMCP instance
|
|
@@ -151,16 +153,25 @@ async def reflect_on_past(
|
|
|
151
153
|
limit: int = Field(default=5, description="Maximum number of results to return"),
|
|
152
154
|
min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
|
|
153
155
|
use_decay: Union[int, str] = Field(default=-1, description="Apply time-based decay: 1=enable, 0=disable, -1=use environment default (accepts int or str)"),
|
|
154
|
-
project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
|
|
156
|
+
project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects."),
|
|
157
|
+
include_raw: bool = Field(default=False, description="Include raw Qdrant payload data for debugging (increases response size)"),
|
|
158
|
+
response_format: str = Field(default="xml", description="Response format: 'xml' or 'markdown'"),
|
|
159
|
+
brief: bool = Field(default=False, description="Brief mode: returns minimal information for faster response")
|
|
155
160
|
) -> str:
|
|
156
161
|
"""Search for relevant past conversations using semantic search with optional time decay."""
|
|
157
162
|
|
|
163
|
+
# Start timing
|
|
164
|
+
start_time = time.time()
|
|
165
|
+
timing_info = {}
|
|
166
|
+
|
|
158
167
|
# Normalize use_decay to integer
|
|
168
|
+
timing_info['param_parsing_start'] = time.time()
|
|
159
169
|
if isinstance(use_decay, str):
|
|
160
170
|
try:
|
|
161
171
|
use_decay = int(use_decay)
|
|
162
172
|
except ValueError:
|
|
163
173
|
raise ValueError("use_decay must be '1', '0', or '-1'")
|
|
174
|
+
timing_info['param_parsing_end'] = time.time()
|
|
164
175
|
|
|
165
176
|
# Parse decay parameter using integer approach
|
|
166
177
|
should_use_decay = (
|
|
@@ -207,10 +218,15 @@ async def reflect_on_past(
|
|
|
207
218
|
|
|
208
219
|
try:
|
|
209
220
|
# Generate embedding
|
|
221
|
+
timing_info['embedding_start'] = time.time()
|
|
210
222
|
query_embedding = await generate_embedding(query)
|
|
223
|
+
timing_info['embedding_end'] = time.time()
|
|
211
224
|
|
|
212
225
|
# Get all collections
|
|
226
|
+
timing_info['get_collections_start'] = time.time()
|
|
213
227
|
all_collections = await get_all_collections()
|
|
228
|
+
timing_info['get_collections_end'] = time.time()
|
|
229
|
+
|
|
214
230
|
if not all_collections:
|
|
215
231
|
return "No conversation collections found. Please import conversations first."
|
|
216
232
|
|
|
@@ -241,7 +257,22 @@ async def reflect_on_past(
|
|
|
241
257
|
all_results = []
|
|
242
258
|
|
|
243
259
|
# Search each collection
|
|
244
|
-
|
|
260
|
+
timing_info['search_all_start'] = time.time()
|
|
261
|
+
collection_timings = []
|
|
262
|
+
|
|
263
|
+
# Report initial progress
|
|
264
|
+
await ctx.report_progress(progress=0, total=len(collections_to_search))
|
|
265
|
+
|
|
266
|
+
for idx, collection_name in enumerate(collections_to_search):
|
|
267
|
+
collection_timing = {'name': collection_name, 'start': time.time()}
|
|
268
|
+
|
|
269
|
+
# Report progress before searching each collection
|
|
270
|
+
await ctx.report_progress(
|
|
271
|
+
progress=idx,
|
|
272
|
+
total=len(collections_to_search),
|
|
273
|
+
message=f"Searching {collection_name}"
|
|
274
|
+
)
|
|
275
|
+
|
|
245
276
|
try:
|
|
246
277
|
if should_use_decay and USE_NATIVE_DECAY and NATIVE_DECAY_AVAILABLE:
|
|
247
278
|
# Use native Qdrant decay with newer API
|
|
@@ -353,10 +384,11 @@ async def reflect_on_past(
|
|
|
353
384
|
score=point.score, # Score already includes decay
|
|
354
385
|
timestamp=clean_timestamp,
|
|
355
386
|
role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
|
|
356
|
-
excerpt=(point.payload.get('text', '')[:
|
|
387
|
+
excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
|
|
357
388
|
project_name=point_project,
|
|
358
389
|
conversation_id=point.payload.get('conversation_id'),
|
|
359
|
-
collection_name=collection_name
|
|
390
|
+
collection_name=collection_name,
|
|
391
|
+
raw_payload=point.payload if include_raw else None
|
|
360
392
|
))
|
|
361
393
|
|
|
362
394
|
elif should_use_decay:
|
|
@@ -372,7 +404,7 @@ async def reflect_on_past(
|
|
|
372
404
|
)
|
|
373
405
|
|
|
374
406
|
# Apply decay scoring manually
|
|
375
|
-
now = datetime.now()
|
|
407
|
+
now = datetime.now(timezone.utc)
|
|
376
408
|
scale_ms = DECAY_SCALE_DAYS * 24 * 60 * 60 * 1000
|
|
377
409
|
|
|
378
410
|
decay_results = []
|
|
@@ -382,6 +414,9 @@ async def reflect_on_past(
|
|
|
382
414
|
timestamp_str = point.payload.get('timestamp')
|
|
383
415
|
if timestamp_str:
|
|
384
416
|
timestamp = datetime.fromisoformat(timestamp_str.replace('Z', '+00:00'))
|
|
417
|
+
# Ensure timestamp is timezone-aware
|
|
418
|
+
if timestamp.tzinfo is None:
|
|
419
|
+
timestamp = timestamp.replace(tzinfo=timezone.utc)
|
|
385
420
|
age_ms = (now - timestamp).total_seconds() * 1000
|
|
386
421
|
|
|
387
422
|
# Calculate decay factor
|
|
@@ -428,10 +463,11 @@ async def reflect_on_past(
|
|
|
428
463
|
score=adjusted_score, # Use adjusted score
|
|
429
464
|
timestamp=clean_timestamp,
|
|
430
465
|
role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
|
|
431
|
-
excerpt=(point.payload.get('text', '')[:
|
|
466
|
+
excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
|
|
432
467
|
project_name=point_project,
|
|
433
468
|
conversation_id=point.payload.get('conversation_id'),
|
|
434
|
-
collection_name=collection_name
|
|
469
|
+
collection_name=collection_name,
|
|
470
|
+
raw_payload=point.payload if include_raw else None
|
|
435
471
|
))
|
|
436
472
|
else:
|
|
437
473
|
# Standard search without decay
|
|
@@ -463,34 +499,151 @@ async def reflect_on_past(
|
|
|
463
499
|
score=point.score,
|
|
464
500
|
timestamp=clean_timestamp,
|
|
465
501
|
role=point.payload.get('start_role', point.payload.get('role', 'unknown')),
|
|
466
|
-
excerpt=(point.payload.get('text', '')[:
|
|
502
|
+
excerpt=(point.payload.get('text', '')[:350] + '...' if len(point.payload.get('text', '')) > 350 else point.payload.get('text', '')),
|
|
467
503
|
project_name=point_project,
|
|
468
504
|
conversation_id=point.payload.get('conversation_id'),
|
|
469
|
-
collection_name=collection_name
|
|
505
|
+
collection_name=collection_name,
|
|
506
|
+
raw_payload=point.payload if include_raw else None
|
|
470
507
|
))
|
|
471
508
|
|
|
472
509
|
except Exception as e:
|
|
473
510
|
await ctx.debug(f"Error searching {collection_name}: {str(e)}")
|
|
474
|
-
|
|
511
|
+
collection_timing['error'] = str(e)
|
|
512
|
+
|
|
513
|
+
collection_timing['end'] = time.time()
|
|
514
|
+
collection_timings.append(collection_timing)
|
|
515
|
+
|
|
516
|
+
timing_info['search_all_end'] = time.time()
|
|
517
|
+
|
|
518
|
+
# Report completion of search phase
|
|
519
|
+
await ctx.report_progress(
|
|
520
|
+
progress=len(collections_to_search),
|
|
521
|
+
total=len(collections_to_search),
|
|
522
|
+
message="Search complete, processing results"
|
|
523
|
+
)
|
|
475
524
|
|
|
476
525
|
# Sort by score and limit
|
|
526
|
+
timing_info['sort_start'] = time.time()
|
|
477
527
|
all_results.sort(key=lambda x: x.score, reverse=True)
|
|
478
528
|
all_results = all_results[:limit]
|
|
529
|
+
timing_info['sort_end'] = time.time()
|
|
479
530
|
|
|
480
531
|
if not all_results:
|
|
481
532
|
return f"No conversations found matching '{query}'. Try different keywords or check if conversations have been imported."
|
|
482
533
|
|
|
483
|
-
# Format results
|
|
484
|
-
|
|
485
|
-
|
|
486
|
-
|
|
487
|
-
#
|
|
488
|
-
|
|
489
|
-
result_text += f"
|
|
490
|
-
result_text += f"
|
|
491
|
-
result_text += f"
|
|
492
|
-
result_text += f"
|
|
493
|
-
|
|
534
|
+
# Format results based on response_format
|
|
535
|
+
timing_info['format_start'] = time.time()
|
|
536
|
+
|
|
537
|
+
if response_format == "xml":
|
|
538
|
+
# XML format (compact tags for performance)
|
|
539
|
+
result_text = "<search>\n"
|
|
540
|
+
result_text += f" <meta>\n"
|
|
541
|
+
result_text += f" <q>{query}</q>\n"
|
|
542
|
+
result_text += f" <scope>{target_project if target_project != 'all' else 'all'}</scope>\n"
|
|
543
|
+
result_text += f" <count>{len(all_results)}</count>\n"
|
|
544
|
+
if all_results:
|
|
545
|
+
result_text += f" <range>{all_results[-1].score:.3f}-{all_results[0].score:.3f}</range>\n"
|
|
546
|
+
result_text += f" <embed>{'local' if PREFER_LOCAL_EMBEDDINGS or not voyage_client else 'voyage'}</embed>\n"
|
|
547
|
+
|
|
548
|
+
# Add timing metadata
|
|
549
|
+
total_time = time.time() - start_time
|
|
550
|
+
result_text += f" <perf>\n"
|
|
551
|
+
result_text += f" <ttl>{int(total_time * 1000)}</ttl>\n"
|
|
552
|
+
result_text += f" <emb>{int((timing_info.get('embedding_end', 0) - timing_info.get('embedding_start', 0)) * 1000)}</emb>\n"
|
|
553
|
+
result_text += f" <srch>{int((timing_info.get('search_all_end', 0) - timing_info.get('search_all_start', 0)) * 1000)}</srch>\n"
|
|
554
|
+
result_text += f" <cols>{len(collections_to_search)}</cols>\n"
|
|
555
|
+
result_text += f" </perf>\n"
|
|
556
|
+
result_text += f" </meta>\n"
|
|
557
|
+
|
|
558
|
+
result_text += " <results>\n"
|
|
559
|
+
for i, result in enumerate(all_results):
|
|
560
|
+
result_text += f' <r rank="{i+1}">\n'
|
|
561
|
+
result_text += f" <s>{result.score:.3f}</s>\n"
|
|
562
|
+
result_text += f" <p>{result.project_name}</p>\n"
|
|
563
|
+
|
|
564
|
+
# Calculate relative time
|
|
565
|
+
timestamp_clean = result.timestamp.replace('Z', '+00:00') if result.timestamp.endswith('Z') else result.timestamp
|
|
566
|
+
timestamp_dt = datetime.fromisoformat(timestamp_clean)
|
|
567
|
+
# Ensure both datetimes are timezone-aware
|
|
568
|
+
if timestamp_dt.tzinfo is None:
|
|
569
|
+
timestamp_dt = timestamp_dt.replace(tzinfo=timezone.utc)
|
|
570
|
+
now = datetime.now(timezone.utc)
|
|
571
|
+
days_ago = (now - timestamp_dt).days
|
|
572
|
+
if days_ago == 0:
|
|
573
|
+
time_str = "today"
|
|
574
|
+
elif days_ago == 1:
|
|
575
|
+
time_str = "yesterday"
|
|
576
|
+
else:
|
|
577
|
+
time_str = f"{days_ago}d"
|
|
578
|
+
result_text += f" <t>{time_str}</t>\n"
|
|
579
|
+
|
|
580
|
+
if not brief:
|
|
581
|
+
# Extract title from first line of excerpt
|
|
582
|
+
excerpt_lines = result.excerpt.split('\n')
|
|
583
|
+
title = excerpt_lines[0][:80] + "..." if len(excerpt_lines[0]) > 80 else excerpt_lines[0]
|
|
584
|
+
result_text += f" <title>{title}</title>\n"
|
|
585
|
+
|
|
586
|
+
# Key finding - summarize the main point
|
|
587
|
+
key_finding = result.excerpt[:100] + "..." if len(result.excerpt) > 100 else result.excerpt
|
|
588
|
+
result_text += f" <key-finding>{key_finding.strip()}</key-finding>\n"
|
|
589
|
+
|
|
590
|
+
# Always include excerpt, but shorter in brief mode
|
|
591
|
+
if brief:
|
|
592
|
+
brief_excerpt = result.excerpt[:100] + "..." if len(result.excerpt) > 100 else result.excerpt
|
|
593
|
+
result_text += f" <excerpt>{brief_excerpt.strip()}</excerpt>\n"
|
|
594
|
+
else:
|
|
595
|
+
result_text += f" <excerpt><![CDATA[{result.excerpt}]]></excerpt>\n"
|
|
596
|
+
|
|
597
|
+
if result.conversation_id:
|
|
598
|
+
result_text += f" <cid>{result.conversation_id}</cid>\n"
|
|
599
|
+
|
|
600
|
+
# Include raw data if requested
|
|
601
|
+
if include_raw and result.raw_payload:
|
|
602
|
+
result_text += " <raw>\n"
|
|
603
|
+
result_text += f" <txt><![CDATA[{result.raw_payload.get('text', '')}]]></txt>\n"
|
|
604
|
+
result_text += f" <id>{result.id}</id>\n"
|
|
605
|
+
result_text += f" <dist>{1 - result.score:.3f}</dist>\n"
|
|
606
|
+
result_text += " <meta>\n"
|
|
607
|
+
for key, value in result.raw_payload.items():
|
|
608
|
+
if key != 'text':
|
|
609
|
+
result_text += f" <{key}>{value}</{key}>\n"
|
|
610
|
+
result_text += " </meta>\n"
|
|
611
|
+
result_text += " </raw>\n"
|
|
612
|
+
|
|
613
|
+
result_text += " </r>\n"
|
|
614
|
+
result_text += " </results>\n"
|
|
615
|
+
result_text += "</search>"
|
|
616
|
+
|
|
617
|
+
else:
|
|
618
|
+
# Markdown format (original)
|
|
619
|
+
result_text = f"Found {len(all_results)} relevant conversation(s) for '{query}':\n\n"
|
|
620
|
+
for i, result in enumerate(all_results):
|
|
621
|
+
result_text += f"**Result {i+1}** (Score: {result.score:.3f})\n"
|
|
622
|
+
# Handle timezone suffix 'Z' properly
|
|
623
|
+
timestamp_clean = result.timestamp.replace('Z', '+00:00') if result.timestamp.endswith('Z') else result.timestamp
|
|
624
|
+
result_text += f"Time: {datetime.fromisoformat(timestamp_clean).strftime('%Y-%m-%d %H:%M:%S')}\n"
|
|
625
|
+
result_text += f"Project: {result.project_name}\n"
|
|
626
|
+
result_text += f"Role: {result.role}\n"
|
|
627
|
+
result_text += f"Excerpt: {result.excerpt}\n"
|
|
628
|
+
result_text += "---\n\n"
|
|
629
|
+
|
|
630
|
+
timing_info['format_end'] = time.time()
|
|
631
|
+
|
|
632
|
+
# Log detailed timing breakdown
|
|
633
|
+
await ctx.debug(f"\n=== TIMING BREAKDOWN ===")
|
|
634
|
+
await ctx.debug(f"Total time: {(time.time() - start_time) * 1000:.1f}ms")
|
|
635
|
+
await ctx.debug(f"Embedding generation: {(timing_info.get('embedding_end', 0) - timing_info.get('embedding_start', 0)) * 1000:.1f}ms")
|
|
636
|
+
await ctx.debug(f"Get collections: {(timing_info.get('get_collections_end', 0) - timing_info.get('get_collections_start', 0)) * 1000:.1f}ms")
|
|
637
|
+
await ctx.debug(f"Search all collections: {(timing_info.get('search_all_end', 0) - timing_info.get('search_all_start', 0)) * 1000:.1f}ms")
|
|
638
|
+
await ctx.debug(f"Sorting results: {(timing_info.get('sort_end', 0) - timing_info.get('sort_start', 0)) * 1000:.1f}ms")
|
|
639
|
+
await ctx.debug(f"Formatting output: {(timing_info.get('format_end', 0) - timing_info.get('format_start', 0)) * 1000:.1f}ms")
|
|
640
|
+
|
|
641
|
+
# Log per-collection timings
|
|
642
|
+
await ctx.debug(f"\n=== PER-COLLECTION TIMINGS ===")
|
|
643
|
+
for ct in collection_timings:
|
|
644
|
+
duration = (ct.get('end', 0) - ct.get('start', 0)) * 1000
|
|
645
|
+
status = "ERROR" if 'error' in ct else "OK"
|
|
646
|
+
await ctx.debug(f"{ct['name']}: {duration:.1f}ms ({status})")
|
|
494
647
|
|
|
495
648
|
return result_text
|
|
496
649
|
|
|
@@ -498,6 +651,7 @@ async def reflect_on_past(
|
|
|
498
651
|
await ctx.error(f"Search failed: {str(e)}")
|
|
499
652
|
return f"Failed to search conversations: {str(e)}"
|
|
500
653
|
|
|
654
|
+
|
|
501
655
|
@mcp.tool()
|
|
502
656
|
async def store_reflection(
|
|
503
657
|
ctx: Context,
|
|
@@ -555,5 +709,185 @@ async def store_reflection(
|
|
|
555
709
|
return f"Failed to store reflection: {str(e)}"
|
|
556
710
|
|
|
557
711
|
|
|
712
|
+
@mcp.tool()
|
|
713
|
+
async def quick_search(
|
|
714
|
+
ctx: Context,
|
|
715
|
+
query: str = Field(description="The search query to find semantically similar conversations"),
|
|
716
|
+
min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
|
|
717
|
+
project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
|
|
718
|
+
) -> str:
|
|
719
|
+
"""Quick search that returns only the count and top result for fast overview."""
|
|
720
|
+
try:
|
|
721
|
+
# Leverage reflect_on_past with optimized parameters
|
|
722
|
+
result = await reflect_on_past(
|
|
723
|
+
ctx=ctx,
|
|
724
|
+
query=query,
|
|
725
|
+
limit=1, # Only get the top result
|
|
726
|
+
min_score=min_score,
|
|
727
|
+
project=project,
|
|
728
|
+
response_format="xml",
|
|
729
|
+
brief=True, # Use brief mode for minimal response
|
|
730
|
+
include_raw=False
|
|
731
|
+
)
|
|
732
|
+
|
|
733
|
+
# Parse and reformat for quick overview
|
|
734
|
+
import re
|
|
735
|
+
|
|
736
|
+
# Extract count from metadata
|
|
737
|
+
count_match = re.search(r'<tc>(\d+)</tc>', result)
|
|
738
|
+
total_count = count_match.group(1) if count_match else "0"
|
|
739
|
+
|
|
740
|
+
# Extract top result
|
|
741
|
+
score_match = re.search(r'<s>([\d.]+)</s>', result)
|
|
742
|
+
project_match = re.search(r'<p>([^<]+)</p>', result)
|
|
743
|
+
title_match = re.search(r'<t>([^<]+)</t>', result)
|
|
744
|
+
|
|
745
|
+
if score_match and project_match and title_match:
|
|
746
|
+
return f"""<quick_search>
|
|
747
|
+
<total_matches>{total_count}</total_matches>
|
|
748
|
+
<top_result>
|
|
749
|
+
<score>{score_match.group(1)}</score>
|
|
750
|
+
<project>{project_match.group(1)}</project>
|
|
751
|
+
<title>{title_match.group(1)}</title>
|
|
752
|
+
</top_result>
|
|
753
|
+
</quick_search>"""
|
|
754
|
+
else:
|
|
755
|
+
return f"""<quick_search>
|
|
756
|
+
<total_matches>{total_count}</total_matches>
|
|
757
|
+
<message>No relevant matches found</message>
|
|
758
|
+
</quick_search>"""
|
|
759
|
+
except Exception as e:
|
|
760
|
+
await ctx.error(f"Quick search failed: {str(e)}")
|
|
761
|
+
return f"<quick_search><error>{str(e)}</error></quick_search>"
|
|
762
|
+
|
|
763
|
+
|
|
764
|
+
@mcp.tool()
|
|
765
|
+
async def search_summary(
|
|
766
|
+
ctx: Context,
|
|
767
|
+
query: str = Field(description="The search query to find semantically similar conversations"),
|
|
768
|
+
project: Optional[str] = Field(default=None, description="Search specific project only. If not provided, searches current project based on working directory. Use 'all' to search across all projects.")
|
|
769
|
+
) -> str:
|
|
770
|
+
"""Get aggregated insights from search results without individual result details."""
|
|
771
|
+
# Get more results for better summary
|
|
772
|
+
result = await reflect_on_past(
|
|
773
|
+
ctx=ctx,
|
|
774
|
+
query=query,
|
|
775
|
+
limit=10, # Get more results for analysis
|
|
776
|
+
min_score=0.6, # Lower threshold for broader context
|
|
777
|
+
project=project,
|
|
778
|
+
response_format="xml",
|
|
779
|
+
brief=False, # Get full excerpts for analysis
|
|
780
|
+
include_raw=False
|
|
781
|
+
)
|
|
782
|
+
|
|
783
|
+
# Parse results for summary generation
|
|
784
|
+
import re
|
|
785
|
+
from collections import Counter
|
|
786
|
+
|
|
787
|
+
# Extract all projects
|
|
788
|
+
projects = re.findall(r'<p>([^<]+)</p>', result)
|
|
789
|
+
project_counts = Counter(projects)
|
|
790
|
+
|
|
791
|
+
# Extract scores for statistics
|
|
792
|
+
scores = [float(s) for s in re.findall(r'<s>([\d.]+)</s>', result)]
|
|
793
|
+
avg_score = sum(scores) / len(scores) if scores else 0
|
|
794
|
+
|
|
795
|
+
# Extract themes from titles and excerpts
|
|
796
|
+
titles = re.findall(r'<t>([^<]+)</t>', result)
|
|
797
|
+
excerpts = re.findall(r'<e>([^<]+)</e>', result)
|
|
798
|
+
|
|
799
|
+
# Extract metadata
|
|
800
|
+
count_match = re.search(r'<tc>(\d+)</tc>', result)
|
|
801
|
+
total_count = count_match.group(1) if count_match else "0"
|
|
802
|
+
|
|
803
|
+
# Generate summary
|
|
804
|
+
summary = f"""<search_summary>
|
|
805
|
+
<total_matches>{total_count}</total_matches>
|
|
806
|
+
<searched_projects>{len(project_counts)}</searched_projects>
|
|
807
|
+
<average_relevance>{avg_score:.2f}</average_relevance>
|
|
808
|
+
<project_distribution>"""
|
|
809
|
+
|
|
810
|
+
for proj, count in project_counts.most_common(3):
|
|
811
|
+
summary += f"\n <project name='{proj}' matches='{count}'/>"
|
|
812
|
+
|
|
813
|
+
summary += f"""
|
|
814
|
+
</project_distribution>
|
|
815
|
+
<common_themes>"""
|
|
816
|
+
|
|
817
|
+
# Simple theme extraction from titles
|
|
818
|
+
theme_words = []
|
|
819
|
+
for title in titles[:5]: # Top 5 results
|
|
820
|
+
words = [w.lower() for w in title.split() if len(w) > 4]
|
|
821
|
+
theme_words.extend(words)
|
|
822
|
+
|
|
823
|
+
theme_counts = Counter(theme_words)
|
|
824
|
+
for theme, count in theme_counts.most_common(5):
|
|
825
|
+
if count > 1: # Only show repeated themes
|
|
826
|
+
summary += f"\n <theme>{theme}</theme>"
|
|
827
|
+
|
|
828
|
+
summary += """
|
|
829
|
+
</common_themes>
|
|
830
|
+
</search_summary>"""
|
|
831
|
+
|
|
832
|
+
return summary
|
|
833
|
+
|
|
834
|
+
|
|
835
|
+
@mcp.tool()
|
|
836
|
+
async def get_more_results(
|
|
837
|
+
ctx: Context,
|
|
838
|
+
query: str = Field(description="The original search query"),
|
|
839
|
+
offset: int = Field(default=3, description="Number of results to skip (for pagination)"),
|
|
840
|
+
limit: int = Field(default=3, description="Number of additional results to return"),
|
|
841
|
+
min_score: float = Field(default=0.7, description="Minimum similarity score (0-1)"),
|
|
842
|
+
project: Optional[str] = Field(default=None, description="Search specific project only")
|
|
843
|
+
) -> str:
|
|
844
|
+
"""Get additional search results after an initial search (pagination support)."""
|
|
845
|
+
# Note: Since Qdrant doesn't support true offset in our current implementation,
|
|
846
|
+
# we'll fetch offset+limit results and slice
|
|
847
|
+
total_limit = offset + limit
|
|
848
|
+
|
|
849
|
+
# Get the larger result set
|
|
850
|
+
result = await reflect_on_past(
|
|
851
|
+
ctx=ctx,
|
|
852
|
+
query=query,
|
|
853
|
+
limit=total_limit,
|
|
854
|
+
min_score=min_score,
|
|
855
|
+
project=project,
|
|
856
|
+
response_format="xml",
|
|
857
|
+
brief=False,
|
|
858
|
+
include_raw=False
|
|
859
|
+
)
|
|
860
|
+
|
|
861
|
+
# Parse and extract only the additional results
|
|
862
|
+
import re
|
|
863
|
+
|
|
864
|
+
# Find all result blocks
|
|
865
|
+
result_pattern = r'<r>.*?</r>'
|
|
866
|
+
all_results = re.findall(result_pattern, result, re.DOTALL)
|
|
867
|
+
|
|
868
|
+
# Get only the results after offset
|
|
869
|
+
additional_results = all_results[offset:offset+limit] if len(all_results) > offset else []
|
|
870
|
+
|
|
871
|
+
if not additional_results:
|
|
872
|
+
return """<more_results>
|
|
873
|
+
<message>No additional results found</message>
|
|
874
|
+
</more_results>"""
|
|
875
|
+
|
|
876
|
+
# Reconstruct response with only additional results
|
|
877
|
+
response = f"""<more_results>
|
|
878
|
+
<offset>{offset}</offset>
|
|
879
|
+
<count>{len(additional_results)}</count>
|
|
880
|
+
<results>
|
|
881
|
+
{''.join(additional_results)}
|
|
882
|
+
</results>
|
|
883
|
+
</more_results>"""
|
|
884
|
+
|
|
885
|
+
return response
|
|
886
|
+
|
|
887
|
+
|
|
558
888
|
# Debug output
|
|
559
889
|
print(f"[DEBUG] FastMCP server created with name: {mcp.name}")
|
|
890
|
+
|
|
891
|
+
# Run the server
|
|
892
|
+
if __name__ == "__main__":
|
|
893
|
+
mcp.run()
|