npm - screenpipe-mcp - Versions diffs - 0.8.3 → 0.8.6 - Mend

screenpipe-mcp 0.8.3 → 0.8.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -28,7 +28,47 @@ The easiest way to use screenpipe-mcp is with npx. Edit your Claude Desktop conf
 }
 ```
-### Option 2: From Source
+### Option 2: HTTP Server (Remote / Network Access)
+The MCP server can run over HTTP using the [Streamable HTTP transport](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http), allowing remote MCP clients to connect over the network instead of stdio. This is ideal when your AI assistant (e.g., OpenClaw) runs on a different machine than screenpipe.
+```bash
+# from npm
+npx screenpipe-mcp-http --port 3031
+# or from source
+npm run start:http -- --port 3031
+```
+The server exposes:
+- **MCP endpoint**: `http://localhost:3031/mcp` — Streamable HTTP transport (POST for requests, GET for SSE stream)
+- **Health check**: `http://localhost:3031/health`
+**Options:**
+| Flag | Description | Default |
+|------|-------------|---------|
+| `--port` | Port for the MCP HTTP server | 3031 |
+| `--screenpipe-port` | Port where screenpipe API is running | 3030 |
+**Connecting a remote MCP client:**
+Point any MCP client that supports HTTP transport at the `/mcp` endpoint:
+```json
+{
+  "mcpServers": {
+    "screenpipe": {
+      "url": "http://<your-ip>:3031/mcp"
+    }
+  }
+}
+```
+If your machines are on different networks, expose port 3031 via Tailscale, SSH tunnel, or similar — see the [OpenClaw integration guide](https://docs.screenpi.pe/openclaw) for detailed examples.
+> **Note:** The HTTP server currently exposes `search_content` only. The stdio server has the full tool set (export-video, list-meetings, activity-summary, search-elements, frame-context). We're working on bringing HTTP to full parity.
+### Option 3: From Source
 Clone and build from source:
@@ -62,6 +102,13 @@ Test with MCP Inspector:
 npx @modelcontextprotocol/inspector npx screenpipe-mcp
 ```
+## Transport Modes
+| Mode | Command | Use Case |
+|------|---------|----------|
+| **stdio** (default) | `npx screenpipe-mcp` | Claude Desktop, local MCP clients |
+| **HTTP** | `npx screenpipe-mcp-http` | Remote clients, network access, OpenClaw on VPS |
 ## Available Tools
 ### search-content
@@ -79,6 +126,23 @@ Export screen recordings as video files:
 - Specify time range with start/end times
 - Configurable FPS for output video
+### activity-summary
+Get a lightweight compressed activity overview for a time range:
+- App usage with active minutes and frame counts
+- Recent accessibility texts
+- Audio speaker summary
+### list-meetings
+List detected meetings with duration, app, and attendees.
+### search-elements
+Search structured UI elements (accessibility tree nodes and OCR text blocks):
+- Filter by source, role, app, time range
+- Much lighter than search-content for targeted UI lookups
+### frame-context
+Get accessibility text, parsed tree nodes, and extracted URLs for a specific frame.
 ## Example Queries in Claude
 - "Search for any mentions of 'rust' in my screen recordings"

package/dist/index.js CHANGED Viewed

@@ -69,7 +69,7 @@ const SCREENPIPE_API = `http://localhost:${port}`;
 // Initialize server
 const server = new index_js_1.Server({
     name: "screenpipe",
-    version: "0.8.3",
+    version: "0.8.5",
 }, {
     capabilities: {
         tools: {},
@@ -85,10 +85,14 @@ const BASE_TOOLS = [
             "Returns timestamped results with app context. " +
             "Call with no parameters to get recent activity. " +
             "Use the 'screenpipe://context' resource for current time when building time-based queries.\n\n" +
+            "WHEN TO USE WHICH content_type:\n" +
+            "- For meetings/calls/conversations: content_type='audio', do NOT use q param (transcriptions are noisy, q filters too aggressively)\n" +
+            "- For screen text/reading: content_type='all' or 'accessibility'\n" +
+            "- For time spent/app usage questions: use activity-summary tool instead (this tool returns content, not time stats)\n\n" +
             "SEARCH STRATEGY: First search with ONLY time params (start_time/end_time) — no q, no app_name, no content_type. " +
             "This gives ground truth of what's recorded. Scan results to find correct app_name values, then narrow with filters using exact observed values. " +
-            "App names are case-sensitive and may differ from user input (e.g. 'Discord' vs 'Discord.exe'). " +
-            "The q param searches captured text (accessibility/OCR), NOT app names. NEVER report 'no data' after one filtered search — verify with unfiltered time-only search first.\n\n" +
+            "App names are case-sensitive (e.g. 'Discord' vs 'Discord.exe'). " +
+            "The q param searches captured text, NOT app names. NEVER report 'no data' after one filtered search — verify with unfiltered time-only search first.\n\n" +
             "DEEP LINKS: When referencing specific moments, create clickable links using IDs from search results:\n" +
             "- OCR results (PREFERRED): [10:30 AM — Chrome](screenpipe://frame/12345) — use content.frame_id from the result\n" +
             "- Audio results: [meeting at 3pm](screenpipe://timeline?timestamp=2024-01-15T15:00:00Z) — use exact timestamp from result\n" +
@@ -102,12 +106,12 @@ const BASE_TOOLS = [
             properties: {
                 q: {
                     type: "string",
-                    description: "Search query. Optional - omit to return all recent content.",
+                    description: "Search query (full-text search on captured text). Optional - omit to return all content in time range. IMPORTANT: Do NOT use q for audio/meeting searches — transcriptions are noisy and q filters too aggressively. Only use q when searching for specific text the user saw on screen.",
                 },
                 content_type: {
                     type: "string",
                     enum: ["all", "ocr", "audio", "input", "accessibility"],
-                    description: "Content type filter: 'ocr' (screen text via OCR, legacy fallback), 'audio' (transcriptions), 'input' (clicks, keystrokes, clipboard, app switches), 'accessibility' (accessibility tree text, preferred for screen content), 'all'. Default: 'all'.",
+                    description: "Content type filter: 'audio' (transcriptions — use for meetings/calls/conversations), 'accessibility' (accessibility tree text, preferred for screen content), 'ocr' (screen text via OCR, legacy fallback), 'input' (clicks, keystrokes, clipboard, app switches), 'all'. Default: 'all'. For meeting/call queries, ALWAYS use 'audio'.",
                     default: "all",
                 },
                 limit: {
@@ -123,12 +127,12 @@ const BASE_TOOLS = [
                 start_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC start time (e.g., 2024-01-15T10:00:00Z)",
+                    description: "Start time: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', '2d ago', 'now')",
                 },
                 end_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC end time (e.g., 2024-01-15T18:00:00Z)",
+                    description: "End time: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
                 },
                 app_name: {
                     type: "string",
@@ -159,6 +163,10 @@ const BASE_TOOLS = [
                     type: "string",
                     description: "Filter audio by speaker name (case-insensitive partial match)",
                 },
+                max_content_length: {
+                    type: "integer",
+                    description: "Truncate each result's text/transcription to this many characters using middle-truncation (keeps first half + last half). Useful for limiting token usage with small-context models.",
+                },
             },
         },
     },
@@ -166,7 +174,7 @@ const BASE_TOOLS = [
         name: "export-video",
         description: "Export a video of screen recordings for a specific time range. " +
             "Creates an MP4 video from the recorded frames between the start and end times.\n\n" +
-            "IMPORTANT: Use ISO 8601 UTC timestamps (e.g., 2024-01-15T10:00:00Z)\n\n" +
+            "IMPORTANT: Use ISO 8601 UTC timestamps (e.g., 2024-01-15T10:00:00Z) or relative times (e.g., '16h ago', 'now')\n\n" +
             "EXAMPLES:\n" +
             "- Last 30 minutes: Calculate timestamps from current time\n" +
             "- Specific meeting: Use the meeting's start and end times in UTC",
@@ -180,12 +188,12 @@ const BASE_TOOLS = [
                 start_time: {
                     type: "string",
                     format: "date-time",
-                    description: "Start time in ISO 8601 format UTC. MUST include timezone (Z for UTC). Example: '2024-01-15T10:00:00Z'",
+                    description: "Start time: ISO 8601 UTC (e.g., '2024-01-15T10:00:00Z') or relative (e.g., '16h ago', 'now')",
                 },
                 end_time: {
                     type: "string",
                     format: "date-time",
-                    description: "End time in ISO 8601 format UTC. MUST include timezone (Z for UTC). Example: '2024-01-15T10:30:00Z'",
+                    description: "End time: ISO 8601 UTC (e.g., '2024-01-15T10:30:00Z') or relative (e.g., 'now', '1h ago')",
                 },
                 fps: {
                     type: "number",
@@ -211,12 +219,12 @@ const BASE_TOOLS = [
                 start_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC start filter (e.g., 2024-01-15T10:00:00Z)",
+                    description: "Start filter: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', 'now')",
                 },
                 end_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC end filter (e.g., 2024-01-15T18:00:00Z)",
+                    description: "End filter: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
                 },
                 limit: {
                     type: "integer",
@@ -234,9 +242,16 @@ const BASE_TOOLS = [
     {
         name: "activity-summary",
         description: "Get a lightweight compressed activity overview for a time range (~200-500 tokens). " +
-            "Returns app usage (name, frame count, minutes), recent accessibility texts, and audio speaker summary. " +
-            "Use this FIRST for broad questions like 'what was I doing?' before drilling into search-content or search-elements. " +
-            "Much cheaper than search-content for getting an overview.",
+            "Returns app usage (name, frame count, active minutes, first/last seen), recent accessibility texts, and audio speaker summary. " +
+            "Minutes are based on active session time (consecutive frames with gaps < 5min count as active). " +
+            "first_seen/last_seen show the wall-clock span per app.\n\n" +
+            "USE THIS TOOL (not search-content or raw SQL) for:\n" +
+            "- 'how long did I spend on X?' → active_minutes per app\n" +
+            "- 'which apps did I use today?' → app list sorted by active_minutes\n" +
+            "- 'what was I doing?' → broad overview before drilling deeper\n" +
+            "- Any time-spent or app-usage question\n\n" +
+            "WARNING: Do NOT estimate time from raw frame counts or SQL queries — those are inaccurate. " +
+            "This endpoint calculates actual active session time correctly.",
         annotations: {
             title: "Activity Summary",
             readOnlyHint: true,
@@ -247,12 +262,12 @@ const BASE_TOOLS = [
                 start_time: {
                     type: "string",
                     format: "date-time",
-                    description: "Start of time range in ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z)",
+                    description: "Start of time range: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', 'now')",
                 },
                 end_time: {
                     type: "string",
                     format: "date-time",
-                    description: "End of time range in ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z)",
+                    description: "End of time range: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
                 },
                 app_name: {
                     type: "string",
@@ -296,12 +311,12 @@ const BASE_TOOLS = [
                 start_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC start time",
+                    description: "Start time: ISO 8601 UTC or relative (e.g., '16h ago', 'now')",
                 },
                 end_time: {
                     type: "string",
                     format: "date-time",
-                    description: "ISO 8601 UTC end time",
+                    description: "End time: ISO 8601 UTC or relative (e.g., 'now', '1h ago')",
                 },
                 app_name: {
                     type: "string",
@@ -421,14 +436,28 @@ Screenpipe captures four types of data:
 - **Get keyboard input**: \`{"content_type": "input"}\`
 - **Get audio only**: \`{"content_type": "audio"}\`
+## Common User Requests → Correct Tool Choice
+| User says | Use this tool | Key params |
+|-----------|--------------|------------|
+| "summarize my meeting/call" | search-content | content_type:"audio", NO q param, start_time |
+| "what did they/I say about X" | search-content | content_type:"audio", NO q param (scan results manually) |
+| "how long on X" / "which apps" / "time spent" | activity-summary | start_time, end_time |
+| "what was I doing" | activity-summary | start_time, end_time (then drill into search-content) |
+| "what was I reading/looking at" | search-content | content_type:"all", start_time |
+## Behavior Rules
+- Act immediately on clear requests. NEVER ask "what time range?" or "which content type?" when the intent is obvious.
+- If search returns empty, silently retry with wider time range or fewer filters. Do NOT ask the user what to change.
+- For meetings: ALWAYS use content_type:"audio" and do NOT use the q param. Transcriptions are noisy — q filters too aggressively and misses relevant content.
 ## search-content
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | q | Search query | (none - returns all) |
 | content_type | all/ocr/audio/input/accessibility | all |
 | limit | Max results | 10 |
-| start_time | ISO 8601 UTC | (no filter) |
-| end_time | ISO 8601 UTC | (no filter) |
+| start_time | ISO 8601 UTC or relative (e.g. '16h ago') | (no filter) |
+| end_time | ISO 8601 UTC or relative (e.g. 'now') | (no filter) |
 | app_name | Filter by app | (no filter) |
 | include_frames | Include screenshots | false |
@@ -446,6 +475,19 @@ Screenpipe captures four types of data:
 4. **Fetch frame-context** for URLs and accessibility tree of specific frames
 5. **Screenshots** (include_frames=true) only when text isn't enough
+## Chat History
+Previous screenpipe chat conversations are stored as individual JSON files in ~/.screenpipe/chats/{conversation-id}.json
+Each file contains: id, title, messages[], createdAt, updatedAt. You can read these files to reference or search previous conversations.
+## Speaker Management
+screenpipe auto-identifies speakers in audio. API endpoints for managing them:
+- \`GET /speakers/unnamed?limit=10\` — list unnamed speakers
+- \`GET /speakers/search?name=John\` — search by name
+- \`POST /speakers/update\` with \`{"id": 5, "name": "John"}\` — rename a speaker
+- \`POST /speakers/merge\` with \`{"speaker_to_keep_id": 1, "speaker_to_merge_id": 2}\` — merge duplicates
+- \`GET /speakers/similar?speaker_id=5\` — find similar speakers for merging
+- \`POST /speakers/reassign\` — reassign audio chunk to different speaker
 ## Tips
 1. Read screenpipe://context first to get current timestamps
 2. Use activity-summary before search-content for broad overview questions
@@ -928,7 +970,12 @@ server.setRequestHandler(types_js_1.CallToolRequestSchema, async (request) => {
                 }
                 const data = await response.json();
                 // Format apps
-                const appsLines = (data.apps || []).map((a) => `  ${a.name}: ${a.minutes} min (${a.frame_count} frames)`);
+                const appsLines = (data.apps || []).map((a) => {
+                    const timeSpan = a.first_seen && a.last_seen
+                        ? `, ${a.first_seen.slice(11, 16)}–${a.last_seen.slice(11, 16)} UTC`
+                        : "";
+                    return `  ${a.name}: ${a.minutes} min (${a.frame_count} frames${timeSpan})`;
+                });
                 // Format audio
                 const speakerLines = (data.audio_summary?.speakers || []).map((s) => `  ${s.name}: ${s.segment_count} segments`);
                 // Format recent texts

package/manifest.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "manifest_version": "0.3",
   "name": "screenpipe",
   "display_name": "Screenpipe",
-  "version": "0.8.3",
+  "version": "0.8.4",
   "description": "Search your screen recordings and audio transcriptions with AI",
   "long_description": "Screenpipe is a 24/7 screen and audio recorder that lets you search everything you've seen or heard. This extension connects Claude to your local screenpipe instance, enabling AI-powered search through your digital memory.",
   "author": {

package/package.json CHANGED Viewed

@@ -1,10 +1,11 @@
 {
   "name": "screenpipe-mcp",
-  "version": "0.8.3",
+  "version": "0.8.6",
   "description": "MCP server for screenpipe - search your screen recordings and audio transcriptions",
   "main": "dist/index.js",
   "bin": {
-    "screenpipe-mcp": "dist/index.js"
+    "screenpipe-mcp": "dist/index.js",
+    "screenpipe-mcp-http": "dist/http-server.js"
   },
   "scripts": {
     "build": "tsc",

package/src/index.ts CHANGED Viewed

@@ -48,7 +48,7 @@ const SCREENPIPE_API = `http://localhost:${port}`;
 const server = new Server(
   {
     name: "screenpipe",
-    version: "0.8.3",
+    version: "0.8.5",
   },
   {
     capabilities: {
@@ -68,10 +68,14 @@ const BASE_TOOLS: Tool[] = [
       "Returns timestamped results with app context. " +
       "Call with no parameters to get recent activity. " +
       "Use the 'screenpipe://context' resource for current time when building time-based queries.\n\n" +
+      "WHEN TO USE WHICH content_type:\n" +
+      "- For meetings/calls/conversations: content_type='audio', do NOT use q param (transcriptions are noisy, q filters too aggressively)\n" +
+      "- For screen text/reading: content_type='all' or 'accessibility'\n" +
+      "- For time spent/app usage questions: use activity-summary tool instead (this tool returns content, not time stats)\n\n" +
       "SEARCH STRATEGY: First search with ONLY time params (start_time/end_time) — no q, no app_name, no content_type. " +
       "This gives ground truth of what's recorded. Scan results to find correct app_name values, then narrow with filters using exact observed values. " +
-      "App names are case-sensitive and may differ from user input (e.g. 'Discord' vs 'Discord.exe'). " +
-      "The q param searches captured text (accessibility/OCR), NOT app names. NEVER report 'no data' after one filtered search — verify with unfiltered time-only search first.\n\n" +
+      "App names are case-sensitive (e.g. 'Discord' vs 'Discord.exe'). " +
+      "The q param searches captured text, NOT app names. NEVER report 'no data' after one filtered search — verify with unfiltered time-only search first.\n\n" +
       "DEEP LINKS: When referencing specific moments, create clickable links using IDs from search results:\n" +
       "- OCR results (PREFERRED): [10:30 AM — Chrome](screenpipe://frame/12345) — use content.frame_id from the result\n" +
       "- Audio results: [meeting at 3pm](screenpipe://timeline?timestamp=2024-01-15T15:00:00Z) — use exact timestamp from result\n" +
@@ -85,12 +89,12 @@ const BASE_TOOLS: Tool[] = [
       properties: {
         q: {
           type: "string",
-          description: "Search query. Optional - omit to return all recent content.",
+          description: "Search query (full-text search on captured text). Optional - omit to return all content in time range. IMPORTANT: Do NOT use q for audio/meeting searches — transcriptions are noisy and q filters too aggressively. Only use q when searching for specific text the user saw on screen.",
         },
         content_type: {
           type: "string",
           enum: ["all", "ocr", "audio", "input", "accessibility"],
-          description: "Content type filter: 'ocr' (screen text via OCR, legacy fallback), 'audio' (transcriptions), 'input' (clicks, keystrokes, clipboard, app switches), 'accessibility' (accessibility tree text, preferred for screen content), 'all'. Default: 'all'.",
+          description: "Content type filter: 'audio' (transcriptions — use for meetings/calls/conversations), 'accessibility' (accessibility tree text, preferred for screen content), 'ocr' (screen text via OCR, legacy fallback), 'input' (clicks, keystrokes, clipboard, app switches), 'all'. Default: 'all'. For meeting/call queries, ALWAYS use 'audio'.",
           default: "all",
         },
         limit: {
@@ -106,12 +110,12 @@ const BASE_TOOLS: Tool[] = [
         start_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC start time (e.g., 2024-01-15T10:00:00Z)",
+          description: "Start time: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', '2d ago', 'now')",
         },
         end_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC end time (e.g., 2024-01-15T18:00:00Z)",
+          description: "End time: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
         },
         app_name: {
           type: "string",
@@ -142,6 +146,10 @@ const BASE_TOOLS: Tool[] = [
           type: "string",
           description: "Filter audio by speaker name (case-insensitive partial match)",
         },
+        max_content_length: {
+          type: "integer",
+          description: "Truncate each result's text/transcription to this many characters using middle-truncation (keeps first half + last half). Useful for limiting token usage with small-context models.",
+        },
       },
     },
   },
@@ -150,7 +158,7 @@ const BASE_TOOLS: Tool[] = [
     description:
       "Export a video of screen recordings for a specific time range. " +
       "Creates an MP4 video from the recorded frames between the start and end times.\n\n" +
-      "IMPORTANT: Use ISO 8601 UTC timestamps (e.g., 2024-01-15T10:00:00Z)\n\n" +
+      "IMPORTANT: Use ISO 8601 UTC timestamps (e.g., 2024-01-15T10:00:00Z) or relative times (e.g., '16h ago', 'now')\n\n" +
       "EXAMPLES:\n" +
       "- Last 30 minutes: Calculate timestamps from current time\n" +
       "- Specific meeting: Use the meeting's start and end times in UTC",
@@ -165,13 +173,13 @@ const BASE_TOOLS: Tool[] = [
           type: "string",
           format: "date-time",
           description:
-            "Start time in ISO 8601 format UTC. MUST include timezone (Z for UTC). Example: '2024-01-15T10:00:00Z'",
+            "Start time: ISO 8601 UTC (e.g., '2024-01-15T10:00:00Z') or relative (e.g., '16h ago', 'now')",
         },
         end_time: {
           type: "string",
           format: "date-time",
           description:
-            "End time in ISO 8601 format UTC. MUST include timezone (Z for UTC). Example: '2024-01-15T10:30:00Z'",
+            "End time: ISO 8601 UTC (e.g., '2024-01-15T10:30:00Z') or relative (e.g., 'now', '1h ago')",
         },
         fps: {
           type: "number",
@@ -199,12 +207,12 @@ const BASE_TOOLS: Tool[] = [
         start_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC start filter (e.g., 2024-01-15T10:00:00Z)",
+          description: "Start filter: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', 'now')",
         },
         end_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC end filter (e.g., 2024-01-15T18:00:00Z)",
+          description: "End filter: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
         },
         limit: {
           type: "integer",
@@ -223,9 +231,16 @@ const BASE_TOOLS: Tool[] = [
     name: "activity-summary",
     description:
       "Get a lightweight compressed activity overview for a time range (~200-500 tokens). " +
-      "Returns app usage (name, frame count, minutes), recent accessibility texts, and audio speaker summary. " +
-      "Use this FIRST for broad questions like 'what was I doing?' before drilling into search-content or search-elements. " +
-      "Much cheaper than search-content for getting an overview.",
+      "Returns app usage (name, frame count, active minutes, first/last seen), recent accessibility texts, and audio speaker summary. " +
+      "Minutes are based on active session time (consecutive frames with gaps < 5min count as active). " +
+      "first_seen/last_seen show the wall-clock span per app.\n\n" +
+      "USE THIS TOOL (not search-content or raw SQL) for:\n" +
+      "- 'how long did I spend on X?' → active_minutes per app\n" +
+      "- 'which apps did I use today?' → app list sorted by active_minutes\n" +
+      "- 'what was I doing?' → broad overview before drilling deeper\n" +
+      "- Any time-spent or app-usage question\n\n" +
+      "WARNING: Do NOT estimate time from raw frame counts or SQL queries — those are inaccurate. " +
+      "This endpoint calculates actual active session time correctly.",
     annotations: {
       title: "Activity Summary",
       readOnlyHint: true,
@@ -236,12 +251,12 @@ const BASE_TOOLS: Tool[] = [
         start_time: {
           type: "string",
           format: "date-time",
-          description: "Start of time range in ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z)",
+          description: "Start of time range: ISO 8601 UTC (e.g., 2024-01-15T10:00:00Z) or relative (e.g., '16h ago', 'now')",
         },
         end_time: {
           type: "string",
           format: "date-time",
-          description: "End of time range in ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z)",
+          description: "End of time range: ISO 8601 UTC (e.g., 2024-01-15T18:00:00Z) or relative (e.g., 'now', '1h ago')",
         },
         app_name: {
           type: "string",
@@ -286,12 +301,12 @@ const BASE_TOOLS: Tool[] = [
         start_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC start time",
+          description: "Start time: ISO 8601 UTC or relative (e.g., '16h ago', 'now')",
         },
         end_time: {
           type: "string",
           format: "date-time",
-          description: "ISO 8601 UTC end time",
+          description: "End time: ISO 8601 UTC or relative (e.g., 'now', '1h ago')",
         },
         app_name: {
           type: "string",
@@ -418,14 +433,28 @@ Screenpipe captures four types of data:
 - **Get keyboard input**: \`{"content_type": "input"}\`
 - **Get audio only**: \`{"content_type": "audio"}\`
+## Common User Requests → Correct Tool Choice
+| User says | Use this tool | Key params |
+|-----------|--------------|------------|
+| "summarize my meeting/call" | search-content | content_type:"audio", NO q param, start_time |
+| "what did they/I say about X" | search-content | content_type:"audio", NO q param (scan results manually) |
+| "how long on X" / "which apps" / "time spent" | activity-summary | start_time, end_time |
+| "what was I doing" | activity-summary | start_time, end_time (then drill into search-content) |
+| "what was I reading/looking at" | search-content | content_type:"all", start_time |
+## Behavior Rules
+- Act immediately on clear requests. NEVER ask "what time range?" or "which content type?" when the intent is obvious.
+- If search returns empty, silently retry with wider time range or fewer filters. Do NOT ask the user what to change.
+- For meetings: ALWAYS use content_type:"audio" and do NOT use the q param. Transcriptions are noisy — q filters too aggressively and misses relevant content.
 ## search-content
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | q | Search query | (none - returns all) |
 | content_type | all/ocr/audio/input/accessibility | all |
 | limit | Max results | 10 |
-| start_time | ISO 8601 UTC | (no filter) |
-| end_time | ISO 8601 UTC | (no filter) |
+| start_time | ISO 8601 UTC or relative (e.g. '16h ago') | (no filter) |
+| end_time | ISO 8601 UTC or relative (e.g. 'now') | (no filter) |
 | app_name | Filter by app | (no filter) |
 | include_frames | Include screenshots | false |
@@ -443,6 +472,19 @@ Screenpipe captures four types of data:
 4. **Fetch frame-context** for URLs and accessibility tree of specific frames
 5. **Screenshots** (include_frames=true) only when text isn't enough
+## Chat History
+Previous screenpipe chat conversations are stored as individual JSON files in ~/.screenpipe/chats/{conversation-id}.json
+Each file contains: id, title, messages[], createdAt, updatedAt. You can read these files to reference or search previous conversations.
+## Speaker Management
+screenpipe auto-identifies speakers in audio. API endpoints for managing them:
+- \`GET /speakers/unnamed?limit=10\` — list unnamed speakers
+- \`GET /speakers/search?name=John\` — search by name
+- \`POST /speakers/update\` with \`{"id": 5, "name": "John"}\` — rename a speaker
+- \`POST /speakers/merge\` with \`{"speaker_to_keep_id": 1, "speaker_to_merge_id": 2}\` — merge duplicates
+- \`GET /speakers/similar?speaker_id=5\` — find similar speakers for merging
+- \`POST /speakers/reassign\` — reassign audio chunk to different speaker
 ## Tips
 1. Read screenpipe://context first to get current timestamps
 2. Use activity-summary before search-content for broad overview questions
@@ -993,8 +1035,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
         // Format apps
         const appsLines = (data.apps || []).map(
-          (a: { name: string; frame_count: number; minutes: number }) =>
-            `  ${a.name}: ${a.minutes} min (${a.frame_count} frames)`
+          (a: { name: string; frame_count: number; minutes: number; first_seen?: string; last_seen?: string }) => {
+            const timeSpan = a.first_seen && a.last_seen
+              ? `, ${a.first_seen.slice(11, 16)}–${a.last_seen.slice(11, 16)} UTC`
+              : "";
+            return `  ${a.name}: ${a.minutes} min (${a.frame_count} frames${timeSpan})`;
+          }
         );
         // Format audio