npm - botholomew - Versions diffs - 0.9.5 → 0.9.7 - Mend

botholomew 0.9.5 → 0.9.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/package.json +1 -1
package/src/chat/agent.ts +19 -2
package/src/commands/context.ts +1 -0
package/src/context/chunker.ts +40 -3
package/src/context/ingest.ts +8 -1
package/src/context/refresh.ts +28 -18
package/src/db/connection.ts +2 -4
package/src/db/context.ts +16 -3
package/src/db/embeddings.ts +37 -2
package/src/db/schema.ts +10 -2
package/src/db/sql/11-rebuild_hnsw.sql +8 -9
package/src/db/sql/13-drive-paths.sql +0 -2
package/src/db/sql/14-drop_hnsw_index.sql +8 -0
package/src/db/sql/15-fts_index.sql +8 -0
package/src/db/sql/16-source_url.sql +7 -0
package/src/db/sql/6-vss_index.sql +7 -1
package/src/db/sql/7-drop_embeddings_fk.sql +0 -1
package/src/worker/prompt.ts +19 -2

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "botholomew",
-  "version": "0.9.5",
+  "version": "0.9.7",
   "description": "An autonomous AI agent for knowledge work — works your task queue while you sleep.",
   "type": "module",
   "bin": {

package/src/chat/agent.ts CHANGED Viewed

@@ -115,9 +115,26 @@ Format your responses using Markdown. Use headings, bold, italic, lists, and cod
     prompt += `
 ## External Tools (MCP)
-Before reaching for MCP tools to **find** information, check local context first — content from Drive, Gmail, GitHub, URLs, and prior agent runs is often already ingested. Use \`search_semantic\` (semantic) or \`context_search\` (keyword) across drives, then \`context_read\` / \`context_tree\` to drill in. Only fall through to \`mcp_exec\` when the data is fresh, write-side (sending an email, creating an issue), or genuinely missing locally.
+### Local context first
-You have access to external tools via MCP servers. Before calling any MCP tool you haven't used yet this session, you MUST fetch its schema first:
+**Before any MCP read, search local context.** Drive, Gmail, GitHub, URLs, and prior agent runs are usually already ingested — refetching is slower, costs tokens, and risks rate limits.
+Workflow for any "look up / find / read" intent:
+1. \`search_semantic\` (semantic) or \`context_search\` (keyword), then \`context_read\` / \`context_tree\` to drill in.
+2. If freshness matters, call \`context_info\` and check \`indexed_at\`. To re-pull a single stale item, use \`context_refresh\` rather than going to MCP for the whole document.
+3. Only call \`mcp_exec\` for reads when the data is genuinely missing locally **or** must be real-time (e.g., "what's on my calendar right now").
+Writes always go through MCP — sending an email, creating an issue, posting to Slack. Don't search context first for those.
+Examples:
+- "What does doc X say?" → \`search_semantic\` first.
+- "Any new emails from Y?" → check the \`gmail\` drive first; only hit Gmail MCP if the freshest indexed item is too old for the question.
+- "Send an email to Y" → MCP write directly; no context lookup.
+### Calling MCP tools
+Before calling any MCP tool you haven't used yet this session, you MUST fetch its schema first:
 1. Discover tools with \`mcp_search\` (preferred — semantic) or \`mcp_list_tools\`.
 2. Call \`mcp_info\` with the exact \`server\` and \`tool\` to read the tool's input schema, required fields, and types.

package/src/commands/context.ts CHANGED Viewed

@@ -871,6 +871,7 @@ async function addUrl(
       drive: target.drive,
       path: target.path,
       isTextual: true,
+      sourceUrl: fetched.sourceUrl,
     };
     const item =

package/src/context/chunker.ts CHANGED Viewed

@@ -1,5 +1,6 @@
 import Anthropic from "@anthropic-ai/sdk";
 import type { BotholomewConfig } from "../config/schemas.ts";
+import { logger } from "../utils/logger.ts";
 export interface Chunk {
   index: number;
@@ -16,6 +17,10 @@ const DEFAULT_OVERLAP_LINES = 2;
 // 8192-token limit, leaving headroom for the title/description prefix
 // prepended at embed time.
 const MAX_CHUNK_CHARS = 15_000;
+// Target size for deterministic fallback chunks. Smaller than MAX_CHUNK_CHARS
+// so a large doc produces multiple chunks of reasonable granularity when the
+// LLM chunker fails.
+const FALLBACK_TARGET_CHARS = 4_000;
 const CHUNKER_TOOL_NAME = "return_chunks";
 const CHUNKER_TOOL = {
@@ -152,6 +157,26 @@ export function addOverlapToChunks(
   });
 }
+export type LLMChunkerFn = (
+  content: string,
+  mimeType: string,
+  config: Required<BotholomewConfig>,
+) => Promise<Chunk[]>;
+/**
+ * Deterministic fallback that splits content on paragraph / line /
+ * hard-char boundaries. Used when the LLM chunker errors or times out.
+ */
+export function chunkByTextSplit(
+  content: string,
+  targetChars = FALLBACK_TARGET_CHARS,
+): Chunk[] {
+  return splitText(content, targetChars).map((c, i) => ({
+    index: i,
+    content: c,
+  }));
+}
 /**
  * LLM-driven chunker that asks Claude to identify semantic boundaries.
  * Uses structured outputs via tool_use with forced tool_choice.
@@ -167,7 +192,7 @@ export async function chunkWithLLM(
   const response = await Promise.race([
     client.messages.create({
       model: config.chunker_model,
-      max_tokens: 1024,
+      max_tokens: 2048,
       tools: [CHUNKER_TOOL],
       tool_choice: { type: "tool", name: CHUNKER_TOOL_NAME },
       messages: [
@@ -209,13 +234,15 @@ ${content}`,
 }
 /**
- * Chunk content using the LLM chunker.
+ * Chunk content using the LLM chunker, with a deterministic fallback
+ * when the LLM call fails (timeout, empty boundaries, API error, …).
  * Short content (<200 chars) is returned as a single chunk.
  */
 export async function chunk(
   content: string,
   mimeType: string,
   config: Required<BotholomewConfig>,
+  llmChunker: LLMChunkerFn = chunkWithLLM,
 ): Promise<Chunk[]> {
   if (content.length < SHORT_CONTENT_THRESHOLD) {
     return [{ index: 0, content }];
@@ -227,7 +254,17 @@ export async function chunk(
     );
   }
-  const chunks = await chunkWithLLM(content, mimeType, config);
+  let chunks: Chunk[];
+  try {
+    chunks = await llmChunker(content, mimeType, config);
+  } catch (err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    logger.warn(
+      `chunker: LLM chunking failed (${msg}); falling back to deterministic text split`,
+    );
+    chunks = chunkByTextSplit(content);
+  }
   // Enforce a hard size cap before AND after overlap. The first pass handles
   // oversize chunks from the LLM (common for docs with very long lines); the
   // second pass handles the rare case where added overlap pushes a near-limit

package/src/context/ingest.ts CHANGED Viewed

@@ -1,7 +1,11 @@
 import type { BotholomewConfig } from "../config/schemas.ts";
 import type { DbConnection } from "../db/connection.ts";
 import { getContextItem, getContextItemById } from "../db/context.ts";
-import { createEmbedding, deleteEmbeddingsForItem } from "../db/embeddings.ts";
+import {
+  createEmbedding,
+  deleteEmbeddingsForItem,
+  rebuildSearchIndex,
+} from "../db/embeddings.ts";
 import { logger } from "../utils/logger.ts";
 import { chunk } from "./chunker.ts";
 import { type DriveTarget, formatDriveRef } from "./drives.ts";
@@ -121,6 +125,9 @@ export async function storeIngestion(
     throw err;
   }
+  // FTS index is a snapshot and doesn't see the writes above until rebuilt.
+  await rebuildSearchIndex(conn);
   const action = isUpdate ? "updated" : "added";
   logger.info(
     `ingest: ${action} ${prepared.chunks.length} chunks for "${prepared.title}" (${prepared.itemId})`,

package/src/context/refresh.ts CHANGED Viewed

@@ -3,7 +3,7 @@ import type { BotholomewConfig } from "../config/schemas.ts";
 import type { DbConnection } from "../db/connection.ts";
 import { type ContextItem, updateContextItem } from "../db/context.ts";
 import { formatDriveRef } from "./drives.ts";
-import { fetchUrl } from "./fetcher.ts";
+import { type FetchedContent, fetchUrl } from "./fetcher.ts";
 import {
   type PreparedIngestion,
   prepareIngestion,
@@ -40,6 +40,13 @@ export interface RefreshOptions {
 type IngestEmbedFn = (texts: string[]) => Promise<number[][]>;
+/** Signature compatible with {@link fetchUrl}. Injectable for tests. */
+export type FetchUrlFn = (
+  url: string,
+  config: Required<BotholomewConfig>,
+  mcpxClient: McpxClient | null,
+) => Promise<FetchedContent>;
 /**
  * Refresh a batch of context items: re-read from origin, diff, update
  * content, and re-embed only the items that changed.
@@ -47,10 +54,12 @@ type IngestEmbedFn = (texts: string[]) => Promise<number[][]>;
  * Dispatches on `drive`:
  *   disk  → read from filesystem
  *   agent → skip (no external origin)
- *   other → re-fetch as a URL (the path is either a full URL for `url` drive
- *           or an origin-specific identifier that fetchUrl can re-derive via
- *           the MCP agent; for now this only refreshes items stored under
- *           `url:/<full-url>`)
+ *   other → re-fetch via `item.source_url` (captured at ingest time).
+ *           The built-in `url` drive stores the URL as its path so it can
+ *           also refresh directly from `path`. Any other drive with no
+ *           `source_url` surfaces a per-item error — the user must re-add
+ *           from URL. No code here knows anything about the remote
+ *           service behind a drive.
  */
 export async function refreshContextItems(
   conn: DbConnection,
@@ -59,6 +68,7 @@ export async function refreshContextItems(
   mcpxClient: McpxClient | null,
   opts: RefreshOptions = {},
   embedFn?: IngestEmbedFn,
+  fetchFn: FetchUrlFn = fetchUrl,
 ): Promise<RefreshResult> {
   const refreshable = items.filter((i) => i.drive !== "agent");
@@ -84,20 +94,20 @@ export async function refreshContextItems(
           continue;
         }
         content = await bunFile.text();
-      } else if (item.drive === "url") {
-        const url = item.path.startsWith("/") ? item.path.slice(1) : item.path;
-        const fetched = await fetchUrl(url, config, mcpxClient);
-        content = fetched.content;
       } else {
-        // Service-specific drives (google-docs, github, etc.) — only
-        // refreshable when the original URL can be reconstructed. For now,
-        // we punt: mark as error so the user knows to re-add from URL.
-        results.push({
-          ...base,
-          status: "error",
-          error: `Refresh not implemented for drive '${item.drive}' — re-add from the original URL.`,
-        });
-        continue;
+        const url =
+          item.source_url ??
+          (item.drive === "url" ? item.path.replace(/^\//, "") : null);
+        if (!url) {
+          results.push({
+            ...base,
+            status: "error",
+            error: `Cannot refresh ${formatDriveRef(item)}: no source_url recorded. Re-add from the original URL.`,
+          });
+          continue;
+        }
+        const fetched = await fetchFn(url, config, mcpxClient);
+        content = fetched.content;
       }
       if (content === item.content) {

package/src/db/connection.ts CHANGED Viewed

@@ -186,8 +186,7 @@ export async function getConnection(dbPath?: string): Promise<DbConnection> {
   if (isMemoryPath(path)) {
     const instance = await DuckDBInstance.create(path);
     const conn = await instance.connect();
-    await conn.run("INSTALL vss; LOAD vss;");
-    await conn.run("SET hnsw_enable_experimental_persistence = true;");
+    await conn.run("INSTALL fts; LOAD fts;");
     return new DbConnection(conn, instance, path);
   }
@@ -197,8 +196,7 @@ export async function getConnection(dbPath?: string): Promise<DbConnection> {
     // INSTALL is a no-op after the first successful install (the extension
     // is persisted to the user's DuckDB extension directory). LOAD is
     // cheap per connection.
-    await conn.run("INSTALL vss; LOAD vss;");
-    await conn.run("SET hnsw_enable_experimental_persistence = true;");
+    await conn.run("INSTALL fts; LOAD fts;");
     return new DbConnection(conn, null, path);
   } catch (err) {
     releaseInstance(path);

package/src/db/context.ts CHANGED Viewed

@@ -17,6 +17,7 @@ export interface ContextItem {
   is_textual: boolean;
   drive: string;
   path: string;
+  source_url: string | null;
   indexed_at: Date | null;
   created_at: Date;
   updated_at: Date;
@@ -38,6 +39,7 @@ interface ContextItemRow {
   is_textual: boolean;
   drive: string;
   path: string;
+  source_url: string | null;
   indexed_at: string | null;
   created_at: string;
   updated_at: string;
@@ -53,6 +55,7 @@ function rowToContextItem(row: ContextItemRow): ContextItem {
     is_textual: !!row.is_textual,
     drive: row.drive,
     path: row.path,
+    source_url: row.source_url,
     indexed_at: row.indexed_at ? new Date(row.indexed_at) : null,
     created_at: new Date(row.created_at),
     updated_at: new Date(row.updated_at),
@@ -84,12 +87,13 @@ export async function createContextItem(
     path: string;
     description?: string;
     isTextual?: boolean;
+    sourceUrl?: string | null;
   },
 ): Promise<ContextItem> {
   const id = uuidv7();
   const row = await db.queryGet<ContextItemRow>(
-    `INSERT INTO context_items (id, title, description, content, mime_type, is_textual, drive, path)
-     VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8)
+    `INSERT INTO context_items (id, title, description, content, mime_type, is_textual, drive, path, source_url)
+     VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9)
      RETURNING *`,
     id,
     params.title,
@@ -99,6 +103,7 @@ export async function createContextItem(
     params.isTextual !== false,
     params.drive,
     params.path,
+    params.sourceUrl ?? null,
   );
   if (!row) throw new Error("INSERT did not return a row");
   return rowToContextItem(row);
@@ -122,6 +127,7 @@ export async function upsertContextItem(
     path: string;
     description?: string;
     isTextual?: boolean;
+    sourceUrl?: string | null;
   },
 ): Promise<ContextItem> {
   const existing = await getContextItem(db, {
@@ -133,6 +139,7 @@ export async function upsertContextItem(
       title: params.title,
       content: params.content,
       mime_type: params.mimeType,
+      source_url: params.sourceUrl,
     });
     if (!updated)
       throw new Error(
@@ -157,6 +164,7 @@ export async function createContextItemStrict(
     path: string;
     description?: string;
     isTextual?: boolean;
+    sourceUrl?: string | null;
   },
 ): Promise<ContextItem> {
   const existing = await getContextItem(db, {
@@ -426,7 +434,10 @@ export async function updateContextItem(
   db: DbConnection,
   id: string,
   updates: Partial<
-    Pick<ContextItem, "title" | "description" | "content" | "mime_type">
+    Pick<
+      ContextItem,
+      "title" | "description" | "content" | "mime_type" | "source_url"
+    >
   >,
 ): Promise<ContextItem | null> {
   const { setClauses, params } = buildSetClauses([
@@ -434,6 +445,7 @@ export async function updateContextItem(
     ["description", updates.description],
     ["content", updates.content],
     ["mime_type", updates.mime_type],
+    ["source_url", updates.source_url],
   ]);
   setClauses.push("updated_at = current_timestamp::VARCHAR");
@@ -514,6 +526,7 @@ export async function copyContextItem(
     drive: dst.drive,
     path: dst.path,
     isTextual: source.is_textual,
+    sourceUrl: source.source_url,
   });
 }

package/src/db/embeddings.ts CHANGED Viewed

@@ -45,6 +45,11 @@ function rowToEmbedding(row: EmbeddingRow): Embedding {
   };
 }
+/**
+ * Insert a single embedding row. Callers that bulk-write embeddings are
+ * responsible for calling `rebuildSearchIndex()` afterward — the FTS index is
+ * a snapshot and will not reflect new rows until rebuilt.
+ */
 export async function createEmbedding(
   conn: DbConnection,
   params: {
@@ -92,6 +97,11 @@ export async function getEmbeddingsForItem(
   return rows.map(rowToEmbedding);
 }
+/**
+ * Delete all embeddings for a context item. Callers are responsible for
+ * calling `rebuildSearchIndex()` afterward — the FTS index is a snapshot and
+ * will still reference the deleted rows until rebuilt.
+ */
 export async function deleteEmbeddingsForItem(
   conn: DbConnection,
   contextItemId: string,
@@ -138,6 +148,25 @@ export interface HybridSearchResult extends EmbeddingSearchResult {
   path: string | null;
 }
+/**
+ * Rebuild the FTS index over (chunk_content, title). DuckDB's FTS index is a
+ * snapshot — it does not update incrementally on INSERT/UPDATE/DELETE, so any
+ * batch writer must call this once its transaction commits. Cheap at our
+ * scale (hundreds to low thousands of rows).
+ *
+ * The trailing CHECKPOINT is load-bearing: `overwrite = 1` writes a
+ * `DROP SCHEMA fts_main_embeddings` record into the WAL. If the WAL still
+ * contains that drop on the next open, replay fails with "Cannot drop entry
+ * 'fts_main_embeddings' because there are entries that depend on it". Forcing
+ * a checkpoint flushes the WAL so the next open has nothing to replay.
+ */
+export async function rebuildSearchIndex(conn: DbConnection): Promise<void> {
+  await conn.exec(
+    "PRAGMA create_fts_index('embeddings', 'id', 'chunk_content', 'title', overwrite = 1)",
+  );
+  await conn.exec("CHECKPOINT");
+}
 export async function hybridSearch(
   conn: DbConnection,
   query: string,
@@ -146,10 +175,16 @@ export async function hybridSearch(
 ): Promise<HybridSearchResult[]> {
   const k = 60; // RRF constant
+  // Keyword side: BM25 over chunk_content + title via the FTS extension.
+  // `match_bm25` returns NULL for rows with no token overlap; we keep only
+  // scored rows and order by descending score so RRF sees the best matches
+  // at the lowest ranks. Stemming, stopwords, and tokenization are handled
+  // by FTS — more query terms produce higher scores, which is exactly the
+  // behaviour a naive per-token ILIKE loop fails to provide.
   const keywordRows = await conn.queryAll<EmbeddingRow>(
     `SELECT * FROM embeddings
-     WHERE chunk_content ILIKE '%' || ?1 || '%'
-        OR title ILIKE '%' || ?1 || '%'
+     WHERE fts_main_embeddings.match_bm25(id, ?1) IS NOT NULL
+     ORDER BY fts_main_embeddings.match_bm25(id, ?1) DESC
      LIMIT 100`,
     query,
   );

package/src/db/schema.ts CHANGED Viewed

@@ -1,5 +1,6 @@
 import { readdirSync, readFileSync } from "node:fs";
 import { join } from "node:path";
+import { logger } from "../utils/logger.ts";
 import type { DbConnection } from "./connection.ts";
 interface Migration {
@@ -45,9 +46,16 @@ export async function migrate(db: DbConnection): Promise<void> {
   const applied = new Set(rows.map((row) => row.id));
   // Run pending migrations in order
+  const pending = loadMigrations().filter((m) => !applied.has(m.id));
+  if (pending.length > 0) {
+    logger.info(
+      `applying ${pending.length} migration${pending.length === 1 ? "" : "s"}`,
+    );
+  }
   let appliedAny = false;
-  for (const migration of loadMigrations()) {
-    if (applied.has(migration.id)) continue;
+  for (const migration of pending) {
+    logger.info(`  ${migration.id}. ${migration.name}`);
     // Split on semicolons and run each statement individually
     const statements = migration.sql

package/src/db/sql/11-rebuild_hnsw.sql CHANGED Viewed

@@ -1,9 +1,8 @@
--- The HNSW index from migration 6 can end up in an internally-inconsistent
--- state after a native-side crash during embedding writes: the buffered WAL
--- replay tries to re-insert a node that HNSW's high-level wrapper already has,
--- and search_semantic then fails with "Duplicate keys not allowed in
--- high-level wrappers". Dropping and recreating the index rebuilds it cleanly
--- from the current contents of the embeddings table.
-DROP INDEX IF EXISTS idx_embeddings_cosine;
-CREATE INDEX idx_embeddings_cosine ON embeddings USING HNSW (embedding) WITH (metric = 'cosine');
+-- Historical: this migration used to drop and recreate the HNSW index
+-- to clean up an internally-inconsistent state after native-side crashes
+-- during embedding writes. HNSW is now gone (see migration 14) and the
+-- VSS extension is no longer loaded at connection time, so the original
+-- DDL would fail on fresh DBs. Kept as a no-op to preserve migration
+-- numbering for existing databases that have already recorded id 11 in
+-- _migrations.
+SELECT 1;

package/src/db/sql/13-drive-paths.sql CHANGED Viewed

@@ -44,6 +44,4 @@ CREATE TABLE embeddings (
   UNIQUE(context_item_id, chunk_index)
 );
-CREATE INDEX idx_embeddings_cosine ON embeddings USING HNSW (embedding) WITH (metric = 'cosine');
 CHECKPOINT;

package/src/db/sql/14-drop_hnsw_index.sql ADDED Viewed

@@ -0,0 +1,8 @@
+-- HNSW has caused two separate corruption modes in this project: the
+-- "Duplicate keys not allowed in high-level wrappers" failure addressed by
+-- migration 11, and a second mode where the index silently returns zero rows
+-- for cosine top-K queries (its stored SQL loses the `WITH (metric = 'cosine')`
+-- clause). At our scale a linear scan of array_cosine_distance is plenty fast
+-- and array_cosine_distance is a core DuckDB function — no VSS extension
+-- required. Drop the index and move on.
+DROP INDEX IF EXISTS idx_embeddings_cosine;

package/src/db/sql/15-fts_index.sql ADDED Viewed

@@ -0,0 +1,8 @@
+-- Keyword search uses DuckDB's FTS extension for BM25 ranking over
+-- chunk_content and title. The index is a snapshot and must be rebuilt
+-- after any write to the embeddings table. rebuildSearchIndex() in
+-- src/db/embeddings.ts is the single entry point and is called from the
+-- ingest transaction. overwrite = 1 makes this PRAGMA idempotent, which
+-- also gives us a first-run rebuild for users upgrading from a DB that
+-- never had FTS.
+PRAGMA create_fts_index('embeddings', 'id', 'chunk_content', 'title', overwrite = 1);

package/src/db/sql/16-source_url.sql ADDED Viewed

@@ -0,0 +1,7 @@
+-- Issue #145: preserve the original URL that produced each context item so
+-- `context refresh` can re-fetch loss-lessly for service-specific drives
+-- (google-docs, github, ...). Nullable — local-origin drives (disk, agent,
+-- tool writes) leave it NULL and use their own refresh path. Legacy rows
+-- ingested before this column existed also leave it NULL and surface a
+-- "re-add from URL" error on refresh.
+ALTER TABLE context_items ADD COLUMN source_url TEXT;

package/src/db/sql/6-vss_index.sql CHANGED Viewed

@@ -1 +1,7 @@
-CREATE INDEX IF NOT EXISTS idx_embeddings_cosine ON embeddings USING HNSW (embedding) WITH (metric = 'cosine');
+-- Historical: this migration used to CREATE an HNSW index on embeddings
+-- via the VSS extension. HNSW has since been removed (see migration 12)
+-- (see migration 14) and the VSS extension is no longer loaded at
+-- connection time, so running `CREATE INDEX ... USING HNSW` here would
+-- fail on fresh DBs. Kept as a no-op to preserve migration numbering
+-- for existing databases that have already recorded id 6 in _migrations.
+SELECT 1;

package/src/db/sql/7-drop_embeddings_fk.sql CHANGED Viewed

@@ -20,5 +20,4 @@ CREATE TABLE embeddings (
   created_at TEXT NOT NULL DEFAULT (current_timestamp::VARCHAR),
   UNIQUE(context_item_id, chunk_index)
 );
-CREATE INDEX IF NOT EXISTS idx_embeddings_cosine ON embeddings USING HNSW (embedding) WITH (metric = 'cosine');
 CHECKPOINT;

package/src/worker/prompt.ts CHANGED Viewed

@@ -131,9 +131,26 @@ When calling complete_task, write a summary that captures your key findings, dec
     prompt += `
 ## External Tools (MCP)
-Before reaching for MCP tools to **find** information, check local context first — content from Drive, Gmail, GitHub, URLs, and prior agent runs is often already ingested. Use \`search_semantic\` (semantic) or \`context_search\` (keyword) across drives, then \`context_read\` / \`context_tree\` to drill in. Only fall through to \`mcp_exec\` when the data is fresh, write-side (sending an email, creating an issue), or genuinely missing locally.
+### Local context first
-You have access to external tools via MCP servers. Before calling any MCP tool you haven't used yet this session, you MUST fetch its schema first:
+**Before any MCP read, search local context.** Drive, Gmail, GitHub, URLs, and prior agent runs are usually already ingested — refetching is slower, costs tokens, and risks rate limits.
+Workflow for any "look up / find / read" intent:
+1. \`search_semantic\` (semantic) or \`context_search\` (keyword), then \`context_read\` / \`context_tree\` to drill in.
+2. If freshness matters, call \`context_info\` and check \`indexed_at\`. To re-pull a single stale item, use \`context_refresh\` rather than going to MCP for the whole document.
+3. Only call \`mcp_exec\` for reads when the data is genuinely missing locally **or** must be real-time (e.g., "what's on my calendar right now").
+Writes always go through MCP — sending an email, creating an issue, posting to Slack. Don't search context first for those.
+Examples:
+- "What does doc X say?" → \`search_semantic\` first.
+- "Any new emails from Y?" → check the \`gmail\` drive first; only hit Gmail MCP if the freshest indexed item is too old for the question.
+- "Send an email to Y" → MCP write directly; no context lookup.
+### Calling MCP tools
+Before calling any MCP tool you haven't used yet this session, you MUST fetch its schema first:
 1. Discover tools with \`mcp_search\` (preferred — semantic) or \`mcp_list_tools\`.
 2. Call \`mcp_info\` with the exact \`server\` and \`tool\` to read the tool's input schema, required fields, and types.