npm - pi-web-toolkit - Versions diffs - 0.3.1 → 0.3.2 - Mend

pi-web-toolkit 0.3.1 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/CHANGELOG.md +16 -1
package/README.md +2 -2
package/docs/guide.md +2 -2
package/docs/tools.md +10 -7
package/extensions/firecrawl_interact.ts +13 -14
package/extensions/firecrawl_scrape.ts +13 -14
package/extensions/firecrawl_search.ts +6 -6
package/extensions/index.ts +25 -0
package/extensions/web_batch_fetch.ts +3 -7
package/extensions/web_browse.ts +5 -9
package/extensions/web_fetch.ts +5 -9
package/extensions/web_search.ts +5 -6
package/package.json +2 -2

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.3.2] - 2026-06-25
+### Fixed
+- Kept the agent's web-tool selection local-first: ordinary URL reads now prefer `web_fetch`, discovery prefers `web_search`, and interaction prefers `web_browse`; `firecrawl_*` tools are documented and prompted as fallback-only unless explicitly requested.
+- Fixed `firecrawl_scrape` and `firecrawl_interact` partial-result rendering type-check errors caused by reading `details` before declaration.
+### Changed
+- Reduced web-tool prompt metadata overhead by consolidating shared routing rules and shortening per-tool `promptSnippet`/`promptGuidelines` text.
+- Added a tool-routing prompt regression test and included it in `npm test`.
 ## [0.3.1] - 2026-06-23
 ### Changed
@@ -145,7 +157,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `web_browse` — interactive browser automation via agent-browser.
 - LLM-optimized `promptGuidelines` and `promptSnippet` for every tool.
-[Unreleased]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.2.2...HEAD
+[Unreleased]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.3.2...HEAD
+[0.3.2]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.3.1...v0.3.2
+[0.3.1]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.3.0...v0.3.1
+[0.3.0]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.2.2...v0.3.0
 [0.2.2]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.2.1...v0.2.2
 [0.2.1]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.2.0...v0.2.1
 [0.2.0]: https://github.com/Wade11s/pi-web-toolkit/compare/v0.1.2...v0.2.0

package/README.md CHANGED Viewed

@@ -22,7 +22,7 @@ Web research toolkit for [pi](https://pi.dev) agents. Search via SearXNG, fetch
 | **`firecrawl_scrape`** | [firecrawl-cli](https://github.com/firecrawl/cli) (keyless) | Cloud single-page fetch (anti-bot / JS / PDF) | — |
 | **`firecrawl_interact`** | [firecrawl-cli](https://github.com/firecrawl/cli) (keyless) | Cloud natural-language page interaction | — |
-> **Firecrawl fallback.** `web_search`, `web_fetch`, and `web_browse` automatically retry through Firecrawl Keyless (1,000 free credits/month, no API key) when their local backend errors out or search returns nothing. The three `firecrawl_*` tools are explicit escape hatches. Disable it with `PI_WEB_FIRECRAWL_FALLBACK=0`. Install the optional CLI: `npm install -g firecrawl-cli`.
+> **Firecrawl fallback.** `web_search`, `web_fetch`, and `web_browse` are the local-first primary tools and automatically retry through Firecrawl Keyless (1,000 free credits/month, no API key) only when their local backend errors out or search returns nothing. The three `firecrawl_*` tools are fallback-only escape hatches; agents are instructed not to call them first unless you explicitly ask for Firecrawl/cloud behavior or a local-first tool already failed. Disable fallback use with `PI_WEB_FIRECRAWL_FALLBACK=0`. Install the optional CLI: `npm install -g firecrawl-cli`.
 ## Tools Preview
@@ -198,7 +198,7 @@ export PI_WEB_FIRECRAWL_FALLBACK=0
 ### Optional: Firecrawl keyless fallback
-When a local backend (`web_search`/`web_fetch`/`web_browse`) fails or returns nothing, the tools automatically retry through [Firecrawl Keyless](https://www.firecrawl.dev/blog/firecrawl-keyless-launch) — 1,000 free credits/month, **no API key, no signup**. The `firecrawl_*` tools are explicit escape hatches for capabilities the local backends lack (search categories, cloud rendering, natural-language interaction).
+When a local backend (`web_search`/`web_fetch`/`web_browse`) fails or returns nothing, the tools automatically retry through [Firecrawl Keyless](https://www.firecrawl.dev/blog/firecrawl-keyless-launch) — 1,000 free credits/month, **no API key, no signup**. The `firecrawl_*` tools are fallback-only explicit escape hatches for capabilities the local backends lack (search categories, cloud rendering, natural-language interaction). Agents should use `web_fetch`/`web_search`/`web_browse` first unless you explicitly request Firecrawl/cloud behavior.
 Install the optional CLI (the fallback degrades gracefully if it is absent):

package/docs/guide.md CHANGED Viewed

@@ -46,7 +46,7 @@ User asks about something external / current
 ## Firecrawl Keyless fallback
-When a local backend cannot do the job, the tools automatically retry through **Firecrawl Keyless** (1,000 free credits/month, no API key, no signup) before giving up. It is **fallback-only** — never the primary path — and is **opt-out-able** with `PI_WEB_FIRECRAWL_FALLBACK=0`. Requires the optional `firecrawl-cli` (`npm install -g firecrawl-cli`); if it is absent the tools simply surface the original local error.
+When a local backend cannot do the job, the tools automatically retry through **Firecrawl Keyless** (1,000 free credits/month, no API key, no signup) before giving up. It is **fallback-only** — never the primary path — and is **opt-out-able** with `PI_WEB_FIRECRAWL_FALLBACK=0`. Requires the optional `firecrawl-cli` (`npm install -g firecrawl-cli`); if it is absent the tools simply surface the original local error. Agents should call `web_search`/`web_fetch`/`web_browse` first and call `firecrawl_*` directly only after the corresponding local-first tool failed, or when the user explicitly asks for Firecrawl/cloud behavior.
 | Tool | Falls back to Firecrawl when… |
 |------|-------------------------------|
@@ -55,7 +55,7 @@ When a local backend cannot do the job, the tools automatically retry through **
 | `web_browse` | agent-browser is missing or its batch fails (not on caller validation errors) |
 | `web_batch_fetch` | (no fallback — Firecrawl batch scrape is not keyless) |
-The three `firecrawl_*` tools are the explicit escape hatches for capabilities the local backends lack (`github`/`research`/`pdf` search categories, cloud rendering, natural-language interaction).
+The three `firecrawl_*` tools are fallback-only explicit escape hatches for capabilities the local backends lack (`github`/`research`/`pdf` search categories, cloud rendering, natural-language interaction). They are not the first step for ordinary URL reading; `web_fetch` already performs Firecrawl fallback internally when local fetching fails.
 **Graceful skip.** If the fallback itself cannot help — the CLI is missing, the IP is flagged as suspicious, the keyless quota is exhausted, or the fallback is disabled — the tool falls through to the original local-tool error so the user is never left worse off.

package/docs/tools.md CHANGED Viewed

@@ -12,7 +12,7 @@ Search the web via SearXNG. Returns ranked results with title, URL, and snippet.
 }
 ```
-**When to use:** The user asks about current events, facts, or anything requiring up-to-date information and has not already provided the source URLs.
+**When to use:** The user asks about current events, facts, or anything requiring up-to-date information and has not already provided the source URLs. Use `web_search` before `firecrawl_search`; `web_search` already performs Firecrawl fallback internally when SearXNG fails or returns nothing.
 **Empty results behavior:** When no results are found, `web_search` includes any query **suggestions** provided by SearXNG. The agent can use them to refine and retry the search.
@@ -35,9 +35,10 @@ Fetch a single page and convert it to clean markdown. Uses Scrapling's browser-b
 ```
 **When to use:**
-- After `web_search` finds a relevant result
+- As the first attempt for a user-provided URL or after `web_search` finds a relevant result
 - The page is static or loads its content on first request
 - You need to read **one** article, doc, or blog post
+- Before `firecrawl_scrape`; `web_fetch` already performs Firecrawl fallback internally when the local fetcher fails
 **Example flow:**
 ```
@@ -77,10 +78,12 @@ Uses the [agent-browser](https://github.com/vercel-labs/agent-browser) CLI with
 When `selector` is omitted, the tool returns agent-browser's interactive accessibility snapshot rather than full page text.
 **When to use:**
+- As the first attempt when the page requires interaction
 - The page requires **clicking** before showing target content (e.g. "Load more", pagination, tab switching)
 - The page requires **filling a form** (e.g. search box, login)
 - The page requires **scrolling** to load lazy content (infinite scroll)
 - The page requires **waiting** for JS to render content (SPA)
+- Before `firecrawl_interact`; `web_browse` already performs Firecrawl fallback internally when local browser automation fails
 **Example flows:**
@@ -163,11 +166,11 @@ User: "Compare Python asyncio, Trio, and curio"
 ---
-## Firecrawl keyless tools (optional cloud escape hatches)
+## Firecrawl keyless tools (optional fallback-only cloud escape hatches)
 These three tools talk to [Firecrawl](https://www.firecrawl.dev) in **keyless** mode: 1,000 free credits/month, **no API key and no signup**. They require the optional `firecrawl-cli` (`npm install -g firecrawl-cli`). **Privacy:** the URL/query/page content is sent to Firecrawl's cloud.
-They double as the implementation of the automatic fallback: `web_search`/`web_fetch`/`web_browse` retry through Firecrawl keyless when their local backend fails (or search returns nothing). Disable all Firecrawl usage with `PI_WEB_FIRECRAWL_FALLBACK=0`.
+They double as the implementation of the automatic fallback: `web_search`/`web_fetch`/`web_browse` retry through Firecrawl keyless when their local backend fails (or search returns nothing). Do not use `firecrawl_*` as the first attempt for ordinary search, URL reading, or page interaction; use the corresponding local-first tool first unless the user explicitly asks for Firecrawl/cloud behavior. Disable all Firecrawl usage with `PI_WEB_FIRECRAWL_FALLBACK=0`.
 ### `firecrawl_search`
@@ -187,7 +190,7 @@ Cloud web search via Firecrawl keyless, with capabilities the local SearXNG tool
 }
 ```
-**When to use:** `web_search` failed or returned nothing; or you need `github`/`research`/`pdf` categories, images/news sources, or domain scoping that SearXNG does not provide.
+**When to use:** `web_search` failed or returned nothing; you need `github`/`research`/`pdf` categories, images/news sources, or domain scoping that SearXNG does not provide; or the user explicitly asked for Firecrawl/cloud search. Do not use it before `web_search` for ordinary discovery.
 ### `firecrawl_scrape`
@@ -203,7 +206,7 @@ Cloud single-page fetch via Firecrawl keyless (anti-bot bypass, JS rendering, PD
 }
 ```
-**When to use:** `web_fetch` failed on an anti-bot-protected, JavaScript-heavy, or PDF page.
+**When to use:** `web_fetch` failed on an anti-bot-protected, JavaScript-heavy, or PDF page, or the user explicitly asked for Firecrawl/cloud scraping. Do not use it before `web_fetch` for ordinary URL reading.
 ### `firecrawl_interact`
@@ -219,6 +222,6 @@ Open a URL in a live Firecrawl browser session and drive it with a natural-langu
 }
 ```
-**When to use:** `web_browse` cannot run (agent-browser missing / OS deps missing), or you want natural-language page interaction without hand-written CSS selectors. Write each prompt as a single, focused task.
+**When to use:** `web_browse` cannot run (agent-browser missing / OS deps missing), you need natural-language page interaction without hand-written CSS selectors, or the user explicitly asked for Firecrawl/cloud interaction. Do not use it before `web_browse` for ordinary page interaction. Write each prompt as a single, focused task.
 ---

package/extensions/firecrawl_interact.ts CHANGED Viewed

@@ -39,18 +39,18 @@ const firecrawlInteractTool = defineTool({
   name: "firecrawl_interact",
   label: "Firecrawl Interact",
   description: [
+    "Fallback-only cloud browser interaction via Firecrawl keyless.",
+    "Do not use firecrawl_interact as the first attempt for ordinary page interaction; use web_browse first.",
     "Open a URL in a live Firecrawl browser session and drive it with a natural-language",
-    "prompt (or code), returning the result. Keyless — no API key, no signup.",
-    "Use firecrawl_interact when the local web_browse cannot run, or when you want",
-    "natural-language page interaction without CSS selectors.",
+    "prompt (or code), returning the result. Use only when web_browse cannot run,",
+    "when the user explicitly asks for Firecrawl/cloud interaction, or when you need natural-language page interaction without CSS selectors.",
     "Privacy: the URL, page content, and prompt are sent to Firecrawl's cloud.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Drive a page via Firecrawl keyless (natural-language interaction)",
+  promptSnippet: "Fallback-only Firecrawl interaction",
   promptGuidelines: [
-    "Prefer web_browse first; reach for firecrawl_interact when web_browse can't run or you want NL interaction.",
-    "Write each prompt as a single, focused task; the session can be reused across calls.",
-    "Always pass the full URL including https://.",
+    "Use firecrawl_interact only after web_browse fails, for needed NL interaction, or explicit cloud interaction.",
+    "Keep firecrawl_interact prompt/code focused.",
   ],
   parameters: FirecrawlInteractParamsSchema,
@@ -95,13 +95,6 @@ const firecrawlInteractTool = defineTool({
   renderResult(result, { expanded, isPartial }, theme, context) {
     const isError = context?.isError ?? false;
-    if (isPartial) {
-      const domain = details?.url ? getDomain(details.url) : "";
-      const label = domain ? `Interacting with ${domain} via Firecrawl...` : "Interacting via Firecrawl...";
-      return new Text(theme.fg("warning", label), 0, 0);
-    }
     const details = result.details as {
       url?: string;
       output?: string;
@@ -111,6 +104,12 @@ const firecrawlInteractTool = defineTool({
       creditsUsed?: number;
     } | undefined;
+    if (isPartial) {
+      const domain = details?.url ? getDomain(details.url) : "";
+      const label = domain ? `Interacting with ${domain} via Firecrawl...` : "Interacting via Firecrawl...";
+      return new Text(theme.fg("warning", label), 0, 0);
+    }
     if (isError) {
       const errText = getErrorText(result);
       let text = theme.fg("error", "✗ Firecrawl interact failed");

package/extensions/firecrawl_scrape.ts CHANGED Viewed

@@ -41,17 +41,17 @@ const firecrawlScrapeTool = defineTool({
   name: "firecrawl_scrape",
   label: "Firecrawl Scrape",
   description: [
-    "Fetch a single page as clean markdown via Firecrawl (keyless — no API key, no signup).",
-    "Use firecrawl_scrape when the local web_fetch fails on a hard target (anti-bot,",
-    "JavaScript-heavy pages, PDFs) or when you need Firecrawl's cloud rendering directly.",
+    "Fallback-only cloud fetch via Firecrawl (keyless — no API key, no signup).",
+    "Do not use firecrawl_scrape as the first attempt for ordinary URL reading; use web_fetch first.",
+    "Use firecrawl_scrape only when web_fetch already failed on a hard target (anti-bot,",
+    "JavaScript-heavy pages, PDFs), or when the user explicitly asks for Firecrawl/cloud rendering.",
     "Privacy: the URL and page content are sent to Firecrawl's cloud.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Fetch a single page via Firecrawl keyless (anti-bot / JS / PDF fallback)",
+  promptSnippet: "Fallback-only Firecrawl scrape",
   promptGuidelines: [
-    "Prefer web_fetch first; reach for firecrawl_scrape when web_fetch fails or you need cloud rendering.",
-    "firecrawl_scrape handles anti-bot protection, JS-heavy SPAs, and PDFs that scrapling may miss.",
-    "Always pass the full URL including https://.",
+    "Use firecrawl_scrape only after web_fetch fails or explicit cloud scraping/rendering.",
+    "Use firecrawl_scrape for anti-bot pages, heavy JS, and PDFs.",
   ],
   parameters: FirecrawlScrapeParamsSchema,
@@ -97,13 +97,6 @@ const firecrawlScrapeTool = defineTool({
   renderResult(result, { expanded, isPartial }, theme, context) {
     const isError = context?.isError ?? false;
-    if (isPartial) {
-      const domain = details?.url ? getDomain(details.url) : "";
-      const label = domain ? `Scraping ${domain} via Firecrawl...` : "Scraping via Firecrawl...";
-      return new Text(theme.fg("warning", label), 0, 0);
-    }
     const details = result.details as {
       url?: string;
       bytes?: number;
@@ -113,6 +106,12 @@ const firecrawlScrapeTool = defineTool({
       creditsUsed?: number;
     } | undefined;
+    if (isPartial) {
+      const domain = details?.url ? getDomain(details.url) : "";
+      const label = domain ? `Scraping ${domain} via Firecrawl...` : "Scraping via Firecrawl...";
+      return new Text(theme.fg("warning", label), 0, 0);
+    }
     if (isError) {
       const errText = getErrorText(result);
       let text = theme.fg("error", "✗ Firecrawl scrape failed");

package/extensions/firecrawl_search.ts CHANGED Viewed

@@ -42,17 +42,17 @@ const firecrawlSearchTool = defineTool({
   name: "firecrawl_search",
   label: "Firecrawl Search",
   description: [
-    "Search the web via Firecrawl (keyless — no API key, no signup).",
+    "Fallback-only cloud search via Firecrawl (keyless — no API key, no signup).",
+    "Do not use firecrawl_search as the first attempt for ordinary web discovery; use web_search first.",
     "Supports sources (web/images/news) and categories (github/research/pdf) that",
-    "SearXNG does not. Use as an escape hatch or when web_search returns nothing.",
+    "SearXNG does not. Use only as an escape hatch when web_search fails/returns nothing, or when the user explicitly asks for Firecrawl/cloud search.",
     "Privacy: the query is sent to Firecrawl's cloud.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Search the web via Firecrawl keyless (categories, sources, domain filters)",
+  promptSnippet: "Fallback-only Firecrawl search",
   promptGuidelines: [
-    "Prefer web_search first; reach for firecrawl_search when web_search fails or returns nothing.",
-    "Use categories=[\"github\"], [\"research\"], or [\"pdf\"] for source-type-specific discovery.",
-    "Use includeDomains/excludeDomains to scope results to specific sites.",
+    "Use firecrawl_search only after web_search fails/returns nothing, for Firecrawl-only categories, or explicit cloud search.",
+    "Use categories=[\"github\"|\"research\"|\"pdf\"] and includeDomains/excludeDomains when needed.",
   ],
   parameters: FirecrawlSearchParamsSchema,

package/extensions/index.ts CHANGED Viewed

@@ -17,6 +17,26 @@ import registerFirecrawlScrape from "./firecrawl_scrape";
 import registerFirecrawlSearch from "./firecrawl_search";
 import registerFirecrawlInteract from "./firecrawl_interact";
+const WEB_TOOL_ROUTING_POLICY = [
+  "Web tools are local-first: web_search=discover, web_fetch=one static URL, web_batch_fetch=2–5 static URLs, web_browse=interaction.",
+  "Use firecrawl_* only after the matching local tool failed in this conversation, or when the user explicitly asks for Firecrawl/cloud.",
+  "web_search/web_fetch/web_browse already auto-fallback to Firecrawl; pass full URLs with scheme and selectors when useful.",
+].join("\n");
+const WEB_TOOL_NAMES = new Set([
+  "web_search",
+  "web_fetch",
+  "web_browse",
+  "web_batch_fetch",
+  "firecrawl_search",
+  "firecrawl_scrape",
+  "firecrawl_interact",
+]);
+function shouldInjectWebToolRoutingPolicy(selectedTools: readonly string[] | undefined): boolean {
+  return selectedTools?.some((tool) => WEB_TOOL_NAMES.has(tool)) ?? false;
+}
 export default function (pi: ExtensionAPI) {
   registerWebSearch(pi);
   registerWebFetch(pi);
@@ -25,4 +45,9 @@ export default function (pi: ExtensionAPI) {
   registerFirecrawlScrape(pi);
   registerFirecrawlSearch(pi);
   registerFirecrawlInteract(pi);
+  pi.on("before_agent_start", (event) => {
+    if (!shouldInjectWebToolRoutingPolicy(event.systemPromptOptions.selectedTools)) return;
+    return { systemPrompt: `${event.systemPrompt}\n\n${WEB_TOOL_ROUTING_POLICY}` };
+  });
 }

package/extensions/web_batch_fetch.ts CHANGED Viewed

@@ -113,14 +113,10 @@ const webBatchFetchTool = defineTool({
     "For a single page, use web_fetch instead.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Fetch multiple URLs in parallel for research",
+  promptSnippet: "Parallel fetch for 2–5 URLs",
   promptGuidelines: [
-    "Use web_batch_fetch when web_search returns multiple (2–5) relevant pages and the agent needs to read them all at once.",
-    "Prefer web_batch_fetch over repeated web_fetch calls when reading multiple pages for comparison or synthesis.",
-    "Use web_batch_fetch for cross-referencing sources, comparing implementations, or synthesizing research from multiple sites.",
-    "For a single URL, always use web_fetch — it supports per-URL selectors and stealthy mode.",
-    "If a page in the batch fails, the tool reports the error but continues with the others.",
-    "Keep batch sizes reasonable (≤8) to avoid overwhelming the browser and token budget.",
+    "Use web_batch_fetch for 2–5 pages to compare/cross-reference/synthesize; single URL → web_fetch.",
+    "Keep batches small (≤8; schema max 15); failed pages are reported without stopping the batch.",
   ],
   parameters: WebBatchFetchParamsSchema,

package/extensions/web_browse.ts CHANGED Viewed

@@ -106,22 +106,18 @@ const webBrowseTool = defineTool({
   name: "web_browse",
   label: "Web Browse",
   description: [
-    "Interact with a web page through a browser: navigate, click, fill forms, scroll,",
+    "Primary local-first tool for interactive web pages: navigate, click, fill forms, scroll,",
     "wait for content, and then extract text.",
-    "Uses the agent-browser CLI with batched JSON commands.",
+    "Uses the agent-browser CLI with batched JSON commands, then automatically tries Firecrawl keyless only if local browser automation fails.",
     "Use web_browse when the target content requires interaction (clicking buttons,",
     "scrolling, filling search boxes, waiting for JS to load) before it becomes available.",
     "For pages that need no interaction, use web_fetch instead.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Interact with a web page (click, scroll, fill) and extract content",
+  promptSnippet: "Local browser interaction and extraction",
   promptGuidelines: [
-    "Use web_browse when a page requires clicking, scrolling, or form submission before showing target content.",
-    "Use web_browse for SPAs, pagination (click 'Load more'), search forms, tab switching, and modal dialogs.",
-    "For static articles, docs, or blogs that load everything on first request, prefer web_fetch.",
-    "After web_search returns results, prefer web_fetch for reading individual articles.",
-    "Use web_browse directly when interaction is required; otherwise try web_fetch first.",
-    "Always provide a selector to extract only the relevant content area — avoid dumping full page text.",
+    "Use web_browse only when clicks/forms/scroll/wait are needed; otherwise use web_fetch.",
+    "Provide a selector to narrow extracted content when possible.",
   ],
   parameters: WebBrowseParamsSchema,

package/extensions/web_fetch.ts CHANGED Viewed

@@ -40,19 +40,15 @@ const webFetchTool = defineTool({
   name: "web_fetch",
   label: "Web Fetch",
   description: [
-    "Fetch and extract readable content from a web page URL.",
-    "Uses scrapling to download the page and convert it to clean markdown.",
-    "Use web_fetch to read the full content of a specific result or user-provided URL.",
+    "Primary local-first tool for reading a single web page URL.",
+    "Fetches and extracts readable content via scrapling, then automatically tries Firecrawl keyless only if the local fetcher fails.",
+    "Use web_fetch as the first attempt to read the full content of a specific result or user-provided URL.",
     "Callers remain responsible for robots.txt and site terms; Scrapling extract commands do not enforce them automatically.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Fetch full page content from a URL as markdown",
+  promptSnippet: "Local-first fetch of one URL as markdown",
   promptGuidelines: [
-    "Use web_fetch to read a single page (article, doc, or blog) that needs no interaction.",
-    "For a single URL, always use web_fetch instead of web_batch_fetch.",
-    "If the page is dynamic/JavaScript-heavy, the tool automatically uses browser automation.",
-    "When reading multiple (2–5) pages at once (e.g., after web_search), prefer web_batch_fetch over repeated web_fetch calls.",
-    "Always pass the full URL including https://.",
+    "Use web_fetch for one non-interactive URL; use web_batch_fetch for 2–5 URLs.",
   ],
   parameters: WebFetchParamsSchema,

package/extensions/web_search.ts CHANGED Viewed

@@ -51,19 +51,18 @@ const webSearchTool = defineTool({
   name: "web_search",
   label: "Web Search",
   description: [
-    "Search the web using a SearXNG instance.",
+    "Primary local-first tool for web discovery via a SearXNG instance.",
     "Returns a list of results with title, URL, and snippet.",
     "Automatically aggregates up to 3 pages of SearXNG results when more than ~20 are needed.",
+    "Use web_search as the first attempt for web search; it automatically tries Firecrawl keyless only if SearXNG fails or returns nothing.",
     "Use web_search when the user asks about current events, facts, or anything",
     "that requires up-to-date information beyond the model's training data.",
     `Output is truncated to ${DEFAULT_MAX_LINES} lines or ${formatSize(DEFAULT_MAX_BYTES)}; if truncated, full output is saved to a temp file.`,
   ].join(" "),
-  promptSnippet: "Search the web for current information",
+  promptSnippet: "Local web search via SearXNG",
   promptGuidelines: [
-    "Use web_search when the user asks about recent events, current data, or external facts.",
-    "Use web_search to verify claims, find documentation, or discover resources online.",
-    "If web_search returns no results but includes suggestions, consider using a suggested query to refine your search.",
-    "If web_search returns multiple (2–5) relevant results that all need to be read, prefer web_batch_fetch to fetch them in parallel instead of calling web_fetch repeatedly.",
+    "Use web_search for current/external facts, verification, docs, and discovery.",
+    "If 2–5 results need reading, use web_batch_fetch; retry suggested queries when results are empty.",
   ],
   parameters: WebSearchParamsSchema,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-web-toolkit",
-  "version": "0.3.1",
+  "version": "0.3.2",
   "description": "Web research toolkit for the pi coding agent. Search via SearXNG, fetch pages with scrapling, browse interactively via agent-browser, batch-read sources in parallel, and optionally fall back to Firecrawl Keyless (no API key) when a local backend fails.",
   "author": "Wade Huang <fastwade11@gmail.com>",
   "license": "MIT",
@@ -19,7 +19,7 @@
   },
   "scripts": {
     "typecheck": "tsc --noEmit",
-    "test": "tsx test/content-preview/test.ts && tsx test/agent-browser/test.ts && tsx test/firecrawl/test.ts",
+    "test": "tsx test/content-preview/test.ts && tsx test/agent-browser/test.ts && tsx test/firecrawl/test.ts && tsx test/tool-routing/test.ts",
     "test:agent-browser": "tsx test/agent-browser/test.ts",
     "test:firecrawl": "tsx test/firecrawl/test.ts",
     "test:approve": "tsx test/content-preview/test.ts --approve"