@agent-sh/harness-websearch 0.3.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # @agent-sh/harness-websearch
2
2
 
3
- Web search for AI agent harnesses, backed by a self-hosted SearXNG instance — tool-layer SSRF defense, declarative provider-neutral query controls, a result-count cap, and a discriminated result surface.
3
+ Web search for AI agent harnesses **works with no API key and no setup**. With nothing configured it queries a bundled keyless fallback chain (Mojeek → Marginalia → Wikipedia) and returns the first backend that has results. Optionally upgrade to Brave/Tavily (API key) or a self-hosted SearXNG for higher coverage same tool, same output. Tool-layer SSRF defense, declarative provider-neutral query controls, a result-count cap, engine provenance, and a discriminated result surface.
4
4
 
5
5
  Part of the [`@agent-sh/harness-*`](https://github.com/avifenesh/tools) monorepo — see the top-level README for architectural context and the full tool surface. WebSearch finds URLs; [`webfetch`](../webfetch) reads them. They compose.
6
6
 
@@ -14,18 +14,39 @@ Requires Node ≥ 20.
14
14
 
15
15
  ## Usage
16
16
 
17
+ Zero-config — keyless, just works:
18
+
17
19
  ```ts
18
20
  import { websearch } from "@agent-sh/harness-websearch";
19
21
 
20
22
  const session = {
21
23
  permissions: { roots: [], sensitivePatterns: [], unsafeAllowSearchWithoutHook: true },
22
- searxngUrl: "http://127.0.0.1:8888",
23
- allowLoopback: true, // self-hosted SearXNG is usually on localhost
24
24
  };
25
25
 
26
26
  const r = await websearch({ query: "rust async runtime benchmarks", count: 5 }, session);
27
+ // r.meta.engine tells you which backend served the results (e.g. "mojeek").
28
+ ```
29
+
30
+ Upgrade to a reliable keyed provider, or a self-hosted SearXNG:
31
+
32
+ ```ts
33
+ // Brave (recommended; free tier at api-dashboard.search.brave.com):
34
+ const braveSession = { ...session, braveApiKey: process.env.BRAVE_API_KEY };
35
+
36
+ // Self-hosted SearXNG (loopback opt-in since it's usually on localhost):
37
+ const searxngSession = { ...session, searxngUrl: "http://127.0.0.1:8888", allowLoopback: true };
38
+
39
+ // An explicit backend is exclusive by default; opt into the keyless tail as a backstop:
40
+ const withFallback = { ...braveSession, fallbackToKeyless: true };
27
41
  ```
28
42
 
43
+ Notes:
44
+ - `disableMojeek: true` drops the Mojeek scrape engine (its robots.txt disallows `/search`; the documented Marginalia/Wikipedia APIs remain).
45
+ - `snippetCap` tunes the per-result snippet length (default 240 chars; clamped 80–600) to trade detail for tokens.
46
+ - When the leading backend returns fewer than `count`, the tool **gathers and merges** results across the chain (de-duplicating the same page by normalized URL) until the quota is met; a backend that already satisfies `count` short-circuits. Merged results are ranked by **Reciprocal Rank Fusion + engine weights** (a page multiple engines agree on is boosted; a niche/encyclopedic backstop won't outrank broad web), name the contributing engines in the header, and tag each result with its `source`.
47
+ - Output is compact ranked text: a `WEB "query" · engine (class) · N results` header then numbered entries; the engine's coverage class (`general web` / `indie/small-web index` / `encyclopedic`) and per-result `age` (when the backend provides it) are shown so the model can judge source quality and freshness. Parse `meta`/`results`, not the `output` string.
48
+ - Zero hits is a normal `kind: "empty"` result, not an error.
49
+
29
50
  ## Contract
30
51
 
31
52
  The full contract — input shape, output discriminated-union, error codes, permission model, and acceptance tests — lives in [`agent-knowledge/design/websearch.md`](https://github.com/avifenesh/tools/blob/main/agent-knowledge/design/websearch.md). Changes to this package must stay in sync with that spec.