pi-web-providers 0.3.0 β†’ 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +158 -223
  2. package/dist/index.js +5545 -1856
  3. package/package.json +3 -3
package/README.md CHANGED
@@ -1,43 +1,23 @@
1
1
  # 🌍 pi-web-providers
2
2
 
3
- A _meta_ web extension for [pi](https://pi.dev).
3
+ A _meta_ web extension for [pi](https://pi.dev) that routes search, content
4
+ extraction, answers, and research through configurable per-tool providers.
4
5
 
5
6
  ## Why?
6
7
 
7
- Most web extensions hard-wire a single search-and-fetch pipeline. That works
8
- until you want to swap providers, compare results, or use a capabilityβ€”like deep
9
- researchβ€”that only one backend offers.
10
-
11
- **pi-web-providers** takes a different approach: it doesn't do web work itself.
12
- Instead it dispatches every request to a **configurable set of providers**,
13
- giving you maximum flexibility and choice when it comes to consuming web results.
14
-
15
- The tool surface is **capability-based, not static**. At startup the extension
16
- inspects which providers are available and what each one supports, then registers
17
- only the tools that make sense. If your active provider offers search and
18
- content extraction but not deep research, the agent never sees a research tool.
19
- Switch to a provider that supports it and the tool appears automatically.
20
-
21
- The extension also separates **available tools** from the **active tool set**.
22
- When a session starts, it can add every available managed tool. Before each
23
- agent run, it removes tools that are no longer available but keeps any managed
24
- tools that you explicitly removed from the active set disabled. That keeps the
25
- tool prompt aligned with the tools that the agent can actually call.
8
+ Most web extensions hard-wire a single backend. **pi-web-providers** lets you
9
+ mix and match providers per tool instead, so `web_search`, `web_contents`,
10
+ `web_answer`, and `web_research` can each use a different backend or be turned
11
+ off entirely.
26
12
 
27
13
  ## ✨ Features
28
14
 
29
- - **Provider-driven tool surface** β€” tools are injected based on what the active
30
- provider actually supports, not a fixed list
31
- - **Multiple providers**: Claude, Codex, Exa, Gemini, Perplexity, Parallel,
32
- Valyu β€” each with
33
- its own SDK, strengths, and capability set
34
- - **One config command** (`/web-providers`) with a TUI that adapts to the
35
- selected provider
36
- - **Transparent fallback** β€” search falls back to Codex when no provider is
37
- explicitly enabled and the local CLI is installed and authenticated
38
- - **Per-provider tool toggles** β€” disable individual capabilities you don't need
39
- without switching providers
40
- - **Truncated output with temp-file spillover** for large results
15
+ - **Multiple providers** β€” Claude, Codex, Exa, Gemini, Perplexity, Parallel,
16
+ Valyu
17
+ - **Batched search and answers** β€” run several related queries in a single
18
+ `web_search` or `web_answer` call and get grouped results back in one response
19
+ - **Async contents prefetch** β€” optionally start background `web_contents`
20
+ extraction from `web_search` results and reuse the cached pages later
41
21
 
42
22
  ## πŸ“¦ Install
43
23
 
@@ -53,265 +33,220 @@ Run:
53
33
  /web-providers
54
34
  ```
55
35
 
56
- This command edits a single global config file:
57
- `~/.pi/agent/web-providers.json`.
36
+ This edits the global config file `~/.pi/agent/web-providers.json`. The
37
+ settings UI mirrors the three sections below: tools, providers, and generic
38
+ settings.
39
+
40
+ Each tool can be routed to any compatible provider:
41
+
42
+ | Provider | search | contents | answer | research | Auth |
43
+ | -------------- | :----: | :------: | :----: | :------: | ---------------------- |
44
+ | **Claude** | βœ” | | βœ” | | Local Claude Code auth |
45
+ | **Codex** | βœ” | | | | Local Codex CLI auth |
46
+ | **Exa** | βœ” | βœ” | βœ” | βœ” | `EXA_API_KEY` |
47
+ | **Gemini** | βœ” | | βœ” | βœ” | `GOOGLE_API_KEY` |
48
+ | **Perplexity** | βœ” | | βœ” | βœ” | `PERPLEXITY_API_KEY` |
49
+ | **Parallel** | βœ” | βœ” | | | `PARALLEL_API_KEY` |
50
+ | **Valyu** | βœ” | βœ” | βœ” | βœ” | `VALYU_API_KEY` |
51
+
52
+ See [`example-config.json`](example-config.json) for a full default
53
+ configuration.
58
54
 
59
- The flow is provider-first: pick the active provider, then configure only that
60
- provider's tool toggles and settings. Each provider view surfaces the knobs that
61
- actually applyβ€”Claude shows model/effort/turns settings; Codex shows
62
- reasoning-effort and web-search-mode toggles; Exa shows search type and
63
- text-content flags; and so on.
55
+ ### Tools
64
56
 
65
- ## πŸ”§ Tools
57
+ Each managed tool maps to one provider id or `null` for off under the top-level
58
+ `tools` key. A tool is only exposed when it is mapped to a compatible provider
59
+ and that provider is currently available. Tool-specific settings live under
60
+ `toolSettings`; today this covers `toolSettings.search.prefetch`.
66
61
 
67
- Which of the tools below are registered depends on the capabilities of the
68
- available providers. If no provider supports a given capability, the
69
- corresponding tool is never exposed to the agent.
62
+ #### `web_search`
70
63
 
71
- ### `web_search`
64
+ Find likely sources on the public web for up to 10 queries in a single call
65
+ and return titles, URLs, and snippets grouped by query.
72
66
 
73
- Find likely sources on the public web and return titles, URLs, and snippets.
67
+ | Parameter | Type | Default | Description |
68
+ | ------------ | -------- | -------- | -------------------------------------------------------------------- |
69
+ | `queries` | string[] | required | One or more search queries to run (max 10) |
70
+ | `maxResults` | integer | `5` | Result count per query, clamped to `1–20` |
71
+ | `options` | object | β€” | Provider-specific search options plus local `prefetch` orchestration |
74
72
 
75
- | Parameter | Type | Default | Description |
76
- | ------------ | ------- | -------- | ----------------------------------------------------------------------------- |
77
- | `query` | string | required | What to search for |
78
- | `maxResults` | integer | `5` | Result count, clamped to `1–20` |
79
- | `options` | object | β€” | Provider-specific search options |
80
- | `provider` | string | auto | Optional override: `claude`, `codex`, `exa`, `gemini`, `perplexity`, `parallel`, or `valyu` |
73
+ `web_search.options.prefetch` is local-only and not forwarded into the provider
74
+ SDK. It accepts `provider`, `maxUrls`, `ttlMs`, and `contentsOptions`, and
75
+ starts a background page-extraction workflow only when `prefetch.provider` is
76
+ set. `/web-providers` can also persist default search prefetch settings under
77
+ `toolSettings.search.prefetch`.
81
78
 
82
- ### `web_contents`
79
+ #### `web_contents`
83
80
 
84
81
  Read and extract the main contents of one or more web pages.
85
82
 
86
- | Parameter | Type | Default | Description |
87
- | ---------- | -------- | -------- | ------------------------------------------------------- |
88
- | `urls` | string[] | required | One or more URLs to extract |
89
- | `options` | object | β€” | Provider-specific extraction options |
90
- | `provider` | string | auto | Optional override among providers that support contents |
83
+ | Parameter | Type | Default | Description |
84
+ | --------- | -------- | -------- | ------------------------------------ |
85
+ | `urls` | string[] | required | One or more URLs to extract |
86
+ | `options` | object | β€” | Provider-specific extraction options |
91
87
 
92
- ### `web_answer`
88
+ `web_contents` reuses any matching cached pages already present in the local
89
+ content storeβ€”whether they came from prefetch or an earlier readβ€”and only
90
+ fetches missing or stale URLs.
93
91
 
94
- Answer a question using web-grounded evidence.
92
+ #### `web_answer`
95
93
 
96
- | Parameter | Type | Default | Description |
97
- | ---------- | ------ | -------- | ------------------------------------------------------ |
98
- | `query` | string | required | Question to answer |
99
- | `options` | object | β€” | Provider-specific answer options |
100
- | `provider` | string | auto | Optional override among providers that support answers |
94
+ Answer one or more questions using web-grounded evidence.
101
95
 
102
- ### `web_research`
96
+ | Parameter | Type | Default | Description |
97
+ | --------- | -------- | -------- | ---------------------------------------------------- |
98
+ | `queries` | string[] | required | One or more questions to answer in one call (max 10) |
99
+ | `options` | object | β€” | Provider-specific options |
100
+
101
+ Responses are grouped into per-question sections when more than one question is provided.
102
+
103
+ #### `web_research`
103
104
 
104
105
  Investigate a topic across web sources and produce a longer report.
105
106
 
106
- | Parameter | Type | Default | Description |
107
- | ---------- | ------ | -------- | ------------------------------------------------------- |
108
- | `input` | string | required | Research brief or question |
109
- | `options` | object | β€” | Provider-specific research options |
110
- | `provider` | string | auto | Optional override among providers that support research |
107
+ | Parameter | Type | Default | Description |
108
+ | --------- | ------ | -------- | -------------------------- |
109
+ | `input` | string | required | Research brief or question |
110
+ | `options` | object | β€” | Provider-specific options |
111
+
112
+ `options` are provider-native and provider-specific. Equivalent concepts can use
113
+ different field names across SDKsβ€”for example Perplexity uses `country`, Exa
114
+ uses `userLocation`, and Valyu uses `countryCode`. Runtime `options` override
115
+ provider-native config, but managed tool inputs and tool wiring stay fixed.
116
+
117
+ <details>
118
+ <summary><strong>Timeout, retry, and delivery modes</strong></summary>
119
+
120
+ The extension accepts local control fields for robustness: `requestTimeoutMs`,
121
+ `retryCount`, and `retryDelayMs` on request/response tools, plus
122
+ `pollIntervalMs`, `timeoutMs`, `maxConsecutivePollErrors`, and `resumeId` on
123
+ `web_research` for lifecycle-based research providers. These fields are handled
124
+ by the extension and are not forwarded into the provider SDK call.
125
+
126
+ - Exa and Valyu research support polling, overall deadlines, and resume IDs
127
+ but reject `requestTimeoutMs` and do not retry non-idempotent job creation.
128
+ - Perplexity research runs in streaming foreground mode and only supports
129
+ `requestTimeoutMs`, `retryCount`, and `retryDelayMs`.
111
130
 
112
- `options` are provider-native and provider-specific. Equivalent concepts can
113
- use different field names across SDKs, for example Perplexity uses `country`,
114
- Exa uses `userLocation`, and Valyu uses `countryCode`. Runtime `options`
115
- override provider defaults, but managed tool inputs and tool wiring stay fixed.
131
+ Providers deliver results in one of three modes:
116
132
 
117
- ## πŸ”Œ Providers
133
+ - **Silent foreground** β€” no intermediate output; result returned when done.
134
+ - **Streaming foreground** β€” progress updates while running, but the result is
135
+ still only usable after the tool finishes.
136
+ - **Background research** β€” the provider runs in the background; if
137
+ interrupted, the run can be resumed later via `resumeId`.
118
138
 
119
- Every provider is a thin adapter around an official SDK. The table below
120
- summarises which capabilities each provider exposes:
139
+ </details>
121
140
 
122
- | Provider | search | contents | answer | research | Auth |
123
- | ------------ | :----: | :------: | :----: | :------: | ---------------------- |
124
- | **Claude** | βœ“ | | βœ“ | | Local Claude Code auth |
125
- | **Codex** | βœ“ | | | | Local Codex CLI auth |
126
- | **Exa** | βœ“ | βœ“ | βœ“ | βœ“ | `EXA_API_KEY` |
127
- | **Gemini** | βœ“ | βœ“ | βœ“ | βœ“ | `GOOGLE_API_KEY` |
128
- | **Perplexity** | βœ“ | | βœ“ | βœ“ | `PERPLEXITY_API_KEY` |
129
- | **Parallel** | βœ“ | βœ“ | | | `PARALLEL_API_KEY` |
130
- | **Valyu** | βœ“ | βœ“ | βœ“ | βœ“ | `VALYU_API_KEY` |
141
+ ### Providers
131
142
 
132
- ### Claude
143
+ Every provider is a thin adapter around an official SDK. Each provider has an
144
+ `enabled` toggle that controls whether it is eligible for tool mappings.
145
+ Provider config is split into `native` settings (forwarded to the SDK) and
146
+ `policy` settings (local overrides that take precedence over generic settings);
147
+ legacy `defaults` blocks are still accepted when reading. Secret-like values
148
+ can be literal strings, environment variable names (e.g., `EXA_API_KEY`), or
149
+ shell commands prefixed with `!`.
150
+
151
+ <details>
152
+ <summary><strong>Claude</strong></summary>
133
153
 
134
154
  - SDK: `@anthropic-ai/claude-agent-sdk`
135
155
  - Uses Claude Code's built-in `WebSearch` and `WebFetch` tools behind a
136
156
  structured JSON adapter
157
+ - Runs in **silent foreground** mode
137
158
  - Supports request-shaping `options` such as `model`, `thinking`, `effort`, and
138
159
  `maxTurns`
139
160
  - Great for search plus grounded answers if you already use Claude Code locally
140
161
 
141
- ### Codex
162
+ </details>
163
+
164
+ <details>
165
+ <summary><strong>Codex</strong></summary>
142
166
 
143
167
  - SDK: `@openai/codex-sdk`
144
168
  - Runs in read-only mode with web search enabled
169
+ - Runs in **silent foreground** mode
145
170
  - Supports request-shaping `web_search.options` such as `model`,
146
171
  `modelReasoningEffort`, and `webSearchMode`
147
172
  - Best if you already use the local Codex CLI and auth flow
148
173
 
149
- ### Exa
174
+ </details>
175
+
176
+ <details>
177
+ <summary><strong>Exa</strong></summary>
150
178
 
151
179
  - SDK: `exa-js`
180
+ - Search, contents, and answer run in **silent foreground** mode
181
+ - Research runs in **background research** mode and supports `resumeId`
152
182
  - Neural, keyword, hybrid, and deep-research search modes
153
183
  - Inline text-content extraction on search results
154
184
 
155
- ### Gemini
185
+ </details>
186
+
187
+ <details>
188
+ <summary><strong>Gemini</strong></summary>
156
189
 
157
190
  - SDK: `@google/genai`
158
- - Google Search grounding for answers and URL Context extraction for page contents
191
+ - Search and answer run in **silent foreground** mode
192
+ - Research runs in **background research** mode and supports `resumeId`
193
+ - Google Search grounding for answers
159
194
  - Deep-research agents via Google's Gemini API
160
195
  - Supports provider-native request options such as `model`, `config`,
161
196
  `generation_config`, and `agent_config` depending on the tool
162
197
 
163
- ### Perplexity
198
+ </details>
199
+
200
+ <details>
201
+ <summary><strong>Perplexity</strong></summary>
164
202
 
165
203
  - SDK: `@perplexity-ai/perplexity_ai`
204
+ - `web_search` and `web_answer` run in **silent foreground** mode
205
+ - `web_research` runs in **streaming foreground** mode (no `resumeId` support)
166
206
  - Uses Perplexity Search for `web_search`
167
207
  - Uses Sonar for `web_answer` and `sonar-deep-research` for `web_research`
168
208
  - Supports provider-specific `web_search.options` such as `country`,
169
209
  `search_mode`, `search_domain_filter`, and `search_recency_filter`
170
210
 
171
- ### Parallel
211
+ </details>
212
+
213
+ <details>
214
+ <summary><strong>Parallel</strong></summary>
172
215
 
173
216
  - SDK: `parallel-web`
217
+ - Runs in **silent foreground** mode
174
218
  - Agentic and one-shot search modes
175
219
  - Page content extraction with excerpt and full-content toggles
176
220
  - Supports provider-native search and extraction options from the Parallel SDK
177
221
 
178
- ### Valyu
222
+ </details>
223
+
224
+ <details>
225
+ <summary><strong>Valyu</strong></summary>
179
226
 
180
227
  - SDK: `valyu-js`
228
+ - Search, contents, and answer run in **silent foreground** mode
229
+ - Research runs in **background research** mode and supports `resumeId`
181
230
  - Web, proprietary, and news search types
182
231
  - Supports provider-native options such as `countryCode`, `responseLength`, and
183
232
  search/source filters
184
233
  - Configurable response length for answers and research
185
234
 
186
- ## πŸ“ Config Notes
187
-
188
- - `/web-providers` keeps exactly one provider active by writing `enabled: true`
189
- for the selected provider and `enabled: false` for the others
190
- - Each provider can also enable or disable its individual tools through a `tools`
191
- block
192
- - Managed tools are registered from available provider capabilities, but the
193
- active tool set can still be narrower if you removed a tool from the session
194
- - If no provider is explicitly enabled for search, the extension falls back to
195
- Codex when the local CLI is installed and authenticated, unless Codex was
196
- explicitly configured as disabled
197
- - Tools stay inactive when no provider is available for their capability, so
198
- they are not injected into the LLM prompt
199
- - Before each agent run, the extension removes newly unavailable managed tools
200
- and keeps manually pruned managed tools inactive instead of re-adding them
201
- - Secret-like values can be:
202
- - literal strings
203
- - environment variable names such as `EXA_API_KEY`
204
- - shell commands prefixed with `!`
205
-
206
- Example:
207
-
208
- ```json
209
- {
210
- "version": 1,
211
- "providers": {
212
- "claude": {
213
- "enabled": false,
214
- "tools": {
215
- "search": true,
216
- "answer": true
217
- }
218
- },
219
- "codex": {
220
- "enabled": true,
221
- "tools": {
222
- "search": true
223
- },
224
- "defaults": {
225
- "webSearchMode": "live",
226
- "networkAccessEnabled": true
227
- }
228
- },
229
- "exa": {
230
- "enabled": false,
231
- "tools": {
232
- "search": true,
233
- "contents": true,
234
- "answer": true,
235
- "research": true
236
- },
237
- "apiKey": "EXA_API_KEY",
238
- "defaults": {
239
- "type": "auto",
240
- "contents": {
241
- "text": true
242
- }
243
- }
244
- },
245
- "gemini": {
246
- "enabled": false,
247
- "tools": {
248
- "search": true,
249
- "contents": true,
250
- "answer": true,
251
- "research": true
252
- },
253
- "apiKey": "GOOGLE_API_KEY",
254
- "defaults": {
255
- "searchModel": "gemini-2.5-flash",
256
- "contentsModel": "gemini-2.5-flash",
257
- "answerModel": "gemini-2.5-flash",
258
- "researchAgent": "deep-research-pro-preview-12-2025"
259
- }
260
- },
261
- "perplexity": {
262
- "enabled": false,
263
- "tools": {
264
- "search": true,
265
- "answer": true,
266
- "research": true
267
- },
268
- "apiKey": "PERPLEXITY_API_KEY",
269
- "defaults": {
270
- "search": {
271
- "country": "US"
272
- },
273
- "answer": {
274
- "model": "sonar"
275
- },
276
- "research": {
277
- "model": "sonar-deep-research"
278
- }
279
- }
280
- },
281
- "parallel": {
282
- "enabled": false,
283
- "tools": {
284
- "search": true,
285
- "contents": true
286
- },
287
- "apiKey": "PARALLEL_API_KEY",
288
- "defaults": {
289
- "search": {
290
- "mode": "agentic"
291
- },
292
- "extract": {
293
- "excerpts": true,
294
- "full_content": false
295
- }
296
- }
297
- },
298
- "valyu": {
299
- "enabled": false,
300
- "tools": {
301
- "search": true,
302
- "contents": true,
303
- "answer": true,
304
- "research": true
305
- },
306
- "apiKey": "VALYU_API_KEY",
307
- "defaults": {
308
- "searchType": "all",
309
- "responseLength": "short"
310
- }
311
- }
312
- }
313
- }
314
- ```
235
+ </details>
236
+
237
+ ### Generic settings
238
+
239
+ The `genericSettings` block sets shared execution defaults that apply to all
240
+ providers unless overridden in a provider's `policy` block:
241
+
242
+ | Field | Default | Description |
243
+ | ---------------------------------- | ---------- | ---------------------------------------------- |
244
+ | `requestTimeoutMs` | `30000` | Maximum time for a single provider request |
245
+ | `retryCount` | `3` | Retries for transient failures |
246
+ | `retryDelayMs` | `2000` | Initial delay before retrying |
247
+ | `researchPollIntervalMs` | `3000` | How often to poll long-running research jobs |
248
+ | `researchTimeoutMs` | `21600000` | Overall deadline for research before returning |
249
+ | `researchMaxConsecutivePollErrors` | `3` | Consecutive poll failures before stopping |
315
250
 
316
251
  ## πŸ› οΈ Development
317
252