pi-web-access 0.7.0 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,21 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [0.7.2] - 2026-02-03
6
+
7
+ ### Added
8
+ - `model` parameter on `fetch_content` to override the Gemini model per-request (e.g. `model: "gemini-2.5-flash"`)
9
+ - Collapsed TUI results now show a 200-char text preview instead of just the status line
10
+ - LICENSE file (MIT)
11
+
12
+ ### Changed
13
+ - Default Gemini model updated from `gemini-2.5-flash` to `gemini-3-flash-preview` across all API, search, URL context, YouTube, and video paths. Gemini Web gracefully falls back to `gemini-2.5-flash` when the model header isn't available.
14
+ - README rewritten: added tagline, badges, "Why" section, Quick Start, corrected "How It Works" routing order, fixed inaccurate env var precedence claim, added missing `/v/` YouTube format, restored `/search` command docs, collapsible Files table
15
+
16
+ ### Fixed
17
+ - `PERPLEXITY_API_KEY` env var now takes precedence over config file value, matching `GEMINI_API_KEY` behavior and README documentation (was reversed)
18
+ - `package.json` now includes `repository`, `homepage`, `bugs`, and `description` fields (repo link was missing from pi packages site)
19
+
5
20
  ## [0.7.0] - 2026-02-03
6
21
 
7
22
  ### Added
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Nico Bailon
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -4,12 +4,23 @@
4
4
 
5
5
  # Pi Web Access
6
6
 
7
- An extension for [Pi coding agent](https://github.com/badlogic/pi-mono/) that gives Pi web capabilities: search via Perplexity AI or Gemini, fetch and extract content from URLs, clone GitHub repos for local exploration, read PDFs, understand YouTube videos, and analyze local video files.
7
+ **Web search, content extraction, and video understanding for Pi agent. Zero config with Chrome, or bring your own API keys.**
8
8
 
9
- ```typescript
10
- web_search({ query: "TypeScript best practices 2025" })
11
- fetch_content({ url: "https://docs.example.com/guide" })
12
- ```
9
+ [![npm version](https://img.shields.io/npm/v/pi-web-access?style=for-the-badge)](https://www.npmjs.com/package/pi-web-access)
10
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
11
+ [![Platform](https://img.shields.io/badge/Platform-macOS%20%7C%20Linux%20%7C%20Windows*-blue?style=for-the-badge)]()
12
+
13
+ https://github.com/user-attachments/assets/cac6a17a-1eeb-4dde-9818-cdf85d8ea98f
14
+
15
+ ## Why Pi Web Access
16
+
17
+ **Zero Config** — Signed into Google in Chrome? That's it. The extension reads your Chrome session cookies to access Gemini directly. No API keys, no setup, no subscriptions.
18
+
19
+ **Video Understanding** — Point it at a YouTube video or local screen recording and ask questions about what's on screen. Full transcripts, visual descriptions, and frame extraction at exact timestamps.
20
+
21
+ **Smart Fallbacks** — Every capability has a fallback chain. Search tries Perplexity, then Gemini API, then Gemini Web. YouTube tries Gemini Web, then API, then Perplexity. Blocked pages retry through Gemini extraction. Something always works.
22
+
23
+ **GitHub Cloning** — GitHub URLs are cloned locally instead of scraped. The agent gets real file contents and a local path to explore, not rendered HTML.
13
24
 
14
25
  ## Install
15
26
 
@@ -17,281 +28,183 @@ fetch_content({ url: "https://docs.example.com/guide" })
17
28
  pi install npm:pi-web-access
18
29
  ```
19
30
 
20
- Configure at least one search provider:
21
-
22
- ```bash
23
- # Option 1: Sign into gemini.google.com in Chrome (free, zero config)
24
-
25
- # Option 2: Gemini API key
26
- echo '{"geminiApiKey": "AIza..."}' > ~/.pi/web-search.json
31
+ If you're not signed into Chrome, or prefer a different provider, add API keys to `~/.pi/web-search.json`:
27
32
 
28
- # Option 3: Perplexity API key
29
- echo '{"perplexityApiKey": "pplx-..."}' > ~/.pi/web-search.json
33
+ ```json
34
+ {
35
+ "perplexityApiKey": "pplx-...",
36
+ "geminiApiKey": "AIza..."
37
+ }
30
38
  ```
31
39
 
32
- All three work simultaneously. In `auto` mode (default), the extension tries Perplexity first, then Gemini API, then Gemini Web.
33
-
34
- **Requires:** Pi v0.37.3+
40
+ You can configure one or both. In `auto` mode (default), `web_search` tries Perplexity first, then Gemini API, then Gemini Web.
35
41
 
36
- **Optional dependencies** for video frame extraction:
42
+ Optional dependencies for video frame extraction:
37
43
 
38
44
  ```bash
39
45
  brew install ffmpeg # frame extraction, video thumbnails, local video duration
40
- brew install yt-dlp # YouTube frame extraction (stream URL + duration lookup)
46
+ brew install yt-dlp # YouTube stream URLs for frame extraction
41
47
  ```
42
48
 
43
- Without these, video content analysis (transcripts via Gemini) still works. The binaries are only needed for extracting visual frames from videos. `ffprobe` (bundled with ffmpeg) is used for local video duration lookup when sampling frames across an entire video.
49
+ Without these, video content analysis (transcripts, visual descriptions via Gemini) still works. The binaries are only needed for extracting individual frames as images.
44
50
 
45
- ## Tools
46
-
47
- ### web_search
51
+ Requires Pi v0.37.3+.
48
52
 
49
- Search the web via Perplexity AI or Gemini. Returns synthesized answer with source citations.
53
+ ## Quick Start
50
54
 
51
55
  ```typescript
52
- // Single query
53
- web_search({ query: "rust async programming" })
56
+ // Search the web
57
+ web_search({ query: "TypeScript best practices 2025" })
54
58
 
55
- // Multiple queries (batch)
56
- web_search({ queries: ["query 1", "query 2"] })
59
+ // Fetch a page
60
+ fetch_content({ url: "https://docs.example.com/guide" })
57
61
 
58
- // With options
59
- web_search({
60
- query: "latest news",
61
- numResults: 10, // Default: 5, max: 20
62
- recencyFilter: "week", // day, week, month, year
63
- domainFilter: ["github.com"] // Prefix with - to exclude
64
- })
62
+ // Clone a GitHub repo
63
+ fetch_content({ url: "https://github.com/owner/repo" })
65
64
 
66
- // Explicit provider
67
- web_search({ query: "...", provider: "gemini" }) // auto, perplexity, gemini
65
+ // Understand a YouTube video
66
+ fetch_content({ url: "https://youtube.com/watch?v=abc", prompt: "What libraries are shown?" })
68
67
 
69
- // Fetch full page content (async)
70
- web_search({ query: "...", includeContent: true })
68
+ // Analyze a screen recording
69
+ fetch_content({ url: "/path/to/recording.mp4", prompt: "What error appears on screen?" })
71
70
  ```
72
71
 
73
- When `includeContent: true`, sources are fetched in the background. Agent receives notification when ready.
74
-
75
- Provider selection in `auto` mode: Perplexity (if key configured) → Gemini API (if key configured, uses Google Search grounding) → Gemini Web (if signed into Chrome). Gemini API returns structured citations with source mappings. Gemini Web returns markdown with embedded links.
72
+ ## Tools
76
73
 
77
- ### fetch_content
74
+ ### web_search
78
75
 
79
- Fetch URL(s) and extract readable content as markdown.
76
+ Search the web via Perplexity AI or Gemini. Returns a synthesized answer with source citations.
80
77
 
81
78
  ```typescript
82
- // Single URL - returns content directly (also stored for retrieval)
83
- fetch_content({ url: "https://example.com/article" })
84
-
85
- // Multiple URLs - returns summary (content stored for retrieval)
86
- fetch_content({ urls: ["url1", "url2", "url3"] })
87
-
88
- // PDFs - extracted and saved to ~/Downloads/
89
- fetch_content({ url: "https://arxiv.org/pdf/1706.03762" })
90
- // → "PDF extracted and saved to: ~/Downloads/arxiv-170603762.md"
79
+ web_search({ query: "rust async programming" })
80
+ web_search({ queries: ["query 1", "query 2"] })
81
+ web_search({ query: "latest news", numResults: 10, recencyFilter: "week" })
82
+ web_search({ query: "...", domainFilter: ["github.com"] })
83
+ web_search({ query: "...", provider: "gemini" })
84
+ web_search({ query: "...", includeContent: true })
91
85
  ```
92
86
 
93
- **GitHub repos:** GitHub code URLs are automatically detected and cloned locally instead of scraping HTML. The agent gets actual file contents and a local path to explore with `read` and `bash`.
94
-
95
- ```typescript
96
- // Clone a repo - returns structure + README
97
- fetch_content({ url: "https://github.com/owner/repo" })
98
- // "Repository cloned to: /tmp/pi-github-repos/owner/repo"
99
-
100
- // Specific file - returns file contents
101
- fetch_content({ url: "https://github.com/owner/repo/blob/main/src/index.ts" })
102
-
103
- // Directory - returns listing
104
- fetch_content({ url: "https://github.com/owner/repo/tree/main/src" })
105
-
106
- // Force-clone a large repo that exceeds the size threshold
107
- fetch_content({ url: "https://github.com/big/repo", forceClone: true })
108
- ```
87
+ | Parameter | Description |
88
+ |-----------|-------------|
89
+ | `query` / `queries` | Single query or batch of queries |
90
+ | `numResults` | Results per query (default: 5, max: 20) |
91
+ | `recencyFilter` | `day`, `week`, `month`, or `year` |
92
+ | `domainFilter` | Limit to domains (prefix with `-` to exclude) |
93
+ | `provider` | `auto` (default), `perplexity`, or `gemini` |
94
+ | `includeContent` | Fetch full page content from sources in background |
109
95
 
110
- Repos over 350MB get a lightweight API-based view instead of a full clone. Commit SHA URLs are also handled via the API. Clones are cached for the session -- multiple files from the same repo share one clone, but clones are wiped on session change/shutdown and re-cloned as needed.
96
+ ### fetch_content
111
97
 
112
- **YouTube videos:** YouTube URLs are automatically detected and processed via Gemini for full video understanding (visual + audio + transcript). Three-tier fallback:
98
+ Fetch URL(s) and extract readable content as markdown. Automatically detects and handles GitHub repos, YouTube videos, PDFs, local video files, and regular web pages.
113
99
 
114
100
  ```typescript
115
- // Returns transcript with timestamps, visual descriptions, chapter markers
116
- fetch_content({ url: "https://youtube.com/watch?v=dQw4w9WgXcQ" })
117
-
118
- // Ask a specific question about the video
119
- fetch_content({ url: "https://youtube.com/watch?v=abc", prompt: "What libraries are imported?" })
101
+ fetch_content({ url: "https://example.com/article" })
102
+ fetch_content({ urls: ["url1", "url2", "url3"] })
103
+ fetch_content({ url: "https://github.com/owner/repo" })
104
+ fetch_content({ url: "https://youtube.com/watch?v=abc", prompt: "What libraries are shown?" })
105
+ fetch_content({ url: "/path/to/recording.mp4", prompt: "What error appears on screen?" })
106
+ fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41-25:00", frames: 4 })
120
107
  ```
121
108
 
122
- 1. **Gemini Web** (primary) -- reads your Chrome session cookies. Zero config if you're signed into Google.
123
- 2. **Gemini API** (secondary) -- uses `GEMINI_API_KEY` env var or `geminiApiKey` in config.
124
- 3. **Perplexity** (fallback) -- topic summary when neither Gemini path is available.
109
+ | Parameter | Description |
110
+ |-----------|-------------|
111
+ | `url` / `urls` | Single URL/path or multiple URLs |
112
+ | `prompt` | Question to ask about a YouTube video or local video file |
113
+ | `timestamp` | Extract frame(s) — single (`"23:41"`), range (`"23:41-25:00"`), or seconds (`"85"`) |
114
+ | `frames` | Number of frames to extract (max 12) |
115
+ | `forceClone` | Clone GitHub repos that exceed the 350MB size threshold |
125
116
 
126
- YouTube results include the video thumbnail as an image content part, so the agent receives visual context alongside the transcript.
127
-
128
- Handles all YouTube URL formats: `/watch?v=`, `youtu.be/`, `/shorts/`, `/live/`, `/embed/`, `/v/`, `m.youtube.com`. Playlist-only URLs fall through to normal extraction.
117
+ ### get_search_content
129
118
 
130
- **Local video files:** Pass a file path to analyze video content via Gemini. Supports MP4, MOV, WebM, AVI, and other common formats. Max 50MB (configurable).
119
+ Retrieve stored content from previous searches or fetches. Content over 30,000 chars is truncated in tool responses but stored in full for retrieval here.
131
120
 
132
121
  ```typescript
133
- // Analyze a screen recording
134
- fetch_content({ url: "/path/to/recording.mp4" })
135
-
136
- // Ask about specific content in the video
137
- fetch_content({ url: "./demo.mov", prompt: "What error message appears on screen?" })
122
+ get_search_content({ responseId: "abc123", urlIndex: 0 })
123
+ get_search_content({ responseId: "abc123", url: "https://..." })
124
+ get_search_content({ responseId: "abc123", query: "original query" })
138
125
  ```
139
126
 
140
- Two-tier fallback: Gemini API (needs key, proper Files API with MIME types) → Gemini Web (free, needs Chrome login). File paths are detected by prefix (`/`, `./`, `../`, `file://`). If ffmpeg is installed, a frame from the video is included as a thumbnail image alongside the analysis.
141
-
142
- **Video frame extraction (YouTube + local):** Use `timestamp` and/or `frames` to pull visuals for scanning.
143
-
144
- ```typescript
145
- // Single frame at an exact time
146
- fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41" })
127
+ ## Capabilities
147
128
 
148
- // Range scan (default 6 frames)
149
- fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41-25:00" })
129
+ ### GitHub repos
150
130
 
151
- // Custom density across a range
152
- fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41-25:00", frames: 3 })
131
+ GitHub URLs are cloned locally instead of scraped. The agent gets real file contents and a local path to explore with `read` and `bash`. Root URLs return the repo tree + README, `/tree/` paths return directory listings, `/blob/` paths return file contents.
153
132
 
154
- // N frames at 5s intervals starting from a single timestamp
155
- fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41", frames: 5 })
133
+ Repos over 350MB get a lightweight API-based view instead of a full clone (override with `forceClone: true`). Commit SHA URLs are handled via the API. Clones are cached for the session and wiped on session change. Private repos require the `gh` CLI.
156
134
 
157
- // Whole-video sampling (no timestamp)
158
- fetch_content({ url: "https://youtube.com/watch?v=abc", frames: 6 })
159
- ```
135
+ ### YouTube videos
160
136
 
161
- The same `timestamp`/`frames` syntax works with local file paths (e.g. `/path/to/video.mp4`).
137
+ YouTube URLs are processed via Gemini for full video understanding — visual descriptions, transcripts with timestamps, and chapter markers. Pass a `prompt` to ask specific questions about the video. Results include the video thumbnail so the agent gets visual context alongside the transcript.
162
138
 
163
- Requirements: YouTube frame extraction needs `yt-dlp` + `ffmpeg`. Local video frames need `ffmpeg` (and `ffprobe`, bundled with ffmpeg, for whole-video sampling).
139
+ Fallback: Gemini Web Gemini API Perplexity (text summary only). Handles all URL formats: `/watch?v=`, `youtu.be/`, `/shorts/`, `/live/`, `/embed/`, `/v/`.
164
140
 
165
- Common errors include missing binaries, private/age-restricted videos, region blocks, live streams, expired stream URLs (403), and timestamps beyond the video duration.
141
+ ### Local video files
166
142
 
167
- **Gemini extraction fallback:** When Readability fails or a site blocks bot traffic (403, 429), the extension automatically retries via Gemini URL Context (API) or Gemini Web. This handles SPAs, JS-heavy pages, and anti-bot protections that the HTTP pipeline can't.
143
+ Pass a file path (`/`, `./`, `../`, or `file://` prefix) to analyze video content via Gemini. Supports MP4, MOV, WebM, AVI, and other common formats up to 50MB. Pass a `prompt` to ask about specific content. If ffmpeg is installed, a thumbnail frame is included alongside the analysis.
168
144
 
169
- **PDF handling:** When fetching a PDF URL, the extension extracts text and saves it as a markdown file in `~/Downloads/`. The agent can then use `read` to access specific sections without loading 200K+ chars into context.
145
+ Fallback: Gemini API (Files API upload) Gemini Web.
170
146
 
171
- ### get_search_content
147
+ ### Video frame extraction
172
148
 
173
- Retrieve stored content from previous searches or fetches.
149
+ Use `timestamp` and/or `frames` on any YouTube URL or local video file to extract visual frames as images.
174
150
 
175
151
  ```typescript
176
- // By response ID (from web_search or fetch_content)
177
- get_search_content({ responseId: "abc123", urlIndex: 0 })
178
-
179
- // By URL
180
- get_search_content({ responseId: "abc123", url: "https://..." })
181
-
182
- // By query (for search results)
183
- get_search_content({ responseId: "abc123", query: "original query" })
152
+ fetch_content({ url: "...", timestamp: "23:41" }) // single frame
153
+ fetch_content({ url: "...", timestamp: "23:41-25:00" }) // range, 6 frames
154
+ fetch_content({ url: "...", timestamp: "23:41-25:00", frames: 3 }) // range, custom count
155
+ fetch_content({ url: "...", timestamp: "23:41", frames: 5 }) // 5 frames at 5s intervals
156
+ fetch_content({ url: "...", frames: 6 }) // sample whole video
184
157
  ```
185
158
 
186
- ## Features
187
-
188
- ### Activity Monitor (Ctrl+Shift+W)
159
+ Requires `ffmpeg` (and `yt-dlp` for YouTube). Timestamps accept `H:MM:SS`, `MM:SS`, or bare seconds.
189
160
 
190
- Toggle live request/response activity:
191
-
192
- ```
193
- ─── Web Search Activity ────────────────────────────────────
194
- API "typescript best practices" 200 2.1s ✓
195
- GET docs.example.com/article 200 0.8s ✓
196
- GET blog.example.com/post 404 0.3s ✗
197
- GET news.example.com/latest ... 1.2s ⋯
198
- ────────────────────────────────────────────────────────────
199
- Rate: 3/10 (resets in 42s)
200
- ```
161
+ ### PDFs
201
162
 
202
- ### RSC Content Extraction
163
+ PDF URLs are extracted as text and saved to `~/Downloads/` as markdown. The agent can then `read` specific sections without loading the full document into context. Text-based extraction only — no OCR.
203
164
 
204
- Next.js App Router pages embed content as RSC (React Server Components) flight data in script tags. When Readability fails, the extension parses these JSON payloads directly, reconstructing markdown with headings, tables, code blocks, and links.
165
+ ### Blocked pages
205
166
 
206
- ### TUI Rendering
167
+ When Readability fails or a site blocks bot traffic, the extension retries via Gemini URL Context API or Gemini Web extraction. Handles SPAs, JS-heavy pages, and anti-bot protections transparently. Also parses Next.js RSC flight data when present.
207
168
 
208
- Tool calls render with real-time progress:
169
+ ## How It Works
209
170
 
210
171
  ```
211
- ┌─ search "TypeScript best practices 2025" ─────────────────────────┐
212
- [████████░░] searching │
213
- └───────────────────────────────────────────────────────────────────┘
172
+ fetch_content(url)
173
+ Video file? Gemini API (Files API) → Gemini Web
174
+ → GitHub URL? Clone repo, return file contents + local path
175
+ → YouTube URL? Gemini Web → Gemini API → Perplexity
176
+ → HTTP fetch → PDF? Extract text, save to ~/Downloads/
177
+ → HTML? Readability → RSC parser → Gemini fallback
178
+ → Text/JSON/Markdown? Return directly
214
179
  ```
215
180
 
216
181
  ## Skills
217
182
 
218
- Skills are bundled with the extension and available automatically after install -- no extra setup needed.
219
-
220
183
  ### librarian
221
184
 
222
- Structured research workflow for open-source libraries with evidence-backed answers and GitHub permalinks. Loaded automatically when the task involves understanding library internals, finding implementation details, or tracing code history.
223
-
224
- Combines `fetch_content` (GitHub cloning), `web_search` (recent info), and git operations (blame, log, show). Pi auto-detects when to load it based on your prompt. If you have [pi-skill-palette](https://github.com/nicobailon/pi-skill-palette) installed, you can also load it explicitly via `/skill:librarian`.
185
+ Bundled research workflow for investigating open-source libraries. Combines GitHub cloning, web search, and git operations (blame, log, show) to produce evidence-backed answers with permalinks. Pi loads it automatically based on your prompt. Also available via `/skill:librarian` with [pi-skill-palette](https://github.com/nicobailon/pi-skill-palette).
225
186
 
226
187
  ## Commands
227
188
 
228
189
  ### /search
229
190
 
230
- Browse stored search results interactively.
231
-
232
- ## How It Works
233
-
234
- ### fetch_content routing
191
+ Browse stored search results interactively. Lists all results from the current session with their response IDs for easy retrieval.
235
192
 
236
- ```
237
- fetch_content(url_or_path, prompt?)
238
-
239
- ├── Local video file? ──→ Gemini API → Gemini Web
240
- │ ↓
241
- │ Video analysis (prompt forwarded)
242
-
243
- ├── github.com code URL? ──→ Clone repo (gh/git --depth 1)
244
- │ │
245
- │ ┌───────┼───────┐
246
- │ ↓ ↓ ↓
247
- │ root tree blob
248
- │ ↓ ↓ ↓
249
- │ tree + dir file
250
- │ README listing contents
251
- │ │ │ │
252
- │ └───────┼───────┘
253
- │ ↓
254
- │ Return content + local
255
- │ path for read/bash
256
-
257
- ├── YouTube URL? ──→ Gemini Web → Gemini API → Perplexity
258
- │ ↓ (prompt forwarded)
259
- │ Transcript + visual descriptions
260
-
261
- ├── PDF? ──→ unpdf → Save to ~/Downloads/
262
-
263
- ├── Plain text/markdown/JSON? ──→ Return directly
264
-
265
- └── HTML ──→ Readability → Markdown
266
-
267
- [if fails]
268
-
269
- RSC Parser → Markdown
270
-
271
- [if all fail]
272
-
273
- Gemini URL Context → Gemini Web extraction
274
- ```
193
+ ## Activity Monitor
275
194
 
276
- ### web_search routing
195
+ Toggle with **Ctrl+Shift+W** to see live request/response activity:
277
196
 
278
197
  ```
279
- web_search(query, provider?)
280
-
281
- ├── provider = "perplexity" ──→ Perplexity API
282
- ├── provider = "gemini" ──→ Gemini API → Gemini Web
283
- └── provider = "auto"
284
- ├── Perplexity key? ──→ Perplexity API
285
- ├── Gemini API key? ──→ Gemini API (grounded search)
286
- ├── Chrome cookies? ──→ Gemini Web (grounded search)
287
- └── Error
198
+ ─── Web Search Activity ────────────────────────────────────
199
+ API "typescript best practices" 200 2.1s ✓
200
+ GET docs.example.com/article 200 0.8s
201
+ GET blog.example.com/post 404 0.3s
202
+ ────────────────────────────────────────────────────────────
288
203
  ```
289
204
 
290
- When `includeContent: true`, sources are fetched in the background using the fetch_content routing above, and the agent receives a notification when ready.
291
-
292
205
  ## Configuration
293
206
 
294
- All config lives in `~/.pi/web-search.json`:
207
+ All config lives in `~/.pi/web-search.json`. Every field is optional.
295
208
 
296
209
  ```json
297
210
  {
@@ -306,61 +219,51 @@ All config lives in `~/.pi/web-search.json`:
306
219
  },
307
220
  "youtube": {
308
221
  "enabled": true,
309
- "preferredModel": "gemini-2.5-flash"
222
+ "preferredModel": "gemini-3-flash-preview"
310
223
  },
311
224
  "video": {
312
225
  "enabled": true,
313
- "preferredModel": "gemini-2.5-flash",
226
+ "preferredModel": "gemini-3-flash-preview",
314
227
  "maxSizeMB": 50
315
228
  }
316
229
  }
317
230
  ```
318
231
 
319
- All fields are optional. `GEMINI_API_KEY` and `PERPLEXITY_API_KEY` env vars take precedence over config file values. Set `"enabled": false` under `githubClone`, `youtube`, or `video` to disable those features.
232
+ `GEMINI_API_KEY` and `PERPLEXITY_API_KEY` env vars take precedence over config file values. `searchProvider` sets the `web_search` default: `"auto"`, `"perplexity"`, or `"gemini"`. Set `"enabled": false` under any feature to disable it. Config changes require a Pi restart.
320
233
 
321
- `searchProvider` controls `web_search` default: `"auto"` (Perplexity Gemini API Gemini Web), `"perplexity"`, or `"gemini"` (API → Web).
234
+ Rate limits: Perplexity is capped at 10 requests/minute (client-side). Content fetches run 3 concurrent with a 30s timeout per URL.
322
235
 
323
- ## Rate Limits
236
+ ## Limitations
324
237
 
325
- - **Perplexity API**: 10 requests/minute (enforced client-side)
326
- - **Content Fetch**: 3 concurrent requests, 30s timeout per URL
327
- - **Cache TTL**: 1 hour
238
+ - Chrome cookie extraction is macOS-only — other platforms fall through to API keys. First-time access may trigger a Keychain dialog.
239
+ - YouTube private/age-restricted videos may fail on all extraction paths.
240
+ - Gemini can process videos up to ~1 hour; longer videos may be truncated.
241
+ - PDFs are text-extracted only (no OCR for scanned documents).
242
+ - GitHub branch names with slashes may misresolve file paths; the clone still works and the agent can navigate manually.
243
+ - Non-code GitHub URLs (issues, PRs, wiki) fall through to normal web extraction.
328
244
 
329
- ## Files
245
+ <details>
246
+ <summary>Files</summary>
330
247
 
331
248
  | File | Purpose |
332
249
  |------|---------|
333
250
  | `index.ts` | Extension entry, tool definitions, commands, widget |
334
- | `perplexity.ts` | Perplexity API client, rate limiting |
335
- | `gemini-search.ts` | Gemini search providers (Web + API with grounding), search routing |
336
- | `extract.ts` | URL/file path routing, HTTP extraction, Gemini fallback orchestration |
251
+ | `extract.ts` | URL/file path routing, HTTP extraction, fallback orchestration |
252
+ | `gemini-search.ts` | Search routing across Perplexity, Gemini API, Gemini Web |
337
253
  | `gemini-url-context.ts` | Gemini URL Context + Web extraction fallbacks |
338
- | `video-extract.ts` | Local video file detection, upload, Gemini Web/API analysis |
339
- | `youtube-extract.ts` | YouTube URL detection, three-tier extraction orchestrator |
340
- | `chrome-cookies.ts` | macOS Chrome cookie extraction (Keychain + SQLite) |
341
254
  | `gemini-web.ts` | Gemini Web client (cookie auth, StreamGenerate) |
342
- | `gemini-api.ts` | Gemini REST API client (generateContent, file upload) |
343
- | `utils.ts` | Shared formatting (`formatSeconds`) and error helpers for frame extraction |
344
- | `github-extract.ts` | GitHub URL parser, clone cache, content generation |
345
- | `github-api.ts` | GitHub API fallback for oversized repos and commit SHAs |
255
+ | `gemini-api.ts` | Gemini REST API client (generateContent) |
256
+ | `chrome-cookies.ts` | macOS Chrome cookie extraction (Keychain + SQLite) |
257
+ | `youtube-extract.ts` | YouTube detection, three-tier extraction, frame extraction |
258
+ | `video-extract.ts` | Local video detection, Files API upload, Gemini analysis |
259
+ | `github-extract.ts` | GitHub URL parsing, clone cache, content generation |
260
+ | `github-api.ts` | GitHub API fallback for large repos and commit SHAs |
261
+ | `perplexity.ts` | Perplexity API client with rate limiting |
346
262
  | `pdf-extract.ts` | PDF text extraction, saves to markdown |
347
263
  | `rsc-extract.ts` | RSC flight data parser for Next.js pages |
264
+ | `utils.ts` | Shared formatting and error helpers |
348
265
  | `storage.ts` | Session-aware result storage |
349
- | `activity.ts` | Activity tracking for observability widget |
350
- | `skills/librarian/` | Bundled skill for library research with permalinks |
351
-
352
- ## Limitations
266
+ | `activity.ts` | Activity tracking for the observability widget |
267
+ | `skills/librarian/` | Bundled skill for library research |
353
268
 
354
- - Content extraction works best on article-style pages; JS-heavy sites fall back to Gemini extraction when available
355
- - Gemini extraction fallback requires either a Gemini API key or Chrome login to Google
356
- - PDFs are extracted as text (no OCR for scanned documents)
357
- - Max response size: 20MB for PDFs, 5MB for HTML
358
- - Max inline content: 30,000 chars per URL (larger content stored for retrieval via get_search_content)
359
- - GitHub cloning requires `gh` CLI for private repos (public repos fall back to `git clone`)
360
- - GitHub branch names with slashes (e.g. `feature/foo`) may resolve the wrong file path; the clone still succeeds and the agent can navigate manually
361
- - Non-code GitHub URLs (issues, PRs, wiki, etc.) fall through to normal Readability extraction
362
- - YouTube extraction via Gemini Web requires macOS (Chrome cookie decryption is OS-specific); other platforms fall through to Gemini API or Perplexity
363
- - YouTube private/age-restricted videos may fail on all paths
364
- - Gemini can process videos up to ~1 hour at default resolution; longer videos may be truncated
365
- - First-time Chrome cookie access may trigger a macOS Keychain permission dialog
366
- - Requires Pi restart after config file changes
269
+ </details>
package/extract.ts CHANGED
@@ -48,6 +48,7 @@ export interface ExtractOptions {
48
48
  prompt?: string;
49
49
  timestamp?: string;
50
50
  frames?: number;
51
+ model?: string;
51
52
  }
52
53
 
53
54
  function parseTimestamp(ts: string): number | null {
@@ -269,7 +270,7 @@ export async function extractContent(
269
270
  const ytInfo = isYouTubeURL(url);
270
271
  if (ytInfo.isYouTube && isYouTubeEnabled()) {
271
272
  try {
272
- const ytResult = await extractYouTube(url, signal, options?.prompt);
273
+ const ytResult = await extractYouTube(url, signal, options?.prompt, options?.model);
273
274
  if (ytResult) return ytResult;
274
275
  } catch {}
275
276
  return {
package/gemini-api.ts CHANGED
@@ -4,7 +4,7 @@ import { join } from "node:path";
4
4
 
5
5
  export const API_BASE = "https://generativelanguage.googleapis.com/v1beta";
6
6
  const CONFIG_PATH = join(homedir(), ".pi", "web-search.json");
7
- export const DEFAULT_MODEL = "gemini-2.5-flash";
7
+ export const DEFAULT_MODEL = "gemini-3-flash-preview";
8
8
 
9
9
  interface GeminiApiConfig {
10
10
  geminiApiKey?: string;
package/gemini-search.ts CHANGED
@@ -123,7 +123,7 @@ async function searchWithGeminiWeb(query: string, options: SearchOptions = {}):
123
123
 
124
124
  try {
125
125
  const text = await queryWithCookies(prompt, cookies, {
126
- model: "gemini-2.5-flash",
126
+ model: "gemini-3-flash-preview",
127
127
  signal: options.signal,
128
128
  timeoutMs: 60000,
129
129
  });
@@ -80,7 +80,7 @@ export async function extractWithGeminiWeb(
80
80
 
81
81
  try {
82
82
  const text = await queryWithCookies(EXTRACTION_PROMPT + url, cookies, {
83
- model: "gemini-2.5-flash",
83
+ model: "gemini-3-flash-preview",
84
84
  signal,
85
85
  timeoutMs: 60000,
86
86
  });
package/index.ts CHANGED
@@ -420,6 +420,9 @@ export default function (pi: ExtensionAPI) {
420
420
  maximum: 12,
421
421
  description: "Number of frames to extract. Use with timestamp range for custom density, with single timestamp to get N frames at 5s intervals, or alone to sample across the entire video. Requires yt-dlp + ffmpeg for YouTube, ffmpeg for local video.",
422
422
  })),
423
+ model: Type.Optional(Type.String({
424
+ description: "Override the Gemini model for video/YouTube analysis (e.g. 'gemini-2.5-flash', 'gemini-3-flash-preview'). Defaults to config or gemini-3-flash-preview.",
425
+ })),
423
426
  }),
424
427
 
425
428
  async execute(_toolCallId, params, signal, onUpdate, _ctx) {
@@ -441,6 +444,7 @@ export default function (pi: ExtensionAPI) {
441
444
  prompt: params.prompt,
442
445
  timestamp: params.timestamp,
443
446
  frames: params.frames,
447
+ model: params.model,
444
448
  });
445
449
  const successful = fetchResults.filter((r) => !r.error).length;
446
450
  const totalChars = fetchResults.reduce((sum, r) => sum + r.content.length, 0);
@@ -527,7 +531,7 @@ export default function (pi: ExtensionAPI) {
527
531
  },
528
532
 
529
533
  renderCall(args, theme) {
530
- const { url, urls, prompt, timestamp, frames } = args as { url?: string; urls?: string[]; prompt?: string; timestamp?: string; frames?: number };
534
+ const { url, urls, prompt, timestamp, frames, model } = args as { url?: string; urls?: string[]; prompt?: string; timestamp?: string; frames?: number; model?: string };
531
535
  const urlList = urls ?? (url ? [url] : []);
532
536
  if (urlList.length === 0) {
533
537
  return new Text(theme.fg("toolTitle", theme.bold("fetch ")) + theme.fg("error", "(no URL)"), 0, 0);
@@ -556,6 +560,9 @@ export default function (pi: ExtensionAPI) {
556
560
  const display = prompt.length > 250 ? prompt.slice(0, 247) + "..." : prompt;
557
561
  lines.push(theme.fg("dim", " prompt: ") + theme.fg("muted", `"${display}"`));
558
562
  }
563
+ if (model) {
564
+ lines.push(theme.fg("dim", " model: ") + theme.fg("warning", model));
565
+ }
559
566
  return new Text(lines.join("\n"), 0, 0);
560
567
  },
561
568
 
@@ -603,8 +610,10 @@ export default function (pi: ExtensionAPI) {
603
610
  if (typeof details?.duration === "number") {
604
611
  statusLine += theme.fg("muted", ` | ${formatSeconds(Math.floor(details.duration))} total`);
605
612
  }
613
+ const textContent = result.content.find((c) => c.type === "text")?.text || "";
606
614
  if (!expanded) {
607
- return new Text(statusLine, 0, 0);
615
+ const brief = textContent.length > 200 ? textContent.slice(0, 200) + "..." : textContent;
616
+ return new Text(statusLine + "\n" + theme.fg("dim", brief), 0, 0);
608
617
  }
609
618
  const lines = [statusLine];
610
619
  if (details?.prompt) {
@@ -617,7 +626,6 @@ export default function (pi: ExtensionAPI) {
617
626
  if (typeof details?.frames === "number") {
618
627
  lines.push(theme.fg("dim", ` frames: ${details.frames}`));
619
628
  }
620
- const textContent = result.content.find((c) => c.type === "text")?.text || "";
621
629
  const preview = textContent.length > 500 ? textContent.slice(0, 500) + "..." : textContent;
622
630
  lines.push(theme.fg("dim", preview));
623
631
  return new Text(lines.join("\n"), 0, 0);
package/package.json CHANGED
@@ -1,8 +1,28 @@
1
1
  {
2
2
  "name": "pi-web-access",
3
- "version": "0.7.0",
3
+ "version": "0.7.2",
4
+ "description": "Web search, URL fetching, GitHub repo cloning, PDF extraction, YouTube video understanding, and local video analysis for Pi coding agent",
4
5
  "type": "module",
5
- "keywords": ["pi-package", "pi", "pi-coding-agent", "extension", "web-search", "perplexity", "fetch", "scraping"],
6
+ "keywords": [
7
+ "pi-package",
8
+ "pi",
9
+ "pi-coding-agent",
10
+ "extension",
11
+ "web-search",
12
+ "perplexity",
13
+ "fetch",
14
+ "scraping"
15
+ ],
16
+ "author": "Nico Bailon",
17
+ "license": "MIT",
18
+ "repository": {
19
+ "type": "git",
20
+ "url": "git+https://github.com/nicobailon/pi-web-access.git"
21
+ },
22
+ "bugs": {
23
+ "url": "https://github.com/nicobailon/pi-web-access/issues"
24
+ },
25
+ "homepage": "https://github.com/nicobailon/pi-web-access#readme",
6
26
  "dependencies": {
7
27
  "@mozilla/readability": "^0.5.0",
8
28
  "linkedom": "^0.16.0",
@@ -11,8 +31,12 @@
11
31
  "unpdf": "^1.4.0"
12
32
  },
13
33
  "pi": {
14
- "extensions": ["./index.ts"],
15
- "skills": ["./skills"],
34
+ "extensions": [
35
+ "./index.ts"
36
+ ],
37
+ "skills": [
38
+ "./skills"
39
+ ],
16
40
  "video": "https://github.com/nicobailon/pi-web-access/raw/refs/heads/main/pi-web-fetch-demo.mp4"
17
41
  }
18
42
  }
package/perplexity.ts CHANGED
@@ -56,7 +56,7 @@ function loadConfig(): WebSearchConfig {
56
56
 
57
57
  function getApiKey(): string {
58
58
  const config = loadConfig();
59
- const key = config.perplexityApiKey || process.env.PERPLEXITY_API_KEY;
59
+ const key = process.env.PERPLEXITY_API_KEY || config.perplexityApiKey;
60
60
  if (!key) {
61
61
  throw new Error(
62
62
  "Perplexity API key not found. Either:\n" +
@@ -93,7 +93,7 @@ function validateDomainFilter(domains: string[]): string[] {
93
93
 
94
94
  export function isPerplexityAvailable(): boolean {
95
95
  const config = loadConfig();
96
- return Boolean(config.perplexityApiKey || process.env.PERPLEXITY_API_KEY);
96
+ return Boolean(process.env.PERPLEXITY_API_KEY || config.perplexityApiKey);
97
97
  }
98
98
 
99
99
  export async function searchWithPerplexity(query: string, options: SearchOptions = {}): Promise<SearchResponse> {
package/video-extract.ts CHANGED
@@ -46,7 +46,7 @@ interface VideoConfig {
46
46
 
47
47
  const VIDEO_CONFIG_DEFAULTS: VideoConfig = {
48
48
  enabled: true,
49
- preferredModel: "gemini-2.5-flash",
49
+ preferredModel: "gemini-3-flash-preview",
50
50
  maxSizeMB: 50,
51
51
  };
52
52
 
@@ -123,11 +123,12 @@ export async function extractVideo(
123
123
  ): Promise<ExtractedContent | null> {
124
124
  const config = loadVideoConfig();
125
125
  const effectivePrompt = options?.prompt ?? DEFAULT_VIDEO_PROMPT;
126
+ const effectiveModel = options?.model ?? config.preferredModel;
126
127
  const displayName = basename(info.absolutePath);
127
128
  const activityId = activityMonitor.logStart({ type: "fetch", url: `video:${displayName}` });
128
129
 
129
- const result = await tryVideoGeminiApi(info, effectivePrompt, config, signal)
130
- ?? await tryVideoGeminiWeb(info, effectivePrompt, config, signal);
130
+ const result = await tryVideoGeminiApi(info, effectivePrompt, effectiveModel, signal)
131
+ ?? await tryVideoGeminiWeb(info, effectivePrompt, effectiveModel, signal);
131
132
 
132
133
  if (result) {
133
134
  const thumbnail = await extractVideoFrame(info.absolutePath);
@@ -183,7 +184,7 @@ export async function getLocalVideoDuration(filePath: string): Promise<number |
183
184
  async function tryVideoGeminiWeb(
184
185
  info: VideoFileInfo,
185
186
  prompt: string,
186
- config: VideoConfig,
187
+ model: string,
187
188
  signal?: AbortSignal,
188
189
  ): Promise<ExtractedContent | null> {
189
190
  try {
@@ -193,7 +194,7 @@ async function tryVideoGeminiWeb(
193
194
 
194
195
  const text = await queryWithCookies(prompt, cookies, {
195
196
  files: [info.absolutePath],
196
- model: config.preferredModel,
197
+ model,
197
198
  signal,
198
199
  timeoutMs: 180000,
199
200
  });
@@ -212,7 +213,7 @@ async function tryVideoGeminiWeb(
212
213
  async function tryVideoGeminiApi(
213
214
  info: VideoFileInfo,
214
215
  prompt: string,
215
- config: VideoConfig,
216
+ model: string,
216
217
  signal?: AbortSignal,
217
218
  ): Promise<ExtractedContent | null> {
218
219
  const apiKey = getApiKey();
@@ -227,7 +228,7 @@ async function tryVideoGeminiApi(
227
228
  await pollFileState(fileName, apiKey, signal, 120000);
228
229
 
229
230
  const text = await queryGeminiApiWithVideo(prompt, uploaded.uri, {
230
- model: config.preferredModel,
231
+ model,
231
232
  mimeType: info.mimeType,
232
233
  signal,
233
234
  timeoutMs: 120000,
@@ -26,7 +26,7 @@ interface YouTubeConfig {
26
26
  preferredModel: string;
27
27
  }
28
28
 
29
- const defaults: YouTubeConfig = { enabled: true, preferredModel: "gemini-2.5-flash" };
29
+ const defaults: YouTubeConfig = { enabled: true, preferredModel: "gemini-3-flash-preview" };
30
30
  let cachedConfig: YouTubeConfig | null = null;
31
31
 
32
32
  function loadYouTubeConfig(): YouTubeConfig {
@@ -69,6 +69,7 @@ export async function extractYouTube(
69
69
  url: string,
70
70
  signal?: AbortSignal,
71
71
  prompt?: string,
72
+ model?: string,
72
73
  ): Promise<ExtractedContent | null> {
73
74
  const config = loadYouTubeConfig();
74
75
  const { videoId } = isYouTubeURL(url);
@@ -76,11 +77,12 @@ export async function extractYouTube(
76
77
  ? `https://www.youtube.com/watch?v=${videoId}`
77
78
  : url;
78
79
  const effectivePrompt = prompt ?? YOUTUBE_PROMPT;
80
+ const effectiveModel = model ?? config.preferredModel;
79
81
 
80
82
  const activityId = activityMonitor.logStart({ type: "fetch", url: `youtube.com/${videoId ?? "video"}` });
81
83
 
82
- const result = await tryGeminiWeb(canonicalUrl, effectivePrompt, config, signal)
83
- ?? await tryGeminiApi(canonicalUrl, effectivePrompt, config, signal)
84
+ const result = await tryGeminiWeb(canonicalUrl, effectivePrompt, effectiveModel, signal)
85
+ ?? await tryGeminiApi(canonicalUrl, effectivePrompt, effectiveModel, signal)
84
86
  ?? await tryPerplexity(url, effectivePrompt, signal);
85
87
 
86
88
  if (result) {
@@ -190,7 +192,7 @@ export async function fetchYouTubeThumbnail(videoId: string): Promise<{ data: st
190
192
  async function tryGeminiWeb(
191
193
  url: string,
192
194
  prompt: string,
193
- config: YouTubeConfig,
195
+ model: string,
194
196
  signal?: AbortSignal,
195
197
  ): Promise<ExtractedContent | null> {
196
198
  try {
@@ -201,7 +203,7 @@ async function tryGeminiWeb(
201
203
 
202
204
  const text = await queryWithCookies(prompt, cookies, {
203
205
  youtubeUrl: url,
204
- model: config.preferredModel,
206
+ model,
205
207
  signal,
206
208
  timeoutMs: 120000,
207
209
  });
@@ -220,7 +222,7 @@ async function tryGeminiWeb(
220
222
  async function tryGeminiApi(
221
223
  url: string,
222
224
  prompt: string,
223
- config: YouTubeConfig,
225
+ model: string,
224
226
  signal?: AbortSignal,
225
227
  ): Promise<ExtractedContent | null> {
226
228
  try {
@@ -229,7 +231,7 @@ async function tryGeminiApi(
229
231
  if (signal?.aborted) return null;
230
232
 
231
233
  const text = await queryGeminiApiWithVideo(prompt, url, {
232
- model: config.preferredModel,
234
+ model,
233
235
  signal,
234
236
  timeoutMs: 120000,
235
237
  });