@apmantza/greedysearch-pi 1.6.4 → 1.6.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +45 -45
- package/cdp.mjs +1004 -0
- package/coding-task.mjs +392 -0
- package/extractors/bing-copilot.mjs +167 -0
- package/extractors/common.mjs +237 -0
- package/extractors/consent.mjs +273 -0
- package/extractors/gemini.mjs +160 -0
- package/extractors/google-ai.mjs +156 -0
- package/extractors/perplexity.mjs +128 -0
- package/extractors/selectors.mjs +52 -0
- package/launch.mjs +288 -0
- package/package.json +8 -2
- package/search.mjs +1242 -0
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# GreedySearch for Pi
|
|
2
2
|
|
|
3
|
-
Pi extension that adds `greedy_search`, `deep_research`, and `coding_task` tools
|
|
3
|
+
Pi extension that adds `greedy_search`, `deep_research`, and `coding_task` tools -- multi-engine AI search via browser automation. **NO API KEYS needed.**
|
|
4
4
|
|
|
5
5
|
Fans out queries to Perplexity, Bing Copilot, and Google AI simultaneously. Returns AI-synthesized answers with deduped sources. Streams progress as each engine completes.
|
|
6
6
|
|
|
@@ -8,7 +8,7 @@ Forked from [GreedySearch-claude](https://github.com/apmantza/GreedySearch-claud
|
|
|
8
8
|
|
|
9
9
|
## Quick Note
|
|
10
10
|
|
|
11
|
-
**No API keys required**
|
|
11
|
+
**No API keys required** -- this tool uses Chrome DevTools Protocol (CDP) to interact with search engines directly through a browser. It launches its own isolated Chrome instance, so it won't interfere with your main browser session.
|
|
12
12
|
|
|
13
13
|
## Install
|
|
14
14
|
|
|
@@ -43,15 +43,15 @@ greedy_search({ query: "What's new in React 19?", depth: "standard" })
|
|
|
43
43
|
|
|
44
44
|
| Depth | Engines | Synthesis | Source Fetch | Time | Best For |
|
|
45
45
|
|-------|---------|-----------|--------------|------|----------|
|
|
46
|
-
| `fast` | 1 |
|
|
47
|
-
| `standard` | 3 |
|
|
48
|
-
| `deep` | 3 |
|
|
46
|
+
| `fast` | 1 | no | no | 15-30s | Quick lookup, single perspective |
|
|
47
|
+
| `standard` | 3 | yes | no | 30-90s | Default -- balanced speed/quality |
|
|
48
|
+
| `deep` | 3 | yes | yes (top 5) | 60-180s | Research that matters -- architecture decisions |
|
|
49
49
|
|
|
50
50
|
## Engines (for fast mode)
|
|
51
51
|
|
|
52
52
|
| Engine | Alias | Best for |
|
|
53
53
|
|--------|-------|----------|
|
|
54
|
-
| `all` |
|
|
54
|
+
| `all` | - | All 3 engines -- but for fast single-engine, pick one below |
|
|
55
55
|
| `perplexity` | `p` | Technical Q&A, code explanations, documentation |
|
|
56
56
|
| `bing` | `b` | Recent news, Microsoft ecosystem |
|
|
57
57
|
| `google` | `g` | Broad coverage, multiple perspectives |
|
|
@@ -62,15 +62,15 @@ greedy_search({ query: "What's new in React 19?", depth: "standard" })
|
|
|
62
62
|
When using `engine: "all"`, the tool streams progress as each engine completes:
|
|
63
63
|
|
|
64
64
|
```
|
|
65
|
-
**Searching...**
|
|
66
|
-
**Searching...**
|
|
67
|
-
**Searching...**
|
|
68
|
-
**Searching...**
|
|
65
|
+
**Searching...** pending: perplexity, bing, google
|
|
66
|
+
**Searching...** done: perplexity, pending: bing, google
|
|
67
|
+
**Searching...** done: perplexity, done: bing, pending: google
|
|
68
|
+
**Searching...** done: perplexity, done: bing, done: google
|
|
69
69
|
```
|
|
70
70
|
|
|
71
71
|
## Deep Research Mode
|
|
72
72
|
|
|
73
|
-
For research that matters
|
|
73
|
+
For research that matters -- architecture decisions, library comparisons -- use `depth: "deep"`:
|
|
74
74
|
|
|
75
75
|
```
|
|
76
76
|
greedy_search({ query: "best auth patterns for SaaS in 2026", depth: "deep" })
|
|
@@ -82,7 +82,7 @@ Deep mode: 3 engines + source fetching (top 5) + synthesis + confidence scores.
|
|
|
82
82
|
- `standard` (default): 3 engines + synthesis. Good for most research.
|
|
83
83
|
- `deep`: Same + fetches source content for grounded answers. Use when the answer really matters.
|
|
84
84
|
|
|
85
|
-
**Legacy:** `deep_research` tool still works
|
|
85
|
+
**Legacy:** `deep_research` tool still works -- aliases to `greedy_search` with `depth: "deep"`.
|
|
86
86
|
|
|
87
87
|
## Full vs Short Answers
|
|
88
88
|
|
|
@@ -120,8 +120,8 @@ greedy_search({ query: "Error: Cannot find module 'react-dom/client' Next.js 15"
|
|
|
120
120
|
|
|
121
121
|
## Requirements
|
|
122
122
|
|
|
123
|
-
- **Chrome**
|
|
124
|
-
- **Node.js 22+**
|
|
123
|
+
- **Chrome** -- must be installed. The extension auto-launches a dedicated Chrome instance on port 9222 with its own isolated profile and DevTools port file, separate from your main browser session.
|
|
124
|
+
- **Node.js 22+** -- for built-in `fetch` and WebSocket support.
|
|
125
125
|
|
|
126
126
|
## Setup (first time)
|
|
127
127
|
|
|
@@ -156,7 +156,7 @@ Run the test suite to verify everything works:
|
|
|
156
156
|
Tests verify:
|
|
157
157
|
- Single engine mode (perplexity, bing, google)
|
|
158
158
|
- Sequential "all" mode searches
|
|
159
|
-
- Parallel "all" mode (5 concurrent searches)
|
|
159
|
+
- Parallel "all" mode (5 concurrent searches) -- detects tab race conditions
|
|
160
160
|
- Synthesis mode with Gemini
|
|
161
161
|
|
|
162
162
|
## Troubleshooting
|
|
@@ -180,7 +180,7 @@ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
|
|
|
180
180
|
|
|
181
181
|
### Google / Bing "verify you're human"
|
|
182
182
|
|
|
183
|
-
The extension auto-clicks verification buttons and Cloudflare Turnstile challenges using broad keyword matching
|
|
183
|
+
The extension auto-clicks verification buttons and Cloudflare Turnstile challenges using broad keyword matching -- resilient to variations like "Verify you are human" or localised button text. For hard CAPTCHAs (image puzzles), solve manually in the Chrome window that opens.
|
|
184
184
|
|
|
185
185
|
### Parallel searches failing
|
|
186
186
|
|
|
@@ -192,60 +192,60 @@ Chrome may be unresponsive. Restart it with `launch.mjs --kill` then `launch.mjs
|
|
|
192
192
|
|
|
193
193
|
### Sources are empty or junk links
|
|
194
194
|
|
|
195
|
-
Sources are now extracted by regex-parsing Markdown links (`[title](url)`) from the clipboard text captured after each engine responds
|
|
195
|
+
Sources are now extracted by regex-parsing Markdown links (`[title](url)`) from the clipboard text captured after each engine responds -- not from DOM selectors that break when the engine's UI updates. If sources are empty, the engine's clipboard copy didn't include formatted links (Bing Copilot currently falls into this category).
|
|
196
196
|
|
|
197
197
|
## How It Works
|
|
198
198
|
|
|
199
|
-
- `index.ts`
|
|
200
|
-
- `search.mjs`
|
|
201
|
-
- `launch.mjs`
|
|
202
|
-
- `extractors/`
|
|
203
|
-
- `cdp.mjs`
|
|
204
|
-
- `skills/greedy-search/SKILL.md`
|
|
199
|
+
- `index.ts` -- Pi extension, registers `greedy_search` tool with streaming progress
|
|
200
|
+
- `search.mjs` -- CLI runner, spawns extractors in parallel, emits `PROGRESS:` events to stderr
|
|
201
|
+
- `launch.mjs` -- launches dedicated Chrome on port 9222 with isolated profile
|
|
202
|
+
- `extractors/` -- per-engine CDP scrapers (Perplexity, Bing Copilot, Google AI, Gemini)
|
|
203
|
+
- `cdp.mjs` -- Chrome DevTools Protocol CLI for browser automation
|
|
204
|
+
- `skills/greedy-search/SKILL.md` -- skill file that guides the model on when/how to use greedy_search
|
|
205
205
|
|
|
206
206
|
## Changelog
|
|
207
207
|
|
|
208
208
|
### v1.6.1 (2026-03-31)
|
|
209
|
-
- **Single-engine full answers by default**
|
|
210
|
-
- **Codebase refactored**
|
|
211
|
-
- **Removed codebase search confusion**
|
|
209
|
+
- **Single-engine full answers by default** -- `engine: "google"` (or any single engine) now returns complete answers instead of truncated previews. Multi-engine (`all`) still truncates to save tokens during synthesis.
|
|
210
|
+
- **Codebase refactored** -- extracted 438 lines from `index.ts` into modular formatters (`src/formatters/`) reducing cognitive complexity from 360 to ~60 and maintainability index from 11.2 to ~40+
|
|
211
|
+
- **Removed codebase search confusion** -- clarified that `greedy_search` is WEB SEARCH ONLY (not for searching local code)
|
|
212
212
|
|
|
213
213
|
### v1.6.0 (2026-03-29)
|
|
214
|
-
- **Merged deep_research into greedy_search**
|
|
215
|
-
- **Simpler API**
|
|
216
|
-
- **Backward compatible**
|
|
217
|
-
- **Updated documentation**
|
|
214
|
+
- **Merged deep_research into greedy_search** -- new `depth` parameter: `fast` (1 engine), `standard` (3 engines + synthesis), `deep` (3 engines + fetch + synthesis + confidence)
|
|
215
|
+
- **Simpler API** -- one tool with clear speed/quality tradeoffs instead of separate tools with overlapping flags
|
|
216
|
+
- **Backward compatible** -- `deep_research` still works as alias, `--synthesize` and `--deep-research` flags still function
|
|
217
|
+
- **Updated documentation** -- README and skill docs now use `depth` parameter throughout
|
|
218
218
|
|
|
219
219
|
### v1.5.1 (2026-03-29)
|
|
220
|
-
- Fixed npm package
|
|
220
|
+
- Fixed npm package -- added `.pi-lens/` and test files to `.npmignore`
|
|
221
221
|
|
|
222
222
|
### v1.5.0 (2026-03-29)
|
|
223
223
|
|
|
224
|
-
- **Code extraction fixed**
|
|
225
|
-
- **Chrome targeting hardened**
|
|
226
|
-
- **Shared utilities**
|
|
227
|
-
- **Documentation leaner**
|
|
228
|
-
- **NO API KEYS**
|
|
224
|
+
- **Code extraction fixed** -- `coding_task` now uses clipboard interception to preserve markdown code blocks (was losing them via DOM scraping)
|
|
225
|
+
- **Chrome targeting hardened** -- all tools now consistently target the dedicated GreedySearch Chrome via `CDP_PROFILE_DIR`, preventing fallback to user's main Chrome session
|
|
226
|
+
- **Shared utilities** -- extracted ~220 lines of duplicate code from extractors into `common.mjs` (cdp wrapper, tab management, clipboard interception)
|
|
227
|
+
- **Documentation leaner** -- skill documentation reduced 61% (180 -> 70 lines) while preserving all decision-making info
|
|
228
|
+
- **NO API KEYS** -- updated messaging to emphasize this works via browser automation, no API keys needed
|
|
229
229
|
|
|
230
230
|
### v1.4.2 (2026-03-25)
|
|
231
231
|
|
|
232
|
-
- **Fresh isolated tabs**
|
|
233
|
-
- **Regex-based citation extraction**
|
|
234
|
-
- **Relaxed verification detection**
|
|
232
|
+
- **Fresh isolated tabs** -- each search now always creates a new `about:blank` tab via `Target.createTarget` and refreshes the CDP page cache immediately after, preventing SPA navigation failures and stale DOM state from prior queries
|
|
233
|
+
- **Regex-based citation extraction** -- all extractors (Perplexity, Bing, Gemini) now parse sources from clipboard Markdown links (`[title](url)`) instead of DOM selectors that break on UI updates
|
|
234
|
+
- **Relaxed verification detection** -- `consent.mjs` now uses broad keyword matching (`includes('verify')`, `includes('human')`) instead of anchored regexes, correctly catching button text variants like "Verify you are human" across Cloudflare, Microsoft, and generic modals
|
|
235
235
|
|
|
236
236
|
---
|
|
237
237
|
|
|
238
238
|
### v1.4.1
|
|
239
239
|
|
|
240
|
-
- **Fixed parallel synthesis**
|
|
240
|
+
- **Fixed parallel synthesis** -- multiple `greedy_search` calls with `synthesize: true` now run safely in parallel. Each search creates a fresh Gemini tab that gets cleaned up after synthesis, preventing tab conflicts and "Uncaught" errors.
|
|
241
241
|
|
|
242
242
|
### v1.4.0
|
|
243
243
|
|
|
244
|
-
- **Grounded synthesis**
|
|
245
|
-
- **Real deep research**
|
|
246
|
-
- **Richer source metadata**
|
|
247
|
-
- **Cleaner tab lifecycle**
|
|
248
|
-
- **Isolated Chrome targeting**
|
|
244
|
+
- **Grounded synthesis** -- Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
|
|
245
|
+
- **Real deep research** -- top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
|
|
246
|
+
- **Richer source metadata** -- source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
|
|
247
|
+
- **Cleaner tab lifecycle** -- temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
|
|
248
|
+
- **Isolated Chrome targeting** -- GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
|
|
249
249
|
|
|
250
250
|
## License
|
|
251
251
|
|