@apmantza/greedysearch-pi 1.4.0 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,204 +1,208 @@
1
- # GreedySearch for Pi
2
-
3
- Pi extension that adds a `greedy_search` tool — fans out queries to Perplexity, Bing Copilot, and Google AI simultaneously and returns AI-synthesized answers with deduped sources. Streams progress as each engine completes.
4
-
5
- Forked from [GreedySearch-claude](https://github.com/apmantza/GreedySearch-claude).
6
-
7
- ## What's New (v1.4.0)
8
-
9
- - **Grounded synthesis** — Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
10
- - **Real deep research** — top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
11
- - **Richer source metadata** — source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
12
- - **Cleaner tab lifecycle** — temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
13
- - **Isolated Chrome targeting** — GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
14
-
15
- ## Install
16
-
17
- ```bash
18
- pi install npm:@apmantza/greedysearch-pi
19
- ```
20
-
21
- Or directly from git:
22
-
23
- ```bash
24
- pi install git:github.com/apmantza/GreedySearch-pi
25
- ```
26
-
27
- ## Quick Start
28
-
29
- Once installed, Pi gains a `greedy_search` tool. The model will use it automatically for questions about current libraries, error messages, version-specific docs, etc.
30
-
31
- ```
32
- greedy_search({ query: "What's new in React 19?", engine: "all" })
33
- ```
34
-
35
- ## Parameters
36
-
37
- | Parameter | Type | Default | Description |
38
- |-----------|------|---------|-------------|
39
- | `query` | string | required | The search question |
40
- | `engine` | string | `"all"` | Engine to use (see below) |
41
- | `synthesize` | boolean | `false` | Synthesize results into one answer via Gemini |
42
- | `fullAnswer` | boolean | `false` | Return complete answer (~3000+ chars) vs truncated preview (~300 chars) |
43
-
44
- ## Engines
45
-
46
- | Engine | Alias | Latency | Best for |
47
- |--------|-------|---------|----------|
48
- | `all` | — | 30-90s | Highest confidence — all 3 engines in parallel (default) |
49
- | `perplexity` | `p` | 15-30s | Technical Q&A, code explanations, documentation |
50
- | `bing` | `b` | 15-30s | Recent news, Microsoft ecosystem |
51
- | `google` | `g` | 15-30s | Broad coverage, multiple perspectives |
52
- | `gemini` | `gem` | 15-30s | Google's AI with different training data |
53
-
54
- ## Streaming Progress
55
-
56
- When using `engine: "all"`, the tool streams progress as each engine completes:
57
-
58
- ```
59
- **Searching...** ⏳ perplexity · ⏳ bing · ⏳ google
60
- **Searching...** perplexity done · bing · google
61
- **Searching...** ✅ perplexity done · ✅ bing done · ⏳ google
62
- **Searching...** ✅ perplexity done · ✅ bing done · ✅ google done
63
- ```
64
-
65
- ## Synthesis Mode
66
-
67
- For complex research questions, use `synthesize: true` with `engine: "all"`:
68
-
69
- ```
70
- greedy_search({ query: "best auth patterns for SaaS in 2026", engine: "all", synthesize: true })
71
- ```
72
-
73
- This deduplicates sources across engines, builds a normalized source registry, and feeds that context to Gemini for one clean synthesized answer. Adds ~30s but now returns agreement summaries, caveats, key claims, and better-labeled top sources.
74
-
75
- For the most grounded mode, use deep research from the CLI:
76
-
77
- ```bash
78
- node search.mjs all "best auth patterns for SaaS in 2026" --deep-research
79
- ```
80
-
81
- Deep research fetches top source pages before synthesis and reports source confidence metadata such as agreement level, fetched-source success rate, and source mix.
82
-
83
- **Use synthesis when:**
84
- - You need one definitive answer, not multiple perspectives
85
- - You're researching a topic to write about or make a decision
86
- - Token efficiency matters (one answer vs three)
87
-
88
- **Skip synthesis when:**
89
- - You want to see where engines disagree
90
- - Speed matters
91
-
92
- ## Full vs Short Answers
93
-
94
- Default mode returns ~300 char summaries to save tokens. Use `fullAnswer: true` for complete responses:
95
-
96
- ```
97
- greedy_search({ query: "explain the React compiler", engine: "perplexity", fullAnswer: true })
98
- ```
99
-
100
- ## Examples
101
-
102
- **Quick technical lookup:**
103
- ```
104
- greedy_search({ query: "How to use async await in Python", engine: "perplexity" })
105
- ```
106
-
107
- **Compare tools (see where engines agree/disagree):**
108
- ```
109
- greedy_search({ query: "Prisma vs Drizzle in 2026", engine: "all" })
110
- ```
111
-
112
- **Research with synthesis:**
113
- ```
114
- greedy_search({ query: "Best practices for monorepo structure", engine: "all", synthesize: true })
115
- ```
116
-
117
- **Debug an error:**
118
- ```
119
- greedy_search({ query: "Error: Cannot find module 'react-dom/client' Next.js 15", engine: "all" })
120
- ```
121
-
122
- ## Requirements
123
-
124
- - **Chrome** — must be installed. The extension auto-launches a dedicated Chrome instance on port 9222 with its own isolated profile and DevTools port file, separate from your main browser session.
125
- - **Node.js 22+** — for built-in `fetch` and WebSocket support.
126
-
127
- ## Setup (first time)
128
-
129
- To pre-launch the dedicated GreedySearch Chrome instance:
130
-
131
- ```bash
132
- node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
133
- ```
134
-
135
- Stop it when done:
136
-
137
- ```bash
138
- node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
139
- ```
140
-
141
- Check status:
142
-
143
- ```bash
144
- node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --status
145
- ```
146
-
147
- ## Testing
148
-
149
- Run the test suite to verify everything works:
150
-
151
- ```bash
152
- ./test.sh # full suite (~3-4 min)
153
- ./test.sh quick # skip parallel tests (~1 min)
154
- ./test.sh parallel # parallel race condition tests only
155
- ```
156
-
157
- Tests verify:
158
- - Single engine mode (perplexity, bing, google)
159
- - Sequential "all" mode searches
160
- - Parallel "all" mode (5 concurrent searches) — detects tab race conditions
161
- - Synthesis mode with Gemini
162
-
163
- ## Troubleshooting
164
-
165
- ### "Chrome not found"
166
- Set the path explicitly:
167
- ```bash
168
- export CHROME_PATH="/path/to/chrome"
169
- ```
170
-
171
- ### "CDP timeout" or "Chrome may have crashed"
172
- Restart GreedySearch Chrome:
173
- ```bash
174
- node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
175
- node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
176
- ```
177
-
178
- ### Google / Bing "verify you're human"
179
- The extension auto-clicks verification buttons and Cloudflare Turnstile challenges. For hard CAPTCHAs (image puzzles), solve manually in the Chrome window that opens.
180
-
181
- ### Parallel searches failing
182
- Earlier versions shared Chrome tabs between concurrent searches, causing `ERR_ABORTED` errors. Version 1.2.0+ creates fresh tabs for each search, allowing safe parallel execution.
183
-
184
- ### Search hangs
185
- Chrome may be unresponsive. Restart it with `launch.mjs --kill` then `launch.mjs`.
186
-
187
- ### Sources are junk links
188
- This was a known issue with Gemini sources. If you're on an older version, update:
189
- ```bash
190
- pi install npm:@apmantza/greedysearch-pi
191
- ```
192
-
193
- ## How It Works
194
-
195
- - `index.ts` — Pi extension, registers `greedy_search` tool with streaming progress
196
- - `search.mjs` — CLI runner, spawns extractors in parallel, emits `PROGRESS:` events to stderr
197
- - `launch.mjs` launches dedicated Chrome on port 9222 with isolated profile
198
- - `extractors/` — per-engine CDP scrapers (Perplexity, Bing Copilot, Google AI, Gemini)
199
- - `cdp.mjs` — Chrome DevTools Protocol CLI for browser automation
200
- - `skills/greedy-search/SKILL.md` — skill file that guides the model on when/how to use greedy_search
201
-
202
- ## License
203
-
204
- MIT
1
+ # GreedySearch for Pi
2
+
3
+ Pi extension that adds a `greedy_search` tool — fans out queries to Perplexity, Bing Copilot, and Google AI simultaneously and returns AI-synthesized answers with deduped sources. Streams progress as each engine completes.
4
+
5
+ Forked from [GreedySearch-claude](https://github.com/apmantza/GreedySearch-claude).
6
+
7
+ ## What's New (v1.4.1)
8
+
9
+ - **Fixed parallel synthesis** — multiple `greedy_search` calls with `synthesize: true` now run safely in parallel. Each search creates a fresh Gemini tab that gets cleaned up after synthesis, preventing tab conflicts and "Uncaught" errors.
10
+
11
+ ## What's New (v1.4.0)
12
+
13
+ - **Grounded synthesis** — Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
14
+ - **Real deep research** — top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
15
+ - **Richer source metadata** — source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
16
+ - **Cleaner tab lifecycle** — temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
17
+ - **Isolated Chrome targeting** — GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
18
+
19
+ ## Install
20
+
21
+ ```bash
22
+ pi install npm:@apmantza/greedysearch-pi
23
+ ```
24
+
25
+ Or directly from git:
26
+
27
+ ```bash
28
+ pi install git:github.com/apmantza/GreedySearch-pi
29
+ ```
30
+
31
+ ## Quick Start
32
+
33
+ Once installed, Pi gains a `greedy_search` tool. The model will use it automatically for questions about current libraries, error messages, version-specific docs, etc.
34
+
35
+ ```
36
+ greedy_search({ query: "What's new in React 19?", engine: "all" })
37
+ ```
38
+
39
+ ## Parameters
40
+
41
+ | Parameter | Type | Default | Description |
42
+ |-----------|------|---------|-------------|
43
+ | `query` | string | required | The search question |
44
+ | `engine` | string | `"all"` | Engine to use (see below) |
45
+ | `synthesize` | boolean | `false` | Synthesize results into one answer via Gemini |
46
+ | `fullAnswer` | boolean | `false` | Return complete answer (~3000+ chars) vs truncated preview (~300 chars) |
47
+
48
+ ## Engines
49
+
50
+ | Engine | Alias | Latency | Best for |
51
+ |--------|-------|---------|----------|
52
+ | `all` | | 30-90s | Highest confidence all 3 engines in parallel (default) |
53
+ | `perplexity` | `p` | 15-30s | Technical Q&A, code explanations, documentation |
54
+ | `bing` | `b` | 15-30s | Recent news, Microsoft ecosystem |
55
+ | `google` | `g` | 15-30s | Broad coverage, multiple perspectives |
56
+ | `gemini` | `gem` | 15-30s | Google's AI with different training data |
57
+
58
+ ## Streaming Progress
59
+
60
+ When using `engine: "all"`, the tool streams progress as each engine completes:
61
+
62
+ ```
63
+ **Searching...** ⏳ perplexity · ⏳ bing · ⏳ google
64
+ **Searching...** ✅ perplexity done · ⏳ bing · ⏳ google
65
+ **Searching...** perplexity done · ✅ bing done · ⏳ google
66
+ **Searching...** ✅ perplexity done · ✅ bing done · ✅ google done
67
+ ```
68
+
69
+ ## Synthesis Mode
70
+
71
+ For complex research questions, use `synthesize: true` with `engine: "all"`:
72
+
73
+ ```
74
+ greedy_search({ query: "best auth patterns for SaaS in 2026", engine: "all", synthesize: true })
75
+ ```
76
+
77
+ This deduplicates sources across engines, builds a normalized source registry, and feeds that context to Gemini for one clean synthesized answer. Adds ~30s but now returns agreement summaries, caveats, key claims, and better-labeled top sources.
78
+
79
+ For the most grounded mode, use deep research from the CLI:
80
+
81
+ ```bash
82
+ node search.mjs all "best auth patterns for SaaS in 2026" --deep-research
83
+ ```
84
+
85
+ Deep research fetches top source pages before synthesis and reports source confidence metadata such as agreement level, fetched-source success rate, and source mix.
86
+
87
+ **Use synthesis when:**
88
+ - You need one definitive answer, not multiple perspectives
89
+ - You're researching a topic to write about or make a decision
90
+ - Token efficiency matters (one answer vs three)
91
+
92
+ **Skip synthesis when:**
93
+ - You want to see where engines disagree
94
+ - Speed matters
95
+
96
+ ## Full vs Short Answers
97
+
98
+ Default mode returns ~300 char summaries to save tokens. Use `fullAnswer: true` for complete responses:
99
+
100
+ ```
101
+ greedy_search({ query: "explain the React compiler", engine: "perplexity", fullAnswer: true })
102
+ ```
103
+
104
+ ## Examples
105
+
106
+ **Quick technical lookup:**
107
+ ```
108
+ greedy_search({ query: "How to use async await in Python", engine: "perplexity" })
109
+ ```
110
+
111
+ **Compare tools (see where engines agree/disagree):**
112
+ ```
113
+ greedy_search({ query: "Prisma vs Drizzle in 2026", engine: "all" })
114
+ ```
115
+
116
+ **Research with synthesis:**
117
+ ```
118
+ greedy_search({ query: "Best practices for monorepo structure", engine: "all", synthesize: true })
119
+ ```
120
+
121
+ **Debug an error:**
122
+ ```
123
+ greedy_search({ query: "Error: Cannot find module 'react-dom/client' Next.js 15", engine: "all" })
124
+ ```
125
+
126
+ ## Requirements
127
+
128
+ - **Chrome** — must be installed. The extension auto-launches a dedicated Chrome instance on port 9222 with its own isolated profile and DevTools port file, separate from your main browser session.
129
+ - **Node.js 22+** — for built-in `fetch` and WebSocket support.
130
+
131
+ ## Setup (first time)
132
+
133
+ To pre-launch the dedicated GreedySearch Chrome instance:
134
+
135
+ ```bash
136
+ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
137
+ ```
138
+
139
+ Stop it when done:
140
+
141
+ ```bash
142
+ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
143
+ ```
144
+
145
+ Check status:
146
+
147
+ ```bash
148
+ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --status
149
+ ```
150
+
151
+ ## Testing
152
+
153
+ Run the test suite to verify everything works:
154
+
155
+ ```bash
156
+ ./test.sh # full suite (~3-4 min)
157
+ ./test.sh quick # skip parallel tests (~1 min)
158
+ ./test.sh parallel # parallel race condition tests only
159
+ ```
160
+
161
+ Tests verify:
162
+ - Single engine mode (perplexity, bing, google)
163
+ - Sequential "all" mode searches
164
+ - Parallel "all" mode (5 concurrent searches) — detects tab race conditions
165
+ - Synthesis mode with Gemini
166
+
167
+ ## Troubleshooting
168
+
169
+ ### "Chrome not found"
170
+ Set the path explicitly:
171
+ ```bash
172
+ export CHROME_PATH="/path/to/chrome"
173
+ ```
174
+
175
+ ### "CDP timeout" or "Chrome may have crashed"
176
+ Restart GreedySearch Chrome:
177
+ ```bash
178
+ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
179
+ node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
180
+ ```
181
+
182
+ ### Google / Bing "verify you're human"
183
+ The extension auto-clicks verification buttons and Cloudflare Turnstile challenges. For hard CAPTCHAs (image puzzles), solve manually in the Chrome window that opens.
184
+
185
+ ### Parallel searches failing
186
+ Earlier versions shared Chrome tabs between concurrent searches, causing `ERR_ABORTED` errors. Version 1.2.0+ creates fresh tabs for each search, allowing safe parallel execution.
187
+
188
+ ### Search hangs
189
+ Chrome may be unresponsive. Restart it with `launch.mjs --kill` then `launch.mjs`.
190
+
191
+ ### Sources are junk links
192
+ This was a known issue with Gemini sources. If you're on an older version, update:
193
+ ```bash
194
+ pi install npm:@apmantza/greedysearch-pi
195
+ ```
196
+
197
+ ## How It Works
198
+
199
+ - `index.ts` — Pi extension, registers `greedy_search` tool with streaming progress
200
+ - `search.mjs` — CLI runner, spawns extractors in parallel, emits `PROGRESS:` events to stderr
201
+ - `launch.mjs` — launches dedicated Chrome on port 9222 with isolated profile
202
+ - `extractors/` — per-engine CDP scrapers (Perplexity, Bing Copilot, Google AI, Gemini)
203
+ - `cdp.mjs` — Chrome DevTools Protocol CLI for browser automation
204
+ - `skills/greedy-search/SKILL.md` — skill file that guides the model on when/how to use greedy_search
205
+
206
+ ## License
207
+
208
+ MIT
package/cdp.mjs CHANGED
@@ -37,22 +37,22 @@ function getDevToolsActivePortPath() {
37
37
  return join(homedir(), '.config', 'google-chrome', 'DevToolsActivePort');
38
38
  }
39
39
 
40
- function getWsUrl() {
41
- // If CDP_PROFILE_DIR is set (by search.mjs), prefer that profile's port file
42
- // so GreedySearch targets its own Chrome, not the user's main session.
43
- const profileDir = process.env.CDP_PROFILE_DIR;
44
- if (profileDir) {
45
- const p = profileDir.replace(/\\/g, '/') + '/DevToolsActivePort';
46
- if (existsSync(p)) {
47
- const lines = readFileSync(p, 'utf8').trim().split('\n');
48
- return `ws://127.0.0.1:${lines[0]}${lines[1]}`;
49
- }
50
- throw new Error(`GreedySearch DevToolsActivePort not found at ${p}. Refusing to fall back to the main Chrome session.`);
51
- }
52
- const portFile = getDevToolsActivePortPath();
53
- const lines = readFileSync(portFile, 'utf8').trim().split('\n');
54
- return `ws://127.0.0.1:${lines[0]}${lines[1]}`;
55
- }
40
+ function getWsUrl() {
41
+ // If CDP_PROFILE_DIR is set (by search.mjs), prefer that profile's port file
42
+ // so GreedySearch targets its own Chrome, not the user's main session.
43
+ const profileDir = process.env.CDP_PROFILE_DIR;
44
+ if (profileDir) {
45
+ const p = profileDir.replace(/\\/g, '/') + '/DevToolsActivePort';
46
+ if (existsSync(p)) {
47
+ const lines = readFileSync(p, 'utf8').trim().split('\n');
48
+ return `ws://127.0.0.1:${lines[0]}${lines[1]}`;
49
+ }
50
+ throw new Error(`GreedySearch DevToolsActivePort not found at ${p}. Refusing to fall back to the main Chrome session.`);
51
+ }
52
+ const portFile = getDevToolsActivePortPath();
53
+ const lines = readFileSync(portFile, 'utf8').trim().split('\n');
54
+ return `ws://127.0.0.1:${lines[0]}${lines[1]}`;
55
+ }
56
56
 
57
57
  const sleep = (ms) => new Promise(r => setTimeout(r, ms));
58
58