@apmantza/greedysearch-pi 1.6.6 → 1.7.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +113 -81
- package/LICENSE +21 -21
- package/README.md +73 -252
- package/{cdp.mjs → bin/cdp.mjs} +1004 -1004
- package/{coding-task.mjs → bin/coding-task.mjs} +392 -392
- package/{launch.mjs → bin/launch.mjs} +288 -288
- package/{search.mjs → bin/search.mjs} +1484 -1176
- package/extractors/bing-copilot.mjs +167 -167
- package/extractors/common.mjs +237 -237
- package/extractors/consent.mjs +273 -273
- package/extractors/google-ai.mjs +156 -156
- package/extractors/perplexity.mjs +141 -128
- package/extractors/selectors.mjs +52 -52
- package/index.ts +18 -18
- package/package.json +46 -39
- package/skills/greedy-search/SKILL.md +117 -117
- package/src/fetcher.mjs +589 -0
- package/src/formatters/coding.ts +68 -68
- package/src/formatters/sources.ts +116 -116
- package/src/formatters/synthesis.ts +91 -100
- package/src/github.mjs +323 -0
- package/src/utils/content.mjs +56 -0
- package/src/utils/helpers.ts +40 -40
package/CHANGELOG.md
CHANGED
|
@@ -1,83 +1,115 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
## v1.
|
|
4
|
-
|
|
5
|
-
###
|
|
6
|
-
- **
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
- **
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
###
|
|
49
|
-
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
- **
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
-
|
|
82
|
-
-
|
|
83
|
-
|
|
3
|
+
## v1.7.2 (2026-04-08)
|
|
4
|
+
|
|
5
|
+
### Release
|
|
6
|
+
- **Patch release** — version bump and npm package verification for the `bin/` runtime layout (`bin/search.mjs`, `bin/launch.mjs`, `bin/cdp.mjs`, `bin/coding-task.mjs`).
|
|
7
|
+
|
|
8
|
+
## v1.7.1 (2026-04-08)
|
|
9
|
+
|
|
10
|
+
### Performance
|
|
11
|
+
- **Bounded source-fetch concurrency** — source fetching now uses a small worker pool (default `2`, configurable via `GREEDY_FETCH_CONCURRENCY`) to reduce burstiness while keeping deep-research fast.
|
|
12
|
+
|
|
13
|
+
### Project structure
|
|
14
|
+
- **Runtime scripts moved to `bin/`** — `search.mjs`, `launch.mjs`, `cdp.mjs`, and `coding-task.mjs` now live under `bin/` for a cleaner repository root.
|
|
15
|
+
- **Path references updated** — extension runtime, tests, extractor shared utilities, and docs now point to `bin/*` paths.
|
|
16
|
+
|
|
17
|
+
### Packaging & docs
|
|
18
|
+
- **Package file list updated** — npm package now includes `bin/` directly instead of root script entries.
|
|
19
|
+
- **README simplified** — rewritten into a shorter, concise format with quick install, usage, and layout guidance.
|
|
20
|
+
|
|
21
|
+
## v1.6.5 (2026-04-04)
|
|
22
|
+
|
|
23
|
+
### Security
|
|
24
|
+
- **Private URL blocking** — Added validation to block requests to localhost, RFC1918 private addresses (10.x, 192.168.x), and .local/.internal domains. Prevents accidental exposure of internal services.
|
|
25
|
+
|
|
26
|
+
### Features
|
|
27
|
+
- **GitHub URL rewriting** — GitHub blob URLs (`github.com/owner/repo/blob/...`) are automatically rewritten to `raw.githubusercontent.com` for faster, cleaner raw file access.
|
|
28
|
+
- **GitHub repo cloning** — Root and tree URLs now trigger `git clone --depth 1` for complete repo access. Agent can explore files locally instead of parsing rendered HTML. Includes README preview and directory tree listing.
|
|
29
|
+
- **Head+tail content trimming** — Large documents now use smart truncation: keeps 75% from the beginning (introduction) + 25% from the end (conclusions/examples) with `[...content trimmed...]` marker, instead of simple truncation.
|
|
30
|
+
- **Anubis bot detection** — Added detection for the new Anubis proof-of-work anti-bot system (`protected by anubis`, `anubis uses a proof-of-work`).
|
|
31
|
+
|
|
32
|
+
### Fixes
|
|
33
|
+
- **Perplexity clipboard retry** — Added single retry with 2s delay when clipboard extraction fails, improving reliability.
|
|
34
|
+
|
|
35
|
+
## v1.6.4 (2026-04-02)
|
|
36
|
+
|
|
37
|
+
### Fixes
|
|
38
|
+
- **Gemini scroll-to-bottom** — Changed from small random jitter scrolls to actual bottom-of-page scrolls every ~6 seconds while waiting for the copy button. This ensures lazy-loaded content is triggered and the full answer is captured.
|
|
39
|
+
- **Restored missing files** — `.mjs` source files (extractors, search.mjs, launch.mjs, etc.) were incorrectly removed in v1.6.2 cleanup; now properly tracked again.
|
|
40
|
+
|
|
41
|
+
## v1.6.3 (2026-04-02)
|
|
42
|
+
|
|
43
|
+
### Fixes
|
|
44
|
+
- **Debug output removed** — Cleaned up stderr passthrough that was causing CDP connection issues in some environments.
|
|
45
|
+
|
|
46
|
+
## v1.6.2 (2026-04-01)
|
|
47
|
+
|
|
48
|
+
### Fixes
|
|
49
|
+
- **Anti-bot detection evasion** — Gemini synthesis now performs gentle scroll every ~6 seconds while waiting for the copy button. This prevents the button from hanging due to anti-bot "human activity" checks.
|
|
50
|
+
|
|
51
|
+
## v1.6.1 (2026-03-31)
|
|
52
|
+
|
|
53
|
+
### Features
|
|
54
|
+
- **Single-engine full answers by default** — when using `engine: "perplexity"`, `engine: "bing"`, `engine: "google"`, or `engine: "gemini"`, the full answer is now returned by default instead of truncated previews. Multi-engine (`engine: "all"`) still uses truncated previews (~300 chars) to save tokens during synthesis. Explicit `fullAnswer: true/false` always overrides.
|
|
55
|
+
|
|
56
|
+
### Code Quality
|
|
57
|
+
- **Major refactoring** — extracted 438 lines from `index.ts` (856 → 418 lines) into modular formatters:
|
|
58
|
+
- `src/formatters/coding.ts` — coding task formatting
|
|
59
|
+
- `src/formatters/results.ts` — search and deep research formatting
|
|
60
|
+
- `src/formatters/sources.ts` — source utilities (URL, label, consensus, formatting)
|
|
61
|
+
- `src/formatters/synthesis.ts` — synthesis rendering
|
|
62
|
+
- `src/utils/helpers.ts` — shared formatting utilities
|
|
63
|
+
- **Complexity reduced** — cognitive complexity dropped from 360 to ~60, maintainability index improved from 11.2 to ~40+
|
|
64
|
+
- **Eliminated code duplication** — removed 6 duplicate blocks, consolidated 4+ single-use helper functions
|
|
65
|
+
|
|
66
|
+
### Documentation
|
|
67
|
+
- Clarified `greedy_search` is WEB SEARCH ONLY — removed "NOT for codebase search" from tool description (still in skill documentation)
|
|
68
|
+
|
|
69
|
+
## v1.6.0 (2026-03-29)
|
|
70
|
+
|
|
71
|
+
### Breaking Changes (Backward Compatible)
|
|
72
|
+
- **Merged deep_research into greedy_search** — new `depth` parameter with three levels:
|
|
73
|
+
- `fast`: single engine (~15-30s)
|
|
74
|
+
- `standard`: 3 engines + synthesis (~30-90s, default for `engine: "all"`)
|
|
75
|
+
- `deep`: 3 engines + source fetching + synthesis + confidence (~60-180s)
|
|
76
|
+
- **Simpler mental model** — one tool with clear speed/quality tradeoffs instead of separate tools with overlapping flags
|
|
77
|
+
- **Deprecated flags still work** — `--synthesize` maps to `depth: "standard"`, `--deep-research` maps to `depth: "deep"`
|
|
78
|
+
- **deep_research tool aliased** — still works, calls `greedy_search` with `depth: "deep"`
|
|
79
|
+
|
|
80
|
+
### Documentation
|
|
81
|
+
- Updated README with new `depth` parameter and examples
|
|
82
|
+
- Updated skill documentation (SKILL.md) to reflect simplified API
|
|
83
|
+
|
|
84
|
+
## v1.5.1 (2026-03-29)
|
|
85
|
+
|
|
86
|
+
- **Fixed npm package** — added `.pi-lens/` and test files to `.npmignore` to reduce package size
|
|
87
|
+
|
|
88
|
+
## v1.5.0 (2026-03-29)
|
|
89
|
+
|
|
90
|
+
### Features
|
|
91
|
+
- **Code extraction fixed** — `coding_task` now uses clipboard interception to preserve markdown code blocks (was losing them via DOM scraping)
|
|
92
|
+
- **Chrome targeting hardened** — all tools now consistently target the dedicated GreedySearch Chrome via `CDP_PROFILE_DIR`, preventing fallback to user's main Chrome session
|
|
93
|
+
- **Shared utilities** — extracted ~220 lines of duplicate code from extractors into `common.mjs` (cdp wrapper, tab management, clipboard interception)
|
|
94
|
+
- **Documentation leaner** — skill documentation reduced 61% (180 → 70 lines) while preserving all decision-making info
|
|
95
|
+
|
|
96
|
+
### Notable
|
|
97
|
+
- **NO API KEYS** — updated messaging to emphasize this works via browser automation, no API keys needed
|
|
98
|
+
|
|
99
|
+
## v1.4.2 (2026-03-25)
|
|
100
|
+
|
|
101
|
+
- **Fresh isolated tabs** — each search now always creates a new `about:blank` tab via `Target.createTarget` and refreshes the CDP page cache immediately after, preventing SPA navigation failures and stale DOM state from prior queries
|
|
102
|
+
- **Regex-based citation extraction** — all extractors (Perplexity, Bing, Gemini) now parse sources from clipboard Markdown links (`[title](url)`) instead of DOM selectors that break on UI updates
|
|
103
|
+
- **Relaxed verification detection** — `consent.mjs` now uses broad keyword matching (`includes('verify')`, `includes('human')`) instead of anchored regexes, correctly catching button text variants like "Verify you are human" across Cloudflare, Microsoft, and generic modals
|
|
104
|
+
|
|
105
|
+
## v1.4.1
|
|
106
|
+
|
|
107
|
+
- **Fixed parallel synthesis** — multiple `greedy_search` calls with `synthesize: true` now run safely in parallel. Each search creates a fresh Gemini tab that gets cleaned up after synthesis, preventing tab conflicts and "Uncaught" errors.
|
|
108
|
+
|
|
109
|
+
## v1.4.0
|
|
110
|
+
|
|
111
|
+
- **Grounded synthesis** — Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
|
|
112
|
+
- **Real deep research** — top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
|
|
113
|
+
- **Richer source metadata** — source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
|
|
114
|
+
- **Cleaner tab lifecycle** — temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
|
|
115
|
+
- **Isolated Chrome targeting** — GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
|
package/LICENSE
CHANGED
|
@@ -1,21 +1,21 @@
|
|
|
1
|
-
MIT License
|
|
2
|
-
|
|
3
|
-
Copyright (c) 2026
|
|
4
|
-
|
|
5
|
-
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
-
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
-
in the Software without restriction, including without limitation the rights
|
|
8
|
-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
-
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
-
furnished to do so, subject to the following conditions:
|
|
11
|
-
|
|
12
|
-
The above copyright notice and this permission notice shall be included in all
|
|
13
|
-
copies or substantial portions of the Software.
|
|
14
|
-
|
|
15
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
-
SOFTWARE.
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
CHANGED
|
@@ -1,252 +1,73 @@
|
|
|
1
|
-
# GreedySearch for Pi
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
##
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
```
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
greedy_search({ query: "
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
##
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
##
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
```
|
|
76
|
-
greedy_search({ query: "best auth patterns for SaaS in 2026", depth: "deep" })
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
Deep mode: 3 engines + source fetching (top 5) + synthesis + confidence scores. ~60-180s but returns grounded synthesis with fetched evidence.
|
|
80
|
-
|
|
81
|
-
**Standard vs Deep:**
|
|
82
|
-
- `standard` (default): 3 engines + synthesis. Good for most research.
|
|
83
|
-
- `deep`: Same + fetches source content for grounded answers. Use when the answer really matters.
|
|
84
|
-
|
|
85
|
-
**Legacy:** `deep_research` tool still works -- aliases to `greedy_search` with `depth: "deep"`.
|
|
86
|
-
|
|
87
|
-
## Full vs Short Answers
|
|
88
|
-
|
|
89
|
-
Default mode returns ~300 char summaries to save tokens. Use `fullAnswer: true` for complete responses:
|
|
90
|
-
|
|
91
|
-
```
|
|
92
|
-
greedy_search({ query: "explain the React compiler", engine: "perplexity", fullAnswer: true })
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
## Examples
|
|
96
|
-
|
|
97
|
-
**Quick lookup (fast):**
|
|
98
|
-
|
|
99
|
-
```
|
|
100
|
-
greedy_search({ query: "How to use async await in Python", depth: "fast", engine: "perplexity" })
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
**Compare tools (standard):**
|
|
104
|
-
|
|
105
|
-
```
|
|
106
|
-
greedy_search({ query: "Prisma vs Drizzle in 2026", depth: "standard" })
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
**Deep research (architecture decision):**
|
|
110
|
-
|
|
111
|
-
```
|
|
112
|
-
greedy_search({ query: "Best practices for monorepo structure", depth: "deep" })
|
|
113
|
-
```
|
|
114
|
-
|
|
115
|
-
**Debug an error:**
|
|
116
|
-
|
|
117
|
-
```
|
|
118
|
-
greedy_search({ query: "Error: Cannot find module 'react-dom/client' Next.js 15", depth: "standard" })
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
## Requirements
|
|
122
|
-
|
|
123
|
-
- **Chrome** -- must be installed. The extension auto-launches a dedicated Chrome instance on port 9222 with its own isolated profile and DevTools port file, separate from your main browser session.
|
|
124
|
-
- **Node.js 22+** -- for built-in `fetch` and WebSocket support.
|
|
125
|
-
|
|
126
|
-
## Setup (first time)
|
|
127
|
-
|
|
128
|
-
To pre-launch the dedicated GreedySearch Chrome instance:
|
|
129
|
-
|
|
130
|
-
```bash
|
|
131
|
-
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
Stop it when done:
|
|
135
|
-
|
|
136
|
-
```bash
|
|
137
|
-
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
|
|
138
|
-
```
|
|
139
|
-
|
|
140
|
-
Check status:
|
|
141
|
-
|
|
142
|
-
```bash
|
|
143
|
-
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --status
|
|
144
|
-
```
|
|
145
|
-
|
|
146
|
-
## Testing
|
|
147
|
-
|
|
148
|
-
Run the test suite to verify everything works:
|
|
149
|
-
|
|
150
|
-
```bash
|
|
151
|
-
./test.sh # full suite (~3-4 min)
|
|
152
|
-
./test.sh quick # skip parallel tests (~1 min)
|
|
153
|
-
./test.sh parallel # parallel race condition tests only
|
|
154
|
-
```
|
|
155
|
-
|
|
156
|
-
Tests verify:
|
|
157
|
-
- Single engine mode (perplexity, bing, google)
|
|
158
|
-
- Sequential "all" mode searches
|
|
159
|
-
- Parallel "all" mode (5 concurrent searches) -- detects tab race conditions
|
|
160
|
-
- Synthesis mode with Gemini
|
|
161
|
-
|
|
162
|
-
## Troubleshooting
|
|
163
|
-
|
|
164
|
-
### "Chrome not found"
|
|
165
|
-
|
|
166
|
-
Set the path explicitly:
|
|
167
|
-
|
|
168
|
-
```bash
|
|
169
|
-
export CHROME_PATH="/path/to/chrome"
|
|
170
|
-
```
|
|
171
|
-
|
|
172
|
-
### "CDP timeout" or "Chrome may have crashed"
|
|
173
|
-
|
|
174
|
-
Restart GreedySearch Chrome:
|
|
175
|
-
|
|
176
|
-
```bash
|
|
177
|
-
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
|
|
178
|
-
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
|
|
179
|
-
```
|
|
180
|
-
|
|
181
|
-
### Google / Bing "verify you're human"
|
|
182
|
-
|
|
183
|
-
The extension auto-clicks verification buttons and Cloudflare Turnstile challenges using broad keyword matching -- resilient to variations like "Verify you are human" or localised button text. For hard CAPTCHAs (image puzzles), solve manually in the Chrome window that opens.
|
|
184
|
-
|
|
185
|
-
### Parallel searches failing
|
|
186
|
-
|
|
187
|
-
Each search creates a fresh isolated browser tab that is closed after completion, allowing safe parallel execution without tab state conflicts.
|
|
188
|
-
|
|
189
|
-
### Search hangs
|
|
190
|
-
|
|
191
|
-
Chrome may be unresponsive. Restart it with `launch.mjs --kill` then `launch.mjs`.
|
|
192
|
-
|
|
193
|
-
### Sources are empty or junk links
|
|
194
|
-
|
|
195
|
-
Sources are now extracted by regex-parsing Markdown links (`[title](url)`) from the clipboard text captured after each engine responds -- not from DOM selectors that break when the engine's UI updates. If sources are empty, the engine's clipboard copy didn't include formatted links (Bing Copilot currently falls into this category).
|
|
196
|
-
|
|
197
|
-
## How It Works
|
|
198
|
-
|
|
199
|
-
- `index.ts` -- Pi extension, registers `greedy_search` tool with streaming progress
|
|
200
|
-
- `search.mjs` -- CLI runner, spawns extractors in parallel, emits `PROGRESS:` events to stderr
|
|
201
|
-
- `launch.mjs` -- launches dedicated Chrome on port 9222 with isolated profile
|
|
202
|
-
- `extractors/` -- per-engine CDP scrapers (Perplexity, Bing Copilot, Google AI, Gemini)
|
|
203
|
-
- `cdp.mjs` -- Chrome DevTools Protocol CLI for browser automation
|
|
204
|
-
- `skills/greedy-search/SKILL.md` -- skill file that guides the model on when/how to use greedy_search
|
|
205
|
-
|
|
206
|
-
## Changelog
|
|
207
|
-
|
|
208
|
-
### v1.6.1 (2026-03-31)
|
|
209
|
-
- **Single-engine full answers by default** -- `engine: "google"` (or any single engine) now returns complete answers instead of truncated previews. Multi-engine (`all`) still truncates to save tokens during synthesis.
|
|
210
|
-
- **Codebase refactored** -- extracted 438 lines from `index.ts` into modular formatters (`src/formatters/`) reducing cognitive complexity from 360 to ~60 and maintainability index from 11.2 to ~40+
|
|
211
|
-
- **Removed codebase search confusion** -- clarified that `greedy_search` is WEB SEARCH ONLY (not for searching local code)
|
|
212
|
-
|
|
213
|
-
### v1.6.0 (2026-03-29)
|
|
214
|
-
- **Merged deep_research into greedy_search** -- new `depth` parameter: `fast` (1 engine), `standard` (3 engines + synthesis), `deep` (3 engines + fetch + synthesis + confidence)
|
|
215
|
-
- **Simpler API** -- one tool with clear speed/quality tradeoffs instead of separate tools with overlapping flags
|
|
216
|
-
- **Backward compatible** -- `deep_research` still works as alias, `--synthesize` and `--deep-research` flags still function
|
|
217
|
-
- **Updated documentation** -- README and skill docs now use `depth` parameter throughout
|
|
218
|
-
|
|
219
|
-
### v1.5.1 (2026-03-29)
|
|
220
|
-
- Fixed npm package -- added `.pi-lens/` and test files to `.npmignore`
|
|
221
|
-
|
|
222
|
-
### v1.5.0 (2026-03-29)
|
|
223
|
-
|
|
224
|
-
- **Code extraction fixed** -- `coding_task` now uses clipboard interception to preserve markdown code blocks (was losing them via DOM scraping)
|
|
225
|
-
- **Chrome targeting hardened** -- all tools now consistently target the dedicated GreedySearch Chrome via `CDP_PROFILE_DIR`, preventing fallback to user's main Chrome session
|
|
226
|
-
- **Shared utilities** -- extracted ~220 lines of duplicate code from extractors into `common.mjs` (cdp wrapper, tab management, clipboard interception)
|
|
227
|
-
- **Documentation leaner** -- skill documentation reduced 61% (180 -> 70 lines) while preserving all decision-making info
|
|
228
|
-
- **NO API KEYS** -- updated messaging to emphasize this works via browser automation, no API keys needed
|
|
229
|
-
|
|
230
|
-
### v1.4.2 (2026-03-25)
|
|
231
|
-
|
|
232
|
-
- **Fresh isolated tabs** -- each search now always creates a new `about:blank` tab via `Target.createTarget` and refreshes the CDP page cache immediately after, preventing SPA navigation failures and stale DOM state from prior queries
|
|
233
|
-
- **Regex-based citation extraction** -- all extractors (Perplexity, Bing, Gemini) now parse sources from clipboard Markdown links (`[title](url)`) instead of DOM selectors that break on UI updates
|
|
234
|
-
- **Relaxed verification detection** -- `consent.mjs` now uses broad keyword matching (`includes('verify')`, `includes('human')`) instead of anchored regexes, correctly catching button text variants like "Verify you are human" across Cloudflare, Microsoft, and generic modals
|
|
235
|
-
|
|
236
|
-
---
|
|
237
|
-
|
|
238
|
-
### v1.4.1
|
|
239
|
-
|
|
240
|
-
- **Fixed parallel synthesis** -- multiple `greedy_search` calls with `synthesize: true` now run safely in parallel. Each search creates a fresh Gemini tab that gets cleaned up after synthesis, preventing tab conflicts and "Uncaught" errors.
|
|
241
|
-
|
|
242
|
-
### v1.4.0
|
|
243
|
-
|
|
244
|
-
- **Grounded synthesis** -- Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
|
|
245
|
-
- **Real deep research** -- top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
|
|
246
|
-
- **Richer source metadata** -- source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
|
|
247
|
-
- **Cleaner tab lifecycle** -- temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
|
|
248
|
-
- **Isolated Chrome targeting** -- GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
|
|
249
|
-
|
|
250
|
-
## License
|
|
251
|
-
|
|
252
|
-
MIT
|
|
1
|
+
# GreedySearch for Pi
|
|
2
|
+
|
|
3
|
+
Multi-engine AI web search for Pi via browser automation.
|
|
4
|
+
|
|
5
|
+
- No API keys
|
|
6
|
+
- Real browser results (Perplexity, Bing Copilot, Google AI)
|
|
7
|
+
- Optional Gemini synthesis with source grounding
|
|
8
|
+
|
|
9
|
+
## Install
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
pi install npm:@apmantza/greedysearch-pi
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Or from git:
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
pi install git:github.com/apmantza/GreedySearch-pi
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Tools
|
|
22
|
+
|
|
23
|
+
- `greedy_search` - fast or grounded multi-engine search
|
|
24
|
+
- `coding_task` - browser-routed Gemini/Copilot coding assistance
|
|
25
|
+
|
|
26
|
+
## Quick usage
|
|
27
|
+
|
|
28
|
+
```js
|
|
29
|
+
greedy_search({ query: "React 19 changes" })
|
|
30
|
+
greedy_search({ query: "Prisma vs Drizzle", engine: "all", depth: "fast" })
|
|
31
|
+
greedy_search({ query: "Best auth architecture 2026", engine: "all", depth: "deep" })
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Parameters (`greedy_search`)
|
|
35
|
+
|
|
36
|
+
- `query` (required)
|
|
37
|
+
- `engine`: `all` (default), `perplexity`, `bing`, `google`, `gemini`
|
|
38
|
+
- `depth`: `standard` (default), `fast`, `deep`
|
|
39
|
+
- `fullAnswer`: return full single-engine output instead of preview
|
|
40
|
+
|
|
41
|
+
## Depth modes
|
|
42
|
+
|
|
43
|
+
- `fast` - quickest, no synthesis/source fetching
|
|
44
|
+
- `standard` - balanced default for `engine: "all"` (synthesis + fetched sources)
|
|
45
|
+
- `deep` - strongest grounding and confidence metadata
|
|
46
|
+
|
|
47
|
+
## Runtime commands
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
node ~/.pi/agent/git/GreedySearch-pi/bin/launch.mjs
|
|
51
|
+
node ~/.pi/agent/git/GreedySearch-pi/bin/launch.mjs --status
|
|
52
|
+
node ~/.pi/agent/git/GreedySearch-pi/bin/launch.mjs --kill
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Requirements
|
|
56
|
+
|
|
57
|
+
- Chrome
|
|
58
|
+
- Node.js 22+
|
|
59
|
+
|
|
60
|
+
## Project layout
|
|
61
|
+
|
|
62
|
+
- `bin/` - runtime CLIs (`search.mjs`, `launch.mjs`, `cdp.mjs`, `coding-task.mjs`)
|
|
63
|
+
- `extractors/` - engine-specific automation
|
|
64
|
+
- `src/` - ranking/fetching/formatting internals
|
|
65
|
+
- `skills/` - Pi skill metadata
|
|
66
|
+
|
|
67
|
+
## Changelog
|
|
68
|
+
|
|
69
|
+
See `CHANGELOG.md`.
|
|
70
|
+
|
|
71
|
+
## License
|
|
72
|
+
|
|
73
|
+
MIT
|