@apmantza/greedysearch-pi 1.4.0 → 1.4.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +39 -24
- package/extractors/bing-copilot.mjs +195 -204
- package/extractors/consent.mjs +255 -248
- package/extractors/gemini.mjs +12 -53
- package/extractors/google-ai.mjs +162 -165
- package/extractors/perplexity.mjs +181 -184
- package/package.json +2 -2
- package/search.mjs +997 -996
package/README.md
CHANGED
|
@@ -4,14 +4,6 @@ Pi extension that adds a `greedy_search` tool — fans out queries to Perplexity
|
|
|
4
4
|
|
|
5
5
|
Forked from [GreedySearch-claude](https://github.com/apmantza/GreedySearch-claude).
|
|
6
6
|
|
|
7
|
-
## What's New (v1.4.0)
|
|
8
|
-
|
|
9
|
-
- **Grounded synthesis** — Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
|
|
10
|
-
- **Real deep research** — top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
|
|
11
|
-
- **Richer source metadata** — source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
|
|
12
|
-
- **Cleaner tab lifecycle** — temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
|
|
13
|
-
- **Isolated Chrome targeting** — GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
|
|
14
|
-
|
|
15
7
|
## Install
|
|
16
8
|
|
|
17
9
|
```bash
|
|
@@ -70,15 +62,7 @@ For complex research questions, use `synthesize: true` with `engine: "all"`:
|
|
|
70
62
|
greedy_search({ query: "best auth patterns for SaaS in 2026", engine: "all", synthesize: true })
|
|
71
63
|
```
|
|
72
64
|
|
|
73
|
-
This deduplicates sources across engines, builds a normalized source registry, and feeds that context to Gemini for one clean synthesized answer. Adds ~30s but
|
|
74
|
-
|
|
75
|
-
For the most grounded mode, use deep research from the CLI:
|
|
76
|
-
|
|
77
|
-
```bash
|
|
78
|
-
node search.mjs all "best auth patterns for SaaS in 2026" --deep-research
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
Deep research fetches top source pages before synthesis and reports source confidence metadata such as agreement level, fetched-source success rate, and source mix.
|
|
65
|
+
This deduplicates sources across engines, builds a normalized source registry, and feeds that context to Gemini for one clean synthesized answer. Adds ~30s but returns agreement summaries, caveats, key claims, and better-labeled top sources.
|
|
82
66
|
|
|
83
67
|
**Use synthesis when:**
|
|
84
68
|
- You need one definitive answer, not multiple perspectives
|
|
@@ -100,21 +84,25 @@ greedy_search({ query: "explain the React compiler", engine: "perplexity", fullA
|
|
|
100
84
|
## Examples
|
|
101
85
|
|
|
102
86
|
**Quick technical lookup:**
|
|
87
|
+
|
|
103
88
|
```
|
|
104
89
|
greedy_search({ query: "How to use async await in Python", engine: "perplexity" })
|
|
105
90
|
```
|
|
106
91
|
|
|
107
92
|
**Compare tools (see where engines agree/disagree):**
|
|
93
|
+
|
|
108
94
|
```
|
|
109
95
|
greedy_search({ query: "Prisma vs Drizzle in 2026", engine: "all" })
|
|
110
96
|
```
|
|
111
97
|
|
|
112
98
|
**Research with synthesis:**
|
|
99
|
+
|
|
113
100
|
```
|
|
114
101
|
greedy_search({ query: "Best practices for monorepo structure", engine: "all", synthesize: true })
|
|
115
102
|
```
|
|
116
103
|
|
|
117
104
|
**Debug an error:**
|
|
105
|
+
|
|
118
106
|
```
|
|
119
107
|
greedy_search({ query: "Error: Cannot find module 'react-dom/client' Next.js 15", engine: "all" })
|
|
120
108
|
```
|
|
@@ -163,32 +151,37 @@ Tests verify:
|
|
|
163
151
|
## Troubleshooting
|
|
164
152
|
|
|
165
153
|
### "Chrome not found"
|
|
154
|
+
|
|
166
155
|
Set the path explicitly:
|
|
156
|
+
|
|
167
157
|
```bash
|
|
168
158
|
export CHROME_PATH="/path/to/chrome"
|
|
169
159
|
```
|
|
170
160
|
|
|
171
161
|
### "CDP timeout" or "Chrome may have crashed"
|
|
162
|
+
|
|
172
163
|
Restart GreedySearch Chrome:
|
|
164
|
+
|
|
173
165
|
```bash
|
|
174
166
|
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs --kill
|
|
175
167
|
node ~/.pi/agent/git/GreedySearch-pi/launch.mjs
|
|
176
168
|
```
|
|
177
169
|
|
|
178
170
|
### Google / Bing "verify you're human"
|
|
179
|
-
|
|
171
|
+
|
|
172
|
+
The extension auto-clicks verification buttons and Cloudflare Turnstile challenges using broad keyword matching — resilient to variations like "Verify you are human" or localised button text. For hard CAPTCHAs (image puzzles), solve manually in the Chrome window that opens.
|
|
180
173
|
|
|
181
174
|
### Parallel searches failing
|
|
182
|
-
|
|
175
|
+
|
|
176
|
+
Each search creates a fresh isolated browser tab that is closed after completion, allowing safe parallel execution without tab state conflicts.
|
|
183
177
|
|
|
184
178
|
### Search hangs
|
|
179
|
+
|
|
185
180
|
Chrome may be unresponsive. Restart it with `launch.mjs --kill` then `launch.mjs`.
|
|
186
181
|
|
|
187
|
-
### Sources are junk links
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
pi install npm:@apmantza/greedysearch-pi
|
|
191
|
-
```
|
|
182
|
+
### Sources are empty or junk links
|
|
183
|
+
|
|
184
|
+
Sources are now extracted by regex-parsing Markdown links (`[title](url)`) from the clipboard text captured after each engine responds — not from DOM selectors that break when the engine's UI updates. If sources are empty, the engine's clipboard copy didn't include formatted links (Bing Copilot currently falls into this category).
|
|
192
185
|
|
|
193
186
|
## How It Works
|
|
194
187
|
|
|
@@ -199,6 +192,28 @@ pi install npm:@apmantza/greedysearch-pi
|
|
|
199
192
|
- `cdp.mjs` — Chrome DevTools Protocol CLI for browser automation
|
|
200
193
|
- `skills/greedy-search/SKILL.md` — skill file that guides the model on when/how to use greedy_search
|
|
201
194
|
|
|
195
|
+
## Changelog
|
|
196
|
+
|
|
197
|
+
### v1.4.2 (2026-03-25)
|
|
198
|
+
|
|
199
|
+
- **Fresh isolated tabs** — each search now always creates a new `about:blank` tab via `Target.createTarget` and refreshes the CDP page cache immediately after, preventing SPA navigation failures and stale DOM state from prior queries
|
|
200
|
+
- **Regex-based citation extraction** — all extractors (Perplexity, Bing, Gemini) now parse sources from clipboard Markdown links (`[title](url)`) instead of DOM selectors that break on UI updates
|
|
201
|
+
- **Relaxed verification detection** — `consent.mjs` now uses broad keyword matching (`includes('verify')`, `includes('human')`) instead of anchored regexes, correctly catching button text variants like "Verify you are human" across Cloudflare, Microsoft, and generic modals
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
### v1.4.1
|
|
206
|
+
|
|
207
|
+
- **Fixed parallel synthesis** — multiple `greedy_search` calls with `synthesize: true` now run safely in parallel. Each search creates a fresh Gemini tab that gets cleaned up after synthesis, preventing tab conflicts and "Uncaught" errors.
|
|
208
|
+
|
|
209
|
+
### v1.4.0
|
|
210
|
+
|
|
211
|
+
- **Grounded synthesis** — Gemini now receives a normalized source registry with stable source IDs, agreement summaries, caveats, and cited claims
|
|
212
|
+
- **Real deep research** — top sources are fetched before synthesis so deep research answers are grounded in fetched evidence, not just engine summaries
|
|
213
|
+
- **Richer source metadata** — source output now includes canonical URLs, domains, source types, per-engine attribution, and confidence metadata
|
|
214
|
+
- **Cleaner tab lifecycle** — temporary Perplexity, Bing, and Google tabs are closed after each fan-out search, and synthesis finishes on the Gemini tab
|
|
215
|
+
- **Isolated Chrome targeting** — GreedySearch now refuses to fall back to your normal Chrome session, preventing stray remote-debugging prompts
|
|
216
|
+
|
|
202
217
|
## License
|
|
203
218
|
|
|
204
219
|
MIT
|
|
@@ -1,204 +1,195 @@
|
|
|
1
|
-
#!/usr/bin/env node
|
|
2
|
-
// extractors/bing-copilot.mjs
|
|
3
|
-
// Navigate copilot.microsoft.com, wait for answer to complete, return clean answer + sources.
|
|
4
|
-
//
|
|
5
|
-
// Usage:
|
|
6
|
-
// node extractors/bing-copilot.mjs "<query>" [--tab <prefix>]
|
|
7
|
-
//
|
|
8
|
-
// Output (stdout): JSON { answer, sources, query, url }
|
|
9
|
-
// Errors go to stderr only — stdout is always clean JSON for piping.
|
|
10
|
-
|
|
11
|
-
import { readFileSync, existsSync } from 'fs';
|
|
12
|
-
import { spawn } from 'child_process';
|
|
13
|
-
import { tmpdir } from 'os';
|
|
14
|
-
import { join, dirname } from 'path';
|
|
15
|
-
import { fileURLToPath } from 'url';
|
|
16
|
-
import { dismissConsent, handleVerification } from './consent.mjs';
|
|
17
|
-
import { SELECTORS } from './selectors.mjs';
|
|
18
|
-
|
|
19
|
-
const __dir = dirname(fileURLToPath(import.meta.url));
|
|
20
|
-
const CDP = join(__dir, '..', 'cdp.mjs');
|
|
21
|
-
const PAGES_CACHE = `${tmpdir().replace(/\\/g, '/')}/cdp-pages.json`;
|
|
22
|
-
|
|
23
|
-
const COPY_POLL_INTERVAL = 700;
|
|
24
|
-
const COPY_TIMEOUT = 60000;
|
|
25
|
-
|
|
26
|
-
const S = SELECTORS.bing;
|
|
27
|
-
|
|
28
|
-
// ---------------------------------------------------------------------------
|
|
29
|
-
|
|
30
|
-
function cdp(args, timeoutMs = 30000) {
|
|
31
|
-
return new Promise((resolve, reject) => {
|
|
32
|
-
const proc = spawn('node', [CDP, ...args], { stdio: ['ignore', 'pipe', 'pipe'] });
|
|
33
|
-
let out = '';
|
|
34
|
-
let err = '';
|
|
35
|
-
proc.stdout.on('data', d => out += d);
|
|
36
|
-
proc.stderr.on('data', d => err += d);
|
|
37
|
-
const timer = setTimeout(() => { proc.kill(); reject(new Error(`cdp timeout: ${args[0]}`)); }, timeoutMs);
|
|
38
|
-
proc.on('close', code => {
|
|
39
|
-
clearTimeout(timer);
|
|
40
|
-
if (code !== 0) reject(new Error(err.trim() || `cdp exit ${code}`));
|
|
41
|
-
else resolve(out.trim());
|
|
42
|
-
});
|
|
43
|
-
});
|
|
44
|
-
}
|
|
45
|
-
|
|
46
|
-
async function getOrOpenTab(tabPrefix) {
|
|
47
|
-
if (tabPrefix) return tabPrefix;
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
}
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
}
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
await cdp(['eval', tab, `
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
(
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
}
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
const
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
await cdp
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
//
|
|
165
|
-
const
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
await
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
await
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
const finalUrl = await cdp(['eval', tab, 'document.location.href']).catch(() => '');
|
|
197
|
-
process.stdout.write(JSON.stringify({ query, url: finalUrl, answer: out, sources }, null, 2) + '\n');
|
|
198
|
-
} catch (e) {
|
|
199
|
-
process.stderr.write(`Error: ${e.message}\n`);
|
|
200
|
-
process.exit(1);
|
|
201
|
-
}
|
|
202
|
-
}
|
|
203
|
-
|
|
204
|
-
main();
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
// extractors/bing-copilot.mjs
|
|
3
|
+
// Navigate copilot.microsoft.com, wait for answer to complete, return clean answer + sources.
|
|
4
|
+
//
|
|
5
|
+
// Usage:
|
|
6
|
+
// node extractors/bing-copilot.mjs "<query>" [--tab <prefix>]
|
|
7
|
+
//
|
|
8
|
+
// Output (stdout): JSON { answer, sources, query, url }
|
|
9
|
+
// Errors go to stderr only — stdout is always clean JSON for piping.
|
|
10
|
+
|
|
11
|
+
import { readFileSync, existsSync } from 'fs';
|
|
12
|
+
import { spawn } from 'child_process';
|
|
13
|
+
import { tmpdir } from 'os';
|
|
14
|
+
import { join, dirname } from 'path';
|
|
15
|
+
import { fileURLToPath } from 'url';
|
|
16
|
+
import { dismissConsent, handleVerification } from './consent.mjs';
|
|
17
|
+
import { SELECTORS } from './selectors.mjs';
|
|
18
|
+
|
|
19
|
+
const __dir = dirname(fileURLToPath(import.meta.url));
|
|
20
|
+
const CDP = join(__dir, '..', 'cdp.mjs');
|
|
21
|
+
const PAGES_CACHE = `${tmpdir().replace(/\\/g, '/')}/cdp-pages.json`;
|
|
22
|
+
|
|
23
|
+
const COPY_POLL_INTERVAL = 700;
|
|
24
|
+
const COPY_TIMEOUT = 60000;
|
|
25
|
+
|
|
26
|
+
const S = SELECTORS.bing;
|
|
27
|
+
|
|
28
|
+
// ---------------------------------------------------------------------------
|
|
29
|
+
|
|
30
|
+
function cdp(args, timeoutMs = 30000) {
|
|
31
|
+
return new Promise((resolve, reject) => {
|
|
32
|
+
const proc = spawn('node', [CDP, ...args], { stdio: ['ignore', 'pipe', 'pipe'] });
|
|
33
|
+
let out = '';
|
|
34
|
+
let err = '';
|
|
35
|
+
proc.stdout.on('data', d => out += d);
|
|
36
|
+
proc.stderr.on('data', d => err += d);
|
|
37
|
+
const timer = setTimeout(() => { proc.kill(); reject(new Error(`cdp timeout: ${args[0]}`)); }, timeoutMs);
|
|
38
|
+
proc.on('close', code => {
|
|
39
|
+
clearTimeout(timer);
|
|
40
|
+
if (code !== 0) reject(new Error(err.trim() || `cdp exit ${code}`));
|
|
41
|
+
else resolve(out.trim());
|
|
42
|
+
});
|
|
43
|
+
});
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
async function getOrOpenTab(tabPrefix) {
|
|
47
|
+
if (tabPrefix) return tabPrefix;
|
|
48
|
+
// Always open a fresh tab to avoid SPA navigation issues
|
|
49
|
+
const list = await cdp(['list']);
|
|
50
|
+
const anchor = list.split('\n')[0]?.slice(0, 8);
|
|
51
|
+
if (!anchor) throw new Error('No Chrome tabs found. Is Chrome running with --remote-debugging-port=9222?');
|
|
52
|
+
const raw = await cdp(['evalraw', anchor, 'Target.createTarget', '{"url":"about:blank"}']);
|
|
53
|
+
const { targetId } = JSON.parse(raw);
|
|
54
|
+
await cdp(['list']); // refresh cache
|
|
55
|
+
return targetId.slice(0, 8);
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
async function injectClipboardInterceptor(tab) {
|
|
59
|
+
await cdp(['eval', tab, `
|
|
60
|
+
window.__bingClipboard = null;
|
|
61
|
+
const _origWriteText = navigator.clipboard.writeText.bind(navigator.clipboard);
|
|
62
|
+
navigator.clipboard.writeText = function(text) {
|
|
63
|
+
window.__bingClipboard = text;
|
|
64
|
+
return _origWriteText(text);
|
|
65
|
+
};
|
|
66
|
+
const _origWrite = navigator.clipboard.write.bind(navigator.clipboard);
|
|
67
|
+
navigator.clipboard.write = async function(items) {
|
|
68
|
+
try {
|
|
69
|
+
for (const item of items) {
|
|
70
|
+
if (item.types && item.types.includes('text/plain')) {
|
|
71
|
+
const blob = await item.getType('text/plain');
|
|
72
|
+
window.__bingClipboard = await blob.text();
|
|
73
|
+
break;
|
|
74
|
+
}
|
|
75
|
+
}
|
|
76
|
+
} catch(e) {}
|
|
77
|
+
return _origWrite(items);
|
|
78
|
+
};
|
|
79
|
+
`]);
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
async function waitForCopyButton(tab) {
|
|
83
|
+
const deadline = Date.now() + COPY_TIMEOUT;
|
|
84
|
+
while (Date.now() < deadline) {
|
|
85
|
+
await new Promise(r => setTimeout(r, COPY_POLL_INTERVAL));
|
|
86
|
+
const found = await cdp(['eval', tab,
|
|
87
|
+
`!!document.querySelector('${S.copyButton}')`
|
|
88
|
+
]).catch(() => 'false');
|
|
89
|
+
if (found === 'true') return;
|
|
90
|
+
}
|
|
91
|
+
throw new Error(`Copilot copy button did not appear within ${COPY_TIMEOUT}ms`);
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
async function extractAnswer(tab) {
|
|
95
|
+
await cdp(['eval', tab, `document.querySelector('${S.copyButton}')?.click()`]);
|
|
96
|
+
await new Promise(r => setTimeout(r, 400));
|
|
97
|
+
|
|
98
|
+
const answer = await cdp(['eval', tab, `window.__bingClipboard || ''`]);
|
|
99
|
+
if (!answer) throw new Error('Clipboard interceptor returned empty text');
|
|
100
|
+
|
|
101
|
+
// Regex parse Markdown links from clipboard — robust against DOM changes
|
|
102
|
+
const sources = Array.from(answer.matchAll(/\[([^\]]+)\]\((https?:\/\/[^\s\)]+)\)/g))
|
|
103
|
+
.map(m => ({ title: m[1], url: m[2] }))
|
|
104
|
+
.filter((v, i, arr) => arr.findIndex(x => x.url === v.url) === i)
|
|
105
|
+
.slice(0, 10);
|
|
106
|
+
|
|
107
|
+
return { answer: answer.trim(), sources };
|
|
108
|
+
}
|
|
109
|
+
|
|
110
|
+
// ---------------------------------------------------------------------------
|
|
111
|
+
|
|
112
|
+
async function main() {
|
|
113
|
+
const args = process.argv.slice(2);
|
|
114
|
+
if (!args.length || args[0] === '--help') {
|
|
115
|
+
process.stderr.write('Usage: node extractors/bing-copilot.mjs "<query>" [--tab <prefix>]\n');
|
|
116
|
+
process.exit(1);
|
|
117
|
+
}
|
|
118
|
+
|
|
119
|
+
const short = args.includes('--short');
|
|
120
|
+
const rest = args.filter(a => a !== '--short');
|
|
121
|
+
const tabFlagIdx = rest.indexOf('--tab');
|
|
122
|
+
const tabPrefix = tabFlagIdx !== -1 ? rest[tabFlagIdx + 1] : null;
|
|
123
|
+
const query = tabFlagIdx !== -1
|
|
124
|
+
? rest.filter((_, i) => i !== tabFlagIdx && i !== tabFlagIdx + 1).join(' ')
|
|
125
|
+
: rest.join(' ');
|
|
126
|
+
|
|
127
|
+
try {
|
|
128
|
+
await cdp(['list']);
|
|
129
|
+
const tab = await getOrOpenTab(tabPrefix);
|
|
130
|
+
|
|
131
|
+
// Navigate to Copilot homepage and use the chat input
|
|
132
|
+
await cdp(['nav', tab, 'https://copilot.microsoft.com/'], 35000);
|
|
133
|
+
await new Promise(r => setTimeout(r, 2000));
|
|
134
|
+
await dismissConsent(tab, cdp);
|
|
135
|
+
|
|
136
|
+
// Handle verification challenges (Cloudflare Turnstile, Microsoft auth, etc.)
|
|
137
|
+
const verifyResult = await handleVerification(tab, cdp, 90000);
|
|
138
|
+
if (verifyResult === 'needs-human') {
|
|
139
|
+
throw new Error('Copilot verification required — please solve it manually in the browser window');
|
|
140
|
+
}
|
|
141
|
+
|
|
142
|
+
// After verification, page may have redirected or reloaded — wait for it to settle
|
|
143
|
+
if (verifyResult === 'clicked') {
|
|
144
|
+
await new Promise(r => setTimeout(r, 3000));
|
|
145
|
+
|
|
146
|
+
// Re-navigate if we got redirected
|
|
147
|
+
const currentUrl = await cdp(['eval', tab, 'document.location.href']).catch(() => '');
|
|
148
|
+
if (!currentUrl.includes('copilot.microsoft.com')) {
|
|
149
|
+
await cdp(['nav', tab, 'https://copilot.microsoft.com/'], 35000);
|
|
150
|
+
await new Promise(r => setTimeout(r, 2000));
|
|
151
|
+
await dismissConsent(tab, cdp);
|
|
152
|
+
}
|
|
153
|
+
}
|
|
154
|
+
|
|
155
|
+
// Wait for React app to mount input (up to 15s, longer after verification)
|
|
156
|
+
const inputDeadline = Date.now() + 15000;
|
|
157
|
+
while (Date.now() < inputDeadline) {
|
|
158
|
+
const found = await cdp(['eval', tab, `!!document.querySelector('${S.input}')`]).catch(() => 'false');
|
|
159
|
+
if (found === 'true') break;
|
|
160
|
+
await new Promise(r => setTimeout(r, 500));
|
|
161
|
+
}
|
|
162
|
+
await new Promise(r => setTimeout(r, 300));
|
|
163
|
+
|
|
164
|
+
// Verify input is actually there before proceeding
|
|
165
|
+
const inputReady = await cdp(['eval', tab, `!!document.querySelector('${S.input}')`]).catch(() => 'false');
|
|
166
|
+
if (inputReady !== 'true') {
|
|
167
|
+
throw new Error('Copilot input not found — verification may have failed or page is in unexpected state');
|
|
168
|
+
}
|
|
169
|
+
|
|
170
|
+
await injectClipboardInterceptor(tab);
|
|
171
|
+
await cdp(['click', tab, S.input]);
|
|
172
|
+
await new Promise(r => setTimeout(r, 400));
|
|
173
|
+
await cdp(['type', tab, query]);
|
|
174
|
+
await new Promise(r => setTimeout(r, 400));
|
|
175
|
+
|
|
176
|
+
// Submit with Enter (most reliable across locales and Chrome instances)
|
|
177
|
+
await cdp(['eval', tab,
|
|
178
|
+
`document.querySelector('${S.input}')?.dispatchEvent(new KeyboardEvent('keydown',{key:'Enter',bubbles:true,keyCode:13})), 'ok'`
|
|
179
|
+
]);
|
|
180
|
+
|
|
181
|
+
await waitForCopyButton(tab);
|
|
182
|
+
|
|
183
|
+
const { answer, sources } = await extractAnswer(tab);
|
|
184
|
+
if (!answer) throw new Error('No answer extracted — Copilot may not have responded');
|
|
185
|
+
const out = short ? answer.slice(0, 300).replace(/\s+\S*$/, '') + '…' : answer;
|
|
186
|
+
|
|
187
|
+
const finalUrl = await cdp(['eval', tab, 'document.location.href']).catch(() => '');
|
|
188
|
+
process.stdout.write(JSON.stringify({ query, url: finalUrl, answer: out, sources }, null, 2) + '\n');
|
|
189
|
+
} catch (e) {
|
|
190
|
+
process.stderr.write(`Error: ${e.message}\n`);
|
|
191
|
+
process.exit(1);
|
|
192
|
+
}
|
|
193
|
+
}
|
|
194
|
+
|
|
195
|
+
main();
|