agentbrowse 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +47 -0
- package/README.md +71 -0
- package/dist/cli.js +857 -0
- package/package.json +33 -0
package/AGENTS.md
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
# Using agentbrowse (for AI agents)
|
|
2
|
+
|
|
3
|
+
You can drive a real browser from the shell with `agentbrowse`. Prefer it over guessing at URLs or asking the user to click things.
|
|
4
|
+
|
|
5
|
+
## Mental model
|
|
6
|
+
|
|
7
|
+
- A **session** is a live browser that persists between your commands. Run `agentbrowse open <url>` once, then `read`, `click`, `find`, `fill`, `submit` operate on that same page. The session keeps cookies and scroll/DOM state.
|
|
8
|
+
- Default session is `default`. Use `--session <id>` to keep independent sessions (e.g. one per site or task).
|
|
9
|
+
- When done, run `agentbrowse stop` to free the browser. It also self-stops after a few minutes idle.
|
|
10
|
+
|
|
11
|
+
## The loop you'll usually run
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
agentbrowse open https://docs.example.com # go somewhere
|
|
15
|
+
agentbrowse read # read it (clean markdown, token-bounded)
|
|
16
|
+
agentbrowse links --filter api # find where to go next (numbered)
|
|
17
|
+
agentbrowse click 3 # follow link #3
|
|
18
|
+
agentbrowse read --page 2 # next chunk if it was truncated
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Rules that make this reliable
|
|
22
|
+
|
|
23
|
+
- **Read with `--json`** when you need to parse: `read --json` gives `{ title, markdown, page, totalPages, state }`. Every command's text output ends with a `url | title | links` footer so you always know where you are.
|
|
24
|
+
- **Targeting `click`/`type`**, in priority order: (1) visible text — `click "Sign in"`; (2) a number from the last `links`/`find` — `click 2`; (3) a CSS selector — `click "button.primary"`. Bare words are treated as visible text; use explicit CSS for elements without text.
|
|
25
|
+
- **Forms:** `fill -f email=me@x.com -f password=...` then `submit`. Or `type <field> <text>` for one field. Fields match by `name`, or pass a CSS selector.
|
|
26
|
+
- **Truncation:** `read` is capped (`--max-chars`, default 8000). If `truncated`, request `--page 2`, etc. Don't assume you've seen the whole page from page 1.
|
|
27
|
+
- **Errors:** non-zero exit codes mean failure; the reason is on **stderr** as `{ "error": { code, message } }`. `4` = target not found (re-run `links`/`find` to get fresh numbers), `3` = navigation problem, `2` = bad usage, `5` = daemon problem.
|
|
28
|
+
|
|
29
|
+
## Authentication
|
|
30
|
+
|
|
31
|
+
Do **not** try to type passwords. If a page needs login, tell the user to run:
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
agentbrowse login https://site/login --session <id>
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
A real browser opens for them to log in; the session is then saved and you can operate headlessly. Never put credentials in command arguments.
|
|
38
|
+
|
|
39
|
+
## Site manifests
|
|
40
|
+
|
|
41
|
+
If a `site.agent.json` exists for the target, you get named shortcuts:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
agentbrowse --site ./site.agent.json <command> [args...]
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
These run predefined multi-step flows (open → type → submit → read) and return the final read output. Check the manifest's `commands` for what's available.
|
package/README.md
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# agentbrowse
|
|
2
|
+
|
|
3
|
+
**Drive any website from the terminal — built for AI coding agents.**
|
|
4
|
+
|
|
5
|
+
Agents (Claude Code, Codex, …) are great at running CLIs and clumsy at clicking through web UIs. `agentbrowse` gives them a clean, parseable surface: open a page, read it as token-bounded markdown, follow links, fill and submit forms, and operate behind a login — all from terminal commands, with a persistent browser session that survives across invocations.
|
|
6
|
+
|
|
7
|
+
> There is no separate web interface to wire up. The agent runs `agentbrowse` and gets structured output back.
|
|
8
|
+
|
|
9
|
+
## Install
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npm install -g agentbrowse
|
|
13
|
+
# first run downloads the browser:
|
|
14
|
+
npx playwright install chromium
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
## Quickstart
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
agentbrowse open https://example.com # navigate the session
|
|
21
|
+
agentbrowse read # current page as clean markdown
|
|
22
|
+
agentbrowse links # numbered, followable links
|
|
23
|
+
agentbrowse click "Learn more" # click by visible text...
|
|
24
|
+
agentbrowse click 2 # ...or by a number from links/find
|
|
25
|
+
agentbrowse read --json # structured output for machines
|
|
26
|
+
agentbrowse stop # end the session (frees the browser)
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
`read`/`links` accept an optional URL to open first, so `agentbrowse read https://x.com` is "open then read" in one step.
|
|
30
|
+
|
|
31
|
+
## How it works
|
|
32
|
+
|
|
33
|
+
A **background browser daemon** (auto-spawned per session, local socket only) holds a live Playwright page, so state persists between separate commands — `open`, then later `click`, then `read`, all hit the same page. The daemon self-stops after inactivity.
|
|
34
|
+
|
|
35
|
+
Sessions are isolated by `--session <id>` (default `default`), each with its own cookies and saved auth.
|
|
36
|
+
|
|
37
|
+
## Commands
|
|
38
|
+
|
|
39
|
+
| Command | What it does |
|
|
40
|
+
|---|---|
|
|
41
|
+
| `open <url>` | Navigate the session to a URL |
|
|
42
|
+
| `read [url]` | Current page (or open `<url>` first) as token-bounded markdown (`--max-chars`, `--page`) |
|
|
43
|
+
| `links [url]` | Numbered, followable links (`--filter`) |
|
|
44
|
+
| `find <text>` | Locate elements by visible text; numbers reusable by `click` |
|
|
45
|
+
| `click <target>` | Click by visible text, a number from `links`/`find`, or a CSS selector |
|
|
46
|
+
| `type <field> <text>` | Type into a field (CSS selector or bare `name`) |
|
|
47
|
+
| `fill -f name=value …` | Fill form fields |
|
|
48
|
+
| `submit [form]` | Submit the current form |
|
|
49
|
+
| `login <url>` | Open a real browser to authenticate once; persists the session for headless reuse |
|
|
50
|
+
| `session save\|load\|clear` | Manage saved auth/session state |
|
|
51
|
+
| `stop` | Stop the session's browser daemon |
|
|
52
|
+
|
|
53
|
+
Add `--json` to any command for structured output. Errors go to **stderr** as `{ "error": { code, message } }` with a non-zero exit code (2 usage, 3 navigation, 4 target-not-found, 5 daemon).
|
|
54
|
+
|
|
55
|
+
## Authentication
|
|
56
|
+
|
|
57
|
+
`agentbrowse login https://site/login` opens a **real browser window** for *you* to log in (handling SSO, MFA, captchas an agent can't). On success it saves the session's cookies locally (`~/.webcli/sessions/<id>/`, gitignored); the agent then operates headlessly with that session. **Credentials are typed into the browser by a human — never passed as CLI arguments.**
|
|
58
|
+
|
|
59
|
+
## Site manifests (optional)
|
|
60
|
+
|
|
61
|
+
Point `agentbrowse` at a `site.agent.json` to expose named, high-level commands for a specific site:
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
agentbrowse --site ./notion.agent.json search "roadmap"
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
A manifest declares `pages`, `selectors`, and `commands` as ordered steps with `{pages.*}` / `{selectors.*}` / `{arg}` / `${ENV}` interpolation. Schema version: `webcli-manifest-v0`. See `AGENTS.md` for the agent-usage guide and the manifest format.
|
|
68
|
+
|
|
69
|
+
## License
|
|
70
|
+
|
|
71
|
+
MIT
|
package/dist/cli.js
ADDED
|
@@ -0,0 +1,857 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
// src/cli.ts
|
|
4
|
+
import { Command } from "commander";
|
|
5
|
+
|
|
6
|
+
// src/daemon/client.ts
|
|
7
|
+
import net from "net";
|
|
8
|
+
import path2 from "path";
|
|
9
|
+
import { spawn } from "child_process";
|
|
10
|
+
import { fileURLToPath } from "url";
|
|
11
|
+
|
|
12
|
+
// src/daemon/protocol.ts
|
|
13
|
+
function encode(msg) {
|
|
14
|
+
return JSON.stringify(msg) + "\n";
|
|
15
|
+
}
|
|
16
|
+
var _seq = 0;
|
|
17
|
+
function nextId() {
|
|
18
|
+
return String(++_seq);
|
|
19
|
+
}
|
|
20
|
+
var LineBuffer = class {
|
|
21
|
+
buf = "";
|
|
22
|
+
push(chunk) {
|
|
23
|
+
this.buf += chunk;
|
|
24
|
+
const lines = [];
|
|
25
|
+
let idx;
|
|
26
|
+
while ((idx = this.buf.indexOf("\n")) >= 0) {
|
|
27
|
+
lines.push(this.buf.slice(0, idx));
|
|
28
|
+
this.buf = this.buf.slice(idx + 1);
|
|
29
|
+
}
|
|
30
|
+
return lines;
|
|
31
|
+
}
|
|
32
|
+
};
|
|
33
|
+
|
|
34
|
+
// src/daemon/paths.ts
|
|
35
|
+
import os from "os";
|
|
36
|
+
import path from "path";
|
|
37
|
+
function socketPath(sessionId, platform = process.platform) {
|
|
38
|
+
if (platform === "win32") return `\\\\.\\pipe\\webcli-${sessionId}`;
|
|
39
|
+
return path.join(os.tmpdir(), `webcli-${sessionId}.sock`);
|
|
40
|
+
}
|
|
41
|
+
function profileDir(sessionId, home = process.env.WEBCLI_HOME || os.homedir()) {
|
|
42
|
+
return path.join(home, ".webcli", "sessions", sessionId);
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
// src/daemon/client.ts
|
|
46
|
+
function trySend(sockPath, req) {
|
|
47
|
+
return new Promise((resolve, reject) => {
|
|
48
|
+
const conn = net.connect(sockPath);
|
|
49
|
+
const buf = new LineBuffer();
|
|
50
|
+
let settled = false;
|
|
51
|
+
conn.on("connect", () => conn.write(encode(req)));
|
|
52
|
+
conn.on("data", (chunk) => {
|
|
53
|
+
const lines = buf.push(chunk.toString());
|
|
54
|
+
if (lines.length && !settled) {
|
|
55
|
+
settled = true;
|
|
56
|
+
resolve(JSON.parse(lines[0]));
|
|
57
|
+
conn.end();
|
|
58
|
+
}
|
|
59
|
+
});
|
|
60
|
+
conn.on("error", (e) => {
|
|
61
|
+
if (!settled) {
|
|
62
|
+
settled = true;
|
|
63
|
+
reject(e);
|
|
64
|
+
}
|
|
65
|
+
});
|
|
66
|
+
});
|
|
67
|
+
}
|
|
68
|
+
function daemonCommand(sessionId) {
|
|
69
|
+
const here = fileURLToPath(import.meta.url);
|
|
70
|
+
if (here.endsWith(".ts")) {
|
|
71
|
+
const cliEntry = path2.resolve(path2.dirname(here), "..", "cli.ts");
|
|
72
|
+
return { cmd: process.execPath, args: ["--import", "tsx", cliEntry, "__daemon", sessionId] };
|
|
73
|
+
}
|
|
74
|
+
return { cmd: process.execPath, args: [here, "__daemon", sessionId] };
|
|
75
|
+
}
|
|
76
|
+
function canConnect(sockPath) {
|
|
77
|
+
return new Promise((resolve) => {
|
|
78
|
+
const c = net.connect(sockPath);
|
|
79
|
+
c.on("connect", () => {
|
|
80
|
+
c.end();
|
|
81
|
+
resolve(true);
|
|
82
|
+
});
|
|
83
|
+
c.on("error", () => resolve(false));
|
|
84
|
+
});
|
|
85
|
+
}
|
|
86
|
+
async function waitForSocket(sockPath, timeoutMs) {
|
|
87
|
+
const deadline = Date.now() + timeoutMs;
|
|
88
|
+
while (Date.now() < deadline) {
|
|
89
|
+
if (await canConnect(sockPath)) return;
|
|
90
|
+
await new Promise((r) => setTimeout(r, 100));
|
|
91
|
+
}
|
|
92
|
+
throw new Error(`daemon for socket ${sockPath} did not start within ${timeoutMs}ms`);
|
|
93
|
+
}
|
|
94
|
+
async function spawnDaemon(sessionId, sockPath) {
|
|
95
|
+
const { cmd, args } = daemonCommand(sessionId);
|
|
96
|
+
const child = spawn(cmd, args, { detached: true, stdio: "ignore" });
|
|
97
|
+
child.unref();
|
|
98
|
+
await waitForSocket(sockPath, 15e3);
|
|
99
|
+
}
|
|
100
|
+
async function sendRequest(sessionId, req, opts = {}) {
|
|
101
|
+
const sockPath = socketPath(sessionId);
|
|
102
|
+
const autoSpawn = opts.autoSpawn !== false;
|
|
103
|
+
try {
|
|
104
|
+
return await trySend(sockPath, req);
|
|
105
|
+
} catch (e) {
|
|
106
|
+
if (!autoSpawn) throw e;
|
|
107
|
+
await spawnDaemon(sessionId, sockPath);
|
|
108
|
+
return await trySend(sockPath, req);
|
|
109
|
+
}
|
|
110
|
+
}
|
|
111
|
+
|
|
112
|
+
// src/commands/errors.ts
|
|
113
|
+
var EXIT = {
|
|
114
|
+
ok: 0,
|
|
115
|
+
internal: 1,
|
|
116
|
+
usage: 2,
|
|
117
|
+
navigation: 3,
|
|
118
|
+
targetNotFound: 4,
|
|
119
|
+
daemon: 5
|
|
120
|
+
};
|
|
121
|
+
var WebcliError = class extends Error {
|
|
122
|
+
constructor(code, message, exitCode, hint) {
|
|
123
|
+
super(message);
|
|
124
|
+
this.code = code;
|
|
125
|
+
this.exitCode = exitCode;
|
|
126
|
+
this.hint = hint;
|
|
127
|
+
this.name = "WebcliError";
|
|
128
|
+
}
|
|
129
|
+
code;
|
|
130
|
+
exitCode;
|
|
131
|
+
hint;
|
|
132
|
+
};
|
|
133
|
+
var DAEMON_CODE_EXIT = {
|
|
134
|
+
bad_args: EXIT.usage,
|
|
135
|
+
unknown_cmd: EXIT.usage,
|
|
136
|
+
target_not_found: EXIT.targetNotFound,
|
|
137
|
+
exec_error: EXIT.navigation
|
|
138
|
+
};
|
|
139
|
+
function fromDaemon(res) {
|
|
140
|
+
const e = res.error ?? { code: "unknown", message: "request failed" };
|
|
141
|
+
return new WebcliError(e.code, e.message, DAEMON_CODE_EXIT[e.code] ?? EXIT.internal, e.hint);
|
|
142
|
+
}
|
|
143
|
+
function formatError(err2) {
|
|
144
|
+
if (err2 instanceof WebcliError) {
|
|
145
|
+
return JSON.stringify({ error: { code: err2.code, message: err2.message, hint: err2.hint } });
|
|
146
|
+
}
|
|
147
|
+
return JSON.stringify({ error: { code: "internal", message: err2.message } });
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
// src/commands/open.ts
|
|
151
|
+
async function runOpen(opts) {
|
|
152
|
+
const res = await sendRequest(opts.session, { id: nextId(), cmd: "open", args: { url: opts.url } });
|
|
153
|
+
if (!res.ok) throw fromDaemon(res);
|
|
154
|
+
const d = res.data;
|
|
155
|
+
if (opts.json) return JSON.stringify(d, null, 2);
|
|
156
|
+
return `opened [${d.status}] ${d.title}
|
|
157
|
+
---
|
|
158
|
+
url: ${d.url}`;
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
// src/core/bound.ts
|
|
162
|
+
function boundText(text, maxChars, page) {
|
|
163
|
+
const totalPages = Math.max(1, Math.ceil(text.length / maxChars));
|
|
164
|
+
const clamped = Math.min(Math.max(1, page), totalPages);
|
|
165
|
+
const start = (clamped - 1) * maxChars;
|
|
166
|
+
const slice = text.slice(start, start + maxChars);
|
|
167
|
+
return {
|
|
168
|
+
text: slice,
|
|
169
|
+
page: clamped,
|
|
170
|
+
totalPages,
|
|
171
|
+
truncated: totalPages > 1
|
|
172
|
+
};
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
// src/core/output.ts
|
|
176
|
+
function footer(state) {
|
|
177
|
+
const fields = state.fields.length ? ` | fields: ${state.fields.join(", ")}` : "";
|
|
178
|
+
return `
|
|
179
|
+
---
|
|
180
|
+
url: ${state.url} | title: ${state.title} | links: ${state.links}${fields}`;
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
// src/commands/read.ts
|
|
184
|
+
async function runRead(opts) {
|
|
185
|
+
if (opts.url) {
|
|
186
|
+
const opened = await sendRequest(opts.session, { id: nextId(), cmd: "open", args: { url: opts.url } });
|
|
187
|
+
if (!opened.ok) throw fromDaemon(opened);
|
|
188
|
+
}
|
|
189
|
+
const res = await sendRequest(opts.session, { id: nextId(), cmd: "read" });
|
|
190
|
+
if (!res.ok) throw fromDaemon(res);
|
|
191
|
+
const data = res.data;
|
|
192
|
+
const bounded = boundText(data.markdown, opts.maxChars, opts.page);
|
|
193
|
+
const state = { url: data.url, title: data.title, links: data.links, fields: data.fields };
|
|
194
|
+
if (opts.json) {
|
|
195
|
+
return JSON.stringify(
|
|
196
|
+
{
|
|
197
|
+
title: data.title,
|
|
198
|
+
markdown: bounded.text,
|
|
199
|
+
page: bounded.page,
|
|
200
|
+
totalPages: bounded.totalPages,
|
|
201
|
+
truncated: bounded.truncated,
|
|
202
|
+
state
|
|
203
|
+
},
|
|
204
|
+
null,
|
|
205
|
+
2
|
|
206
|
+
);
|
|
207
|
+
}
|
|
208
|
+
const note = bounded.truncated ? `
|
|
209
|
+
[truncated: page ${bounded.page}/${bounded.totalPages}, use --page ${Math.min(bounded.page + 1, bounded.totalPages)}]` : "";
|
|
210
|
+
return bounded.text + note + footer(state);
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
// src/commands/links.ts
|
|
214
|
+
async function runLinks(opts) {
|
|
215
|
+
if (opts.url) {
|
|
216
|
+
const opened = await sendRequest(opts.session, { id: nextId(), cmd: "open", args: { url: opts.url } });
|
|
217
|
+
if (!opened.ok) throw fromDaemon(opened);
|
|
218
|
+
}
|
|
219
|
+
const res = await sendRequest(opts.session, {
|
|
220
|
+
id: nextId(),
|
|
221
|
+
cmd: "links",
|
|
222
|
+
args: opts.filter ? { filter: opts.filter } : {}
|
|
223
|
+
});
|
|
224
|
+
if (!res.ok) throw fromDaemon(res);
|
|
225
|
+
const data = res.data;
|
|
226
|
+
if (opts.json) return JSON.stringify({ url: data.url, links: data.links }, null, 2);
|
|
227
|
+
if (data.links.length === 0) return "(no links found)";
|
|
228
|
+
return data.links.map((l) => `${l.n}. ${l.text} -> ${l.href}`).join("\n");
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
// src/commands/stop.ts
|
|
232
|
+
async function runStop(session) {
|
|
233
|
+
try {
|
|
234
|
+
const res = await sendRequest(session, { id: nextId(), cmd: "stop" }, { autoSpawn: false });
|
|
235
|
+
return res.ok ? `stopped session '${session}'` : `error: ${res.error?.message ?? "stop failed"}`;
|
|
236
|
+
} catch {
|
|
237
|
+
return `no daemon running for session '${session}'`;
|
|
238
|
+
}
|
|
239
|
+
}
|
|
240
|
+
|
|
241
|
+
// src/commands/interact.ts
|
|
242
|
+
function stateLine(d) {
|
|
243
|
+
return `---
|
|
244
|
+
url: ${d.url}${d.title ? ` | title: ${d.title}` : ""}`;
|
|
245
|
+
}
|
|
246
|
+
async function runClick(opts) {
|
|
247
|
+
const res = await sendRequest(opts.session, { id: nextId(), cmd: "click", args: { target: opts.target } });
|
|
248
|
+
if (!res.ok) throw fromDaemon(res);
|
|
249
|
+
const d = res.data;
|
|
250
|
+
return opts.json ? JSON.stringify(d, null, 2) : `clicked: ${opts.target}
|
|
251
|
+
${stateLine(d)}`;
|
|
252
|
+
}
|
|
253
|
+
async function runFind(opts) {
|
|
254
|
+
const res = await sendRequest(opts.session, { id: nextId(), cmd: "find", args: { text: opts.text } });
|
|
255
|
+
if (!res.ok) throw fromDaemon(res);
|
|
256
|
+
const d = res.data;
|
|
257
|
+
if (opts.json) return JSON.stringify(d, null, 2);
|
|
258
|
+
if (d.matches.length === 0) return `(no matches for "${opts.text}")`;
|
|
259
|
+
return d.matches.map((m) => `${m.n}. <${m.tag}> ${m.text}`).join("\n");
|
|
260
|
+
}
|
|
261
|
+
async function runType(opts) {
|
|
262
|
+
const res = await sendRequest(opts.session, {
|
|
263
|
+
id: nextId(),
|
|
264
|
+
cmd: "type",
|
|
265
|
+
args: { selector: opts.selector, text: opts.text }
|
|
266
|
+
});
|
|
267
|
+
if (!res.ok) throw fromDaemon(res);
|
|
268
|
+
return opts.json ? JSON.stringify(res.data, null, 2) : `typed into ${opts.selector}`;
|
|
269
|
+
}
|
|
270
|
+
async function runFill(opts) {
|
|
271
|
+
const res = await sendRequest(opts.session, { id: nextId(), cmd: "fill", args: { fields: opts.fields } });
|
|
272
|
+
if (!res.ok) throw fromDaemon(res);
|
|
273
|
+
const d = res.data;
|
|
274
|
+
return opts.json ? JSON.stringify(d, null, 2) : `filled: ${d.filled.join(", ")}`;
|
|
275
|
+
}
|
|
276
|
+
async function runSubmit(opts) {
|
|
277
|
+
const res = await sendRequest(opts.session, {
|
|
278
|
+
id: nextId(),
|
|
279
|
+
cmd: "submit",
|
|
280
|
+
args: opts.form ? { form: opts.form } : {}
|
|
281
|
+
});
|
|
282
|
+
if (!res.ok) throw fromDaemon(res);
|
|
283
|
+
const d = res.data;
|
|
284
|
+
return opts.json ? JSON.stringify(d, null, 2) : `submitted
|
|
285
|
+
${stateLine(d)}`;
|
|
286
|
+
}
|
|
287
|
+
|
|
288
|
+
// src/commands/login.ts
|
|
289
|
+
import readline from "readline";
|
|
290
|
+
import { chromium } from "playwright";
|
|
291
|
+
|
|
292
|
+
// src/core/session.ts
|
|
293
|
+
import fs from "fs";
|
|
294
|
+
import path3 from "path";
|
|
295
|
+
function stateFilePath(session) {
|
|
296
|
+
return path3.join(profileDir(session), "storageState.json");
|
|
297
|
+
}
|
|
298
|
+
function hasState(session) {
|
|
299
|
+
return fs.existsSync(stateFilePath(session));
|
|
300
|
+
}
|
|
301
|
+
function loadStatePath(session) {
|
|
302
|
+
return hasState(session) ? stateFilePath(session) : void 0;
|
|
303
|
+
}
|
|
304
|
+
function ensureStatePath(session) {
|
|
305
|
+
const p = stateFilePath(session);
|
|
306
|
+
fs.mkdirSync(path3.dirname(p), { recursive: true });
|
|
307
|
+
return p;
|
|
308
|
+
}
|
|
309
|
+
function clearState(session) {
|
|
310
|
+
const p = stateFilePath(session);
|
|
311
|
+
if (fs.existsSync(p)) {
|
|
312
|
+
fs.rmSync(p);
|
|
313
|
+
return true;
|
|
314
|
+
}
|
|
315
|
+
return false;
|
|
316
|
+
}
|
|
317
|
+
|
|
318
|
+
// src/commands/login.ts
|
|
319
|
+
function waitForEnter() {
|
|
320
|
+
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
|
|
321
|
+
return new Promise(
|
|
322
|
+
(resolve) => rl.question("Log in in the browser window, then press Enter here... ", () => {
|
|
323
|
+
rl.close();
|
|
324
|
+
resolve();
|
|
325
|
+
})
|
|
326
|
+
);
|
|
327
|
+
}
|
|
328
|
+
async function runLogin(opts) {
|
|
329
|
+
const browser = await chromium.launch({ headless: opts.headless ?? false });
|
|
330
|
+
try {
|
|
331
|
+
const context = await browser.newContext();
|
|
332
|
+
const page = await context.newPage();
|
|
333
|
+
await page.goto(opts.url, { waitUntil: "domcontentloaded" });
|
|
334
|
+
if (opts.driveForTest) await opts.driveForTest(page);
|
|
335
|
+
if (opts.until) {
|
|
336
|
+
await page.waitForURL((u) => u.toString().includes(opts.until), {
|
|
337
|
+
timeout: opts.timeoutMs ?? 12e4
|
|
338
|
+
});
|
|
339
|
+
} else {
|
|
340
|
+
await waitForEnter();
|
|
341
|
+
}
|
|
342
|
+
const statePath = ensureStatePath(opts.session);
|
|
343
|
+
await context.storageState({ path: statePath });
|
|
344
|
+
return `logged in \u2014 session '${opts.session}' saved`;
|
|
345
|
+
} finally {
|
|
346
|
+
await browser.close();
|
|
347
|
+
try {
|
|
348
|
+
await sendRequest(opts.session, { id: nextId(), cmd: "stop" }, { autoSpawn: false });
|
|
349
|
+
} catch {
|
|
350
|
+
}
|
|
351
|
+
}
|
|
352
|
+
}
|
|
353
|
+
|
|
354
|
+
// src/commands/session.ts
|
|
355
|
+
async function runSessionSave(session) {
|
|
356
|
+
try {
|
|
357
|
+
const res = await sendRequest(session, { id: nextId(), cmd: "savestate" }, { autoSpawn: false });
|
|
358
|
+
return res.ok ? `saved session '${session}'` : `error: ${res.error?.message ?? "save failed"}`;
|
|
359
|
+
} catch {
|
|
360
|
+
return `no daemon running for session '${session}' (nothing to save)`;
|
|
361
|
+
}
|
|
362
|
+
}
|
|
363
|
+
async function runSessionLoad(session) {
|
|
364
|
+
try {
|
|
365
|
+
await sendRequest(session, { id: nextId(), cmd: "stop" }, { autoSpawn: false });
|
|
366
|
+
} catch {
|
|
367
|
+
}
|
|
368
|
+
return hasState(session) ? `session '${session}' will load its saved state on the next command` : `no saved state for session '${session}'`;
|
|
369
|
+
}
|
|
370
|
+
async function runSessionClear(session) {
|
|
371
|
+
try {
|
|
372
|
+
await sendRequest(session, { id: nextId(), cmd: "stop" }, { autoSpawn: false });
|
|
373
|
+
} catch {
|
|
374
|
+
}
|
|
375
|
+
return clearState(session) ? `cleared session '${session}'` : `no saved state for session '${session}'`;
|
|
376
|
+
}
|
|
377
|
+
|
|
378
|
+
// src/manifest/load.ts
|
|
379
|
+
import fs2 from "fs";
|
|
380
|
+
import path4 from "path";
|
|
381
|
+
|
|
382
|
+
// src/manifest/schema.ts
|
|
383
|
+
import Ajv from "ajv";
|
|
384
|
+
var MANIFEST_SCHEMA = {
|
|
385
|
+
type: "object",
|
|
386
|
+
additionalProperties: false,
|
|
387
|
+
required: ["schemaVersion", "name", "baseUrl", "commands"],
|
|
388
|
+
properties: {
|
|
389
|
+
schemaVersion: { const: "webcli-manifest-v0" },
|
|
390
|
+
name: { type: "string", minLength: 1 },
|
|
391
|
+
baseUrl: { type: "string", minLength: 1 },
|
|
392
|
+
auth: {
|
|
393
|
+
type: "object",
|
|
394
|
+
additionalProperties: false,
|
|
395
|
+
properties: {
|
|
396
|
+
type: { enum: ["headed-login", "recipe"] },
|
|
397
|
+
loginUrl: { type: "string" },
|
|
398
|
+
recipe: { type: "array", items: { type: "object" } }
|
|
399
|
+
}
|
|
400
|
+
},
|
|
401
|
+
selectors: { type: "object", additionalProperties: { type: "string" } },
|
|
402
|
+
pages: { type: "object", additionalProperties: { type: "string" } },
|
|
403
|
+
commands: {
|
|
404
|
+
type: "object",
|
|
405
|
+
additionalProperties: {
|
|
406
|
+
type: "object",
|
|
407
|
+
additionalProperties: false,
|
|
408
|
+
required: ["steps"],
|
|
409
|
+
properties: {
|
|
410
|
+
args: { type: "array", items: { type: "string" } },
|
|
411
|
+
steps: {
|
|
412
|
+
type: "array",
|
|
413
|
+
items: {
|
|
414
|
+
type: "object",
|
|
415
|
+
additionalProperties: false,
|
|
416
|
+
properties: {
|
|
417
|
+
open: { type: "string" },
|
|
418
|
+
click: { type: "string" },
|
|
419
|
+
find: { type: "string" },
|
|
420
|
+
type: {
|
|
421
|
+
type: "object",
|
|
422
|
+
additionalProperties: false,
|
|
423
|
+
required: ["selector", "text"],
|
|
424
|
+
properties: { selector: { type: "string" }, text: { type: "string" } }
|
|
425
|
+
},
|
|
426
|
+
fill: { type: "object", additionalProperties: { type: "string" } },
|
|
427
|
+
submit: { type: ["boolean", "string"] },
|
|
428
|
+
read: { type: "boolean" },
|
|
429
|
+
links: { type: "boolean" }
|
|
430
|
+
}
|
|
431
|
+
}
|
|
432
|
+
}
|
|
433
|
+
}
|
|
434
|
+
}
|
|
435
|
+
}
|
|
436
|
+
}
|
|
437
|
+
};
|
|
438
|
+
var ajv = new Ajv({ allErrors: true, allowUnionTypes: true });
|
|
439
|
+
var validate = ajv.compile(MANIFEST_SCHEMA);
|
|
440
|
+
function validateManifest(obj) {
|
|
441
|
+
const valid = validate(obj);
|
|
442
|
+
const errors = (validate.errors ?? []).map((e) => `${e.instancePath || "/"} ${e.message ?? ""}`.trim());
|
|
443
|
+
return { valid: !!valid, errors };
|
|
444
|
+
}
|
|
445
|
+
|
|
446
|
+
// src/manifest/load.ts
|
|
447
|
+
function resolveManifestPath(value) {
|
|
448
|
+
if (fs2.existsSync(value)) return value;
|
|
449
|
+
const named = path4.resolve(process.cwd(), `${value}.agent.json`);
|
|
450
|
+
if (fs2.existsSync(named)) return named;
|
|
451
|
+
throw new Error(`manifest not found: '${value}' (looked for the file and ${value}.agent.json)`);
|
|
452
|
+
}
|
|
453
|
+
function loadManifest(filePath) {
|
|
454
|
+
const obj = JSON.parse(fs2.readFileSync(filePath, "utf8"));
|
|
455
|
+
const { valid, errors } = validateManifest(obj);
|
|
456
|
+
if (!valid) throw new Error(`invalid manifest ${filePath}: ${errors.join("; ")}`);
|
|
457
|
+
return obj;
|
|
458
|
+
}
|
|
459
|
+
|
|
460
|
+
// src/manifest/run.ts
|
|
461
|
+
function interpolate(template, ctx) {
|
|
462
|
+
return template.replace(/\$\{([A-Za-z_][A-Za-z0-9_]*)\}/g, (_, env) => process.env[env] ?? "").replace(/\{([^}]+)\}/g, (_, raw) => {
|
|
463
|
+
const key = raw.trim();
|
|
464
|
+
if (key.startsWith("pages.")) return ctx.manifest.pages?.[key.slice(6)] ?? "";
|
|
465
|
+
if (key.startsWith("selectors.")) return ctx.manifest.selectors?.[key.slice(10)] ?? "";
|
|
466
|
+
return ctx.args[key] ?? "";
|
|
467
|
+
});
|
|
468
|
+
}
|
|
469
|
+
var send = async (session, cmd, args) => {
|
|
470
|
+
const res = await sendRequest(session, { id: nextId(), cmd, args });
|
|
471
|
+
if (!res.ok) throw fromDaemon(res);
|
|
472
|
+
return res;
|
|
473
|
+
};
|
|
474
|
+
async function runCommand(manifest, name, argv, session) {
|
|
475
|
+
const cmd = manifest.commands[name];
|
|
476
|
+
if (!cmd) {
|
|
477
|
+
throw new Error(`unknown command '${name}' in manifest '${manifest.name}'`);
|
|
478
|
+
}
|
|
479
|
+
const args = {};
|
|
480
|
+
(cmd.args ?? []).forEach((argName, i) => {
|
|
481
|
+
args[argName] = argv[i] ?? "";
|
|
482
|
+
});
|
|
483
|
+
const ctx = { manifest, args };
|
|
484
|
+
let lastOutput = "";
|
|
485
|
+
for (const step of cmd.steps) {
|
|
486
|
+
if (step.open !== void 0) {
|
|
487
|
+
const url = new URL(interpolate(step.open, ctx), manifest.baseUrl).toString();
|
|
488
|
+
await send(session, "open", { url });
|
|
489
|
+
} else if (step.click !== void 0) {
|
|
490
|
+
await send(session, "click", { target: interpolate(step.click, ctx) });
|
|
491
|
+
} else if (step.find !== void 0) {
|
|
492
|
+
await send(session, "find", { text: interpolate(step.find, ctx) });
|
|
493
|
+
} else if (step.type !== void 0) {
|
|
494
|
+
await send(session, "type", {
|
|
495
|
+
selector: interpolate(step.type.selector, ctx),
|
|
496
|
+
text: interpolate(step.type.text, ctx)
|
|
497
|
+
});
|
|
498
|
+
} else if (step.fill !== void 0) {
|
|
499
|
+
const fields = {};
|
|
500
|
+
for (const [k, v] of Object.entries(step.fill)) fields[k] = interpolate(v, ctx);
|
|
501
|
+
await send(session, "fill", { fields });
|
|
502
|
+
} else if (step.submit !== void 0) {
|
|
503
|
+
await send(session, "submit", typeof step.submit === "string" ? { form: interpolate(step.submit, ctx) } : {});
|
|
504
|
+
} else if (step.read) {
|
|
505
|
+
const res = await send(session, "read", {});
|
|
506
|
+
lastOutput = res.data?.markdown ?? "";
|
|
507
|
+
} else if (step.links) {
|
|
508
|
+
const res = await send(session, "links", {});
|
|
509
|
+
lastOutput = JSON.stringify(res.data?.links ?? []);
|
|
510
|
+
}
|
|
511
|
+
}
|
|
512
|
+
return lastOutput;
|
|
513
|
+
}
|
|
514
|
+
|
|
515
|
+
// src/daemon/server.ts
|
|
516
|
+
import net2 from "net";
|
|
517
|
+
import fs3 from "fs";
|
|
518
|
+
import { chromium as chromium2 } from "playwright";
|
|
519
|
+
|
|
520
|
+
// src/core/markdown.ts
|
|
521
|
+
import { JSDOM } from "jsdom";
|
|
522
|
+
import { Readability } from "@mozilla/readability";
|
|
523
|
+
import TurndownService from "turndown";
|
|
524
|
+
function htmlToMarkdown(html, baseUrl) {
|
|
525
|
+
const dom = new JSDOM(html, { url: baseUrl });
|
|
526
|
+
const doc = dom.window.document;
|
|
527
|
+
for (const a of Array.from(doc.querySelectorAll("a[href]"))) {
|
|
528
|
+
const href = a.getAttribute("href");
|
|
529
|
+
if (href) {
|
|
530
|
+
try {
|
|
531
|
+
a.setAttribute("href", new dom.window.URL(href, baseUrl).toString());
|
|
532
|
+
} catch {
|
|
533
|
+
}
|
|
534
|
+
}
|
|
535
|
+
}
|
|
536
|
+
const article = new Readability(doc).parse();
|
|
537
|
+
const contentHtml = article?.content ?? doc.body?.innerHTML ?? "";
|
|
538
|
+
const turndown = new TurndownService({ headingStyle: "atx", codeBlockStyle: "fenced" });
|
|
539
|
+
turndown.remove(["script", "style", "noscript"]);
|
|
540
|
+
return turndown.turndown(contentHtml).trim();
|
|
541
|
+
}
|
|
542
|
+
|
|
543
|
+
// src/core/links.ts
|
|
544
|
+
import { JSDOM as JSDOM2 } from "jsdom";
|
|
545
|
+
function extractLinks(html, baseUrl, filter) {
|
|
546
|
+
const dom = new JSDOM2(html, { url: baseUrl });
|
|
547
|
+
const needle = filter?.toLowerCase();
|
|
548
|
+
const out = [];
|
|
549
|
+
for (const a of Array.from(dom.window.document.querySelectorAll("a[href]"))) {
|
|
550
|
+
const rawHref = a.getAttribute("href") ?? "";
|
|
551
|
+
if (rawHref.startsWith("#") || rawHref.trim() === "") continue;
|
|
552
|
+
const text = (a.textContent ?? "").replace(/\s+/g, " ").trim();
|
|
553
|
+
if (!text) continue;
|
|
554
|
+
let href;
|
|
555
|
+
try {
|
|
556
|
+
href = new dom.window.URL(rawHref, baseUrl).toString();
|
|
557
|
+
} catch {
|
|
558
|
+
continue;
|
|
559
|
+
}
|
|
560
|
+
if (needle && !`${text} ${href}`.toLowerCase().includes(needle)) continue;
|
|
561
|
+
out.push({ n: out.length + 1, text, href });
|
|
562
|
+
}
|
|
563
|
+
return out;
|
|
564
|
+
}
|
|
565
|
+
|
|
566
|
+
// src/core/target.ts
|
|
567
|
+
function looksLikeSelector(s) {
|
|
568
|
+
const t = s.trim();
|
|
569
|
+
if (/^[.#[]/.test(t)) return true;
|
|
570
|
+
if (/[>[\]]/.test(t)) return true;
|
|
571
|
+
if (/^[a-zA-Z][\w-]*\.[\w-]+$/.test(t)) return true;
|
|
572
|
+
return false;
|
|
573
|
+
}
|
|
574
|
+
function resolveLocator(page, target) {
|
|
575
|
+
if (looksLikeSelector(target)) return page.locator(target).first();
|
|
576
|
+
return page.getByText(target, { exact: false }).first();
|
|
577
|
+
}
|
|
578
|
+
function resolveField(page, ref) {
|
|
579
|
+
if (looksLikeSelector(ref)) return page.locator(ref).first();
|
|
580
|
+
return page.locator(`[name="${ref}"]`).first();
|
|
581
|
+
}
|
|
582
|
+
|
|
583
|
+
// src/daemon/server.ts
|
|
584
|
+
function ok(req, data) {
|
|
585
|
+
return { id: req.id, ok: true, data };
|
|
586
|
+
}
|
|
587
|
+
function err(req, code, message) {
|
|
588
|
+
return { id: req.id, ok: false, error: { code, message } };
|
|
589
|
+
}
|
|
590
|
+
async function startDaemon(sessionId, opts = {}) {
|
|
591
|
+
const idleMs = opts.idleMs ?? Number(process.env.WEBCLI_IDLE_MS ?? 3e5);
|
|
592
|
+
const browser = await chromium2.launch({ headless: true });
|
|
593
|
+
const statePath = loadStatePath(sessionId);
|
|
594
|
+
const context = await browser.newContext(statePath ? { storageState: statePath } : {});
|
|
595
|
+
const page = await context.newPage();
|
|
596
|
+
let lastRefs = [];
|
|
597
|
+
const settle = () => page.waitForLoadState("domcontentloaded").catch(() => {
|
|
598
|
+
});
|
|
599
|
+
async function dispatch(req) {
|
|
600
|
+
try {
|
|
601
|
+
switch (req.cmd) {
|
|
602
|
+
case "open": {
|
|
603
|
+
const url = String(req.args?.url ?? "");
|
|
604
|
+
if (!url) return err(req, "bad_args", "open requires a url");
|
|
605
|
+
const resp = await page.goto(url, { waitUntil: "domcontentloaded" });
|
|
606
|
+
return ok(req, { title: await page.title(), url: page.url(), status: resp?.status() ?? 0 });
|
|
607
|
+
}
|
|
608
|
+
case "read": {
|
|
609
|
+
const html = await page.content();
|
|
610
|
+
const url = page.url();
|
|
611
|
+
return ok(req, {
|
|
612
|
+
markdown: htmlToMarkdown(html, url),
|
|
613
|
+
url,
|
|
614
|
+
title: await page.title(),
|
|
615
|
+
links: extractLinks(html, url).length,
|
|
616
|
+
fields: []
|
|
617
|
+
});
|
|
618
|
+
}
|
|
619
|
+
case "links": {
|
|
620
|
+
const html = await page.content();
|
|
621
|
+
const filter = req.args?.filter ? String(req.args.filter) : void 0;
|
|
622
|
+
const links = extractLinks(html, page.url(), filter);
|
|
623
|
+
lastRefs = links.map((l) => ({ kind: "link", href: l.href }));
|
|
624
|
+
return ok(req, { links, url: page.url() });
|
|
625
|
+
}
|
|
626
|
+
case "find": {
|
|
627
|
+
const text = String(req.args?.text ?? "");
|
|
628
|
+
if (!text) return err(req, "bad_args", "find requires text");
|
|
629
|
+
const matches = await page.getByText(text, { exact: false }).all();
|
|
630
|
+
lastRefs = matches.map((locator) => ({ kind: "element", locator }));
|
|
631
|
+
const items = await Promise.all(
|
|
632
|
+
matches.map(async (loc, i) => ({
|
|
633
|
+
n: i + 1,
|
|
634
|
+
tag: await loc.evaluate((el) => el.tagName.toLowerCase()).catch(() => "?"),
|
|
635
|
+
text: (await loc.textContent().catch(() => "") ?? "").replace(/\s+/g, " ").trim().slice(0, 120)
|
|
636
|
+
}))
|
|
637
|
+
);
|
|
638
|
+
return ok(req, { matches: items, url: page.url() });
|
|
639
|
+
}
|
|
640
|
+
case "click": {
|
|
641
|
+
const target = String(req.args?.target ?? "");
|
|
642
|
+
if (!target) return err(req, "bad_args", "click requires a target");
|
|
643
|
+
if (/^\d+$/.test(target)) {
|
|
644
|
+
const ref = lastRefs[parseInt(target, 10) - 1];
|
|
645
|
+
if (!ref) return err(req, "target_not_found", `no ref #${target}; run links or find first`);
|
|
646
|
+
if (ref.kind === "link") {
|
|
647
|
+
await page.goto(ref.href, { waitUntil: "domcontentloaded" });
|
|
648
|
+
} else {
|
|
649
|
+
await ref.locator.click();
|
|
650
|
+
await settle();
|
|
651
|
+
}
|
|
652
|
+
} else {
|
|
653
|
+
const loc = resolveLocator(page, target);
|
|
654
|
+
if (await loc.count() === 0) return err(req, "target_not_found", `no element matching: ${target}`);
|
|
655
|
+
await loc.click();
|
|
656
|
+
await settle();
|
|
657
|
+
}
|
|
658
|
+
return ok(req, { url: page.url(), title: await page.title() });
|
|
659
|
+
}
|
|
660
|
+
case "type": {
|
|
661
|
+
const selector = String(req.args?.selector ?? "");
|
|
662
|
+
const text = String(req.args?.text ?? "");
|
|
663
|
+
if (!selector) return err(req, "bad_args", "type requires a selector");
|
|
664
|
+
const loc = resolveField(page, selector);
|
|
665
|
+
if (await loc.count() === 0) return err(req, "target_not_found", `no field matching: ${selector}`);
|
|
666
|
+
await loc.fill(text);
|
|
667
|
+
return ok(req, { typed: selector, url: page.url() });
|
|
668
|
+
}
|
|
669
|
+
case "fill": {
|
|
670
|
+
const fields = req.args?.fields ?? {};
|
|
671
|
+
const filled = [];
|
|
672
|
+
for (const [name, value] of Object.entries(fields)) {
|
|
673
|
+
const loc = resolveField(page, name);
|
|
674
|
+
if (await loc.count() === 0) return err(req, "target_not_found", `no field matching: ${name}`);
|
|
675
|
+
await loc.fill(String(value));
|
|
676
|
+
filled.push(name);
|
|
677
|
+
}
|
|
678
|
+
return ok(req, { filled, url: page.url() });
|
|
679
|
+
}
|
|
680
|
+
case "submit": {
|
|
681
|
+
const formSel = req.args?.form ? String(req.args.form) : void 0;
|
|
682
|
+
const submitBtn = page.locator('button[type="submit"], input[type="submit"]').first();
|
|
683
|
+
if (await submitBtn.count() > 0) {
|
|
684
|
+
await submitBtn.click();
|
|
685
|
+
} else {
|
|
686
|
+
const form = formSel ? page.locator(formSel).first() : page.locator("form").first();
|
|
687
|
+
if (await form.count() === 0) return err(req, "target_not_found", "no form to submit");
|
|
688
|
+
await form.evaluate((f) => f.requestSubmit());
|
|
689
|
+
}
|
|
690
|
+
await settle();
|
|
691
|
+
return ok(req, { url: page.url(), title: await page.title() });
|
|
692
|
+
}
|
|
693
|
+
case "savestate": {
|
|
694
|
+
const p = ensureStatePath(sessionId);
|
|
695
|
+
await context.storageState({ path: p });
|
|
696
|
+
return ok(req, { saved: p });
|
|
697
|
+
}
|
|
698
|
+
case "stop":
|
|
699
|
+
return ok(req, { stopping: true });
|
|
700
|
+
default:
|
|
701
|
+
return err(req, "unknown_cmd", `unknown command: ${req.cmd}`);
|
|
702
|
+
}
|
|
703
|
+
} catch (e) {
|
|
704
|
+
return err(req, "exec_error", e.message);
|
|
705
|
+
}
|
|
706
|
+
}
|
|
707
|
+
const sockPath = socketPath(sessionId);
|
|
708
|
+
if (process.platform !== "win32" && fs3.existsSync(sockPath)) {
|
|
709
|
+
try {
|
|
710
|
+
fs3.unlinkSync(sockPath);
|
|
711
|
+
} catch {
|
|
712
|
+
}
|
|
713
|
+
}
|
|
714
|
+
let stopping = false;
|
|
715
|
+
async function stop() {
|
|
716
|
+
if (stopping) return;
|
|
717
|
+
stopping = true;
|
|
718
|
+
if (idleTimer) clearTimeout(idleTimer);
|
|
719
|
+
await new Promise((r) => server.close(() => r()));
|
|
720
|
+
await browser.close();
|
|
721
|
+
if (process.platform !== "win32" && fs3.existsSync(sockPath)) {
|
|
722
|
+
try {
|
|
723
|
+
fs3.unlinkSync(sockPath);
|
|
724
|
+
} catch {
|
|
725
|
+
}
|
|
726
|
+
}
|
|
727
|
+
}
|
|
728
|
+
let idleTimer;
|
|
729
|
+
function resetIdle() {
|
|
730
|
+
if (idleTimer) clearTimeout(idleTimer);
|
|
731
|
+
if (idleMs > 0) {
|
|
732
|
+
idleTimer = setTimeout(() => void stop(), idleMs);
|
|
733
|
+
idleTimer.unref();
|
|
734
|
+
}
|
|
735
|
+
}
|
|
736
|
+
const server = net2.createServer((conn) => {
|
|
737
|
+
const buf = new LineBuffer();
|
|
738
|
+
conn.on("data", async (chunk) => {
|
|
739
|
+
for (const line of buf.push(chunk.toString())) {
|
|
740
|
+
let req;
|
|
741
|
+
try {
|
|
742
|
+
req = JSON.parse(line);
|
|
743
|
+
} catch {
|
|
744
|
+
continue;
|
|
745
|
+
}
|
|
746
|
+
const res = await dispatch(req);
|
|
747
|
+
conn.write(encode(res));
|
|
748
|
+
resetIdle();
|
|
749
|
+
if (req.cmd === "stop") {
|
|
750
|
+
conn.end();
|
|
751
|
+
void stop();
|
|
752
|
+
}
|
|
753
|
+
}
|
|
754
|
+
});
|
|
755
|
+
conn.on("error", () => {
|
|
756
|
+
});
|
|
757
|
+
});
|
|
758
|
+
await new Promise((resolve, reject) => {
|
|
759
|
+
server.once("error", reject);
|
|
760
|
+
server.listen(sockPath, () => resolve());
|
|
761
|
+
});
|
|
762
|
+
resetIdle();
|
|
763
|
+
return { socket: sockPath, stop };
|
|
764
|
+
}
|
|
765
|
+
|
|
766
|
+
// src/cli.ts
|
|
767
|
+
async function emit(produce) {
|
|
768
|
+
try {
|
|
769
|
+
process.stdout.write(await produce() + "\n");
|
|
770
|
+
} catch (e) {
|
|
771
|
+
process.stderr.write(formatError(e) + "\n");
|
|
772
|
+
process.exitCode = e instanceof WebcliError ? e.exitCode : EXIT.internal;
|
|
773
|
+
}
|
|
774
|
+
}
|
|
775
|
+
function buildProgram() {
|
|
776
|
+
const program = new Command();
|
|
777
|
+
program.name("agentbrowse").description("Agent-browser CLI: drive any website from the terminal.").version("0.0.1").option("--session <id>", "session name (isolated browser + cookies)", "default");
|
|
778
|
+
const session = (cmd) => cmd.optsWithGlobals().session;
|
|
779
|
+
program.command("open").description("Navigate the session's browser to a URL.").argument("<url>", "URL to open").option("--json", "structured JSON output", false).action((url, opts, cmd) => emit(() => runOpen({ session: session(cmd), json: !!opts.json, url })));
|
|
780
|
+
program.command("read").description("Read the current page (or open <url> first) as token-bounded markdown.").argument("[url]", "optional URL to open before reading").option("--json", "structured JSON output", false).option("--max-chars <n>", "max characters per page", (v) => parseInt(v, 10), 8e3).option("--page <n>", "page number when output is truncated", (v) => parseInt(v, 10), 1).action(
|
|
781
|
+
(url, opts, cmd) => emit(
|
|
782
|
+
() => runRead({ session: session(cmd), json: !!opts.json, maxChars: opts.maxChars, page: opts.page, url })
|
|
783
|
+
)
|
|
784
|
+
);
|
|
785
|
+
program.command("links").description("List navigable links on the current page (or open <url> first).").argument("[url]", "optional URL to open before listing").option("--json", "structured JSON output", false).option("--filter <text>", "case-insensitive substring filter").action(
|
|
786
|
+
(url, opts, cmd) => emit(() => runLinks({ session: session(cmd), json: !!opts.json, filter: opts.filter, url }))
|
|
787
|
+
);
|
|
788
|
+
program.command("find").description("Find elements on the current page by visible text (numbers reusable by click).").argument("<text...>", "visible text to search for").option("--json", "structured JSON output", false).action(
|
|
789
|
+
(text, opts, cmd) => emit(() => runFind({ session: session(cmd), json: !!opts.json, text: text.join(" ") }))
|
|
790
|
+
);
|
|
791
|
+
program.command("click").description("Click by visible text, a number from links/find, or a CSS selector.").argument("<target...>", "text, #number, or selector").option("--json", "structured JSON output", false).action(
|
|
792
|
+
(target, opts, cmd) => emit(() => runClick({ session: session(cmd), json: !!opts.json, target: target.join(" ") }))
|
|
793
|
+
);
|
|
794
|
+
program.command("type").description("Type text into a field (CSS selector or bare field name).").argument("<selector>", "field selector or name").argument("<text...>", "text to type").option("--json", "structured JSON output", false).action(
|
|
795
|
+
(selector, text, opts, cmd) => emit(
|
|
796
|
+
() => runType({ session: session(cmd), json: !!opts.json, selector, text: text.join(" ") })
|
|
797
|
+
)
|
|
798
|
+
);
|
|
799
|
+
program.command("fill").description("Fill one or more form fields.").option(
|
|
800
|
+
"-f, --field <name=value>",
|
|
801
|
+
"form field (repeatable)",
|
|
802
|
+
(v, acc) => {
|
|
803
|
+
acc.push(v);
|
|
804
|
+
return acc;
|
|
805
|
+
},
|
|
806
|
+
[]
|
|
807
|
+
).option("--json", "structured JSON output", false).action((opts, cmd) => {
|
|
808
|
+
const fields = {};
|
|
809
|
+
for (const pair of opts.field) {
|
|
810
|
+
const i = pair.indexOf("=");
|
|
811
|
+
if (i > 0) fields[pair.slice(0, i)] = pair.slice(i + 1);
|
|
812
|
+
}
|
|
813
|
+
return emit(() => runFill({ session: session(cmd), json: !!opts.json, fields }));
|
|
814
|
+
});
|
|
815
|
+
program.command("submit").description("Submit the current form (or a specific one by selector).").argument("[form]", "optional form selector").option("--json", "structured JSON output", false).action((form, opts, cmd) => emit(() => runSubmit({ session: session(cmd), json: !!opts.json, form })));
|
|
816
|
+
program.command("login").description("Open a real browser to log in once; persists the session for headless reuse.").argument("<url>", "login page URL").option("--until <text>", "auto-finish when the URL contains this (else waits for Enter)").action((url, opts, cmd) => emit(() => runLogin({ session: session(cmd), url, until: opts.until })));
|
|
817
|
+
const sessionCmd = program.command("session").description("Manage saved session/auth state.");
|
|
818
|
+
sessionCmd.command("save").description("Persist the running daemon's cookies/localStorage.").action((_o, cmd) => emit(() => runSessionSave(session(cmd))));
|
|
819
|
+
sessionCmd.command("load").description("Reload saved state into the session.").action((_o, cmd) => emit(() => runSessionLoad(session(cmd))));
|
|
820
|
+
sessionCmd.command("clear").description("Delete saved state for the session.").action((_o, cmd) => emit(() => runSessionClear(session(cmd))));
|
|
821
|
+
program.command("stop").description("Stop the session's background browser daemon.").action((_opts, cmd) => emit(() => runStop(session(cmd))));
|
|
822
|
+
const daemonCmd = new Command("__daemon").argument("<sessionId>").action(async (sessionId) => {
|
|
823
|
+
await startDaemon(sessionId);
|
|
824
|
+
});
|
|
825
|
+
program.addCommand(daemonCmd, { hidden: true });
|
|
826
|
+
return program;
|
|
827
|
+
}
|
|
828
|
+
async function dispatchSite(argv) {
|
|
829
|
+
let site = "";
|
|
830
|
+
let session = "default";
|
|
831
|
+
const rest = [];
|
|
832
|
+
for (let i = 0; i < argv.length; i++) {
|
|
833
|
+
if (argv[i] === "--site") site = argv[++i] ?? "";
|
|
834
|
+
else if (argv[i] === "--session") session = argv[++i] ?? "default";
|
|
835
|
+
else rest.push(argv[i]);
|
|
836
|
+
}
|
|
837
|
+
const [command, ...cmdArgs] = rest;
|
|
838
|
+
if (!command) throw new WebcliError("usage", "usage: agentbrowse --site <manifest> <command> [args...]", EXIT.usage);
|
|
839
|
+
const manifest = loadManifest(resolveManifestPath(site));
|
|
840
|
+
process.stdout.write(await runCommand(manifest, command, cmdArgs, session) + "\n");
|
|
841
|
+
}
|
|
842
|
+
var isMain = process.argv[1]?.endsWith("cli.ts") || process.argv[1]?.endsWith("cli.js");
|
|
843
|
+
if (isMain) {
|
|
844
|
+
const argv = process.argv.slice(2);
|
|
845
|
+
if (argv.includes("--site")) {
|
|
846
|
+
dispatchSite(argv).catch((e) => {
|
|
847
|
+
process.stderr.write(formatError(e) + "\n");
|
|
848
|
+
process.exitCode = e instanceof WebcliError ? e.exitCode : EXIT.usage;
|
|
849
|
+
});
|
|
850
|
+
} else {
|
|
851
|
+
buildProgram().parseAsync(process.argv);
|
|
852
|
+
}
|
|
853
|
+
}
|
|
854
|
+
export {
|
|
855
|
+
buildProgram,
|
|
856
|
+
dispatchSite
|
|
857
|
+
};
|
package/package.json
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "agentbrowse",
|
|
3
|
+
"version": "0.0.1",
|
|
4
|
+
"description": "Agent-browser CLI: drive any website from the terminal.",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"bin": { "agentbrowse": "./dist/cli.js" },
|
|
7
|
+
"files": ["dist", "AGENTS.md"],
|
|
8
|
+
"engines": { "node": ">=20" },
|
|
9
|
+
"scripts": {
|
|
10
|
+
"build": "tsup",
|
|
11
|
+
"dev": "tsx src/cli.ts",
|
|
12
|
+
"test": "vitest run",
|
|
13
|
+
"test:watch": "vitest",
|
|
14
|
+
"prepublishOnly": "npm run build && npm test"
|
|
15
|
+
},
|
|
16
|
+
"dependencies": {
|
|
17
|
+
"commander": "^12.1.0",
|
|
18
|
+
"playwright": "^1.45.0",
|
|
19
|
+
"jsdom": "^24.1.0",
|
|
20
|
+
"@mozilla/readability": "^0.5.0",
|
|
21
|
+
"turndown": "^7.2.0",
|
|
22
|
+
"ajv": "^8.17.1"
|
|
23
|
+
},
|
|
24
|
+
"devDependencies": {
|
|
25
|
+
"typescript": "^5.5.0",
|
|
26
|
+
"tsup": "^8.1.0",
|
|
27
|
+
"tsx": "^4.16.0",
|
|
28
|
+
"vitest": "^2.0.0",
|
|
29
|
+
"@types/node": "^20.14.0",
|
|
30
|
+
"@types/jsdom": "^21.1.0",
|
|
31
|
+
"@types/turndown": "^5.0.4"
|
|
32
|
+
}
|
|
33
|
+
}
|