aether-mcp-server 2.0.1 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,28 +1,25 @@
1
1
  # Aether — AI Browser Controller
2
2
 
3
- Aether is an MCP (Model Context Protocol) server that lets AI coding agents inspect and control a real browser through the Chrome DevTools Protocol (CDP). No browser extension needed.
3
+ Give your AI agent a real browser. Aether is an MCP server that lets Claude, Cursor, Kilo Code, Codex, and other AI coding agents inspect and control a live browser through the Chrome DevTools Protocol no extension needed.
4
4
 
5
5
  ## Install
6
6
 
7
7
  ```bash
8
- npx aether-mcp-server
8
+ npx -y aether-mcp-server
9
9
  ```
10
10
 
11
11
  Or globally:
12
12
 
13
13
  ```bash
14
14
  npm install -g aether-mcp-server
15
- aether-mcp-server
16
15
  ```
17
16
 
18
17
  ## MCP Client Config
19
18
 
20
- Add this to your MCP client (Cursor, Claude Code, Codex, KiloCode, etc.):
21
-
22
19
  ```json
23
20
  {
24
21
  "mcpServers": {
25
- "aether-browser": {
22
+ "aether": {
26
23
  "command": "npx",
27
24
  "args": ["-y", "aether-mcp-server"]
28
25
  }
@@ -35,7 +32,7 @@ If installed globally:
35
32
  ```json
36
33
  {
37
34
  "mcpServers": {
38
- "aether-browser": {
35
+ "aether": {
39
36
  "command": "aether-mcp-server",
40
37
  "args": []
41
38
  }
@@ -43,72 +40,52 @@ If installed globally:
43
40
  }
44
41
  ```
45
42
 
46
- That's it. No cloning. No building. No absolute paths.
43
+ No cloning. No building. No absolute paths.
47
44
 
48
45
  ## What It Does
49
46
 
50
47
  - Launches or connects to Chrome, Edge, Brave, or Firefox with remote debugging
51
- - Exposes MCP tools for navigation, clicking, typing, form filling, screenshots, tabs, logs, cookies, network controls, PDF printing, and page inspection
52
- - Uses native CDP input events instead of fragile DOM-only hacks
53
- - Provides Set-of-Marks style visual element references for screenshot-based workflows
54
- - Resolves elements by selector, text, role, accessible name, label, placeholder, XPath, coordinates, same-origin iframe content, and shadow DOM content
55
- - Caches compact page snapshots and invalidates them on DOM/navigation/runtime events
48
+ - Click by text, role, label, or placeholder not fragile CSS selectors
49
+ - Native CDP keyboard and mouse events, not DOM simulation
50
+ - Smart page snapshots that auto-invalidate on DOM changes
51
+ - Set-of-Marks visual element references for screenshot-based workflows
52
+ - Shadow DOM and same-origin iframe support built in
56
53
 
57
54
  ## Browser Setup
58
55
 
59
- Launch a clean browser automatically:
60
-
61
56
  ```
62
57
  launch_browser()
63
- ```
64
-
65
- Pick a specific browser:
66
-
67
- ```
68
58
  launch_browser(browser="chrome")
69
59
  launch_browser(browser="edge")
70
60
  launch_browser(browser="brave")
71
61
  ```
72
62
 
73
- Connect to an existing browser with remote debugging:
63
+ Connect to an existing browser:
74
64
 
75
65
  ```bash
76
66
  chrome --remote-debugging-port=9222
77
67
  ```
78
68
 
79
- Then:
80
-
81
69
  ```
82
70
  connect_browser(mode="connect", port=9222)
83
71
  ```
84
72
 
85
73
  ## Key Tools
86
74
 
87
- - `browser_status` connection and active tab status
88
- - `snapshot_compact` — title, URL, readyState, and interactive element list
89
- - `list_interactive_elements` element refs for click/fill flows
90
- - `click_by_ref` click refs returned by compact snapshots
91
- - `click_text`, `click_role`, `fill_label` semantic actions powered by the locator engine
92
- - `get_state` optional screenshot, tabs, DOM snapshot, and elements
93
- - `get_logs`, `get_network_errors` compact debugging output
94
- - `act` broad compatibility action tool
75
+ | Tool | What it does |
76
+ |---|---|
77
+ | `browser_status` | Connection and active tab status |
78
+ | `snapshot_compact` | Fast title, URL, and interactive element list |
79
+ | `list_interactive_elements` | Element refs for click/fill flows |
80
+ | `click_text`, `click_role`, `fill_label` | Semantic actions |
81
+ | `click_by_ref`, `fill_by_selector` | Direct element targeting |
82
+ | `get_state` | Screenshot, tabs, DOM snapshot |
83
+ | `get_logs`, `get_network_errors` | Live debugging output |
84
+ | `act` | Broad compatibility action tool |
95
85
 
96
86
  ## Project-Local Learning
97
87
 
98
- Aether stores learned lessons and skills in `.aether/` inside your project:
99
-
100
- ```
101
- <project>/.aether/
102
- memory/
103
- lessons.jsonl
104
- learned.json
105
- skills/
106
- <skill-name>/SKILL.md
107
- _registry.json
108
- memory-config.json
109
- ```
110
-
111
- Call `configure_aether_memory` with your project root to enable it. Aether creates `.aether/` and adds it to `.gitignore`.
88
+ Aether stores learned lessons and reusable skills inside your project under `.aether/` lightweight automation notes that make future runs faster. Call `configure_aether_memory` with your project root to enable it.
112
89
 
113
90
  ## Environment Variables
114
91
 
@@ -120,21 +97,7 @@ Call `configure_aether_memory` with your project root to enable it. Aether creat
120
97
  ## Requirements
121
98
 
122
99
  - Node.js >= 18
123
- - Chrome, Edge, Brave, or Firefox installed locally
124
-
125
- ## Architecture
126
-
127
- ```
128
- AI Agent / MCP Client
129
- |
130
- | stdio JSON-RPC
131
- v
132
- Aether MCP Server
133
- |
134
- | Chrome DevTools Protocol
135
- v
136
- Browser Target
137
- ```
100
+ - Chrome, Edge, Brave, or Firefox
138
101
 
139
102
  ## License
140
103
 
@@ -1,4 +1,7 @@
1
1
  "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
2
5
  Object.defineProperty(exports, "__esModule", { value: true });
3
6
  exports.CdpBridge = void 0;
4
7
  exports.getCdpBridge = getCdpBridge;
@@ -6,6 +9,8 @@ const cdp_client_1 = require("./cdp-client");
6
9
  const captcha_solver_1 = require("./captcha-solver");
7
10
  const locator_engine_1 = require("./locator-engine");
8
11
  const page_snapshot_cache_1 = require("./page-snapshot-cache");
12
+ const fs_1 = require("fs");
13
+ const path_1 = __importDefault(require("path"));
9
14
  /**
10
15
  * Bridge layer that translates old extension-style commands to CDP commands.
11
16
  * This allows the MCP server to work without the Chrome extension.
@@ -169,6 +174,15 @@ class CdpBridge {
169
174
  case "get_dom_snapshot":
170
175
  await this.ensureConnected();
171
176
  return this.getDomSnapshot(params);
177
+ case "get_page_text":
178
+ await this.ensureConnected();
179
+ return this.getPageText(params);
180
+ case "save_auth_state":
181
+ await this.ensureConnected();
182
+ return this.saveAuthState(params);
183
+ case "load_auth_state":
184
+ await this.ensureConnected();
185
+ return this.loadAuthState(params);
172
186
  case "get_tabs":
173
187
  await this.ensureConnected();
174
188
  return this.getTabs(params);
@@ -354,72 +368,6 @@ class CdpBridge {
354
368
  elements: snapshot.elements
355
369
  };
356
370
  }
357
- async getCompactElements(maxElements, includeText, withOverlay) {
358
- const result = await this.client.sendCommand("Runtime.evaluate", {
359
- expression: `
360
- (function() {
361
- const max = ${JSON.stringify(maxElements)};
362
- const includeText = ${JSON.stringify(includeText)};
363
- const selectors = [
364
- 'a[href]', 'button', 'input:not([type="hidden"])', 'select', 'textarea',
365
- '[onclick]', '[role="button"]', '[role="link"]', '[role="checkbox"]',
366
- '[tabindex]:not([tabindex="-1"])', 'label', 'summary'
367
- ].join(', ');
368
-
369
- function cssPath(el) {
370
- if (el.id) return '#' + CSS.escape(el.id);
371
- const path = [];
372
- while (el && el.nodeType === Node.ELEMENT_NODE && el !== document.body) {
373
- let selector = el.nodeName.toLowerCase();
374
- if (el.classList && el.classList.length) {
375
- selector += '.' + Array.from(el.classList).slice(0, 2).map(c => CSS.escape(c)).join('.');
376
- }
377
- const parent = el.parentElement;
378
- if (parent) {
379
- const siblings = Array.from(parent.children).filter(child => child.nodeName === el.nodeName);
380
- if (siblings.length > 1) selector += ':nth-of-type(' + (siblings.indexOf(el) + 1) + ')';
381
- }
382
- path.unshift(selector);
383
- el = parent;
384
- }
385
- return path.length ? path.join(' > ') : '';
386
- }
387
-
388
- return Array.from(document.querySelectorAll(selectors)).map((el, index) => {
389
- const rect = el.getBoundingClientRect();
390
- const computed = window.getComputedStyle(el);
391
- const visible = computed.display !== 'none' && computed.visibility !== 'hidden' && rect.width > 0 && rect.height > 0;
392
- if (!visible) return null;
393
- const selector = cssPath(el);
394
- if (!selector) return null;
395
- const text = ((el.innerText || el.textContent || el.getAttribute('aria-label') || el.getAttribute('placeholder') || '') + '').trim().replace(/\\s+/g, ' ').substring(0, 120);
396
- return {
397
- ref: 'css:' + selector,
398
- index: index + 1,
399
- tag: el.tagName.toLowerCase(),
400
- role: el.getAttribute('role') || '',
401
- type: el.getAttribute('type') || '',
402
- name: el.getAttribute('name') || '',
403
- text: includeText ? text : undefined,
404
- bounds: {
405
- x: Math.round(rect.left),
406
- y: Math.round(rect.top),
407
- width: Math.round(rect.width),
408
- height: Math.round(rect.height)
409
- }
410
- };
411
- }).filter(Boolean).slice(0, max);
412
- })()
413
- `,
414
- returnByValue: true,
415
- awaitPromise: true
416
- });
417
- const elements = result.result?.value || [];
418
- if (withOverlay && elements.length > 0) {
419
- await this.client.getInteractiveElements(true).catch(() => ({ elements: [], somInjected: false }));
420
- }
421
- return elements;
422
- }
423
371
  async clickByRef(params) {
424
372
  const ref = String(params.ref || "");
425
373
  if (!ref)
@@ -1143,28 +1091,91 @@ class CdpBridge {
1143
1091
  this.snapshotCache.invalidate("click_element_by_selector");
1144
1092
  return "Clicked element by point";
1145
1093
  }
1146
- // Get element bounds via CDP
1147
- const result = await this.client.sendCommand("Runtime.evaluate", {
1148
- expression: `
1149
- (function() {
1150
- const el = document.querySelector(${JSON.stringify(selector)});
1151
- if (!el) return null;
1152
- el.scrollIntoView({ block: 'center', inline: 'center', behavior: 'instant' });
1153
- const rect = el.getBoundingClientRect();
1154
- const computed = window.getComputedStyle(el);
1155
- if (computed.display === 'none' || computed.visibility === 'hidden' || rect.width === 0 || rect.height === 0) return null;
1156
- return { x: rect.left + rect.width/2, y: rect.top + rect.height/2, w: rect.width };
1157
- })()
1158
- `,
1159
- returnByValue: true,
1160
- });
1161
- if (result.result?.value) {
1162
- const { x, y, w } = result.result.value;
1163
- await this.client.click(x, y, params.button, w);
1164
- this.snapshotCache.invalidate("click_element_by_selector");
1165
- return "Clicked element by selector";
1094
+ // Run the full actionability gate (visible, enabled, in-viewport, stable
1095
+ // bounds, not obscured) before committing the click.
1096
+ const point = await this.resolveActionablePoint(selector, params.timeout ?? 4000);
1097
+ if (!point.ok) {
1098
+ const detail = point.reason === "obscured" && point.obscuredBy
1099
+ ? `${point.reason} by <${point.obscuredBy}>`
1100
+ : point.reason;
1101
+ throw new Error(`Element not actionable (${detail}): ${selector}`);
1102
+ }
1103
+ await this.client.click(point.x, point.y, params.button, point.w);
1104
+ this.snapshotCache.invalidate("click_element_by_selector");
1105
+ return "Clicked element by selector";
1106
+ }
1107
+ /**
1108
+ * Centralized actionability gate. Polls until the selector resolves to an
1109
+ * element that is visible, enabled, scrolled into the viewport, has stable
1110
+ * bounds across frames, and is the topmost element at its own click point
1111
+ * (i.e. not covered by an overlay/cookie banner). Returns the verified click
1112
+ * point, or the blocking reason so the caller can surface it.
1113
+ */
1114
+ async resolveActionablePoint(selector, timeout = 4000) {
1115
+ const start = Date.now();
1116
+ let lastBoxKey = "";
1117
+ let stableHits = 0;
1118
+ let last = { ok: false, reason: "not_found" };
1119
+ while (Date.now() - start < timeout) {
1120
+ const result = await this.client.sendCommand("Runtime.evaluate", {
1121
+ expression: `
1122
+ (function() {
1123
+ const el = document.querySelector(${JSON.stringify(selector)});
1124
+ if (!el) return { ok: false, reason: 'not_found' };
1125
+ el.scrollIntoView({ block: 'center', inline: 'center', behavior: 'instant' });
1126
+ const rect = el.getBoundingClientRect();
1127
+ const style = window.getComputedStyle(el);
1128
+ if (style.display === 'none' || style.visibility === 'hidden' || style.opacity === '0' || rect.width === 0 || rect.height === 0) {
1129
+ return { ok: false, reason: 'not_visible' };
1130
+ }
1131
+ if (el.disabled === true || el.getAttribute('aria-disabled') === 'true' || style.pointerEvents === 'none') {
1132
+ return { ok: false, reason: 'disabled' };
1133
+ }
1134
+ const cx = rect.left + rect.width / 2;
1135
+ const cy = rect.top + rect.height / 2;
1136
+ const vw = window.innerWidth || document.documentElement.clientWidth;
1137
+ const vh = window.innerHeight || document.documentElement.clientHeight;
1138
+ if (cx < 0 || cy < 0 || cx > vw || cy > vh) {
1139
+ return { ok: false, reason: 'offscreen' };
1140
+ }
1141
+ const top = document.elementFromPoint(cx, cy);
1142
+ const reachable = top === el || el.contains(top) || (top && top.contains(el));
1143
+ let obscuredBy = '';
1144
+ if (!reachable && top) {
1145
+ obscuredBy = top.tagName.toLowerCase() + (top.id ? '#' + top.id : '');
1146
+ }
1147
+ return {
1148
+ ok: !!reachable,
1149
+ reason: reachable ? 'ok' : 'obscured',
1150
+ obscuredBy: obscuredBy,
1151
+ x: cx,
1152
+ y: cy,
1153
+ w: rect.width,
1154
+ boxKey: Math.round(rect.left) + ',' + Math.round(rect.top) + ',' + Math.round(rect.width) + ',' + Math.round(rect.height)
1155
+ };
1156
+ })()
1157
+ `,
1158
+ returnByValue: true,
1159
+ });
1160
+ const state = result.result?.value;
1161
+ if (state) {
1162
+ last = state;
1163
+ if (state.ok) {
1164
+ if (state.boxKey === lastBoxKey) {
1165
+ stableHits++;
1166
+ if (stableHits >= 1) {
1167
+ return { ok: true, reason: "ok", x: state.x, y: state.y, w: state.w };
1168
+ }
1169
+ }
1170
+ else {
1171
+ lastBoxKey = state.boxKey;
1172
+ stableHits = 0;
1173
+ }
1174
+ }
1175
+ }
1176
+ await new Promise(r => setTimeout(r, 90));
1166
1177
  }
1167
- throw new Error(`Element not found: ${selector}`);
1178
+ return { ok: false, reason: last?.reason || "timeout", obscuredBy: last?.obscuredBy };
1168
1179
  }
1169
1180
  async type(params) {
1170
1181
  const text = params.text || params.value || "";
@@ -1213,6 +1224,266 @@ class CdpBridge {
1213
1224
  async getDomSnapshot(params) {
1214
1225
  return await this.client.getDOMSnapshot();
1215
1226
  }
1227
+ /**
1228
+ * Extract clean, readable page content as Markdown (or plain text) — a
1229
+ * token-cheap alternative to screenshots or full DOM dumps for reading a page.
1230
+ * Scopes to `selector` when provided, otherwise picks the main content region.
1231
+ */
1232
+ async getPageText(params) {
1233
+ const format = params.format === "text" ? "text" : "markdown";
1234
+ const selector = params.selector ? String(params.selector) : "";
1235
+ const maxLength = Math.max(500, Math.min(Number(params.maxLength ?? 20000), 200000));
1236
+ const includeLinks = params.includeLinks !== false;
1237
+ const extracted = await this.client.evaluate(`
1238
+ (function() {
1239
+ const FORMAT = ${JSON.stringify(format)};
1240
+ const SELECTOR = ${JSON.stringify(selector)};
1241
+ const INCLUDE_LINKS = ${JSON.stringify(includeLinks)};
1242
+ const SKIP = new Set(['SCRIPT','STYLE','NOSCRIPT','SVG','CANVAS','TEMPLATE','IFRAME','OBJECT','EMBED','NAV','FOOTER','HEADER','ASIDE']);
1243
+
1244
+ function pickRoot() {
1245
+ if (SELECTOR) {
1246
+ const el = document.querySelector(SELECTOR);
1247
+ if (el) return el;
1248
+ }
1249
+ return document.querySelector('main, article, [role="main"]') || document.body || document.documentElement;
1250
+ }
1251
+
1252
+ function isHidden(el) {
1253
+ if (el.getAttribute && el.getAttribute('aria-hidden') === 'true') return true;
1254
+ const style = window.getComputedStyle(el);
1255
+ if (style.display === 'none' || style.visibility === 'hidden' || style.opacity === '0') return true;
1256
+ const rect = el.getBoundingClientRect();
1257
+ return rect.width === 0 && rect.height === 0 && el.tagName !== 'BR' && el.tagName !== 'HR';
1258
+ }
1259
+
1260
+ function inline(node) {
1261
+ let out = '';
1262
+ node.childNodes.forEach(function(child) {
1263
+ if (child.nodeType === Node.TEXT_NODE) {
1264
+ out += child.textContent.replace(/\\s+/g, ' ');
1265
+ } else if (child.nodeType === Node.ELEMENT_NODE) {
1266
+ if (SKIP.has(child.tagName) || isHidden(child)) return;
1267
+ const tag = child.tagName.toLowerCase();
1268
+ const inner = inline(child);
1269
+ if (FORMAT === 'markdown') {
1270
+ if (tag === 'a' && INCLUDE_LINKS && child.getAttribute('href')) {
1271
+ const href = child.getAttribute('href');
1272
+ out += inner.trim() ? '[' + inner.trim() + '](' + href + ')' : '';
1273
+ } else if (tag === 'strong' || tag === 'b') {
1274
+ out += inner.trim() ? '**' + inner.trim() + '**' : '';
1275
+ } else if (tag === 'em' || tag === 'i') {
1276
+ out += inner.trim() ? '*' + inner.trim() + '*' : '';
1277
+ } else if (tag === 'code') {
1278
+ out += inner.trim() ? '\`' + inner.trim() + '\`' : '';
1279
+ } else if (tag === 'br') {
1280
+ out += '\\n';
1281
+ } else {
1282
+ out += inner;
1283
+ }
1284
+ } else {
1285
+ out += (tag === 'br') ? '\\n' : inner;
1286
+ }
1287
+ }
1288
+ });
1289
+ return out;
1290
+ }
1291
+
1292
+ const BLOCK = new Set(['P','DIV','SECTION','ARTICLE','UL','OL','LI','TABLE','TR','BLOCKQUOTE','PRE','H1','H2','H3','H4','H5','H6','HR','FIGURE','FIGCAPTION']);
1293
+ const lines = [];
1294
+
1295
+ function walk(node, depth) {
1296
+ if (node.nodeType !== Node.ELEMENT_NODE) return;
1297
+ if (SKIP.has(node.tagName) || isHidden(node)) return;
1298
+ const tag = node.tagName.toLowerCase();
1299
+
1300
+ if (/^h[1-6]$/.test(tag)) {
1301
+ const t = inline(node).trim();
1302
+ if (t) lines.push(FORMAT === 'markdown' ? '#'.repeat(Number(tag[1])) + ' ' + t : t);
1303
+ return;
1304
+ }
1305
+ if (tag === 'hr') { lines.push(FORMAT === 'markdown' ? '---' : ''); return; }
1306
+ if (tag === 'pre') {
1307
+ const t = (node.innerText || node.textContent || '').replace(/\\s+$/,'');
1308
+ if (t) lines.push(FORMAT === 'markdown' ? '\\n\`\`\`\\n' + t + '\\n\`\`\`\\n' : t);
1309
+ return;
1310
+ }
1311
+ if (tag === 'li') {
1312
+ const t = inline(node).trim();
1313
+ if (t) {
1314
+ const indent = ' '.repeat(Math.max(0, depth));
1315
+ lines.push(FORMAT === 'markdown' ? indent + '- ' + t : indent + '• ' + t);
1316
+ }
1317
+ return;
1318
+ }
1319
+ if (tag === 'blockquote') {
1320
+ const t = inline(node).trim();
1321
+ if (t) lines.push(FORMAT === 'markdown' ? '> ' + t : t);
1322
+ return;
1323
+ }
1324
+ if (tag === 'tr') {
1325
+ const cells = Array.from(node.children).map(function(c){ return inline(c).trim(); });
1326
+ if (cells.some(Boolean)) lines.push(FORMAT === 'markdown' ? '| ' + cells.join(' | ') + ' |' : cells.join('\\t'));
1327
+ return;
1328
+ }
1329
+
1330
+ // Leaf-ish block (no block descendants): emit its inline text once.
1331
+ const hasBlockChild = Array.from(node.children).some(function(c){ return BLOCK.has(c.tagName); });
1332
+ if (!hasBlockChild && (tag === 'p' || tag === 'div' || tag === 'section' || tag === 'figcaption' || tag === 'td' || tag === 'th')) {
1333
+ const t = inline(node).trim();
1334
+ if (t) lines.push(t);
1335
+ return;
1336
+ }
1337
+
1338
+ const nextDepth = (tag === 'ul' || tag === 'ol') ? depth + 1 : depth;
1339
+ node.childNodes.forEach(function(child){ walk(child, nextDepth); });
1340
+ }
1341
+
1342
+ const root = pickRoot();
1343
+ walk(root, 0);
1344
+
1345
+ let text = lines.join('\\n\\n').replace(/\\n{3,}/g, '\\n\\n').replace(/[ \\t]+\\n/g, '\\n').trim();
1346
+ return { title: document.title || '', url: location.href, text: text };
1347
+ })()
1348
+ `).catch((e) => ({ title: "", url: "", text: "", error: e?.message }));
1349
+ const text = String(extracted?.text ?? "");
1350
+ const truncated = text.length > maxLength;
1351
+ return {
1352
+ title: extracted?.title ?? "",
1353
+ url: extracted?.url ?? "",
1354
+ format,
1355
+ length: text.length,
1356
+ truncated,
1357
+ text: truncated ? text.slice(0, maxLength) + "\n\n…[truncated]" : text,
1358
+ };
1359
+ }
1360
+ defaultAuthStatePath(custom) {
1361
+ if (custom)
1362
+ return path_1.default.resolve(String(custom));
1363
+ return path_1.default.resolve(process.cwd(), ".aether", "auth-state.json");
1364
+ }
1365
+ // CDP Network.getAllCookies returns Cookie objects with read-only fields
1366
+ // (size, session, …) that Network.setCookies rejects. Keep only CookieParam fields.
1367
+ toCookieParam(c) {
1368
+ const param = {
1369
+ name: c.name,
1370
+ value: c.value,
1371
+ domain: c.domain,
1372
+ path: c.path,
1373
+ secure: c.secure,
1374
+ httpOnly: c.httpOnly,
1375
+ };
1376
+ if (typeof c.expires === "number" && c.expires > 0)
1377
+ param.expires = c.expires;
1378
+ if (c.sameSite)
1379
+ param.sameSite = c.sameSite;
1380
+ if (c.priority)
1381
+ param.priority = c.priority;
1382
+ if (c.sourceScheme)
1383
+ param.sourceScheme = c.sourceScheme;
1384
+ if (typeof c.sourcePort === "number")
1385
+ param.sourcePort = c.sourcePort;
1386
+ if (c.partitionKey)
1387
+ param.partitionKey = c.partitionKey;
1388
+ return param;
1389
+ }
1390
+ /**
1391
+ * Export the current session (cookies + localStorage + sessionStorage of the
1392
+ * active origin) to a JSON file so a logged-in state can be reused later.
1393
+ */
1394
+ async saveAuthState(params) {
1395
+ const filePath = this.defaultAuthStatePath(params.path);
1396
+ const cookiesRes = await this.client.sendCommand("Network.getAllCookies", {})
1397
+ .catch(() => this.client.sendCommand("Storage.getCookies", {}).catch(() => ({ cookies: [] })));
1398
+ const cookies = (cookiesRes?.cookies || []).map((c) => this.toCookieParam(c));
1399
+ const storage = await this.client.evaluate(`
1400
+ (function() {
1401
+ const ls = {}, ss = {};
1402
+ try { for (let i = 0; i < localStorage.length; i++) { const k = localStorage.key(i); ls[k] = localStorage.getItem(k); } } catch (e) {}
1403
+ try { for (let i = 0; i < sessionStorage.length; i++) { const k = sessionStorage.key(i); ss[k] = sessionStorage.getItem(k); } } catch (e) {}
1404
+ return { origin: location.origin, localStorage: ls, sessionStorage: ss };
1405
+ })()
1406
+ `).catch(() => null);
1407
+ const state = {
1408
+ version: 1,
1409
+ savedAt: new Date().toISOString(),
1410
+ cookies,
1411
+ origins: storage ? [storage] : [],
1412
+ };
1413
+ await fs_1.promises.mkdir(path_1.default.dirname(filePath), { recursive: true });
1414
+ await fs_1.promises.writeFile(filePath, JSON.stringify(state, null, 2), "utf8");
1415
+ return {
1416
+ success: true,
1417
+ path: filePath,
1418
+ cookies: cookies.length,
1419
+ origins: state.origins.length,
1420
+ storageKeys: storage ? Object.keys(storage.localStorage).length + Object.keys(storage.sessionStorage).length : 0,
1421
+ };
1422
+ }
1423
+ /**
1424
+ * Restore a session saved by saveAuthState. Cookies are set globally; storage
1425
+ * is restored for the active origin (navigate to the site first), then the tab
1426
+ * is reloaded so the session takes effect.
1427
+ */
1428
+ async loadAuthState(params) {
1429
+ const filePath = this.defaultAuthStatePath(params.path);
1430
+ let raw;
1431
+ try {
1432
+ raw = await fs_1.promises.readFile(filePath, "utf8");
1433
+ }
1434
+ catch (e) {
1435
+ return { success: false, path: filePath, message: `Could not read auth state: ${e?.message}` };
1436
+ }
1437
+ let state;
1438
+ try {
1439
+ state = JSON.parse(raw);
1440
+ }
1441
+ catch (e) {
1442
+ return { success: false, path: filePath, message: `Invalid auth state JSON: ${e?.message}` };
1443
+ }
1444
+ let cookiesSet = 0;
1445
+ if (Array.isArray(state.cookies) && state.cookies.length) {
1446
+ const params2 = state.cookies.map((c) => this.toCookieParam(c));
1447
+ await this.client.setCookies(params2).catch((err) => {
1448
+ console.error("[Aether] setCookies failed during loadAuthState:", err?.message);
1449
+ });
1450
+ cookiesSet = params2.length;
1451
+ }
1452
+ let storageRestored = 0;
1453
+ let storageSkipped = 0;
1454
+ const currentOrigin = await this.client.evaluate("location.origin").catch(() => "");
1455
+ for (const entry of (state.origins || [])) {
1456
+ if (entry.origin && currentOrigin && entry.origin !== currentOrigin) {
1457
+ storageSkipped++;
1458
+ continue;
1459
+ }
1460
+ const data = JSON.stringify({ localStorage: entry.localStorage || {}, sessionStorage: entry.sessionStorage || {} });
1461
+ const ok = await this.client.evaluate(`
1462
+ (function() {
1463
+ try {
1464
+ const data = ${data};
1465
+ for (const k in data.localStorage) localStorage.setItem(k, data.localStorage[k]);
1466
+ for (const k in data.sessionStorage) sessionStorage.setItem(k, data.sessionStorage[k]);
1467
+ return true;
1468
+ } catch (e) { return false; }
1469
+ })()
1470
+ `).catch(() => false);
1471
+ if (ok)
1472
+ storageRestored++;
1473
+ }
1474
+ if (params.reload !== false) {
1475
+ await this.client.reload(false).catch(() => { });
1476
+ }
1477
+ this.snapshotCache.invalidate("load_auth_state");
1478
+ return {
1479
+ success: true,
1480
+ path: filePath,
1481
+ cookiesSet,
1482
+ storageRestored,
1483
+ storageSkipped,
1484
+ note: storageSkipped > 0 ? "Some storage origins were skipped; navigate to that origin before loading to restore them." : undefined,
1485
+ };
1486
+ }
1216
1487
  // ==================== AGENT-CENTRIC APIs ====================
1217
1488
  async agentAction(params) {
1218
1489
  const { action, target, verify, waitFor, timeout } = params;
@@ -13,6 +13,7 @@ const fs_1 = require("fs");
13
13
  const path_1 = __importDefault(require("path"));
14
14
  const os_1 = __importDefault(require("os"));
15
15
  const stealth_1 = require("./stealth");
16
+ const element_collector_1 = require("./element-collector");
16
17
  class CdpClient {
17
18
  ws = null;
18
19
  messageId = 0;
@@ -208,7 +209,8 @@ class CdpClient {
208
209
  expression: `
209
210
  (function() {
210
211
  const withSoM = ${JSON.stringify(withSoM)};
211
-
212
+ ${element_collector_1.SHARED_DOM_HELPERS}
213
+
212
214
  // Remove existing overlays
213
215
  const oldContainer = document.getElementById('aether-som-container');
214
216
  if (oldContainer) oldContainer.remove();
@@ -245,11 +247,8 @@ class CdpClient {
245
247
  let text = el.innerText || el.textContent || '';
246
248
  text = text.trim().substring(0, 100);
247
249
 
248
- // Get selector
249
- let selector = '';
250
- if (el.id) selector = '#' + CSS.escape(el.id);
251
- else if (el.className && typeof el.className === 'string') selector = '.' + el.className.split(' ')[0];
252
- else selector = el.tagName.toLowerCase();
250
+ // Get a stable selector (shared with the locator engine)
251
+ const selector = aetherStableSelector(el);
253
252
 
254
253
  if (withSoM && container) {
255
254
  const id = String(validIndex);
@@ -285,7 +284,7 @@ class CdpClient {
285
284
  attributes: {
286
285
  type: el.getAttribute('type') || '',
287
286
  href: el.getAttribute('href') || '',
288
- role: el.getAttribute('role') || '',
287
+ role: aetherImplicitRole(el),
289
288
  'aria-label': el.getAttribute('aria-label') || ''
290
289
  }
291
290
  };
@@ -0,0 +1,198 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.SHARED_DOM_HELPERS = void 0;
4
+ /**
5
+ * Shared in-page DOM collection helpers.
6
+ *
7
+ * These helpers are injected (as source text) into every element-collection
8
+ * script so that the LocatorEngine, the Set-of-Marks overlay collector, and the
9
+ * compact snapshot all derive selectors, roles, names, and visibility the SAME
10
+ * way. Previously each call site had its own divergent copy, so `get_state`,
11
+ * `list_interactive_elements`, and the semantic click resolver could disagree
12
+ * about the same element.
13
+ *
14
+ * Role/name resolution follows the WAI-ARIA implicit-role mapping and the
15
+ * accessible-name computation closely enough that role/label targeting matches
16
+ * what the browser's own accessibility tree reports.
17
+ */
18
+ exports.SHARED_DOM_HELPERS = `
19
+ function aetherNorm(value) {
20
+ return String(value == null ? '' : value).trim().replace(/\\s+/g, ' ');
21
+ }
22
+
23
+ function aetherVisible(el) {
24
+ if (!el || el.nodeType !== Node.ELEMENT_NODE) return false;
25
+ const rect = el.getBoundingClientRect();
26
+ if (rect.width <= 0 || rect.height <= 0) return false;
27
+ const style = window.getComputedStyle(el);
28
+ return style.display !== 'none' &&
29
+ style.visibility !== 'hidden' &&
30
+ style.opacity !== '0';
31
+ }
32
+
33
+ // WAI-ARIA implicit role mapping. Mirrors what the accessibility tree reports
34
+ // so role-based targeting (click_role, fill_label) lines up with the browser.
35
+ function aetherImplicitRole(el) {
36
+ const explicit = (el.getAttribute('role') || '').trim().toLowerCase();
37
+ if (explicit) return explicit.split(/\\s+/)[0];
38
+ const tag = el.tagName.toLowerCase();
39
+ const type = (el.getAttribute('type') || '').toLowerCase();
40
+ switch (tag) {
41
+ case 'a':
42
+ case 'area':
43
+ return el.hasAttribute('href') ? 'link' : 'generic';
44
+ case 'button':
45
+ return 'button';
46
+ case 'summary':
47
+ return 'button';
48
+ case 'select':
49
+ return (el.multiple || el.size > 1) ? 'listbox' : 'combobox';
50
+ case 'textarea':
51
+ return 'textbox';
52
+ case 'progress':
53
+ return 'progressbar';
54
+ case 'output':
55
+ return 'status';
56
+ case 'input':
57
+ switch (type) {
58
+ case 'button':
59
+ case 'submit':
60
+ case 'reset':
61
+ case 'image':
62
+ return 'button';
63
+ case 'checkbox':
64
+ return 'checkbox';
65
+ case 'radio':
66
+ return 'radio';
67
+ case 'range':
68
+ return 'slider';
69
+ case 'number':
70
+ return 'spinbutton';
71
+ case 'search':
72
+ return el.getAttribute('list') ? 'combobox' : 'searchbox';
73
+ case 'email':
74
+ case 'tel':
75
+ case 'text':
76
+ case 'url':
77
+ case '':
78
+ return el.getAttribute('list') ? 'combobox' : 'textbox';
79
+ default:
80
+ return 'textbox';
81
+ }
82
+ }
83
+ if (el.isContentEditable) return 'textbox';
84
+ if (/^h[1-6]$/.test(tag)) return 'heading';
85
+ return tag;
86
+ }
87
+
88
+ function aetherTextById(doc, id) {
89
+ if (!id) return '';
90
+ try {
91
+ const el = doc.getElementById(id);
92
+ return el ? aetherNorm(el.innerText || el.textContent) : '';
93
+ } catch (e) {
94
+ return '';
95
+ }
96
+ }
97
+
98
+ function aetherLabelFor(el) {
99
+ const doc = el.ownerDocument;
100
+ const labelledBy = aetherNorm(
101
+ (el.getAttribute('aria-labelledby') || '')
102
+ .split(/\\s+/)
103
+ .map(function (id) { return aetherTextById(doc, id); })
104
+ .join(' ')
105
+ );
106
+ if (labelledBy) return labelledBy;
107
+ if (el.id) {
108
+ try {
109
+ const direct = doc.querySelector('label[for="' + CSS.escape(el.id) + '"]');
110
+ if (direct) return aetherNorm(direct.innerText || direct.textContent);
111
+ } catch (e) {}
112
+ }
113
+ const wrapping = el.closest && el.closest('label');
114
+ return wrapping ? aetherNorm(wrapping.innerText || wrapping.textContent) : '';
115
+ }
116
+
117
+ // Accessible-name computation (simplified): aria-label > aria-labelledby/label >
118
+ // placeholder > alt > title > control value > visible text.
119
+ function aetherAccessibleName(el) {
120
+ return aetherNorm(
121
+ el.getAttribute('aria-label') ||
122
+ aetherLabelFor(el) ||
123
+ el.getAttribute('placeholder') ||
124
+ el.getAttribute('alt') ||
125
+ el.getAttribute('title') ||
126
+ ((el.tagName === 'INPUT' || el.tagName === 'BUTTON') ? el.getAttribute('value') : '') ||
127
+ el.innerText ||
128
+ el.textContent ||
129
+ el.getAttribute('name') ||
130
+ ''
131
+ );
132
+ }
133
+
134
+ function aetherIsUnique(root, selector) {
135
+ try {
136
+ return root.querySelectorAll(selector).length === 1;
137
+ } catch (e) {
138
+ return false;
139
+ }
140
+ }
141
+
142
+ function aetherStructuralPath(el) {
143
+ const path = [];
144
+ let node = el;
145
+ const stop = el.ownerDocument ? el.ownerDocument.body : null;
146
+ while (node && node.nodeType === Node.ELEMENT_NODE && node !== stop) {
147
+ let part = node.nodeName.toLowerCase();
148
+ if (node.classList && node.classList.length) {
149
+ part += '.' + Array.from(node.classList).slice(0, 2).map(function (c) { return CSS.escape(c); }).join('.');
150
+ }
151
+ const parent = node.parentElement;
152
+ if (parent) {
153
+ const same = Array.from(parent.children).filter(function (child) { return child.nodeName === node.nodeName; });
154
+ if (same.length > 1) part += ':nth-of-type(' + (same.indexOf(node) + 1) + ')';
155
+ }
156
+ path.unshift(part);
157
+ node = parent;
158
+ }
159
+ return path.join(' > ');
160
+ }
161
+
162
+ // Prefer stable, intent-revealing selectors (test ids, id, name, aria-label)
163
+ // and only fall back to a brittle structural path when nothing stable+unique
164
+ // is available.
165
+ function aetherStableSelector(el) {
166
+ if (!el || el.nodeType !== Node.ELEMENT_NODE) return '';
167
+ const root = el.getRootNode ? el.getRootNode() : el.ownerDocument;
168
+ const tag = el.tagName.toLowerCase();
169
+
170
+ const testAttrs = ['data-testid', 'data-test-id', 'data-test', 'data-cy', 'data-qa', 'data-automation-id'];
171
+ for (let i = 0; i < testAttrs.length; i++) {
172
+ const v = el.getAttribute(testAttrs[i]);
173
+ if (v) {
174
+ const sel = '[' + testAttrs[i] + '=' + JSON.stringify(v) + ']';
175
+ if (aetherIsUnique(root, sel)) return sel;
176
+ }
177
+ }
178
+
179
+ if (el.id) {
180
+ const sel = '#' + CSS.escape(el.id);
181
+ if (aetherIsUnique(root, sel)) return sel;
182
+ }
183
+
184
+ const name = el.getAttribute('name');
185
+ if (name && (tag === 'input' || tag === 'select' || tag === 'textarea' || tag === 'button')) {
186
+ const sel = tag + '[name=' + JSON.stringify(name) + ']';
187
+ if (aetherIsUnique(root, sel)) return sel;
188
+ }
189
+
190
+ const aria = el.getAttribute('aria-label');
191
+ if (aria) {
192
+ const sel = tag + '[aria-label=' + JSON.stringify(aria) + ']';
193
+ if (aetherIsUnique(root, sel)) return sel;
194
+ }
195
+
196
+ return aetherStructuralPath(el);
197
+ }
198
+ `;
@@ -1,6 +1,7 @@
1
1
  "use strict";
2
2
  Object.defineProperty(exports, "__esModule", { value: true });
3
3
  exports.LocatorEngine = void 0;
4
+ const element_collector_1 = require("./element-collector");
4
5
  const DEFAULT_TIMEOUT = 7000;
5
6
  class LocatorEngine {
6
7
  client;
@@ -97,40 +98,16 @@ function locatorScript(input) {
97
98
  '[contenteditable="true"]', '[aria-label]', '[placeholder]'
98
99
  ].join(', ');
99
100
 
100
- function norm(value) {
101
- return String(value || '').trim().replace(/\\s+/g, ' ');
102
- }
101
+ ${element_collector_1.SHARED_DOM_HELPERS}
103
102
 
104
- function visible(el) {
105
- const rect = el.getBoundingClientRect();
106
- const style = window.getComputedStyle(el);
107
- return style.display !== 'none' &&
108
- style.visibility !== 'hidden' &&
109
- style.opacity !== '0' &&
110
- rect.width > 0 &&
111
- rect.height > 0;
112
- }
113
-
114
- function cssPath(el) {
115
- if (!el || el.nodeType !== Node.ELEMENT_NODE) return '';
116
- if (el.id) return '#' + CSS.escape(el.id);
117
- const path = [];
118
- let node = el;
119
- while (node && node.nodeType === Node.ELEMENT_NODE && node !== node.ownerDocument.body) {
120
- let part = node.nodeName.toLowerCase();
121
- if (node.classList && node.classList.length) {
122
- part += '.' + Array.from(node.classList).slice(0, 2).map((c) => CSS.escape(c)).join('.');
123
- }
124
- const parent = node.parentElement;
125
- if (parent) {
126
- const same = Array.from(parent.children).filter((child) => child.nodeName === node.nodeName);
127
- if (same.length > 1) part += ':nth-of-type(' + (same.indexOf(node) + 1) + ')';
128
- }
129
- path.unshift(part);
130
- node = parent;
131
- }
132
- return path.join(' > ');
133
- }
103
+ // Thin aliases so the rest of this resolver reads naturally while the
104
+ // implementations stay shared with every other collector.
105
+ const norm = aetherNorm;
106
+ const visible = aetherVisible;
107
+ const cssPath = aetherStableSelector;
108
+ const inferRole = aetherImplicitRole;
109
+ const labelFor = aetherLabelFor;
110
+ const textFor = aetherAccessibleName;
134
111
 
135
112
  function xpath(el) {
136
113
  const parts = [];
@@ -148,55 +125,6 @@ function locatorScript(input) {
148
125
  return '/' + parts.join('/');
149
126
  }
150
127
 
151
- function inferRole(el) {
152
- const explicit = (el.getAttribute('role') || '').toLowerCase();
153
- if (explicit) return explicit;
154
- const tag = el.tagName.toLowerCase();
155
- const type = (el.getAttribute('type') || '').toLowerCase();
156
- if (tag === 'button' || type === 'button' || type === 'submit' || type === 'reset') return 'button';
157
- if (tag === 'a') return 'link';
158
- if (tag === 'textarea') return 'textbox';
159
- if (tag === 'select') return 'combobox';
160
- if (tag === 'input' && ['checkbox', 'radio'].includes(type)) return type;
161
- if (tag === 'input') return 'textbox';
162
- if (el.isContentEditable) return 'textbox';
163
- if (tag === 'summary') return 'button';
164
- return tag;
165
- }
166
-
167
- function byId(doc, id) {
168
- if (!id) return '';
169
- const el = doc.getElementById(id);
170
- return el ? norm(el.innerText || el.textContent) : '';
171
- }
172
-
173
- function labelFor(el) {
174
- const doc = el.ownerDocument;
175
- const labelledBy = norm((el.getAttribute('aria-labelledby') || '').split(/\\s+/).map((id) => byId(doc, id)).join(' '));
176
- if (labelledBy) return labelledBy;
177
- if (el.id) {
178
- const direct = doc.querySelector('label[for="' + CSS.escape(el.id) + '"]');
179
- if (direct) return norm(direct.innerText || direct.textContent);
180
- }
181
- const wrapping = el.closest && el.closest('label');
182
- return wrapping ? norm(wrapping.innerText || wrapping.textContent) : '';
183
- }
184
-
185
- function textFor(el) {
186
- return norm(
187
- el.getAttribute('aria-label') ||
188
- labelFor(el) ||
189
- el.getAttribute('placeholder') ||
190
- el.getAttribute('alt') ||
191
- el.getAttribute('title') ||
192
- el.innerText ||
193
- el.textContent ||
194
- el.getAttribute('value') ||
195
- el.getAttribute('name') ||
196
- ''
197
- );
198
- }
199
-
200
128
  function scoreField(field, value, exact, includes) {
201
129
  const text = norm(value);
202
130
  const lower = text.toLowerCase();
@@ -663,6 +663,41 @@ const Tools = [
663
663
  required: ["fields"]
664
664
  }
665
665
  },
666
+ {
667
+ name: "get_page_text",
668
+ description: "READ TOOL. Extract clean, readable page content as Markdown (or plain text). Token-cheap alternative to screenshots or full DOM dumps for reading/understanding a page. Scopes to a CSS selector when given, otherwise auto-detects the main content region.",
669
+ inputSchema: {
670
+ type: "object",
671
+ properties: {
672
+ format: { type: "string", enum: ["markdown", "text"], description: "Output format. Default markdown." },
673
+ selector: { type: "string", description: "Optional CSS selector to scope extraction to a region." },
674
+ includeLinks: { type: "boolean", description: "Render anchors as [text](href) in markdown. Default true." },
675
+ maxLength: { type: "number", description: "Max characters returned before truncation. Default 20000." }
676
+ }
677
+ }
678
+ },
679
+ {
680
+ name: "save_auth_state",
681
+ description: "SESSION TOOL. Export the current browser session (cookies + localStorage + sessionStorage) to a JSON file so a logged-in session can be reused later with load_auth_state. Avoids repeating logins.",
682
+ inputSchema: {
683
+ type: "object",
684
+ properties: {
685
+ path: { type: "string", description: "File path to write the auth state JSON. Defaults to <cwd>/.aether/auth-state.json." },
686
+ origins: { type: "array", items: { type: "string" }, description: "Optional list of origins to capture storage for. Defaults to the current origin." }
687
+ }
688
+ }
689
+ },
690
+ {
691
+ name: "load_auth_state",
692
+ description: "SESSION TOOL. Restore a previously saved session (cookies + localStorage + sessionStorage) from a JSON file written by save_auth_state. Navigate to the target site first, then load, then reload.",
693
+ inputSchema: {
694
+ type: "object",
695
+ properties: {
696
+ path: { type: "string", description: "File path to read the auth state JSON. Defaults to <cwd>/.aether/auth-state.json." },
697
+ reload: { type: "boolean", description: "Reload the active tab after restoring so storage takes effect. Default true." }
698
+ }
699
+ }
700
+ },
666
701
  {
667
702
  name: "page_snapshot",
668
703
  description: "Capture page context optimized for LLM consumption. Lightweight by default; opt into screenshots, cookies, accessibility tree, or full DOM snapshot when needed.",
@@ -1221,6 +1256,30 @@ function RegisterMcpTools(server, wsServer) {
1221
1256
  }
1222
1257
  return { content };
1223
1258
  }
1259
+ if (name === "get_page_text") {
1260
+ const result = await bridge.sendCommand("get_page_text", {
1261
+ format: a?.format,
1262
+ selector: a?.selector,
1263
+ includeLinks: a?.includeLinks,
1264
+ maxLength: a?.maxLength,
1265
+ });
1266
+ const header = `Title: ${result.title}\nURL: ${result.url}\nFormat: ${result.format} | ${result.length} chars${result.truncated ? " (truncated)" : ""}`;
1267
+ return { content: [{ type: "text", text: `${header}\n\n${result.text}` }] };
1268
+ }
1269
+ if (name === "save_auth_state") {
1270
+ const result = await bridge.sendCommand("save_auth_state", {
1271
+ path: a?.path,
1272
+ origins: a?.origins,
1273
+ });
1274
+ return { content: [{ type: "text", text: JSON.stringify(result) }] };
1275
+ }
1276
+ if (name === "load_auth_state") {
1277
+ const result = await bridge.sendCommand("load_auth_state", {
1278
+ path: a?.path,
1279
+ reload: a?.reload,
1280
+ });
1281
+ return { content: [{ type: "text", text: JSON.stringify(result) }] };
1282
+ }
1224
1283
  throw new Error(`Unknown tool: ${name}`);
1225
1284
  }
1226
1285
  catch (error) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "aether-mcp-server",
3
- "version": "2.0.1",
3
+ "version": "2.1.0",
4
4
  "description": "Aether MCP Server - AI Browser Controller",
5
5
  "main": "dist/index.js",
6
6
  "bin": {