textweb 0.2.0 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -92,6 +92,12 @@ npx textweb-mcp
92
92
 
93
93
  Then just ask: *"Go to hacker news and find posts about AI"* — the agent uses text grids instead of screenshots.
94
94
 
95
+ **New (v0.2.1-style MCP capabilities):**
96
+ - `session_id` on every tool call for isolated parallel workflows
97
+ - `textweb_storage_save` / `textweb_storage_load` for persistent auth/session state
98
+ - `textweb_wait_for` for multi-step async UI transitions
99
+ - `textweb_assert_field` for flow guards before submit
100
+
95
101
  ### 🛠️ OpenAI / Anthropic Function Calling
96
102
 
97
103
  Drop-in tool definitions for any function-calling model. See [`tools/tool_definitions.json`](tools/tool_definitions.json).
@@ -174,10 +180,18 @@ const { view, elements, meta } = await browser.navigate('https://example.com');
174
180
 
175
181
  console.log(view); // The text grid
176
182
  console.log(elements); // { 0: { selector, tag, text, href }, ... }
183
+ console.log(meta.stats); // { totalElements, interactiveElements, renderMs }
177
184
 
178
185
  await browser.click(3); // Click element [3]
179
186
  await browser.type(7, 'hello'); // Type into element [7]
180
187
  await browser.scroll('down'); // Scroll down
188
+ await browser.waitFor({ selector: '.step-2.active' }); // Wait for next step
189
+ await browser.assertField(7, 'hello', { comparator: 'equals' }); // Validate field state
190
+ await browser.saveStorageState('/tmp/textweb-state.json');
191
+ await browser.loadStorageState('/tmp/textweb-state.json');
192
+ await browser.query('nav a'); // Find elements by CSS selector
193
+ await browser.screenshot(); // PNG buffer (for debugging)
194
+ console.log(browser.getCurrentUrl());// Current page URL
181
195
  await browser.close();
182
196
  ```
183
197
 
@@ -218,6 +232,67 @@ await browser.close();
218
232
  3. **Map** pixel coordinates to character grid positions (spatial layout preserved)
219
233
  4. **Annotate** interactive elements with `[ref]` numbers for agent interaction
220
234
 
235
+ ## Selector Strategy
236
+
237
+ TextWeb builds stable CSS selectors for each interactive element, preferring resilient strategies over brittle positional ones:
238
+
239
+ | Priority | Strategy | Example |
240
+ |----------|----------|---------|
241
+ | 1 | `#id` | `#email` |
242
+ | 2 | `[data-testid]` | `[data-testid="submit-btn"]` |
243
+ | 3 | `[aria-label]` | `input[aria-label="Search"]` |
244
+ | 4 | `[role]` (if unique) | `[role="navigation"]` |
245
+ | 5 | `[name]` | `input[name="email"]` |
246
+ | 6 | `a[href]` (if unique) | `a[href="/about"]` |
247
+ | 7 | `nth-child` (fallback) | `div > a:nth-child(3)` |
248
+
249
+ This means selectors survive DOM changes between snapshots — critical for multi-step agent workflows.
250
+
251
+ ## ATS Workflow Examples (Greenhouse / Lever)
252
+
253
+ For multi-step ATS flows, use a stable `session_id` and combine wait/assert guards:
254
+
255
+ ```javascript
256
+ // Keep one session for the whole application
257
+ await textweb_navigate({ url: 'https://job-boards.greenhouse.io/acme/jobs/123', session_id: 'apply-acme' });
258
+
259
+ // Fill + continue
260
+ await textweb_type({ ref: 12, text: 'Christopher', session_id: 'apply-acme' });
261
+ await textweb_type({ ref: 15, text: 'Robison', session_id: 'apply-acme' });
262
+ await textweb_click({ ref: 42, session_id: 'apply-acme', retries: 3, retry_delay_ms: 400 });
263
+
264
+ // Guard transition
265
+ await textweb_wait_for({ selector: '#step-2.active', timeout_ms: 8000, session_id: 'apply-acme', retries: 2 });
266
+
267
+ // Validate before submit
268
+ await textweb_assert_field({ ref: 77, expected: 'San Francisco', comparator: 'includes', session_id: 'apply-acme' });
269
+
270
+ // Persist auth/session for follow-up flow
271
+ await textweb_storage_save({ path: '/tmp/ats-state.json', session_id: 'apply-acme' });
272
+ ```
273
+
274
+ Useful session tools:
275
+ - `textweb_session_list` → inspect active sessions
276
+ - `textweb_session_close` → close one session or all
277
+
278
+ ## Testing
279
+
280
+ ```bash
281
+ # Run all tests (form + live + ATS e2e)
282
+ npm test
283
+
284
+ # Form fixture tests
285
+ npm run test:form
286
+
287
+ # Live site tests — example.com, HN, Wikipedia
288
+ npm run test:live
289
+
290
+ # ATS multi-step fixture test
291
+ npm run test:ats
292
+ ```
293
+
294
+ Test fixtures are in `test/fixtures/` — includes a comprehensive HTML form and an ATS-style multi-step application fixture.
295
+
221
296
  ## Design Principles
222
297
 
223
298
  1. **Text is native to LLMs** — no vision model middleman
package/logo.svg ADDED
@@ -0,0 +1,30 @@
1
+ <?xml version="1.0" encoding="utf-8"?>
2
+ <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 425.697 360.796" xmlns:bx="https://boxy-svg.com">
3
+ <defs>
4
+ <style>
5
+ @keyframes blink {
6
+ 0%, 100% { opacity: 1; }
7
+ 50% { opacity: 0; }
8
+ }
9
+ .cursor {
10
+ animation: blink 1.1s step-start infinite;
11
+ }
12
+ </style>
13
+ <filter id="ds" x="-20%" y="-20%" width="140%" height="140%">
14
+ <feDropShadow dx="0" dy="10" stdDeviation="12" flood-color="#000" flood-opacity="0.35"/>
15
+ </filter>
16
+ <linearGradient id="g" x1="0" y1="0" x2="1" y2="1">
17
+ <stop offset="0" stop-color="var(--bg)"/>
18
+ <stop offset="1" stop-color="var(--bg2)"/>
19
+ </linearGradient>
20
+ <bx:export>
21
+ <bx:file format="png"/>
22
+ </bx:export>
23
+ </defs>
24
+ <rect x="6.482" y="11.817" width="400" height="320" rx="72" ry="72" fill="url(#g)" filter="url(#ds)" style="stroke-width: 1;"/>
25
+ <rect x="26.482" y="31.817" width="360" height="280" rx="58" ry="58" fill="none" stroke-width="4" style="stroke-width: 4; stroke: rgba(255, 255, 255, 0.255);"/>
26
+ <g class="mono" fill="var(--green)" style="font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, &quot;Liberation Mono&quot;, monospace;" transform="matrix(1, 0, 0, 1, -75.701332, -120.82827)">
27
+ <text style="fill: rgb(0, 201, 22); font-size: 56px; font-weight: 700; letter-spacing: 0.5px; white-space: pre;" x="112.671" y="313.449">&gt; textweb<tspan class="cursor">_</tspan><tspan x="112.6709976196289" dy="1em">​</tspan></text>
28
+ </g>
29
+ <circle cx="90.479" cy="155.817" r="70" fill="var(--green)" opacity="0.05" style="stroke-width: 1;"/>
30
+ </svg>
package/mcp/index.js CHANGED
@@ -2,11 +2,11 @@
2
2
 
3
3
  /**
4
4
  * TextWeb MCP Server
5
- *
5
+ *
6
6
  * Model Context Protocol server that gives any MCP client
7
7
  * (Claude Desktop, Cursor, Windsurf, Cline, OpenClaw, etc.)
8
8
  * text-based web browsing capabilities.
9
- *
9
+ *
10
10
  * Communicates over stdio using JSON-RPC 2.0.
11
11
  */
12
12
 
@@ -14,9 +14,11 @@ const { AgentBrowser } = require('../src/browser');
14
14
 
15
15
  const SERVER_INFO = {
16
16
  name: 'textweb',
17
- version: '0.1.0',
17
+ version: '0.2.2',
18
18
  };
19
19
 
20
+ const SESSION_NOTE = 'Optional session_id to isolate state across flows. Defaults to "default".';
21
+
20
22
  const TOOLS = [
21
23
  {
22
24
  name: 'textweb_navigate',
@@ -26,6 +28,9 @@ const TOOLS = [
26
28
  properties: {
27
29
  url: { type: 'string', description: 'The URL to navigate to' },
28
30
  cols: { type: 'number', description: 'Grid width in characters (default: 120)' },
31
+ session_id: { type: 'string', description: SESSION_NOTE },
32
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
33
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
29
34
  },
30
35
  required: ['url'],
31
36
  },
@@ -37,6 +42,9 @@ const TOOLS = [
37
42
  type: 'object',
38
43
  properties: {
39
44
  ref: { type: 'number', description: 'Element reference number from the text grid (e.g., 3 for [3])' },
45
+ session_id: { type: 'string', description: SESSION_NOTE },
46
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
47
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
40
48
  },
41
49
  required: ['ref'],
42
50
  },
@@ -49,6 +57,9 @@ const TOOLS = [
49
57
  properties: {
50
58
  ref: { type: 'number', description: 'Element reference number of the input field' },
51
59
  text: { type: 'string', description: 'Text to type into the field' },
60
+ session_id: { type: 'string', description: SESSION_NOTE },
61
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
62
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
52
63
  },
53
64
  required: ['ref', 'text'],
54
65
  },
@@ -61,6 +72,9 @@ const TOOLS = [
61
72
  properties: {
62
73
  ref: { type: 'number', description: 'Element reference number of the select/dropdown' },
63
74
  value: { type: 'string', description: 'Value or visible text of the option to select' },
75
+ session_id: { type: 'string', description: SESSION_NOTE },
76
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
77
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
64
78
  },
65
79
  required: ['ref', 'value'],
66
80
  },
@@ -73,6 +87,7 @@ const TOOLS = [
73
87
  properties: {
74
88
  direction: { type: 'string', enum: ['up', 'down', 'top'], description: 'Scroll direction' },
75
89
  amount: { type: 'number', description: 'Number of pages to scroll (default: 1)' },
90
+ session_id: { type: 'string', description: SESSION_NOTE },
76
91
  },
77
92
  required: ['direction'],
78
93
  },
@@ -82,7 +97,9 @@ const TOOLS = [
82
97
  description: 'Re-render the current page as a text grid without navigating. Useful after waiting for dynamic content to load.',
83
98
  inputSchema: {
84
99
  type: 'object',
85
- properties: {},
100
+ properties: {
101
+ session_id: { type: 'string', description: SESSION_NOTE },
102
+ },
86
103
  },
87
104
  },
88
105
  {
@@ -92,10 +109,32 @@ const TOOLS = [
92
109
  type: 'object',
93
110
  properties: {
94
111
  key: { type: 'string', description: 'Key to press (e.g., "Enter", "Tab", "Escape", "ArrowDown")' },
112
+ session_id: { type: 'string', description: SESSION_NOTE },
113
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
114
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
95
115
  },
96
116
  required: ['key'],
97
117
  },
98
118
  },
119
+ {
120
+ name: 'textweb_session_list',
121
+ description: 'List active textweb sessions and basic metadata (url, age).',
122
+ inputSchema: {
123
+ type: 'object',
124
+ properties: {},
125
+ },
126
+ },
127
+ {
128
+ name: 'textweb_session_close',
129
+ description: 'Close one session by session_id, or all sessions when all=true.',
130
+ inputSchema: {
131
+ type: 'object',
132
+ properties: {
133
+ session_id: { type: 'string', description: 'Session id to close (default: default)' },
134
+ all: { type: 'boolean', description: 'Close all active sessions' },
135
+ },
136
+ },
137
+ },
99
138
  {
100
139
  name: 'textweb_upload',
101
140
  description: 'Upload a file to a file input element by its reference number.',
@@ -104,22 +143,93 @@ const TOOLS = [
104
143
  properties: {
105
144
  ref: { type: 'number', description: 'Element reference number of the file input' },
106
145
  path: { type: 'string', description: 'Absolute path to the file to upload' },
146
+ session_id: { type: 'string', description: SESSION_NOTE },
147
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
148
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
107
149
  },
108
150
  required: ['ref', 'path'],
109
151
  },
110
152
  },
153
+ {
154
+ name: 'textweb_storage_save',
155
+ description: 'Save current browser storage state (cookies/localStorage/sessionStorage) to disk for later restore.',
156
+ inputSchema: {
157
+ type: 'object',
158
+ properties: {
159
+ path: { type: 'string', description: 'Absolute path to write storage state JSON' },
160
+ session_id: { type: 'string', description: SESSION_NOTE },
161
+ },
162
+ required: ['path'],
163
+ },
164
+ },
165
+ {
166
+ name: 'textweb_storage_load',
167
+ description: 'Load storage state from disk into a fresh browser context.',
168
+ inputSchema: {
169
+ type: 'object',
170
+ properties: {
171
+ path: { type: 'string', description: 'Absolute path of previously saved storage state JSON' },
172
+ cols: { type: 'number', description: 'Grid width in characters (default: 120)' },
173
+ session_id: { type: 'string', description: SESSION_NOTE },
174
+ },
175
+ required: ['path'],
176
+ },
177
+ },
178
+ {
179
+ name: 'textweb_wait_for',
180
+ description: 'Wait for UI state in multi-step flows. Supports selector, text, and url_includes checks.',
181
+ inputSchema: {
182
+ type: 'object',
183
+ properties: {
184
+ selector: { type: 'string', description: 'CSS selector that must appear (or match state)' },
185
+ text: { type: 'string', description: 'Text that must appear in page body' },
186
+ url_includes: { type: 'string', description: 'Substring that must appear in current URL' },
187
+ state: { type: 'string', enum: ['attached', 'detached', 'visible', 'hidden'], description: 'Selector wait state (default: visible)' },
188
+ timeout_ms: { type: 'number', description: 'Timeout in milliseconds (default: 30000)' },
189
+ poll_ms: { type: 'number', description: 'Polling interval for text/url waits (default: 100)' },
190
+ retries: { type: 'number', description: 'Retry attempts for flaky transitions' },
191
+ retry_delay_ms: { type: 'number', description: 'Delay between retries in ms' },
192
+ session_id: { type: 'string', description: SESSION_NOTE },
193
+ },
194
+ },
195
+ },
196
+ {
197
+ name: 'textweb_assert_field',
198
+ description: 'Assert a field value/text by element ref. Useful in multi-step forms before submitting.',
199
+ inputSchema: {
200
+ type: 'object',
201
+ properties: {
202
+ ref: { type: 'number', description: 'Element reference number from current snapshot' },
203
+ expected: { type: 'string', description: 'Expected value/content' },
204
+ comparator: { type: 'string', enum: ['equals', 'includes', 'regex', 'not_empty'], description: 'Comparison mode (default: equals)' },
205
+ attribute: { type: 'string', description: 'Optional DOM attribute name to validate (e.g., aria-invalid)' },
206
+ session_id: { type: 'string', description: SESSION_NOTE },
207
+ },
208
+ required: ['ref', 'expected'],
209
+ },
210
+ },
111
211
  ];
112
212
 
113
- // ─── Browser Instance ────────────────────────────────────────────────────────
213
+ // ─── Browser Sessions ───────────────────────────────────────────────────────
214
+
215
+ /** @type {Map<string, AgentBrowser>} */
216
+ const sessions = new Map();
114
217
 
115
- let browser = null;
218
+ function resolveSessionId(args = {}) {
219
+ return (args.session_id || 'default').trim() || 'default';
220
+ }
221
+
222
+ async function getBrowser(args = {}) {
223
+ const sessionId = resolveSessionId(args);
224
+ let browser = sessions.get(sessionId);
116
225
 
117
- async function getBrowser(cols) {
118
226
  if (!browser) {
119
- browser = new AgentBrowser({ cols: cols || 120, headless: true });
227
+ browser = new AgentBrowser({ cols: args.cols || 120, headless: true });
120
228
  await browser.launch();
229
+ sessions.set(sessionId, browser);
121
230
  }
122
- return browser;
231
+
232
+ return { browser, sessionId };
123
233
  }
124
234
 
125
235
  function formatResult(result) {
@@ -130,26 +240,78 @@ function formatResult(result) {
130
240
  return `URL: ${result.meta?.url || 'unknown'}\nTitle: ${result.meta?.title || 'unknown'}\nRefs: ${result.meta?.totalRefs || 0}\n\n${result.view}\n\nInteractive elements:\n${refs}`;
131
241
  }
132
242
 
243
+ function retryOptions(args = {}) {
244
+ return {
245
+ retries: args.retries,
246
+ retryDelayMs: args.retry_delay_ms,
247
+ };
248
+ }
249
+
250
+ async function listSessions() {
251
+ const out = [];
252
+ for (const [sessionId, browser] of sessions.entries()) {
253
+ out.push({
254
+ session_id: sessionId,
255
+ url: browser.getCurrentUrl() || null,
256
+ initialized: Boolean(browser.page),
257
+ refs: browser.lastResult?.meta?.totalRefs ?? null,
258
+ });
259
+ }
260
+ return out;
261
+ }
262
+
263
+ async function closeSession({ session_id, all } = {}) {
264
+ if (all) {
265
+ const closed = [];
266
+ for (const [sid, browser] of sessions.entries()) {
267
+ await browser.close();
268
+ closed.push(sid);
269
+ }
270
+ sessions.clear();
271
+ return { closed };
272
+ }
273
+
274
+ const sid = (session_id || 'default').trim() || 'default';
275
+ const browser = sessions.get(sid);
276
+ if (!browser) {
277
+ return { closed: [], missing: [sid] };
278
+ }
279
+
280
+ await browser.close();
281
+ sessions.delete(sid);
282
+ return { closed: [sid] };
283
+ }
284
+
133
285
  // ─── Tool Execution ──────────────────────────────────────────────────────────
134
286
 
135
- async function executeTool(name, args) {
136
- const b = await getBrowser(args.cols);
287
+ async function executeTool(name, args = {}) {
288
+ if (name === 'textweb_session_list') {
289
+ const active = await listSessions();
290
+ return JSON.stringify({ count: active.length, sessions: active }, null, 2);
291
+ }
292
+
293
+ if (name === 'textweb_session_close') {
294
+ const out = await closeSession({ session_id: args.session_id, all: args.all });
295
+ return JSON.stringify(out, null, 2);
296
+ }
297
+
298
+ const { browser: b, sessionId } = await getBrowser(args);
137
299
 
138
300
  switch (name) {
139
301
  case 'textweb_navigate': {
140
- const result = await b.navigate(args.url);
302
+ const result = await b.navigate(args.url, retryOptions(args));
141
303
  return formatResult(result);
142
304
  }
143
305
  case 'textweb_click': {
144
- const result = await b.click(args.ref);
306
+ const result = await b.click(args.ref, retryOptions(args));
145
307
  return formatResult(result);
146
308
  }
147
309
  case 'textweb_type': {
148
- const result = await b.type(args.ref, args.text);
310
+ const result = await b.type(args.ref, args.text, retryOptions(args));
149
311
  return formatResult(result);
150
312
  }
151
313
  case 'textweb_select': {
152
- const result = await b.select(args.ref, args.value);
314
+ const result = await b.select(args.ref, args.value, retryOptions(args));
153
315
  return formatResult(result);
154
316
  }
155
317
  case 'textweb_scroll': {
@@ -161,13 +323,40 @@ async function executeTool(name, args) {
161
323
  return formatResult(result);
162
324
  }
163
325
  case 'textweb_press': {
164
- const result = await b.press(args.key);
326
+ const result = await b.press(args.key, retryOptions(args));
165
327
  return formatResult(result);
166
328
  }
167
329
  case 'textweb_upload': {
168
- const result = await b.upload(args.ref, args.path);
330
+ const result = await b.upload(args.ref, args.path, retryOptions(args));
331
+ return formatResult(result);
332
+ }
333
+ case 'textweb_storage_save': {
334
+ const out = await b.saveStorageState(args.path);
335
+ return `Saved storage state for session "${sessionId}" to ${out.path}`;
336
+ }
337
+ case 'textweb_storage_load': {
338
+ const out = await b.loadStorageState(args.path);
339
+ return `Loaded storage state for session "${sessionId}" from ${out.path}`;
340
+ }
341
+ case 'textweb_wait_for': {
342
+ const result = await b.waitFor({
343
+ selector: args.selector,
344
+ text: args.text,
345
+ urlIncludes: args.url_includes,
346
+ timeoutMs: args.timeout_ms,
347
+ pollMs: args.poll_ms,
348
+ state: args.state,
349
+ ...retryOptions(args),
350
+ });
169
351
  return formatResult(result);
170
352
  }
353
+ case 'textweb_assert_field': {
354
+ const out = await b.assertField(args.ref, args.expected, {
355
+ comparator: args.comparator,
356
+ attribute: args.attribute,
357
+ });
358
+ return `ASSERT ${out.pass ? 'PASS' : 'FAIL'} | ref=${out.ref} | comparator=${out.comparator} | expected="${out.expected}" | actual="${out.actual}" | selector=${out.selector}`;
359
+ }
171
360
  default:
172
361
  throw new Error(`Unknown tool: ${name}`);
173
362
  }
@@ -232,7 +421,7 @@ function main() {
232
421
  process.stdin.setEncoding('utf8');
233
422
  process.stdin.on('data', async (chunk) => {
234
423
  buffer += chunk;
235
-
424
+
236
425
  // Process complete lines (newline-delimited JSON)
237
426
  const lines = buffer.split('\n');
238
427
  buffer = lines.pop(); // Keep incomplete line in buffer
@@ -257,17 +446,18 @@ function main() {
257
446
  });
258
447
 
259
448
  process.stdin.on('end', async () => {
260
- if (browser) await browser.close();
449
+ for (const [, browser] of sessions) {
450
+ await browser.close();
451
+ }
452
+ sessions.clear();
261
453
  process.exit(0);
262
454
  });
263
455
 
264
456
  process.on('SIGINT', async () => {
265
- if (browser) await browser.close();
266
- process.exit(0);
267
- });
268
-
269
- process.on('SIGTERM', async () => {
270
- if (browser) await browser.close();
457
+ for (const [, browser] of sessions) {
458
+ await browser.close();
459
+ }
460
+ sessions.clear();
271
461
  process.exit(0);
272
462
  });
273
463
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "textweb",
3
- "version": "0.2.0",
3
+ "version": "0.2.3",
4
4
  "description": "A text-grid web renderer for AI agents — see the web without screenshots",
5
5
  "main": "src/browser.js",
6
6
  "bin": {
@@ -10,8 +10,19 @@
10
10
  "scripts": {
11
11
  "start": "node src/cli.js",
12
12
  "serve": "node src/server.js",
13
- "test": "node test/basic.js"
13
+ "test": "node test/test-form.js && node test/test-live.js && node test/test-ats-e2e.js",
14
+ "test:form": "node test/test-form.js",
15
+ "test:live": "node test/test-live.js",
16
+ "test:ats": "node test/test-ats-e2e.js"
14
17
  },
18
+ "files": [
19
+ "src/",
20
+ "mcp/",
21
+ "tools/",
22
+ "logo.svg",
23
+ "README.md",
24
+ "LICENSE"
25
+ ],
15
26
  "keywords": [
16
27
  "ai",
17
28
  "agent",