chromeflow 0.2.3 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -34,15 +34,19 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
34
34
  ## Guided flow pattern
35
35
 
36
36
  ```
37
- 1. open_page(url) — navigate to the right page (add new_tab=true to keep current tab open)
37
+ 1. open_page(url) — navigate to the right page (add new_tab=true to keep current tab open; add background=true to keep the current tab focused if its form auto-saves on blur)
38
38
  2. For each step:
39
39
  a. Claude acts directly:
40
40
  click_element("Save") — press buttons/links Claude can press
41
- get_page_text() or wait_for_selector(".success") ALWAYS confirm after click; click_element returns after 600ms regardless of outcome
42
- fill_form([{label, value}, ...]) fill multiple fields in one call; prefer over repeated fill_input
43
- fill_input("Product name", "Pro") — fill a single field (works on React, CodeMirror, and contenteditable)
41
+ click_element("Save", until_selector=".success-toast") when synthetic clicks may silently no-op on a React-heavy site, require an observable post-click condition (or until_url_contains / until_text_contains)
42
+ get_page_text() or wait_for_selector(".success") confirm after click without an until-clause; click_element returns after 600ms regardless of outcome unless until_* was used
43
+ fill_form([{label, value}, ...], exact=true) — fill multiple fields in one call; pass exact=true on dense forms to refuse fuzzy text-walk matches
44
+ fill_input("Product name", "Pro") — fill a single field (works on React, CodeMirror, and contenteditable). Always check the response — it names the matched element so you can spot wrong-field matches
45
+ fill_input("Rate", "5", exact=true) — exact-match mode for short generic labels that may collide with neighbouring fields
46
+ react_set_input("input[name=email]", "x@y") — for inputs where fill_input fails (or for iframe-hosted inputs via frame=...) — handles the prototype-from-instance gotcha automatically
44
47
  type_text("hello world") — type via trusted keyboard events (use when fill_input fails isTrusted checks)
45
- set_file_input("Upload", "/abs/path/to/file.zip") upload a file to a file input (even hidden inputs)
48
+ type_text("description", frame="iframe.se-rte") type into a same-origin iframe's contenteditable (eBay description editor pattern)
49
+ set_file_input("Upload", "/abs/path/to/file.zip") — upload a file; returns success only after the upload is observably committed (no manual sleep needed between rapid uploads)
46
50
  clear_overlays() — call this immediately after fill_input/fill_form succeeds
47
51
  scroll_to_element("label text") — jump directly to a known field; prefer this over scroll_page when the target is known
48
52
  scroll_page("down") — reveal off-screen content when target location is unknown
@@ -50,7 +54,7 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
50
54
  get_page_text() — read errors/status after actions
51
55
  wait_for_selector(".success") — wait for a new element to appear
52
56
  wait_for_change(".toast") — wait for an existing element's content to mutate, then read it (uses MutationObserver, cheaper than polling)
53
- execute_script("document.title") query DOM state programmatically
57
+ execute_script("return await fetch('/api/x').then(r => r.json())") top-level await is supported, no window.__variable + sleep dance needed
54
58
  c. When an element can't be found or clicked:
55
59
  scroll_page("down") and retry — always try this first
56
60
  get_elements() — get EXACT DOM coords when needed
@@ -154,19 +158,48 @@ screenshot to check what happened.
154
158
  **Multiple elements with the same label** (e.g. many "Remove" buttons):
155
159
  `click_element("Remove", nth=3)` — use `nth` (1-based) to target the specific one by order top-to-bottom. Check `get_form_fields` or `get_page_text` first to determine which index corresponds to the right section.
156
160
 
161
+ **`fill_input` matched the wrong field** (always read the response — it names the matched element):
162
+ - If you wanted "Ad rate" and got back `<input name="title">`, the fuzzy text walker latched onto a neighbour. Retry with `exact=true` and a more specific hint, or use `react_set_input(selector, value)` with a precise CSS selector.
163
+ - The match-strength is reported as `aria-eq`, `placeholder-eq`, `name-eq`, `id-eq`, `label-text-eq`, or fuzzier kinds. Anything labeled `fuzzy-text-walk` or `*-includes` is the lowest-confidence kind — verify the matched element really was what you wanted.
164
+
157
165
  **`fill_input` not found or rejected by the page:**
158
166
  1. `click_element(hint)` to focus the field, then retry `fill_input`
159
- 2. If the site rejects programmatic input (isTrusted check, shadow DOM, custom editors):
167
+ 2. `react_set_input("input[name=...]", value)` uses the input's own prototype to set the value, dispatches input/change. Handles the "Illegal invocation" iframe gotcha and the prototype-from-instance ceremony for you.
168
+ 3. If the site rejects programmatic input (isTrusted check, shadow DOM, custom editors):
160
169
  - `click_element(hint)` to focus the field
161
170
  - `execute_script("document.execCommand('selectAll')")` to clear existing content
162
171
  - `type_text("new value")` — uses CDP trusted keyboard events that pass isTrusted checks
163
- 3. `find_and_highlight(hint, "Click here I'll fill it in")` (no `valueToType`) then
172
+ 4. For iframe-hosted contenteditable rich-text editors (eBay's description, etc.):
173
+ - `type_text("body content", frame="iframe.selector")` — same-origin only. Focuses the iframe's contenteditable, types via CDP, dispatches input/change in the iframe's context so React reads the new value.
174
+ 5. `find_and_highlight(hint, "Click here — I'll fill it in")` (no `valueToType`) then
164
175
  `wait_for_click()` — the user's click focuses the field and `fill_input`'s active-element
165
176
  fallback fills it automatically
166
- 4. Call `clear_overlays()` after `fill_input` succeeds
167
- 5. Only use `valueToType` when the user must personally type the value (password, personal data)
177
+ 6. Call `clear_overlays()` after `fill_input` succeeds
178
+ 7. Only use `valueToType` when the user must personally type the value (password, personal data)
168
179
 
169
- **Waiting for async results** (build, save, deploy): `wait_for_selector(selector, timeout)` never poll with screenshots.
180
+ **`click_element` returned success but the page didn't change** (common on React-heavy sites where synthetic clicks no-op):
181
+ Pass an `until_*` clause to require an observable post-click condition. `click_element` returns success=false if the condition isn't met within `until_timeout_ms` (default 5000):
182
+ ```
183
+ click_element("List with displayed fees", until_url_contains="/listing-published")
184
+ click_element("Save", until_selector=".success-toast")
185
+ click_element("Confirm", until_text_contains="Order placed")
186
+ ```
187
+ If success=false: try `react_set_input` to fire the click via the page's own React handler, or use `execute_script("document.querySelector(...).click()")` directly.
188
+
189
+ **`set_file_input` not committing on rapid back-to-back uploads:**
190
+ The default 3000ms commit-wait is enough for most uploaders. For batch photo uploads on slow react file handlers (eBay's 25-photo carousel, Stripe Connect document upload), increase `wait_ms` to 6000–8000 OR pass `verify_selector` pointing at the thumbnail/Remove-button that should appear:
191
+ ```
192
+ set_file_input("Photos", "/path/1.jpg", verify_selector=".photo-thumbnail:nth-of-type(1)")
193
+ set_file_input("Photos", "/path/2.jpg", verify_selector=".photo-thumbnail:nth-of-type(2)")
194
+ ```
195
+ The page-level file count is reported in the response — use it to spot uploaders that consume-and-reset the input vs uploaders that keep the file there.
196
+
197
+ **Waiting for async results** (build, save, deploy): `wait_for_selector(selector, timeout)` — never poll with screenshots. `wait_for_selector` pierces open shadow roots, so a selector inside a web component (Outlier task UI, Lit/Stencil widget) matches without ceremony.
198
+
199
+ **Waiting for a shadow host's tree to attach** (e.g. SPA route flips where `<my-host>` appears 10s before its shadow content hydrates, and `wait_for_selector("my-host")` resolves while `host.shadowRoot` is still null): pass `shadow_root=true`. The wait then requires the matched element's `.shadowRoot` to be non-null, not just for the host element to exist.
200
+ ```
201
+ wait_for_selector("iframe", shadow_root=true) — wait until the iframe both exists AND has an attached shadowRoot
202
+ ```
170
203
 
171
204
  **Waiting for an existing region to update** (e.g. click Save, then get the confirmation toast; send a chat message, then get the reply): `wait_for_change(selector)` uses a MutationObserver on the element's subtree and returns its new text content as soon as the mutation settles. Prefer this over `wait_for_selector` + `get_page_text` when the element already exists and you just need its next state — one call instead of two, no polling.
172
205
 
@@ -179,26 +212,25 @@ set_dialog_response(type="confirm", value="true") — next confirm() re
179
212
  Then trigger the action (e.g. `click_element("Save As")`). The response is consumed once.
180
213
 
181
214
  **React Select / custom styled dropdowns** (e.g. "Select..." components on DataAnnotation):
182
- `click_element` and `fill_input` do NOT work on these — they intercept native events. Use
183
- `execute_script` with the hidden combobox input approach (most reliable):
215
+ `click_element` and `fill_input` do NOT work on these — they intercept native events. The cleanest path is `react_set_input` (which handles the prototype-from-instance setter for you) followed by a click on the filtered option:
216
+
217
+ ```
218
+ 1. react_set_input('input[id*="react-select-3-input"]', "Target Option")
219
+ — sets the hidden combobox input via its own prototype's value-setter and dispatches the input event React's onChange listens for
220
+ 2. (300ms pause for the dropdown to filter)
221
+ 3. execute_script("document.querySelector('[id*=\"react-select-3-option-0\"]').click()")
222
+ 4. Verify the control shows the selected value:
223
+ execute_script("document.querySelector('[class*=\"singleValue\"]').textContent.trim()")
224
+ ```
225
+
226
+ If you must hand-roll this with `execute_script` (older React-Select versions, weird custom wrappers), prefer reading the prototype FROM the instance to avoid "Illegal invocation" inside iframes:
184
227
 
185
228
  ```js
186
- // 1. Find the hidden combobox input (each React Select has one: input[id*="react-select-N-input"])
187
229
  var input = document.querySelector('input[id*="react-select-3-input"]');
188
230
  input.focus();
189
-
190
- // 2. Set value via native setter to trigger React's onChange
191
- var setter = Object.getOwnPropertyDescriptor(HTMLInputElement.prototype, 'value').set;
231
+ var setter = Object.getOwnPropertyDescriptor(Object.getPrototypeOf(input), 'value').set;
192
232
  setter.call(input, 'Target Option');
193
- input.dispatchEvent(new Event('input', {bubbles: true}));
194
-
195
- // 3. Wait 300ms for the dropdown to filter, then click the first matching option
196
- // (run this as a separate execute_script call after a brief pause)
197
- var option = document.querySelector('[id*="react-select-3-option-0"]');
198
- if (option) option.click();
199
-
200
- // 4. Verify — the control div should show the selected value
201
- document.querySelector('[class*="singleValue"]').textContent.trim();
233
+ input.dispatchEvent(new Event('input', { bubbles: true }));
202
234
  ```
203
235
 
204
236
  Fallback if the combobox approach doesn't work (older React Select versions):
@@ -6,15 +6,18 @@ import { execSync } from "child_process";
6
6
  function registerBrowserTools(server, bridge) {
7
7
  server.tool(
8
8
  "open_page",
9
- "Navigate to a URL. By default reuses the active tab. Set new_tab=true to open alongside the current tab without losing it. After navigating, call get_page_text to read the page \u2014 do NOT take a screenshot.",
9
+ `Navigate to a URL. By default reuses the active tab. Set new_tab=true to open alongside the current tab without losing it. After navigating, call get_page_text to read the page \u2014 do NOT take a screenshot.
10
+
11
+ Set background=true (only with new_tab=true) to open the new tab WITHOUT switching focus to it. Use this when the current tab has a partially-filled form whose page auto-saves on focus loss (e.g. eBay seller listings) \u2014 switching away would trigger the auto-save and corrupt the in-progress draft.`,
10
12
  {
11
13
  url: z.string().url().describe("The URL to navigate to"),
12
- new_tab: z.boolean().optional().describe("Open in a new tab instead of replacing the current one (default false)")
14
+ new_tab: z.boolean().optional().describe("Open in a new tab instead of replacing the current one (default false)"),
15
+ background: z.boolean().optional().describe("If new_tab=true, do not switch focus to the new tab. Default false. Ignored when new_tab is false.")
13
16
  },
14
- async ({ url, new_tab }) => {
15
- await bridge.request({ type: "navigate", url, newTab: new_tab ?? false });
17
+ async ({ url, new_tab, background }) => {
18
+ await bridge.request({ type: "navigate", url, newTab: new_tab ?? false, background: background ?? false });
16
19
  return {
17
- content: [{ type: "text", text: `Navigated to ${url}${new_tab ? " (new tab)" : ""}` }]
20
+ content: [{ type: "text", text: `Navigated to ${url}${new_tab ? background ? " (new background tab)" : " (new tab)" : ""}` }]
18
21
  };
19
22
  }
20
23
  );
@@ -259,15 +262,21 @@ Unlike fill_input (which sets .value programmatically), this produces real keyst
259
262
  - fill_input fails because the site validates event.isTrusted (e.g. Outlier, DataAnnotation code editors)
260
263
  - The target is a shadow DOM input, custom web component, or heavily guarded editor
261
264
  - You need to type into a CodeMirror/Monaco/Ace editor that rejects programmatic value changes
265
+ - The target lives inside a same-origin iframe (e.g. eBay's "se-rte" rich-text description editor) \u2014 pass the iframe's CSS selector via the \`frame\` parameter
262
266
 
263
267
  Usage: first click_element or execute_script to focus the target field, then call type_text with the content.
264
- To clear existing content before typing, use execute_script("document.execCommand('selectAll')") first.`,
268
+ To clear existing content before typing, use execute_script("document.execCommand('selectAll')") first.
269
+
270
+ For iframe contenteditables: pass \`frame\` (a CSS selector for the iframe). type_text descends into the iframe, focuses its first editable element, types via CDP, then dispatches input/change in the iframe's context so React picks up the change. Same-origin iframes only \u2014 cross-origin iframes will return an error.`,
265
271
  {
266
- text: z.string().describe("The text to type into the focused element")
272
+ text: z.string().describe("The text to type into the focused element"),
273
+ frame: z.string().optional().describe(
274
+ "CSS selector for an iframe whose contents you want to type into (e.g. 'iframe.se-rte-frame__summary'). Same-origin only. Before typing, the first contenteditable/input inside the iframe is focused; after typing, input/change events are dispatched in the iframe's context."
275
+ )
267
276
  },
268
- async ({ text }) => {
277
+ async ({ text, frame }) => {
269
278
  const timeoutMs = Math.max(3e4, text.length * 90 + 15e3);
270
- const response = await bridge.request({ type: "type_text", text }, timeoutMs);
279
+ const response = await bridge.request({ type: "type_text", text, frame }, timeoutMs);
271
280
  const r = response;
272
281
  return {
273
282
  content: [{ type: "text", text: r.message ?? (r.success ? "Text typed successfully" : "Failed to type text") }]
@@ -278,29 +287,65 @@ To clear existing content before typing, use execute_script("document.execComman
278
287
  "set_file_input",
279
288
  `Upload a file to a file input field. Works even when the input is visually hidden behind a custom drag-and-drop zone.
280
289
  Uses Chrome DevTools Protocol to set the file \u2014 the only way to bypass the browser's file-input script restriction.
281
- hint: label text or name of the file input (or empty string to target the first file input on the page).
282
- file_path: absolute path to the file on the local filesystem (e.g. /Users/you/Downloads/task.zip).
283
- After calling this, verify the upload was accepted: use execute_script to check that the input's files.length > 0, or use get_page_text to look for a success indicator (e.g. a Remove button appearing). If not accepted, call set_file_input again \u2014 occasional React timing issues may require a retry.`,
290
+
291
+ Returns success=true ONLY if an observable change is detected within wait_ms: either the page-level file count goes up, or the file is consumed by the page's React handler (input is reset), or verify_selector matches a new element on the page. Otherwise success=false with a clear message \u2014 typically because the page rejected the file (size/type) or the React handler hasn't run yet.
292
+
293
+ For rapid batch uploads (multiple set_file_input calls in a row), this commit-wait prevents the second CDP call from overwriting the first before React reads it \u2014 no manual sleep needed between calls.
294
+
295
+ hint: label text, name, or CSS selector of the file input (or empty string to target the first file input on the page).
296
+ file_path: absolute path to the file on the local filesystem (e.g. /Users/you/Downloads/task.zip).`,
284
297
  {
285
298
  hint: z.string().describe("Label text, name, or surrounding text of the file input. Use empty string to target the first file input on the page."),
286
- file_path: z.string().describe("Absolute path to the file to upload (e.g. /Users/you/Downloads/task.zip)")
299
+ file_path: z.string().describe("Absolute path to the file to upload (e.g. /Users/you/Downloads/task.zip)"),
300
+ wait_ms: z.number().int().min(0).optional().describe("How long to wait for an observable change after setting the file (default 3000). Increase for slow uploaders that take a moment to render thumbnails."),
301
+ verify_selector: z.string().optional().describe('Optional CSS selector that should appear after a successful upload (e.g. ".photo-thumbnail", "[data-uploaded=true]"). When matched, set_file_input returns success immediately.')
287
302
  },
288
- async ({ hint, file_path }) => {
289
- const response = await bridge.request({ type: "set_file_input", hint, filePath: file_path });
303
+ async ({ hint, file_path, wait_ms, verify_selector }) => {
304
+ const wsTimeout = Math.max(3e4, (wait_ms ?? 3e3) + 1e4);
305
+ const response = await bridge.request(
306
+ { type: "set_file_input", hint, filePath: file_path, waitMs: wait_ms, verifySelector: verify_selector },
307
+ wsTimeout
308
+ );
290
309
  const r = response;
291
310
  return {
292
311
  content: [{ type: "text", text: r.message ?? (r.success ? "File set successfully" : "Failed to set file") }]
293
312
  };
294
313
  }
295
314
  );
315
+ server.tool(
316
+ "react_set_input",
317
+ `Set the value of a React-controlled input via the native value-setter, dispatching the input/change events that React's onChange handler listens for.
318
+
319
+ Use this instead of writing your own \`Object.getOwnPropertyDescriptor(HTMLInputElement.prototype, 'value').set\` script \u2014 this helper handles the prototype-from-instance gotcha automatically (inputs inside iframes have their own HTMLInputElement constructor, and using the outer-window prototype throws "Illegal invocation").
320
+
321
+ Common cases:
322
+ - A standard input that fill_input fails on because the page validates event.isTrusted or uses an exotic React Hook Form setup.
323
+ - An input inside a same-origin iframe (pass frame="iframe.selector").
324
+ - A hidden React-Select combobox input (selector='input[id*="react-select-3-input"]').
325
+
326
+ Returns the matched element's tag/name/id/type so you can verify it was the right field, and the read-back value so you can spot when React rejected the new value.`,
327
+ {
328
+ selector: z.string().describe("CSS selector of the input to set (e.g. 'input[name=email]', '#promoted-rate-input')"),
329
+ value: z.string().describe("The value to set"),
330
+ frame: z.string().optional().describe('Optional CSS selector for a same-origin iframe whose contents contain the input (e.g. "iframe.se-rte-frame"). Cross-origin iframes are not supported.')
331
+ },
332
+ async ({ selector, value, frame }) => {
333
+ const response = await bridge.request({ type: "react_set_input", selector, value, frame });
334
+ const r = response;
335
+ return {
336
+ content: [{ type: "text", text: r.message ?? (r.success ? "Set" : "Failed to set") }]
337
+ };
338
+ }
339
+ );
296
340
  server.tool(
297
341
  "execute_script",
298
342
  `Execute JavaScript in the current page's context and return the result as a string.
299
343
  Use this to read framework state, check DOM properties, or interact with page APIs that aren't reachable via text.
300
344
  Prefer get_page_text for reading visible content. Use this for programmatic DOM queries (e.g. checking an element's attribute, reading a value not visible in text).
301
345
  Top-level return statements are supported (e.g. multi-statement scripts with \`return value;\`).
346
+ Top-level \`await\` is supported \u2014 write \`return await fetch(url).then(r => r.json())\` directly without the window.__variable + sleep + re-read pattern. Detected automatically when the code contains the \`await\` keyword.
302
347
  If the page called alert()/confirm()/prompt() since the last check, the message will appear as PAGE ALERT in the result \u2014 read it and act on it.
303
- NOTE: Pages with strict Content Security Policy (e.g. Stripe, GitHub) will block eval and return a CSP error \u2014 do not retry, use get_page_text or fill_input instead.`,
348
+ NOTE: Pages with strict Content Security Policy (e.g. Stripe, GitHub) will fall through to a CDP path that bypasses CSP \u2014 but the script still runs, so retries usually aren't needed.`,
304
349
  {
305
350
  code: z.string().describe(
306
351
  "JavaScript expression or multi-statement script to evaluate in the page. Top-level `return` is supported."
@@ -9,14 +9,19 @@ function registerCaptureTools(server, bridge) {
9
9
  `Fill a form input field with a value automatically.
10
10
  Use this for fields Claude knows the answer to (product name, price, description, tier name, URLs, etc.).
11
11
  DO NOT use for: email address, password, payment/billing info, phone number \u2014 highlight those instead and tell the user what to enter.
12
- After filling, call wait_for_click only if the user needs to review/confirm; otherwise proceed directly to the next step.`,
12
+ After filling, call wait_for_click only if the user needs to review/confirm; otherwise proceed directly to the next step.
13
+
14
+ The response always includes the matched element's identifying attributes (e.g. \`<input name="title" id="..." placeholder="...">\`) and the match-strength (aria-eq, name-eq, fuzzy-text-walk, etc.). VERIFY this is the field you intended \u2014 fuzzy-text-walk matches are the lowest-confidence kind and have historically caused fill_input to land on the wrong field on dense forms.
15
+
16
+ Pass \`exact: true\` to refuse fuzzy text-walk matches entirely. Use this for short generic labels like "Rate", "Price", or "Amount" on dense forms with many similarly-labeled fields. If no exact match exists, fill_input returns success=false instead of silently filling the wrong field.`,
13
17
  {
14
18
  textHint: z.string().describe("The label, placeholder, or nearby text identifying the input (e.g. 'Product name', 'Amount', 'Description')"),
15
19
  value: z.string().describe("The value to fill in"),
16
- nth: z.number().int().min(1).optional().describe("Which match to fill when multiple inputs share the same label (1 = first/topmost, default 1)")
20
+ nth: z.number().int().min(1).optional().describe("Which match to fill when multiple inputs share the same label (1 = first/topmost, default 1)"),
21
+ exact: z.boolean().optional().describe("If true, only match aria-label/placeholder/name/id/label-text equal to the hint \u2014 refuse fuzzy text-walk matches. Default false.")
17
22
  },
18
- async ({ textHint, value, nth }) => {
19
- const response = await bridge.request({ type: "fill_input", textHint, value, nth });
23
+ async ({ textHint, value, nth, exact }) => {
24
+ const response = await bridge.request({ type: "fill_input", textHint, value, nth, exact });
20
25
  if (response.type !== "fill_response") throw new Error("Unexpected response");
21
26
  const r = response;
22
27
  return {
@@ -18,22 +18,36 @@ function registerFlowTools(server, bridge) {
18
18
  Use this whenever Claude can press a button without needing user input \u2014 e.g. "Save", "Continue", "Create product", "Add pricing", "Confirm", "Next".
19
19
  After clicking, use get_page_text to check the result \u2014 only use take_screenshot if you need pixel positions.
20
20
  Do NOT use for: elements that require the user to make a personal choice, consent to terms, or enter sensitive data.
21
- When multiple elements share the same label (e.g. many "Remove" buttons), use nth to target a specific one (1 = first/topmost, 2 = second, etc.).`,
21
+ When multiple elements share the same label (e.g. many "Remove" buttons), use nth to target a specific one (1 = first/topmost, 2 = second, etc.).
22
+
23
+ Verifying the click took effect: on React-heavy sites the synthetic click sometimes returns success but the handler never ran. Pass an "until" condition that should hold AFTER the click \u2014 click_element will then poll for it and return success only if the page actually changed:
24
+ - until_selector: a CSS selector that should appear (e.g. ".success-toast", "#confirm-modal")
25
+ - until_url_contains: a substring that should appear in the URL (e.g. "/listing-published")
26
+ - until_text_contains: a substring that should appear anywhere in page text (e.g. "Listing created")
27
+ If the until-condition is not met within until_timeout_ms (default 5000ms), click_element returns success=false with a clear message so the caller can retry or take a different path.`,
22
28
  {
23
29
  textHint: z.string().describe(
24
30
  "The visible label of the button or link (e.g. 'Save product', 'Continue', 'Add a product', 'Create')"
25
31
  ),
26
- nth: z.number().int().min(1).optional().describe("Which match to click when multiple elements share the same label (1 = first/topmost, default 1)")
32
+ nth: z.number().int().min(1).optional().describe("Which match to click when multiple elements share the same label (1 = first/topmost, default 1)"),
33
+ until_selector: z.string().optional().describe('Wait until this CSS selector appears on the page after the click (e.g. ".success-toast"). Returns success=false if it does not appear within until_timeout_ms.'),
34
+ until_url_contains: z.string().optional().describe('Wait until the URL contains this substring after the click (e.g. "/checkout/complete"). Returns success=false if it does not.'),
35
+ until_text_contains: z.string().optional().describe('Wait until the visible page text contains this substring after the click (e.g. "Listing published"). Returns success=false if it does not.'),
36
+ until_timeout_ms: z.number().int().min(500).optional().describe("How long to wait for the until-condition, in milliseconds (default 5000). Only used if one of until_* is set.")
27
37
  },
28
- async ({ textHint, nth }) => {
29
- const response = await bridge.request({ type: "click_element", textHint, nth });
38
+ async ({ textHint, nth, until_selector, until_url_contains, until_text_contains, until_timeout_ms }) => {
39
+ const wsTimeout = Math.max(3e4, (until_timeout_ms ?? 0) + 1e4);
40
+ const response = await bridge.request(
41
+ { type: "click_element", textHint, nth, until_selector, until_url_contains, until_text_contains, until_timeout_ms },
42
+ wsTimeout
43
+ );
30
44
  const r = response;
31
45
  if (!r.success) {
32
46
  return {
33
47
  content: [
34
48
  {
35
49
  type: "text",
36
- text: `Could not click "${textHint}": ${r.message}. Call take_screenshot() to locate the element visually.`
50
+ text: `Could not click "${textHint}": ${r.message}`
37
51
  }
38
52
  ]
39
53
  };
@@ -78,7 +92,15 @@ If the click causes page navigation, this resolves when the new page finishes lo
78
92
  Examples: wait for a build to finish, a success/error message to appear, a modal to open.
79
93
  After it resolves, use get_page_text to read the result rather than taking a screenshot.
80
94
  For long-running server-side processes (e.g. a query job that may take minutes), set poll_interval
81
- to 15 seconds so the page is checked gently rather than hammered every 500ms.`,
95
+ to 15 seconds so the page is checked gently rather than hammered every 500ms.
96
+
97
+ Pierces open shadow roots automatically \u2014 selectors for elements inside web components
98
+ (Outlier task UI, Lit/Stencil widgets) match without needing a shadow-DOM-aware caller.
99
+
100
+ Pass \`shadow_root: true\` when the matched element is itself a shadow host whose tree
101
+ hasn't attached yet \u2014 common after SPA route transitions where the host element appears
102
+ seconds before its shadow content hydrates. Without this, wait_for_selector("the-host")
103
+ resolves on the empty host and the next execute_script(host.shadowRoot) returns null.`,
82
104
  {
83
105
  selector: z.string().describe(
84
106
  `CSS selector to wait for (e.g. '.deploy-ready', '[data-status="error"]', '.toast-error')`
@@ -86,14 +108,21 @@ to 15 seconds so the page is checked gently rather than hammered every 500ms.`,
86
108
  timeout: z.number().optional().describe("Max seconds to wait (default 30)"),
87
109
  poll_interval: z.number().optional().describe(
88
110
  "How often to check for the selector, in seconds (default 0.5). Set to 15 when waiting for a slow server-side process."
111
+ ),
112
+ shadow_root: z.boolean().optional().describe(
113
+ "If true, also require the matched element to have an attached shadowRoot (not null). Use after SPA navigations where the shadow host appears before its tree hydrates. Default false."
89
114
  )
90
115
  },
91
- async ({ selector, timeout = 30, poll_interval }) => {
116
+ async ({ selector, timeout = 30, poll_interval, shadow_root }) => {
92
117
  const timeoutMs = timeout * 1e3;
93
118
  const pollMs = poll_interval ? poll_interval * 1e3 : void 0;
94
- await bridge.request({ type: "wait_for_selector", selector, timeout: timeoutMs, refresh: pollMs }, timeoutMs + 5e3);
119
+ await bridge.request(
120
+ { type: "wait_for_selector", selector, timeout: timeoutMs, refresh: pollMs, shadow_root },
121
+ timeoutMs + 5e3
122
+ );
123
+ const suffix = shadow_root ? " (with attached shadowRoot)" : "";
95
124
  return {
96
- content: [{ type: "text", text: `Selector "${selector}" found on page.` }]
125
+ content: [{ type: "text", text: `Selector "${selector}" found on page${suffix}.` }]
97
126
  };
98
127
  }
99
128
  );
@@ -163,17 +192,22 @@ Examples: scroll_to_element("#submit-btn"), scroll_to_element("Billing address")
163
192
  `Fill multiple form fields in a single call by targeting each field by its label text.
164
193
  Use this instead of calling fill_input repeatedly \u2014 it fills all fields in one round trip and returns a per-field success report.
165
194
  Ideal for forms with many textareas or inputs where each fill would otherwise require a separate tool call.
166
- fields is an array of {label, value} pairs. label should match the field's visible label, placeholder, or aria-label.`,
195
+ fields is an array of {label, value} pairs. label should match the field's visible label, placeholder, or aria-label.
196
+
197
+ Each per-field result includes the matched element description (e.g. \`<input name="title" id="..." placeholder="...">\`) so Claude can spot when fill_form picked the wrong field.
198
+
199
+ Pass \`exact: true\` for forms with short generic labels (like "Rate" or "Amount") that may collide with similarly-labeled neighbours \u2014 fields without an exact aria-label/placeholder/name/id/label-text match will return success=false instead of silently filling the wrong field.`,
167
200
  {
168
201
  fields: z.array(
169
202
  z.object({
170
203
  label: z.string().describe("Visible label, placeholder, or aria-label of the field"),
171
204
  value: z.string().describe("Value to fill in")
172
205
  })
173
- ).describe("List of fields to fill")
206
+ ).describe("List of fields to fill"),
207
+ exact: z.boolean().optional().describe("If true, refuse fuzzy text-walk matches for every field. Default false.")
174
208
  },
175
- async ({ fields }) => {
176
- const response = await bridge.request({ type: "fill_form", fields });
209
+ async ({ fields, exact }) => {
210
+ const response = await bridge.request({ type: "fill_form", fields, exact });
177
211
  const r = response;
178
212
  const lines = r.results.map((f) => `${f.success ? "\u2713" : "\u2717"} "${f.label}": ${f.message}`);
179
213
  return {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "chromeflow",
3
- "version": "0.2.3",
3
+ "version": "0.4.0",
4
4
  "description": "Browser guidance MCP server for Claude Code — highlights, clicks, fills, and captures from the web so you don't have to.",
5
5
  "type": "module",
6
6
  "bin": {