chromeflow 0.1.8 → 0.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -22,11 +22,13 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
22
22
  `scroll_page` then retry, or use `highlight_region` to show the user. Never use
23
23
  `osascript`, `applescript`, or any shell command to control the browser.
24
24
 
25
- 2. **`take_screenshot` is only for pixel-position lookups before `highlight_region`.** Every
26
- other state check after navigation, after a click, after `wait_for_click`, to confirm an
27
- action succeeded must use `get_page_text` or `wait_for_selector`. Never take a screenshot
28
- as a "let me see what's on screen" step. Correct order: try `click_element` if it fails,
29
- THEN `take_screenshot` `highlight_region`. Never screenshot preemptively.
25
+ 2. **Use `get_elements` to get pixel coordinates, not `take_screenshot`.** `get_elements`
26
+ returns exact DOM coordinates (always accurate). `take_screenshot` is a last resort only
27
+ when you need to SEE the visual layout not for finding positions. Correct order when
28
+ `click_element` or `fill_input` fails: try `get_elements` first to get exact coords →
29
+ then `highlight_region` using those coords. Only use `take_screenshot` if you genuinely
30
+ need to see what the page looks like. Screenshots now include a red coordinate grid to
31
+ help read positions — use the grid labels, not visual estimates.
30
32
 
31
33
  3. **`open_page` already waits for navigation.** Never call `wait_for_navigation`
32
34
  immediately after `open_page` — it will time out.
@@ -54,11 +56,12 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
54
56
  get_page_text() — read errors/status after actions
55
57
  wait_for_selector(".success") — wait for async changes (builds, modals)
56
58
  execute_script("document.title") — query DOM state programmatically
57
- c. Only take a screenshot when click_element failed and you need pixel coords:
59
+ c. When click_element or fill_input fails and you need pixel coords:
58
60
  click_element("Save") — try this first, ALWAYS
59
- [if fails] take_screenshot() now get coords, not before
60
- highlight_region(x,y,w,h,msg) — point user to exact location
61
+ [if fails] get_elements() — get EXACT DOM coords, use these in highlight_region
62
+ highlight_region(x,y,w,h,msg) — use exact coords from get_elements, not estimates
61
63
  [after wait_for_click] get_page_text() — confirm result, NOT take_screenshot
64
+ [last resort only] take_screenshot() — only if you need to see the visual layout
62
65
  d. Pause for the user when needed:
63
66
  find_and_highlight(text, msg) — show the user what to do
64
67
  wait_for_click() — wait for user interaction
@@ -100,7 +103,7 @@ Use the absolute path for `envPath` — it's the Claude Code working directory +
100
103
  - After any action → `get_page_text()` to check for errors (not `take_screenshot`)
101
104
  - After `click_element("Save")` / form submission → use `get_page_text()` or `wait_for_selector` to confirm. Never use `wait_for_navigation` — most form saves don't navigate.
102
105
  - `click_element` not found → `scroll_page("down")` then retry
103
- - Still not found → `take_screenshot()` then `highlight_region(x,y,w,h,msg)`
106
+ - Still not found → `get_elements()` to get exact coords, then `highlight_region(x,y,w,h,msg)` using those coords. Only use `take_screenshot()` if you need to visually inspect the page.
104
107
  - `fill_input` not found → `click_element(hint)` to focus the field, then retry `fill_input`. If still failing, use `find_and_highlight(hint, "Click here — I'll fill it in")` (NO `valueToType`) then `wait_for_click()` then retry `fill_input` — after the user focuses the field by clicking, the active-element fallback fills it automatically. `find_and_highlight` uses DOM positioning (pixel-perfect) — only fall back to `take_screenshot` + `highlight_region` if `find_and_highlight` returns false. After `fill_input` succeeds, immediately call `clear_overlays()` to remove the highlight. Only use `valueToType` when the user genuinely must type the value themselves (e.g. password, personal data).
105
108
  - Waiting for async result (build, save, deploy) → `wait_for_selector(selector, timeout)`
106
109
  - Never use Bash to work around a stuck browser interaction
package/dist/index.js CHANGED
@@ -27,7 +27,7 @@ async function main() {
27
27
  const bridge = new WsBridge();
28
28
  const server = new McpServer({
29
29
  name: "chromeflow",
30
- version: "0.1.8"
30
+ version: "0.1.10"
31
31
  });
32
32
  registerBrowserTools(server, bridge);
33
33
  registerHighlightTools(server, bridge);
@@ -46,6 +46,31 @@ function registerBrowserTools(server, bridge) {
46
46
  };
47
47
  }
48
48
  );
49
+ server.tool(
50
+ "get_elements",
51
+ `Get the exact pixel positions of all visible interactive elements on the page (inputs, buttons, links, selects).
52
+ Use this INSTEAD OF take_screenshot when you need coordinates for highlight_region \u2014 the coordinates are exact DOM values, not estimates.
53
+ Returns a numbered list with element type, label, and precise x/y/width/height in CSS pixels.
54
+ After calling this, use those exact coordinates in highlight_region \u2014 do NOT adjust them.`,
55
+ {},
56
+ async () => {
57
+ const response = await bridge.request({ type: "get_elements" });
58
+ if (response.type !== "elements_response") throw new Error("Unexpected response");
59
+ const els = response.elements;
60
+ if (els.length === 0) {
61
+ return { content: [{ type: "text", text: "No visible interactive elements found on page." }] };
62
+ }
63
+ const lines = els.map(
64
+ (e) => `${e.index}. ${e.type} "${e.label}" \u2014 x:${e.x} y:${e.y} w:${e.width} h:${e.height}`
65
+ );
66
+ return {
67
+ content: [{ type: "text", text: `Visible interactive elements:
68
+ ${lines.join("\n")}
69
+
70
+ Use these exact x/y values in highlight_region.` }]
71
+ };
72
+ }
73
+ );
49
74
  server.tool(
50
75
  "execute_script",
51
76
  `Execute JavaScript in the current page's context and return the result as a string.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "chromeflow",
3
- "version": "0.1.8",
3
+ "version": "0.1.10",
4
4
  "description": "Browser guidance MCP server for Claude Code — highlights, clicks, fills, and captures from the web so you don't have to.",
5
5
  "type": "module",
6
6
  "bin": {