chromeflow 0.1.8 → 0.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +12 -9
- package/dist/index.js +1 -1
- package/dist/tools/browser.js +25 -0
- package/package.json +1 -1
package/CLAUDE.md
CHANGED
|
@@ -22,11 +22,13 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
|
|
|
22
22
|
`scroll_page` then retry, or use `highlight_region` to show the user. Never use
|
|
23
23
|
`osascript`, `applescript`, or any shell command to control the browser.
|
|
24
24
|
|
|
25
|
-
2.
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
25
|
+
2. **Use `get_elements` to get pixel coordinates, not `take_screenshot`.** `get_elements`
|
|
26
|
+
returns exact DOM coordinates (always accurate). `take_screenshot` is a last resort only
|
|
27
|
+
when you need to SEE the visual layout — not for finding positions. Correct order when
|
|
28
|
+
`click_element` or `fill_input` fails: try `get_elements` first to get exact coords →
|
|
29
|
+
then `highlight_region` using those coords. Only use `take_screenshot` if you genuinely
|
|
30
|
+
need to see what the page looks like. Screenshots now include a red coordinate grid to
|
|
31
|
+
help read positions — use the grid labels, not visual estimates.
|
|
30
32
|
|
|
31
33
|
3. **`open_page` already waits for navigation.** Never call `wait_for_navigation`
|
|
32
34
|
immediately after `open_page` — it will time out.
|
|
@@ -54,11 +56,12 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
|
|
|
54
56
|
get_page_text() — read errors/status after actions
|
|
55
57
|
wait_for_selector(".success") — wait for async changes (builds, modals)
|
|
56
58
|
execute_script("document.title") — query DOM state programmatically
|
|
57
|
-
c.
|
|
59
|
+
c. When click_element or fill_input fails and you need pixel coords:
|
|
58
60
|
click_element("Save") — try this first, ALWAYS
|
|
59
|
-
[if fails]
|
|
60
|
-
highlight_region(x,y,w,h,msg) —
|
|
61
|
+
[if fails] get_elements() — get EXACT DOM coords, use these in highlight_region
|
|
62
|
+
highlight_region(x,y,w,h,msg) — use exact coords from get_elements, not estimates
|
|
61
63
|
[after wait_for_click] get_page_text() — confirm result, NOT take_screenshot
|
|
64
|
+
[last resort only] take_screenshot() — only if you need to see the visual layout
|
|
62
65
|
d. Pause for the user when needed:
|
|
63
66
|
find_and_highlight(text, msg) — show the user what to do
|
|
64
67
|
wait_for_click() — wait for user interaction
|
|
@@ -100,7 +103,7 @@ Use the absolute path for `envPath` — it's the Claude Code working directory +
|
|
|
100
103
|
- After any action → `get_page_text()` to check for errors (not `take_screenshot`)
|
|
101
104
|
- After `click_element("Save")` / form submission → use `get_page_text()` or `wait_for_selector` to confirm. Never use `wait_for_navigation` — most form saves don't navigate.
|
|
102
105
|
- `click_element` not found → `scroll_page("down")` then retry
|
|
103
|
-
- Still not found → `
|
|
106
|
+
- Still not found → `get_elements()` to get exact coords, then `highlight_region(x,y,w,h,msg)` using those coords. Only use `take_screenshot()` if you need to visually inspect the page.
|
|
104
107
|
- `fill_input` not found → `click_element(hint)` to focus the field, then retry `fill_input`. If still failing, use `find_and_highlight(hint, "Click here — I'll fill it in")` (NO `valueToType`) then `wait_for_click()` then retry `fill_input` — after the user focuses the field by clicking, the active-element fallback fills it automatically. `find_and_highlight` uses DOM positioning (pixel-perfect) — only fall back to `take_screenshot` + `highlight_region` if `find_and_highlight` returns false. After `fill_input` succeeds, immediately call `clear_overlays()` to remove the highlight. Only use `valueToType` when the user genuinely must type the value themselves (e.g. password, personal data).
|
|
105
108
|
- Waiting for async result (build, save, deploy) → `wait_for_selector(selector, timeout)`
|
|
106
109
|
- Never use Bash to work around a stuck browser interaction
|
package/dist/index.js
CHANGED
package/dist/tools/browser.js
CHANGED
|
@@ -46,6 +46,31 @@ function registerBrowserTools(server, bridge) {
|
|
|
46
46
|
};
|
|
47
47
|
}
|
|
48
48
|
);
|
|
49
|
+
server.tool(
|
|
50
|
+
"get_elements",
|
|
51
|
+
`Get the exact pixel positions of all visible interactive elements on the page (inputs, buttons, links, selects).
|
|
52
|
+
Use this INSTEAD OF take_screenshot when you need coordinates for highlight_region \u2014 the coordinates are exact DOM values, not estimates.
|
|
53
|
+
Returns a numbered list with element type, label, and precise x/y/width/height in CSS pixels.
|
|
54
|
+
After calling this, use those exact coordinates in highlight_region \u2014 do NOT adjust them.`,
|
|
55
|
+
{},
|
|
56
|
+
async () => {
|
|
57
|
+
const response = await bridge.request({ type: "get_elements" });
|
|
58
|
+
if (response.type !== "elements_response") throw new Error("Unexpected response");
|
|
59
|
+
const els = response.elements;
|
|
60
|
+
if (els.length === 0) {
|
|
61
|
+
return { content: [{ type: "text", text: "No visible interactive elements found on page." }] };
|
|
62
|
+
}
|
|
63
|
+
const lines = els.map(
|
|
64
|
+
(e) => `${e.index}. ${e.type} "${e.label}" \u2014 x:${e.x} y:${e.y} w:${e.width} h:${e.height}`
|
|
65
|
+
);
|
|
66
|
+
return {
|
|
67
|
+
content: [{ type: "text", text: `Visible interactive elements:
|
|
68
|
+
${lines.join("\n")}
|
|
69
|
+
|
|
70
|
+
Use these exact x/y values in highlight_region.` }]
|
|
71
|
+
};
|
|
72
|
+
}
|
|
73
|
+
);
|
|
49
74
|
server.tool(
|
|
50
75
|
"execute_script",
|
|
51
76
|
`Execute JavaScript in the current page's context and return the result as a string.
|
package/package.json
CHANGED