npm - chromeflow - Versions diffs - 0.1.36 → 0.1.38 - Mend

chromeflow 0.1.36 → 0.1.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/CLAUDE.md CHANGED Viewed

@@ -140,6 +140,9 @@ screenshot to check what happened.
 2. `get_elements()` to get exact coords → `highlight_region(x,y,w,h,msg)`
 3. `take_screenshot()` only if you still can't identify the element from DOM queries
+**Multiple elements with the same label** (e.g. many "Remove" buttons):
+`click_element("Remove", nth=3)` — use `nth` (1-based) to target the specific one by order top-to-bottom. Check `get_form_fields` or `get_page_text` first to determine which index corresponds to the right section.
 **`fill_input` not found:**
 1. `click_element(hint)` to focus the field, then retry `fill_input`
 2. `find_and_highlight(hint, "Click here — I'll fill it in")` (no `valueToType`) then
@@ -173,6 +176,13 @@ for (var i = 0; i < allEls.length; i++) {
 controls[N].textContent.trim(); // should show selected value
 ```
+**Page text with large embedded content** (e.g. uploaded log files previewed inline): full-page `get_page_text()` pagination becomes unwieldy. Scope to a specific section instead:
+```
+get_page_text(selector=".section-3")   — scope to a CSS selector
+get_page_text(selector="#upload-form") — scope to an id
+```
+Use `execute_script("document.querySelectorAll('section').length")` to find structural selectors first.
 **Page content rendered as images** (e.g. qualification "Examples" tabs that show PNG screenshots
 instead of DOM text): `get_page_text()` returns nothing useful. Zoom out and screenshot instead:

package/README.md CHANGED Viewed

@@ -1,13 +1,13 @@
 # Chromeflow
-Browser guidance for Claude Code. When Claude needs you to set up Stripe, grab API keys, configure a third-party service, or do anything in a browser — Chromeflow takes over. It highlights what to click, fills in fields it knows, clicks buttons automatically, and writes captured values straight to your `.env`.
+Browser guidance for Claude Code. When Claude needs you to set up Stripe, grab API keys, configure a third-party service, or do anything in a browser — Chromeflow takes over. It highlights what to click, fills in fields it knows, clicks buttons automatically, uploads files, and writes captured values straight to your `.env`.
 ## How it works
 Chromeflow is two things that work together:
-- **MCP server** — gives Claude Code a set of browser tools (`open_page`, `click_element`, `fill_input`, `read_element`, `write_to_env`, etc.)
-- **Chrome extension** — receives those commands and acts on the active tab (highlights, clicks, fills, captures screenshots)
+- **MCP server** — gives Claude Code a set of browser tools (`open_page`, `click_element`, `fill_form`, `set_file_input`, `read_element`, `write_to_env`, etc.)
+- **Chrome extension** — receives those commands and acts on the active tab (highlights, clicks, fills, uploads files, captures screenshots)
 Claude drives the flow. You only touch the browser for things that genuinely need you — login, passwords, payment details, personal choices.
@@ -49,6 +49,40 @@ Just ask Claude normally:
 Claude will navigate, highlight steps, click what it can, pause for anything sensitive, and write values to your `.env` automatically.
+## What Claude can do
+| Capability | Tools |
+|------------|-------|
+| Navigate pages, open new tabs | `open_page`, `list_tabs`, `switch_to_tab` |
+| Click buttons and links | `click_element` |
+| Fill single fields | `fill_input` |
+| Fill multiple fields in one call | `fill_form` |
+| Upload files (even hidden inputs) | `set_file_input` |
+| Read page content as text | `get_page_text` |
+| Inspect all form fields | `get_form_fields` |
+| Scroll to a known element | `scroll_to_element` |
+| Highlight elements for the user | `highlight_region`, `find_and_highlight` |
+| Wait for the user to click | `wait_for_click` |
+| Wait for async changes | `wait_for_selector` |
+| Run arbitrary JS | `execute_script` |
+| Capture credentials to `.env` | `read_element`, `write_to_env` |
+| Screenshot (element location only) | `take_screenshot` |
+| Screenshot + save + copy to clipboard | `take_and_copy_screenshot` |
+| Save/restore form state across tabs | `save_page_state`, `restore_page_state` |
+| Show a step-by-step guide panel | `show_guide_panel`, `mark_step_done` |
+### File uploads
+`set_file_input` uses Chrome DevTools Protocol to bypass the browser's file-input script restriction — the same mechanism used by Playwright and Puppeteer. It works even when the `<input type=file>` is hidden behind a custom drag-and-drop zone.
+```
+set_file_input("Upload", "/Users/you/Downloads/task.zip")
+```
+### Dedicated Claude window
+Click the Chromeflow extension icon and use **"Use this window for Claude"** to lock Claude's browser operations to a specific Chrome window. This lets you freely use other Chrome windows without Claude interfering.
 ## Adding to another project
 Run setup from the new project's directory — the MCP server is already registered globally, this just drops `CLAUDE.md` and tool permissions into the project:

package/dist/tools/browser.js CHANGED Viewed

@@ -122,7 +122,8 @@ save_to controls where the PNG is saved: "downloads" (default) saves to ~/Downlo
     `Get the exact pixel positions of all visible interactive elements on the page (inputs, buttons, links, selects).
 Use this INSTEAD OF take_screenshot when you need coordinates for highlight_region \u2014 the coordinates are exact DOM values, not estimates.
 Returns a numbered list with element type, label, and precise x/y/width/height in CSS pixels.
-After calling this, use those exact coordinates in highlight_region \u2014 do NOT adjust them.`,
+IMPORTANT: x/y are VIEWPORT-relative (0,0 = top-left of the visible area). Use these exact values directly in highlight_region \u2014 do not add window.scrollY.
+Use get_form_fields instead if you need document y positions or fields below the fold.`,
     {},
     async () => {
       const response = await bridge.request({ type: "get_elements" });
@@ -175,7 +176,7 @@ ${lines.join("\n")}${r.warning ?? ""}` }]
 Uses Chrome DevTools Protocol to set the file \u2014 the only way to bypass the browser's file-input script restriction.
 hint: label text or name of the file input (or empty string to target the first file input on the page).
 file_path: absolute path to the file on the local filesystem (e.g. /Users/you/Downloads/task.zip).
-After calling this, call get_page_text to confirm the file was accepted.`,
+After calling this, verify the upload was accepted: use execute_script to check that the input's files.length > 0, or use get_page_text to look for a success indicator (e.g. a Remove button appearing). If not accepted, call set_file_input again \u2014 occasional React timing issues may require a retry.`,
     {
       hint: z.string().describe("Label text, name, or surrounding text of the file input. Use empty string to target the first file input on the page."),
       file_path: z.string().describe("Absolute path to the file to upload (e.g. /Users/you/Downloads/task.zip)")

package/dist/tools/flow.js CHANGED Viewed

@@ -17,14 +17,16 @@ function registerFlowTools(server, bridge) {
     `Click a button, link, or interactive element on the page by its visible text or aria-label.
 Use this whenever Claude can press a button without needing user input \u2014 e.g. "Save", "Continue", "Create product", "Add pricing", "Confirm", "Next".
 After clicking, use get_page_text to check the result \u2014 only use take_screenshot if you need pixel positions.
-Do NOT use for: elements that require the user to make a personal choice, consent to terms, or enter sensitive data.`,
+Do NOT use for: elements that require the user to make a personal choice, consent to terms, or enter sensitive data.
+When multiple elements share the same label (e.g. many "Remove" buttons), use nth to target a specific one (1 = first/topmost, 2 = second, etc.).`,
     {
       textHint: z.string().describe(
         "The visible label of the button or link (e.g. 'Save product', 'Continue', 'Add a product', 'Create')"
-      )
+      ),
+      nth: z.number().int().min(1).optional().describe("Which match to click when multiple elements share the same label (1 = first/topmost, default 1)")
     },
-    async ({ textHint }) => {
-      const response = await bridge.request({ type: "click_element", textHint });
+    async ({ textHint, nth }) => {
+      const response = await bridge.request({ type: "click_element", textHint, nth });
       const r = response;
       if (!r.success) {
         return {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "chromeflow",
-  "version": "0.1.36",
+  "version": "0.1.38",
   "description": "Browser guidance MCP server for Claude Code — highlights, clicks, fills, and captures from the web so you don't have to.",
   "type": "module",
   "bin": {