npm - chromeflow - Versions diffs - 0.1.40 → 0.1.42 - Mend

chromeflow 0.1.40 → 0.1.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CLAUDE.md CHANGED Viewed

@@ -24,7 +24,7 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
 2. **Never use `take_screenshot` to read page content.** After `scroll_page`, after
    `click_element`, after navigation — always call `get_page_text`, not `take_screenshot`.
-   `get_page_text` returns up to 20,000 characters; if truncated it tells you the next
+   `get_page_text` returns up to 10,000 characters; if truncated it tells you the next
    `startIndex` to paginate. Screenshots are only for locating an element's pixel position
    when DOM queries have already failed. Never take more than 1–2 screenshots in a row.
@@ -111,6 +111,18 @@ use `take_and_copy_screenshot()` — it saves a PNG to ~/Downloads and copies it
 - `fill_input` and `fill_form` work on React-controlled inputs, contenteditable (Stripe,
   Notion), and **CodeMirror 6 editors** — auto-detected. After filling, the value is read
   back and a warning is shown if React did not accept it.
+- **Monaco editors** (VS Code-style code editors on DataAnnotation, etc.) appear in
+  `get_form_fields()` as type "monaco". They cannot be filled via `fill_input` — use
+  `execute_script` with the Monaco API instead:
+  ```js
+  // Read content from the first Monaco model
+  monaco.editor.getModels()[0].getValue()
+  // Write content to the first Monaco model
+  monaco.editor.getModels()[0].setValue('new content here')
+  ```
+- `set_file_input` accepts CSS selectors as the hint (e.g. `#import-problem-file`,
+  `.upload-input`) in addition to label text. Use selectors when file inputs are hidden
+  behind custom UIs and have no visible label.
 - After any radio/checkbox click that reveals new fields, call `get_form_fields()` again —
   the inventory will include the new fields and warn if more hidden ones still exist.
 - If a form has collapsible sections, expand them all before calling `get_form_fields()` so
@@ -153,6 +165,14 @@ screenshot to check what happened.
 **Waiting for async results** (build, save, deploy): `wait_for_selector(selector, timeout)` — never poll with screenshots.
+**Pre-filling `prompt()` and `confirm()` dialogs**: When a page action will trigger a JS
+dialog (e.g. "Save As" calling `prompt()`), call `set_dialog_response` BEFORE the action:
+```
+set_dialog_response(type="prompt", value="my-filename")   — next prompt() returns "my-filename"
+set_dialog_response(type="confirm", value="true")          — next confirm() returns true
+```
+Then trigger the action (e.g. `click_element("Save As")`). The response is consumed once.
 **React Select / custom styled dropdowns** (e.g. "Select..." components on DataAnnotation):
 `click_element` and `fill_input` do NOT work on these — they intercept native events. Use
 `execute_script` with the hidden combobox input approach (most reliable):
@@ -208,4 +228,9 @@ document.body.style.zoom = '0.4';
 document.body.style.zoom = '1';
 ```
+**Downloads via `execute_script`**: Creating a Blob URL and clicking an anchor via
+`execute_script` sometimes fails due to CSP or timing. If a download doesn't trigger:
+1. Retry the exact same `execute_script` call
+2. If still failing, use `find_and_highlight` to show the user a download button to click manually
 **Never use Bash to work around a stuck browser interaction.**

package/README.md CHANGED Viewed

@@ -1,6 +1,25 @@
-# Chromeflow
+<p align="center">
+  <img src="assets/icon.png" width="120" alt="Chromeflow" />
+</p>
-Browser guidance for Claude Code. When Claude needs you to set up Stripe, grab API keys, configure a third-party service, or do anything in a browser — Chromeflow takes over. It highlights what to click, fills in fields it knows, clicks buttons automatically, uploads files, and writes captured values straight to your `.env`.
+<h1 align="center">Chromeflow</h1>
+When Claude needs you to set up Stripe, grab API keys, configure a third-party service, or do anything in a browser — Chromeflow takes over. It highlights what to click, fills in fields it knows, clicks buttons automatically, uploads files, and writes captured values straight to your `.env`.
+## Why Chromeflow?
+Existing browser automation tools (Playwright, Browser Use, Puppeteer) launch a **fresh, empty browser** — no cookies, no sessions, no extensions. Every time they start, you're logged out of everything and can't handle 2FA.
+Chromeflow works in **your actual Chrome browser**, where you're already logged into Stripe, AWS, Supabase, and everything else. Claude automates what it can (clicking buttons, filling forms, uploading files) and pauses for anything that needs you (passwords, 2FA, payment details).
+| | Chromeflow | Playwright / Browser Use |
+|---|---|---|
+| **Browser** | Your real Chrome (sessions intact) | Fresh instance, logged out of everything |
+| **Auth / 2FA** | Already handled — pauses when needed | Can't handle — blocks completely |
+| **Page understanding** | DOM queries (fast, cheap, reliable) | Screenshots + vision model (slow, expensive) |
+| **Human-in-the-loop** | Built-in guide panel, highlights, pauses | Fully autonomous, no interaction |
+| **Integration** | MCP server for Claude Code | Standalone, not Claude Code aware |
+| **Credential capture** | Reads API keys → writes to `.env` | Not designed for this |
 ## How it works
@@ -54,20 +73,22 @@ Claude will navigate, highlight steps, click what it can, pause for anything sen
 | Capability | Tools |
 |------------|-------|
 | Navigate pages, open new tabs | `open_page`, `list_tabs`, `switch_to_tab` |
-| Click buttons and links | `click_element` |
-| Fill single fields | `fill_input` |
+| Click buttons and links | `click_element` (with `nth` for duplicates) |
+| Fill single fields | `fill_input` (with `nth` for duplicates) |
 | Fill multiple fields in one call | `fill_form` |
 | Upload files (even hidden inputs) | `set_file_input` |
-| Read page content as text | `get_page_text` |
+| Read page content as text | `get_page_text` (with `selector` scoping) |
 | Inspect all form fields | `get_form_fields` |
 | Scroll to a known element | `scroll_to_element` |
 | Highlight elements for the user | `highlight_region`, `find_and_highlight` |
 | Wait for the user to click | `wait_for_click` |
 | Wait for async changes | `wait_for_selector` |
 | Run arbitrary JS | `execute_script` |
+| Read browser console output | `get_console_logs` |
 | Capture credentials to `.env` | `read_element`, `write_to_env` |
 | Screenshot (element location only) | `take_screenshot` |
 | Screenshot + save + copy to clipboard | `take_and_copy_screenshot` |
+| Screenshot the terminal window | `capture_terminal` |
 | Save/restore form state across tabs | `save_page_state`, `restore_page_state` |
 | Show a step-by-step guide panel | `show_guide_panel`, `mark_step_done` |
@@ -79,6 +100,10 @@ Claude will navigate, highlight steps, click what it can, pause for anything sen
 set_file_input("Upload", "/Users/you/Downloads/task.zip")
 ```
+### Terminal screenshots
+`capture_terminal` screenshots the terminal window (Terminal, iTerm2, Warp, VS Code, Ghostty, etc.) and saves it as a PNG. Use this with `set_file_input` to upload terminal output to a web form.
 ### Dedicated Claude window
 Click the Chromeflow extension icon and use **"Use this window for Claude"** to lock Claude's browser operations to a specific Chrome window. This lets you freely use other Chrome windows without Claude interfering.

package/dist/index.js CHANGED Viewed

@@ -38,6 +38,54 @@ async function main() {
   registerHighlightTools(server, bridge);
   registerCaptureTools(server, bridge);
   registerFlowTools(server, bridge);
+  server.prompt(
+    "chromeflow-status",
+    "Check if the chromeflow Chrome extension is connected and which tab is active",
+    async () => {
+      const connected = bridge.isConnected();
+      if (!connected) {
+        return {
+          messages: [{
+            role: "user",
+            content: {
+              type: "text",
+              text: "Check chromeflow status. The Chrome extension is NOT connected. Tell the user to reload the chromeflow extension in chrome://extensions."
+            }
+          }]
+        };
+      }
+      try {
+        const response = await bridge.request({ type: "list_tabs" }, 3e3);
+        const tabs = response.tabs;
+        const active = tabs.find((t) => t.active);
+        const tabList = tabs.map((t) => `${t.index}. ${t.active ? "[active] " : ""}${t.title} \u2014 ${t.url}`).join("\n");
+        return {
+          messages: [{
+            role: "user",
+            content: {
+              type: "text",
+              text: `Check chromeflow status.
+Extension: Connected
+Active tab: ${active?.title ?? "none"} \u2014 ${active?.url ?? ""}
+All tabs:
+${tabList}`
+            }
+          }]
+        };
+      } catch {
+        return {
+          messages: [{
+            role: "user",
+            content: {
+              type: "text",
+              text: "Check chromeflow status. Extension is connected but not responding. The user may need to reload it."
+            }
+          }]
+        };
+      }
+    }
+  );
   const transport = new StdioServerTransport();
   await server.connect(transport);
   console.error("[chromeflow] MCP server running. Waiting for Claude...");

package/dist/setup.js CHANGED Viewed

@@ -173,7 +173,9 @@ const CHROMEFLOW_TOOLS = [
   // v0.1.39+
   "get_console_logs",
   // v0.1.40+
-  "capture_terminal"
+  "capture_terminal",
+  // v0.1.42+
+  "set_dialog_response"
 ].map((t) => `mcp__chromeflow__${t}`);
 function patchSettingsLocalJson(cwd) {
   const claudeDir = join(cwd, ".claude");

package/dist/tools/browser.js CHANGED Viewed

@@ -169,6 +169,25 @@ The saved file path can be passed directly to set_file_input(hint, file_path) to
       };
     }
   );
+  server.tool(
+    "set_dialog_response",
+    `Pre-set the return value for the next window.prompt() or window.confirm() dialog.
+Call this BEFORE triggering an action that will show a dialog (e.g. a "Save As" button that calls prompt()).
+The response is consumed once \u2014 after the dialog fires, it resets to default behavior.
+For prompt: the value string is returned to the page. For confirm: true/false is returned.`,
+    {
+      type: z.enum(["prompt", "confirm"]).describe('Which dialog type to pre-fill: "prompt" or "confirm"'),
+      value: z.string().describe('For prompt: the string to return. For confirm: "true" or "false"')
+    },
+    async ({ type, value }) => {
+      const jsValue = type === "confirm" ? value === "true" : value;
+      const code = `window._chromeflowDialogResponse = window._chromeflowDialogResponse || {}; window._chromeflowDialogResponse.${type} = ${JSON.stringify(jsValue)}; "set"`;
+      await bridge.request({ type: "execute_script", code });
+      return {
+        content: [{ type: "text", text: `Next ${type}() will return ${JSON.stringify(jsValue)}. Now trigger the action that shows the dialog.` }]
+      };
+    }
+  );
   server.tool(
     "clear_overlays",
     "Remove all highlights and callout annotations from the current page. Does NOT remove the guide panel \u2014 the guide panel persists until the next flow starts.",

package/dist/tools/capture.js CHANGED Viewed

@@ -61,7 +61,8 @@ After filling, call wait_for_click only if the user needs to review/confirm; oth
     "get_page_text",
     `Get the visible text content of the current page without taking a screenshot.
 Use this instead of take_screenshot whenever you need to read what's on the page \u2014 errors, build status, form labels, confirmation messages, etc.
-Returns up to 20,000 characters at a time. If the response ends with "... (N more characters)", call again with startIndex to read the next chunk.
+Returns up to 10,000 characters per call (~3k tokens). If the response ends with "... (N more characters)", call again with startIndex to read the next chunk.
+Use the selector parameter to scope extraction to a specific section and avoid pulling unnecessary content.
 Never use take_screenshot just to read page content \u2014 paginate with startIndex instead.`,
     {
       selector: z.string().optional().describe(

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "chromeflow",
-  "version": "0.1.40",
+  "version": "0.1.42",
   "description": "Browser guidance MCP server for Claude Code — highlights, clicks, fills, and captures from the web so you don't have to.",
   "type": "module",
   "bin": {