chromeflow 0.1.40 → 0.1.42
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +26 -1
- package/README.md +30 -5
- package/dist/index.js +48 -0
- package/dist/setup.js +3 -1
- package/dist/tools/browser.js +19 -0
- package/dist/tools/capture.js +2 -1
- package/package.json +1 -1
package/CLAUDE.md
CHANGED
|
@@ -24,7 +24,7 @@ Do NOT ask "should I open the browser?" — just do it. The user expects seamles
|
|
|
24
24
|
|
|
25
25
|
2. **Never use `take_screenshot` to read page content.** After `scroll_page`, after
|
|
26
26
|
`click_element`, after navigation — always call `get_page_text`, not `take_screenshot`.
|
|
27
|
-
`get_page_text` returns up to
|
|
27
|
+
`get_page_text` returns up to 10,000 characters; if truncated it tells you the next
|
|
28
28
|
`startIndex` to paginate. Screenshots are only for locating an element's pixel position
|
|
29
29
|
when DOM queries have already failed. Never take more than 1–2 screenshots in a row.
|
|
30
30
|
|
|
@@ -111,6 +111,18 @@ use `take_and_copy_screenshot()` — it saves a PNG to ~/Downloads and copies it
|
|
|
111
111
|
- `fill_input` and `fill_form` work on React-controlled inputs, contenteditable (Stripe,
|
|
112
112
|
Notion), and **CodeMirror 6 editors** — auto-detected. After filling, the value is read
|
|
113
113
|
back and a warning is shown if React did not accept it.
|
|
114
|
+
- **Monaco editors** (VS Code-style code editors on DataAnnotation, etc.) appear in
|
|
115
|
+
`get_form_fields()` as type "monaco". They cannot be filled via `fill_input` — use
|
|
116
|
+
`execute_script` with the Monaco API instead:
|
|
117
|
+
```js
|
|
118
|
+
// Read content from the first Monaco model
|
|
119
|
+
monaco.editor.getModels()[0].getValue()
|
|
120
|
+
// Write content to the first Monaco model
|
|
121
|
+
monaco.editor.getModels()[0].setValue('new content here')
|
|
122
|
+
```
|
|
123
|
+
- `set_file_input` accepts CSS selectors as the hint (e.g. `#import-problem-file`,
|
|
124
|
+
`.upload-input`) in addition to label text. Use selectors when file inputs are hidden
|
|
125
|
+
behind custom UIs and have no visible label.
|
|
114
126
|
- After any radio/checkbox click that reveals new fields, call `get_form_fields()` again —
|
|
115
127
|
the inventory will include the new fields and warn if more hidden ones still exist.
|
|
116
128
|
- If a form has collapsible sections, expand them all before calling `get_form_fields()` so
|
|
@@ -153,6 +165,14 @@ screenshot to check what happened.
|
|
|
153
165
|
|
|
154
166
|
**Waiting for async results** (build, save, deploy): `wait_for_selector(selector, timeout)` — never poll with screenshots.
|
|
155
167
|
|
|
168
|
+
**Pre-filling `prompt()` and `confirm()` dialogs**: When a page action will trigger a JS
|
|
169
|
+
dialog (e.g. "Save As" calling `prompt()`), call `set_dialog_response` BEFORE the action:
|
|
170
|
+
```
|
|
171
|
+
set_dialog_response(type="prompt", value="my-filename") — next prompt() returns "my-filename"
|
|
172
|
+
set_dialog_response(type="confirm", value="true") — next confirm() returns true
|
|
173
|
+
```
|
|
174
|
+
Then trigger the action (e.g. `click_element("Save As")`). The response is consumed once.
|
|
175
|
+
|
|
156
176
|
**React Select / custom styled dropdowns** (e.g. "Select..." components on DataAnnotation):
|
|
157
177
|
`click_element` and `fill_input` do NOT work on these — they intercept native events. Use
|
|
158
178
|
`execute_script` with the hidden combobox input approach (most reliable):
|
|
@@ -208,4 +228,9 @@ document.body.style.zoom = '0.4';
|
|
|
208
228
|
document.body.style.zoom = '1';
|
|
209
229
|
```
|
|
210
230
|
|
|
231
|
+
**Downloads via `execute_script`**: Creating a Blob URL and clicking an anchor via
|
|
232
|
+
`execute_script` sometimes fails due to CSP or timing. If a download doesn't trigger:
|
|
233
|
+
1. Retry the exact same `execute_script` call
|
|
234
|
+
2. If still failing, use `find_and_highlight` to show the user a download button to click manually
|
|
235
|
+
|
|
211
236
|
**Never use Bash to work around a stuck browser interaction.**
|
package/README.md
CHANGED
|
@@ -1,6 +1,25 @@
|
|
|
1
|
-
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="assets/icon.png" width="120" alt="Chromeflow" />
|
|
3
|
+
</p>
|
|
2
4
|
|
|
3
|
-
|
|
5
|
+
<h1 align="center">Chromeflow</h1>
|
|
6
|
+
|
|
7
|
+
When Claude needs you to set up Stripe, grab API keys, configure a third-party service, or do anything in a browser — Chromeflow takes over. It highlights what to click, fills in fields it knows, clicks buttons automatically, uploads files, and writes captured values straight to your `.env`.
|
|
8
|
+
|
|
9
|
+
## Why Chromeflow?
|
|
10
|
+
|
|
11
|
+
Existing browser automation tools (Playwright, Browser Use, Puppeteer) launch a **fresh, empty browser** — no cookies, no sessions, no extensions. Every time they start, you're logged out of everything and can't handle 2FA.
|
|
12
|
+
|
|
13
|
+
Chromeflow works in **your actual Chrome browser**, where you're already logged into Stripe, AWS, Supabase, and everything else. Claude automates what it can (clicking buttons, filling forms, uploading files) and pauses for anything that needs you (passwords, 2FA, payment details).
|
|
14
|
+
|
|
15
|
+
| | Chromeflow | Playwright / Browser Use |
|
|
16
|
+
|---|---|---|
|
|
17
|
+
| **Browser** | Your real Chrome (sessions intact) | Fresh instance, logged out of everything |
|
|
18
|
+
| **Auth / 2FA** | Already handled — pauses when needed | Can't handle — blocks completely |
|
|
19
|
+
| **Page understanding** | DOM queries (fast, cheap, reliable) | Screenshots + vision model (slow, expensive) |
|
|
20
|
+
| **Human-in-the-loop** | Built-in guide panel, highlights, pauses | Fully autonomous, no interaction |
|
|
21
|
+
| **Integration** | MCP server for Claude Code | Standalone, not Claude Code aware |
|
|
22
|
+
| **Credential capture** | Reads API keys → writes to `.env` | Not designed for this |
|
|
4
23
|
|
|
5
24
|
## How it works
|
|
6
25
|
|
|
@@ -54,20 +73,22 @@ Claude will navigate, highlight steps, click what it can, pause for anything sen
|
|
|
54
73
|
| Capability | Tools |
|
|
55
74
|
|------------|-------|
|
|
56
75
|
| Navigate pages, open new tabs | `open_page`, `list_tabs`, `switch_to_tab` |
|
|
57
|
-
| Click buttons and links | `click_element` |
|
|
58
|
-
| Fill single fields | `fill_input` |
|
|
76
|
+
| Click buttons and links | `click_element` (with `nth` for duplicates) |
|
|
77
|
+
| Fill single fields | `fill_input` (with `nth` for duplicates) |
|
|
59
78
|
| Fill multiple fields in one call | `fill_form` |
|
|
60
79
|
| Upload files (even hidden inputs) | `set_file_input` |
|
|
61
|
-
| Read page content as text | `get_page_text` |
|
|
80
|
+
| Read page content as text | `get_page_text` (with `selector` scoping) |
|
|
62
81
|
| Inspect all form fields | `get_form_fields` |
|
|
63
82
|
| Scroll to a known element | `scroll_to_element` |
|
|
64
83
|
| Highlight elements for the user | `highlight_region`, `find_and_highlight` |
|
|
65
84
|
| Wait for the user to click | `wait_for_click` |
|
|
66
85
|
| Wait for async changes | `wait_for_selector` |
|
|
67
86
|
| Run arbitrary JS | `execute_script` |
|
|
87
|
+
| Read browser console output | `get_console_logs` |
|
|
68
88
|
| Capture credentials to `.env` | `read_element`, `write_to_env` |
|
|
69
89
|
| Screenshot (element location only) | `take_screenshot` |
|
|
70
90
|
| Screenshot + save + copy to clipboard | `take_and_copy_screenshot` |
|
|
91
|
+
| Screenshot the terminal window | `capture_terminal` |
|
|
71
92
|
| Save/restore form state across tabs | `save_page_state`, `restore_page_state` |
|
|
72
93
|
| Show a step-by-step guide panel | `show_guide_panel`, `mark_step_done` |
|
|
73
94
|
|
|
@@ -79,6 +100,10 @@ Claude will navigate, highlight steps, click what it can, pause for anything sen
|
|
|
79
100
|
set_file_input("Upload", "/Users/you/Downloads/task.zip")
|
|
80
101
|
```
|
|
81
102
|
|
|
103
|
+
### Terminal screenshots
|
|
104
|
+
|
|
105
|
+
`capture_terminal` screenshots the terminal window (Terminal, iTerm2, Warp, VS Code, Ghostty, etc.) and saves it as a PNG. Use this with `set_file_input` to upload terminal output to a web form.
|
|
106
|
+
|
|
82
107
|
### Dedicated Claude window
|
|
83
108
|
|
|
84
109
|
Click the Chromeflow extension icon and use **"Use this window for Claude"** to lock Claude's browser operations to a specific Chrome window. This lets you freely use other Chrome windows without Claude interfering.
|
package/dist/index.js
CHANGED
|
@@ -38,6 +38,54 @@ async function main() {
|
|
|
38
38
|
registerHighlightTools(server, bridge);
|
|
39
39
|
registerCaptureTools(server, bridge);
|
|
40
40
|
registerFlowTools(server, bridge);
|
|
41
|
+
server.prompt(
|
|
42
|
+
"chromeflow-status",
|
|
43
|
+
"Check if the chromeflow Chrome extension is connected and which tab is active",
|
|
44
|
+
async () => {
|
|
45
|
+
const connected = bridge.isConnected();
|
|
46
|
+
if (!connected) {
|
|
47
|
+
return {
|
|
48
|
+
messages: [{
|
|
49
|
+
role: "user",
|
|
50
|
+
content: {
|
|
51
|
+
type: "text",
|
|
52
|
+
text: "Check chromeflow status. The Chrome extension is NOT connected. Tell the user to reload the chromeflow extension in chrome://extensions."
|
|
53
|
+
}
|
|
54
|
+
}]
|
|
55
|
+
};
|
|
56
|
+
}
|
|
57
|
+
try {
|
|
58
|
+
const response = await bridge.request({ type: "list_tabs" }, 3e3);
|
|
59
|
+
const tabs = response.tabs;
|
|
60
|
+
const active = tabs.find((t) => t.active);
|
|
61
|
+
const tabList = tabs.map((t) => `${t.index}. ${t.active ? "[active] " : ""}${t.title} \u2014 ${t.url}`).join("\n");
|
|
62
|
+
return {
|
|
63
|
+
messages: [{
|
|
64
|
+
role: "user",
|
|
65
|
+
content: {
|
|
66
|
+
type: "text",
|
|
67
|
+
text: `Check chromeflow status.
|
|
68
|
+
|
|
69
|
+
Extension: Connected
|
|
70
|
+
Active tab: ${active?.title ?? "none"} \u2014 ${active?.url ?? ""}
|
|
71
|
+
All tabs:
|
|
72
|
+
${tabList}`
|
|
73
|
+
}
|
|
74
|
+
}]
|
|
75
|
+
};
|
|
76
|
+
} catch {
|
|
77
|
+
return {
|
|
78
|
+
messages: [{
|
|
79
|
+
role: "user",
|
|
80
|
+
content: {
|
|
81
|
+
type: "text",
|
|
82
|
+
text: "Check chromeflow status. Extension is connected but not responding. The user may need to reload it."
|
|
83
|
+
}
|
|
84
|
+
}]
|
|
85
|
+
};
|
|
86
|
+
}
|
|
87
|
+
}
|
|
88
|
+
);
|
|
41
89
|
const transport = new StdioServerTransport();
|
|
42
90
|
await server.connect(transport);
|
|
43
91
|
console.error("[chromeflow] MCP server running. Waiting for Claude...");
|
package/dist/setup.js
CHANGED
|
@@ -173,7 +173,9 @@ const CHROMEFLOW_TOOLS = [
|
|
|
173
173
|
// v0.1.39+
|
|
174
174
|
"get_console_logs",
|
|
175
175
|
// v0.1.40+
|
|
176
|
-
"capture_terminal"
|
|
176
|
+
"capture_terminal",
|
|
177
|
+
// v0.1.42+
|
|
178
|
+
"set_dialog_response"
|
|
177
179
|
].map((t) => `mcp__chromeflow__${t}`);
|
|
178
180
|
function patchSettingsLocalJson(cwd) {
|
|
179
181
|
const claudeDir = join(cwd, ".claude");
|
package/dist/tools/browser.js
CHANGED
|
@@ -169,6 +169,25 @@ The saved file path can be passed directly to set_file_input(hint, file_path) to
|
|
|
169
169
|
};
|
|
170
170
|
}
|
|
171
171
|
);
|
|
172
|
+
server.tool(
|
|
173
|
+
"set_dialog_response",
|
|
174
|
+
`Pre-set the return value for the next window.prompt() or window.confirm() dialog.
|
|
175
|
+
Call this BEFORE triggering an action that will show a dialog (e.g. a "Save As" button that calls prompt()).
|
|
176
|
+
The response is consumed once \u2014 after the dialog fires, it resets to default behavior.
|
|
177
|
+
For prompt: the value string is returned to the page. For confirm: true/false is returned.`,
|
|
178
|
+
{
|
|
179
|
+
type: z.enum(["prompt", "confirm"]).describe('Which dialog type to pre-fill: "prompt" or "confirm"'),
|
|
180
|
+
value: z.string().describe('For prompt: the string to return. For confirm: "true" or "false"')
|
|
181
|
+
},
|
|
182
|
+
async ({ type, value }) => {
|
|
183
|
+
const jsValue = type === "confirm" ? value === "true" : value;
|
|
184
|
+
const code = `window._chromeflowDialogResponse = window._chromeflowDialogResponse || {}; window._chromeflowDialogResponse.${type} = ${JSON.stringify(jsValue)}; "set"`;
|
|
185
|
+
await bridge.request({ type: "execute_script", code });
|
|
186
|
+
return {
|
|
187
|
+
content: [{ type: "text", text: `Next ${type}() will return ${JSON.stringify(jsValue)}. Now trigger the action that shows the dialog.` }]
|
|
188
|
+
};
|
|
189
|
+
}
|
|
190
|
+
);
|
|
172
191
|
server.tool(
|
|
173
192
|
"clear_overlays",
|
|
174
193
|
"Remove all highlights and callout annotations from the current page. Does NOT remove the guide panel \u2014 the guide panel persists until the next flow starts.",
|
package/dist/tools/capture.js
CHANGED
|
@@ -61,7 +61,8 @@ After filling, call wait_for_click only if the user needs to review/confirm; oth
|
|
|
61
61
|
"get_page_text",
|
|
62
62
|
`Get the visible text content of the current page without taking a screenshot.
|
|
63
63
|
Use this instead of take_screenshot whenever you need to read what's on the page \u2014 errors, build status, form labels, confirmation messages, etc.
|
|
64
|
-
Returns up to
|
|
64
|
+
Returns up to 10,000 characters per call (~3k tokens). If the response ends with "... (N more characters)", call again with startIndex to read the next chunk.
|
|
65
|
+
Use the selector parameter to scope extraction to a specific section and avoid pulling unnecessary content.
|
|
65
66
|
Never use take_screenshot just to read page content \u2014 paginate with startIndex instead.`,
|
|
66
67
|
{
|
|
67
68
|
selector: z.string().optional().describe(
|
package/package.json
CHANGED