browse-agent-cli 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json ADDED
@@ -0,0 +1,40 @@
1
+ {
2
+ "name": "browse-agent-cli",
3
+ "version": "0.0.1",
4
+ "type": "module",
5
+ "description": "TypeScript CLI for browse-agent",
6
+ "main": "./dist/cli.js",
7
+ "types": "./dist/cli.d.ts",
8
+ "exports": {
9
+ ".": {
10
+ "types": "./dist/cli.d.ts",
11
+ "import": "./dist/cli.js"
12
+ },
13
+ "./script": {
14
+ "types": "./dist/script.d.ts",
15
+ "import": "./dist/script.js"
16
+ }
17
+ },
18
+ "bin": {
19
+ "browse-agent": "./dist/cli.js"
20
+ },
21
+ "files": [
22
+ "dist/**/*.js",
23
+ "dist/**/*.d.ts",
24
+ "skill/**"
25
+ ],
26
+ "scripts": {
27
+ "build": "tsdown",
28
+ "clean": "rm -rf dist",
29
+ "prepublishOnly": "npm run clean && npm run build"
30
+ },
31
+ "dependencies": {
32
+ "browse-agent-sdk": "^0.0.2"
33
+ },
34
+ "devDependencies": {
35
+ "@types/node": "^22.15.3",
36
+ "tsdown": "^0.13.3",
37
+ "typescript": "^5.4.0"
38
+ },
39
+ "license": "MIT"
40
+ }
package/skill/SKILL.md ADDED
@@ -0,0 +1,126 @@
1
+ ---
2
+ name: browse-agent
3
+ description: "Browse web pages and extract data using a real Chrome browser controlled via browse-agent-sdk. Use when: AI needs to visit a URL, read webpage content, scrape data, take page screenshots, query DOM elements, run JavaScript on live pages, get page text or HTML, or interact with web applications. Triggers: 'browse website', 'visit URL', 'scrape page', 'get web content', 'screenshot page', 'extract from website', 'read webpage', 'open URL', 'web data'."
4
+ argument-hint: "URL to visit, or describe what data to extract from the web"
5
+ ---
6
+
7
+ # Browse Agent — Web Browsing & Data Extraction
8
+
9
+ Control a real Chrome browser to visit web pages, extract content, take screenshots, query DOM elements, and run JavaScript — all via `browse-agent-sdk`.
10
+
11
+ ## Prerequisites
12
+
13
+ - Node.js 18+
14
+ - One of: Google Chrome, Chromium, Microsoft Edge, or Brave Browser
15
+
16
+ ## Setup (One-Time)
17
+
18
+ ```bash
19
+ browse-agent setup
20
+ ```
21
+
22
+ Auto-detects local vs global mode. Use `--global` to force global installation in `~/.browse-agent/`. The setup checks existing state — if already installed, it exits immediately.
23
+
24
+ ## Usage Procedure — Step-by-Step (Recommended)
25
+
26
+ Browse **interactively** — one command per step. Observe each result before deciding the next action, just like a human browsing.
27
+
28
+ ### Step 1: Launch Browser
29
+
30
+ ```bash
31
+ browse-agent launch 2>/dev/null
32
+ ```
33
+
34
+ Returns session info as JSON. The browser stays open until you close it.
35
+
36
+ ### Step 2: Navigate and Explore
37
+
38
+ Run each command separately. Read the output, then decide the next action:
39
+
40
+ ```bash
41
+ # Open a page — returns { tabId, url, title }
42
+ browse-agent navigate "https://example.com" 2>/dev/null
43
+
44
+ # Read the page content — returns { content, url, title }
45
+ browse-agent get-content --format text 2>/dev/null
46
+
47
+ # Specify a tab by ID (from navigate output) — use --tabId with any feature command
48
+ browse-agent get-content --format text --tabId 123 2>/dev/null
49
+
50
+ # Query specific elements — returns { result }
51
+ browse-agent get-dom "h1" --property innerText 2>/dev/null
52
+
53
+ # Run JavaScript — returns { result }
54
+ browse-agent evaluate "document.title" 2>/dev/null
55
+
56
+ # Take a screenshot — returns { data (base64), format, width, height }
57
+ browse-agent screenshot visible 2>/dev/null
58
+ ```
59
+
60
+ **Key principle**: Each command outputs JSON to stdout. Use `--tabId <id>` to target a specific tab (ID comes from `navigate` or `tabs list` output). Parse the output, reason about it, then choose the next command. Don't pre-plan the entire interaction.
61
+
62
+ ### Step 3: Close Browser
63
+
64
+ ```bash
65
+ browse-agent close 2>/dev/null
66
+ ```
67
+
68
+ ### Typical Interaction Flow
69
+
70
+ ```
71
+ launch → navigate URL → get-content (read page) → get-dom (extract specific data)
72
+ → navigate another URL → get-content → ...
73
+ → close
74
+ ```
75
+
76
+ Each step is independent. If the page content isn't what you expected, you can navigate elsewhere, try different selectors, or run JavaScript to interact with the page — all based on what you see.
77
+
78
+ ### CLI Options
79
+
80
+ | Option | Applies to | Description |
81
+ |---|---|---|
82
+ | `--global` | setup, clear | Use global installation (`~/.browse-agent/`) |
83
+ | `--browser <name>` | launch | `chrome` \| `chromium` \| `edge` \| `brave` (default: chrome) |
84
+ | `--headless` | launch | Run without visible window |
85
+ | `--port <number>` | launch, connect, feature cmds | WebSocket port (default: 9315) |
86
+ | `--tabId <id>` | all feature cmds | Target a specific tab (ID from `navigate` or `tabs list`) |
87
+ | `--format <type>` | get-content, screenshot | `text` \| `html` (content) or `png` \| `jpeg` (screenshot) |
88
+ | `--property <prop>` | get-dom | `outerHTML` \| `innerHTML` \| `innerText` |
89
+ | `--all` | get-dom | Return all matches instead of first only |
90
+ | `--quality <num>` | screenshot | JPEG quality 1-100 |
91
+ | `--timeout <ms>` | connect, feature cmds | Connection timeout in ms |
92
+
93
+ Run `browse-agent --help` to see all commands and options.
94
+
95
+ For full CLI commands, see [CLI Reference](./references/cli.md).
96
+
97
+ ## One-Shot Script (Alternative)
98
+
99
+ When you already know the exact steps needed, use the [browse launcher](./scripts/browse.mjs) to run everything in one script:
100
+
101
+ ```javascript
102
+ import { browse } from 'browse-agent-cli/script';
103
+
104
+ await browse(async (agent) => {
105
+ await agent.navigate('https://example.com');
106
+ const content = await agent.getContent({ format: 'text' });
107
+ return { url: content.url, title: content.title, content: content.content };
108
+ });
109
+ ```
110
+
111
+ Run with: `node _browse_task.mjs 2>/dev/null`
112
+
113
+ For full API and script examples, see [API Reference](./references/api.md) and [Examples](./references/examples.md).
114
+
115
+ ## Cleanup
116
+
117
+ ```bash
118
+ browse-agent clear # remove local installation
119
+ browse-agent clear --global # remove global installation
120
+ ```
121
+
122
+ ## References
123
+
124
+ - [API Reference](./references/api.md) — Full method signatures, options, modular scripts
125
+ - [CLI Reference](./references/cli.md) — All CLI commands and options
126
+ - [Examples & Troubleshooting](./references/examples.md) — Code examples and common fixes