arise-browser 0.2.3 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -13,15 +13,46 @@ metadata:
|
|
|
13
13
|
|
|
14
14
|
# AriseBrowser
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
You are controlling a **real Chrome browser**, like a human sitting in front of a screen. You see the page through snapshots, and you interact by clicking, typing, and selecting — not by writing JavaScript or constructing URLs.
|
|
17
17
|
|
|
18
18
|
## MANDATORY RULES
|
|
19
19
|
|
|
20
20
|
**You MUST follow these rules. No exceptions.**
|
|
21
21
|
|
|
22
|
-
1. **Do NOT call any
|
|
23
|
-
2. **
|
|
24
|
-
3. **
|
|
22
|
+
1. **Wait for ready.** Do NOT call any endpoint until `/health` returns `{"connected":true}`.
|
|
23
|
+
2. **Snapshot is your eyes.** After every navigate or significant action, call `/snapshot` to see what's on the page. Read the snapshot to find element refs (e0, e5, e12...) and understand the page structure.
|
|
24
|
+
3. **Act through refs.** To click a button, select a dropdown, or type in a field — use `/action` with the ref from your snapshot. Do NOT construct URLs with query parameters to change page state. Use `select`, `click`, and `type` actions instead.
|
|
25
|
+
4. **NEVER use `/evaluate` to extract data.** The snapshot already contains all visible text, links, buttons, and form elements in a structured format. `/evaluate` is only for rare edge cases where data is hidden from the accessibility tree.
|
|
26
|
+
5. **NEVER use `/text` as your primary data source.** `/text` returns unstructured plain text that is hard to parse. Use `/snapshot` — it gives you structured elements with refs, roles, names, and links.
|
|
27
|
+
6. **Refs are persistent.** Do NOT re-snapshot just to reuse a ref. Only re-snapshot when the page content changes.
|
|
28
|
+
|
|
29
|
+
## How to Think
|
|
30
|
+
|
|
31
|
+
You are a person using a browser. Snapshot is your eyes, action is your hands.
|
|
32
|
+
|
|
33
|
+
- **To sort results** → find the sort dropdown in the snapshot → use `select` action on its ref
|
|
34
|
+
- **To search** → find the search box ref → `type` your query → `press_key` Enter
|
|
35
|
+
- **To go to next page** → find the "Next" button ref → `click` it
|
|
36
|
+
- **To read product info** → it's already in the snapshot (names, prices, ratings are all there as text)
|
|
37
|
+
|
|
38
|
+
### Example: Sort Amazon results by Best Sellers
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
# 1. Navigate
|
|
42
|
+
curl -X POST /navigate -d '{"url": "https://amazon.com/s?k=laptop"}'
|
|
43
|
+
|
|
44
|
+
# 2. Snapshot — see the page
|
|
45
|
+
curl /snapshot
|
|
46
|
+
# → combobox "Sort by:" [ref=e187] with options including "Best Sellers"
|
|
47
|
+
# → link "Product Name" [ref=e226], generic "4.4" [ref=e231], link "$599" [ref=e246]
|
|
48
|
+
|
|
49
|
+
# 3. Select from dropdown using ref
|
|
50
|
+
curl -X POST /action -d '{"type": "select", "ref": "e187", "value": "exact-aware-popularity-rank"}'
|
|
51
|
+
|
|
52
|
+
# 4. Snapshot again — results are now sorted
|
|
53
|
+
curl /snapshot
|
|
54
|
+
# → Read product names, prices, ratings directly from snapshot text
|
|
55
|
+
```
|
|
25
56
|
|
|
26
57
|
## Step 1: Start the Server
|
|
27
58
|
|
|
@@ -51,20 +82,14 @@ After installation succeeds, inform the user:
|
|
|
51
82
|
- `6090/tcp` — Neko WebRTC UI (browser live view)
|
|
52
83
|
- `52000-52100/udp` — WebRTC media data
|
|
53
84
|
2. **Watch the browser**: Open `http://<server-ip>:6090` in your browser, password: `neko`
|
|
54
|
-
3. **HTTPS (optional)**: For production, put
|
|
55
|
-
4. **Passwords**: Default Neko passwords are `neko` (viewer) and `admin` (admin). Change via
|
|
85
|
+
3. **HTTPS (optional)**: For production, put Caddy in front with `tls internal` (self-signed, no domain needed).
|
|
86
|
+
4. **Passwords**: Default Neko passwords are `neko` (viewer) and `admin` (admin). Change via CLI flags.
|
|
56
87
|
|
|
57
|
-
## Step 2:
|
|
88
|
+
## Step 2: Core Loop
|
|
58
89
|
|
|
59
90
|
Base URL: `http://localhost:9867`
|
|
60
91
|
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
```
|
|
64
|
-
Navigate → Snapshot → Act → Snapshot → Act → ... → Done
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
### Navigate
|
|
92
|
+
### Navigate to a URL
|
|
68
93
|
|
|
69
94
|
```bash
|
|
70
95
|
curl -X POST http://localhost:9867/navigate \
|
|
@@ -72,61 +97,55 @@ curl -X POST http://localhost:9867/navigate \
|
|
|
72
97
|
-d '{"url": "https://example.com"}'
|
|
73
98
|
```
|
|
74
99
|
|
|
75
|
-
### Snapshot
|
|
100
|
+
### Snapshot — see the page
|
|
76
101
|
|
|
77
|
-
Returns a YAML accessibility tree
|
|
102
|
+
Returns a YAML accessibility tree. Every interactive element has a ref you can act on.
|
|
78
103
|
|
|
79
104
|
```bash
|
|
80
|
-
# Full snapshot
|
|
81
105
|
curl http://localhost:9867/snapshot
|
|
106
|
+
```
|
|
82
107
|
|
|
83
|
-
|
|
84
|
-
|
|
108
|
+
What you'll see in a snapshot:
|
|
109
|
+
```yaml
|
|
110
|
+
- combobox "Sort by:" [ref=e187] ← dropdown, use select action
|
|
111
|
+
- link "Product Name" [ref=e226] ← clickable link
|
|
112
|
+
- textbox "Search" [ref=e14] ← input field, use type action
|
|
113
|
+
- button "Add to cart" [ref=e281] ← button, use click action
|
|
114
|
+
- generic "4.4" [ref=e231] ← text content (rating)
|
|
115
|
+
- generic "$599.99" [ref=e246] ← text content (price)
|
|
85
116
|
```
|
|
86
117
|
|
|
87
|
-
|
|
118
|
+
Use `?diff=true` after the first snapshot to only see changes (saves tokens).
|
|
88
119
|
|
|
89
|
-
|
|
120
|
+
### Act — interact with elements
|
|
121
|
+
|
|
122
|
+
Use the ref from your snapshot:
|
|
90
123
|
|
|
91
124
|
```bash
|
|
92
|
-
# Click
|
|
125
|
+
# Click a link or button
|
|
93
126
|
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
94
|
-
-d '{"type": "click", "ref": "
|
|
127
|
+
-d '{"type": "click", "ref": "e226"}'
|
|
95
128
|
|
|
96
|
-
# Type text
|
|
129
|
+
# Type in a text field
|
|
97
130
|
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
98
|
-
-d '{"type": "type", "ref": "
|
|
131
|
+
-d '{"type": "type", "ref": "e14", "text": "search query"}'
|
|
99
132
|
|
|
100
|
-
# Press key
|
|
133
|
+
# Press a key (Enter, Tab, Escape, etc.)
|
|
101
134
|
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
102
135
|
-d '{"type": "press_key", "key": "Enter"}'
|
|
103
136
|
|
|
104
|
-
#
|
|
137
|
+
# Select from a dropdown
|
|
105
138
|
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
106
|
-
-d '{"type": "
|
|
139
|
+
-d '{"type": "select", "ref": "e187", "value": "option-value"}'
|
|
107
140
|
|
|
108
|
-
#
|
|
141
|
+
# Scroll down
|
|
109
142
|
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
110
|
-
-d '{"type": "
|
|
111
|
-
|
|
112
|
-
# Select dropdown
|
|
113
|
-
curl -X POST http://localhost:9867/action -H "Content-Type: application/json" \
|
|
114
|
-
-d '{"type": "select", "ref": "e3", "value": "option1"}'
|
|
143
|
+
-d '{"type": "scroll", "direction": "down", "amount": 500}'
|
|
115
144
|
```
|
|
116
145
|
|
|
117
|
-
###
|
|
146
|
+
### Repeat
|
|
118
147
|
|
|
119
|
-
|
|
120
|
-
# Page text
|
|
121
|
-
curl http://localhost:9867/text
|
|
122
|
-
|
|
123
|
-
# Screenshot (JPEG)
|
|
124
|
-
curl http://localhost:9867/screenshot > screenshot.jpg
|
|
125
|
-
|
|
126
|
-
# Execute JavaScript
|
|
127
|
-
curl -X POST http://localhost:9867/evaluate -H "Content-Type: application/json" \
|
|
128
|
-
-d '{"expression": "document.title"}'
|
|
129
|
-
```
|
|
148
|
+
After each action that changes the page, snapshot again to see the result. Then act on the new refs.
|
|
130
149
|
|
|
131
150
|
## Step 3: Stop
|
|
132
151
|
|
|
@@ -138,11 +157,11 @@ The Docker container is automatically stopped and cleaned up.
|
|
|
138
157
|
|
|
139
158
|
## Tips
|
|
140
159
|
|
|
160
|
+
- **Read the snapshot carefully.** Product names, prices, ratings, links — they're all there. No need for JavaScript or regex.
|
|
141
161
|
- Use `?diff=true` after the first snapshot to save tokens.
|
|
142
|
-
- Refs persist across snapshots — don't re-snapshot just to reuse a ref.
|
|
143
162
|
- Batch actions: `POST /actions` with `{"actions": [...], "stopOnError": true}`.
|
|
144
163
|
- Tabs: `GET /tabs`, `POST /tab` with `{"action": "create|switch|close"}`.
|
|
145
|
-
-
|
|
164
|
+
- Screenshot (`GET /screenshot`) is useful to show the user what you see, but do NOT use it as your primary data source.
|
|
146
165
|
|
|
147
166
|
## Troubleshooting
|
|
148
167
|
|
|
@@ -151,9 +170,5 @@ The Docker container is automatically stopped and cleaned up.
|
|
|
151
170
|
| First run slow | Docker pulling Neko image (~700MB), wait ~2 min |
|
|
152
171
|
| Health returns `connected: false` | Chrome crashed — restart arise-browser |
|
|
153
172
|
| Neko UI loads but no video | Open UDP 52000-52100 in firewall/security group |
|
|
154
|
-
|
|
|
155
|
-
|
|
|
156
|
-
|
|
157
|
-
## Full API Reference
|
|
158
|
-
|
|
159
|
-
See [references/api.md](references/api.md) for all endpoints, parameters, and advanced features (recording, PDF export, batch actions).
|
|
173
|
+
| Action returns error | Snapshot first to get valid refs, then act on them |
|
|
174
|
+
| Can't find an element | Scroll down and snapshot again — element may be below the fold |
|