barebrowse 0.5.1 → 0.5.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +6 -6
- package/barebrowse.context.md +1 -1
- package/mcp-server.js +1 -1
- package/package.json +1 -1
- package/.barebrowse/page-2026-02-23T15-39-32-011Z.yml +0 -1219
- package/.barebrowse/page-2026-02-23T15-40-19-874Z.yml +0 -663
- package/.mcp.json +0 -8
- package/CLAUDE.md +0 -24
- package/baremobile.md +0 -105
- package/docs/00-context/assumptions.md +0 -38
- package/docs/00-context/system-state.md +0 -402
- package/docs/00-context/vision.md +0 -52
- package/docs/01-product/prd.md +0 -308
- package/docs/03-logs/bug-log.md +0 -16
- package/docs/03-logs/decisions-log.md +0 -32
- package/docs/03-logs/implementation-log.md +0 -54
- package/docs/03-logs/insights.md +0 -35
- package/docs/03-logs/validation-log.md +0 -269
- package/docs/04-process/definition-of-done.md +0 -31
- package/docs/04-process/dev-workflow.md +0 -68
- package/docs/04-process/testing.md +0 -242
- package/docs/README.md +0 -56
- package/docs/archive/poc-plan.md +0 -230
- package/docs/skill-template.md +0 -106
|
@@ -1,269 +0,0 @@
|
|
|
1
|
-
# Validation Log
|
|
2
|
-
|
|
3
|
-
What's been tested against the real world. Updated when new sites or features are validated.
|
|
4
|
-
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
## Test suite (64 tests, 6 files)
|
|
8
|
-
|
|
9
|
-
| File | Tests | Type | What it covers |
|
|
10
|
-
|------|-------|------|----------------|
|
|
11
|
-
| `test/unit/prune.test.js` | 16 | Unit | 9-step pruning pipeline in isolation |
|
|
12
|
-
| `test/unit/auth.test.js` | 7 | Unit | Cookie extraction from Firefox/Chromium |
|
|
13
|
-
| `test/unit/cdp.test.js` | 5 | Unit | Browser discovery, launch, CDP client, sessions |
|
|
14
|
-
| `test/integration/browse.test.js` | 11 | Integration | Full `browse()` and `connect()` pipeline |
|
|
15
|
-
| `test/integration/cli.test.js` | 10 | Integration | CLI session lifecycle: open/snapshot/goto/click/eval/console/network/close |
|
|
16
|
-
| `test/integration/interact.test.js` | 15 | E2E | Real interactions on data: fixtures + live sites |
|
|
17
|
-
|
|
18
|
-
Run all: `node --test test/unit/*.test.js test/integration/*.test.js`
|
|
19
|
-
|
|
20
|
-
## Site validation matrix
|
|
21
|
-
|
|
22
|
-
Tested across 16+ sites, 8 countries, 7 languages.
|
|
23
|
-
|
|
24
|
-
| Site | Consent | Cookies | Interactions | Notes |
|
|
25
|
-
|------|---------|---------|-------------|-------|
|
|
26
|
-
| google.com | NL dialog dismissed | Firefox injection | Search (combobox + Enter) | Bot-blocks headless |
|
|
27
|
-
| youtube.com | Bypassed via cookies | Firefox injection | Search + video playback | Full e2e demo, SPA nav |
|
|
28
|
-
| bbc.com | SourcePoint dismissed | -- | -- | Button outside dialog |
|
|
29
|
-
| wikipedia.org | -- | -- | Link click + navigation | Clean, no consent |
|
|
30
|
-
| github.com | -- | -- | SPA navigation | Needs settle time |
|
|
31
|
-
| duckduckgo.com | -- | -- | Search + results | Headless-friendly |
|
|
32
|
-
| news.ycombinator.com | -- | -- | Story link click | Clean, simple DOM |
|
|
33
|
-
| amazon.de | Banner dismissed | -- | -- | |
|
|
34
|
-
| theguardian.com | CMP dismissed | -- | -- | |
|
|
35
|
-
| spiegel.de | CMP dismissed | -- | -- | German |
|
|
36
|
-
| lemonde.fr | CMP dismissed | -- | -- | French |
|
|
37
|
-
| elpais.com | CMP dismissed | -- | -- | Spanish |
|
|
38
|
-
| corriere.it | CMP dismissed | -- | -- | Italian |
|
|
39
|
-
| nos.nl | CMP dismissed | -- | -- | Dutch |
|
|
40
|
-
| bild.de | CMP dismissed | -- | -- | German |
|
|
41
|
-
| nu.nl | CMP dismissed | -- | -- | Dutch |
|
|
42
|
-
| booking.com | Banner dismissed | -- | -- | |
|
|
43
|
-
| nytimes.com | -- | -- | -- | No consent wall |
|
|
44
|
-
| stackoverflow.com | Footer link only | -- | -- | Not blocking |
|
|
45
|
-
| cnn.com | -- | -- | -- | No consent wall |
|
|
46
|
-
| reddit.com | -- | -- | Fallback to old.reddit | Bot-blocks headless |
|
|
47
|
-
|
|
48
|
-
## Token reduction measurements
|
|
49
|
-
|
|
50
|
-
| Page | Raw ARIA | Pruned | Reduction |
|
|
51
|
-
|------|----------|--------|-----------|
|
|
52
|
-
| example.com | 377 chars | 45 chars | 88% |
|
|
53
|
-
| Hacker News | 51,726 chars | 27,197 chars | 47% |
|
|
54
|
-
| Wikipedia (article) | 109,479 chars | 40,566 chars | 63% |
|
|
55
|
-
| DuckDuckGo | 42,254 chars | 5,407 chars | 87% |
|
|
56
|
-
|
|
57
|
-
---
|
|
58
|
-
|
|
59
|
-
## CLI manual validation (v0.3.0)
|
|
60
|
-
|
|
61
|
-
Full end-to-end validation of every CLI command against real websites.
|
|
62
|
-
|
|
63
|
-
### Session lifecycle
|
|
64
|
-
|
|
65
|
-
| Command | Result |
|
|
66
|
-
|---------|--------|
|
|
67
|
-
| `barebrowse open https://example.com` | Session started, pid+port printed, session.json created |
|
|
68
|
-
| `barebrowse status` | Shows running pid, port, start time |
|
|
69
|
-
| `barebrowse close` | "Session closed", session.json removed, daemon exited |
|
|
70
|
-
| `status` after close | "No session found", exit code 1 |
|
|
71
|
-
| `click 5` with no session | "No active session. Run `barebrowse open` first.", exit 1 |
|
|
72
|
-
| double `open` | "Session already running. Use `barebrowse close` first.", exit 1 |
|
|
73
|
-
|
|
74
|
-
### Navigation + snapshots (example.com, HN)
|
|
75
|
-
|
|
76
|
-
| Command | Result |
|
|
77
|
-
|---------|--------|
|
|
78
|
-
| `snapshot` (example.com) | `.barebrowse/page-*.yml` created, clean formatting |
|
|
79
|
-
| `snapshot --mode=read` | Read mode includes paragraphs, each node on own line |
|
|
80
|
-
| `goto https://news.ycombinator.com` | "ok" |
|
|
81
|
-
| `snapshot` (HN) | Clean ARIA tree with refs, proper newline separation |
|
|
82
|
-
| `screenshot` | Valid 780x493 PNG file |
|
|
83
|
-
|
|
84
|
-
### Interactions (DuckDuckGo search)
|
|
85
|
-
|
|
86
|
-
| Command | Result |
|
|
87
|
-
|---------|--------|
|
|
88
|
-
| `type 12 barebrowse npm` | "ok", multi-word text correctly joined |
|
|
89
|
-
| `press Enter` | "ok", search submitted |
|
|
90
|
-
| `wait-idle` | "ok", waited for network settle |
|
|
91
|
-
| `eval "document.title"` | `"barebrowse npm at DuckDuckGo"` |
|
|
92
|
-
| `snapshot` | Search results page, clean formatting with refs |
|
|
93
|
-
| `fill 2583 hello world` | "ok", cleared search box + typed new text |
|
|
94
|
-
| `hover 2402` | "ok" |
|
|
95
|
-
| `scroll 300` | "ok" |
|
|
96
|
-
|
|
97
|
-
### Debugging commands
|
|
98
|
-
|
|
99
|
-
| Command | Result |
|
|
100
|
-
|---------|--------|
|
|
101
|
-
| `eval "1 + 1"` | `2` |
|
|
102
|
-
| `eval "document.location.href"` | `"https://news.ycombinator.com/news"` |
|
|
103
|
-
| `eval "console.log('test'); console.error('err')"` | `ok` (undefined return) |
|
|
104
|
-
| `console-logs` | `.json (2 entries)` — log + error captured with types and timestamps |
|
|
105
|
-
| `network-log` | `.json (15 entries)` — all requests with URL, method, status |
|
|
106
|
-
| `network-log --failed` | `.json (1 entries)` — filtered to failed/4xx+ only |
|
|
107
|
-
|
|
108
|
-
### Legacy + install commands
|
|
109
|
-
|
|
110
|
-
| Command | Result |
|
|
111
|
-
|---------|--------|
|
|
112
|
-
| `browse https://example.com` | One-shot snapshot to stdout |
|
|
113
|
-
| `install` | "No MCP clients detected" + Claude Code hint |
|
|
114
|
-
| `install --skill` | SKILL.md copied to `~/.config/claude/skills/barebrowse/` |
|
|
115
|
-
| (no args) | Clean help output with all commands |
|
|
116
|
-
|
|
117
|
-
### Bug found and fixed during validation
|
|
118
|
-
|
|
119
|
-
**`src/aria.js` line 23**: ignored nodes joined children with `''` instead of `'\n'`, causing sibling subtrees to concatenate on one line (e.g. `[ref=15]- _promote`). Fixed to `.filter(Boolean).join('\n')`. All 64 tests pass with the fix.
|
|
120
|
-
|
|
121
|
-
---
|
|
122
|
-
|
|
123
|
-
## New features manual validation (v0.4.0)
|
|
124
|
-
|
|
125
|
-
All tested against live sites via CLI session from `/tmp`.
|
|
126
|
-
|
|
127
|
-
### Navigation: back/forward
|
|
128
|
-
|
|
129
|
-
| Command | Result |
|
|
130
|
-
|---------|--------|
|
|
131
|
-
| `open https://example.com` | Session started |
|
|
132
|
-
| `goto https://wikipedia.org` | "ok" |
|
|
133
|
-
| `back` | "ok" — returned to example.com |
|
|
134
|
-
| `forward` | "ok" — returned to wikipedia.org |
|
|
135
|
-
|
|
136
|
-
### File upload
|
|
137
|
-
|
|
138
|
-
| Command | Result |
|
|
139
|
-
|---------|--------|
|
|
140
|
-
| `goto 'data:text/html,<input type="file" id="f"><script>...</script>'` | "ok" |
|
|
141
|
-
| `snapshot` | `button "Choose File" [ref=7]` |
|
|
142
|
-
| `upload 7 /tmp/test-upload.txt` | "ok" |
|
|
143
|
-
| `eval 'document.title'` | `"uploaded"` — onchange fired, confirmed working |
|
|
144
|
-
|
|
145
|
-
### PDF export
|
|
146
|
-
|
|
147
|
-
| Command | Result |
|
|
148
|
-
|---------|--------|
|
|
149
|
-
| (on wikipedia.org) `pdf` | `.barebrowse/page-*.pdf` — 200,716 bytes |
|
|
150
|
-
|
|
151
|
-
### Tabs
|
|
152
|
-
|
|
153
|
-
| Command | Result |
|
|
154
|
-
|---------|--------|
|
|
155
|
-
| `tabs` | `[{"index":0,"url":"https://www.wikipedia.org/","title":"Wikipedia",...}, {"index":1,"url":"about:blank",...}]` |
|
|
156
|
-
|
|
157
|
-
### Wait-for
|
|
158
|
-
|
|
159
|
-
| Command | Result |
|
|
160
|
-
|---------|--------|
|
|
161
|
-
| `wait-for --text=Wikipedia` | "ok" — found text immediately |
|
|
162
|
-
| `wait-for --selector=body` | "ok" — found selector immediately |
|
|
163
|
-
|
|
164
|
-
### JS dialog auto-dismiss
|
|
165
|
-
|
|
166
|
-
| Command | Result |
|
|
167
|
-
|---------|--------|
|
|
168
|
-
| `eval 'alert("hello from dialog"); "done"'` | `"done"` — alert auto-dismissed, eval continued |
|
|
169
|
-
| `dialog-log` | `.barebrowse/dialogs-*.json (1 entries)` — dialog logged with type, message, timestamp |
|
|
170
|
-
|
|
171
|
-
### Save state
|
|
172
|
-
|
|
173
|
-
| Command | Result |
|
|
174
|
-
|---------|--------|
|
|
175
|
-
| `save-state` | `.barebrowse/state-*.json` — 2,836 bytes (cookies + localStorage) |
|
|
176
|
-
|
|
177
|
-
### Viewport flag
|
|
178
|
-
|
|
179
|
-
| Command | Result |
|
|
180
|
-
|---------|--------|
|
|
181
|
-
| `open https://example.com --viewport=800x600` | Session started |
|
|
182
|
-
| `eval 'window.innerWidth + "x" + window.innerHeight'` | `"800x600"` — confirmed |
|
|
183
|
-
|
|
184
|
-
### Drag (wired, needs drag-and-drop UI for visual test)
|
|
185
|
-
|
|
186
|
-
Wired through interact.js → index.js → daemon.js → cli.js. Mouse event sequence: mousePressed at source → mouseMoved to midpoint → mouseMoved to target → mouseReleased at target. Requires a drag-and-drop UI to validate visually.
|
|
187
|
-
|
|
188
|
-
### Proxy flag
|
|
189
|
-
|
|
190
|
-
Wired through cli.js → daemon.js → chromium.js → `--proxy-server` Chromium launch arg. Requires a proxy server to validate.
|
|
191
|
-
|
|
192
|
-
### Storage-state flag
|
|
193
|
-
|
|
194
|
-
Wired through cli.js → daemon.js → connect() → `Network.setCookies` on startup. Loads from JSON file produced by `save-state`.
|
|
195
|
-
|
|
196
|
-
---
|
|
197
|
-
|
|
198
|
-
## MCP server validation (v0.4.1)
|
|
199
|
-
|
|
200
|
-
All 12 MCP tools tested live via Claude Code MCP integration. Stats line (`# X chars → Y chars (N% pruned)`) confirmed on every snapshot.
|
|
201
|
-
|
|
202
|
-
### Tools tested successfully (10/12)
|
|
203
|
-
|
|
204
|
-
| Tool | Test | Result |
|
|
205
|
-
|------|------|--------|
|
|
206
|
-
| `browse` | One-shot HN | `51,397 → 26,983 (48% pruned)` — stats line present |
|
|
207
|
-
| `goto` | DDG, Wikipedia, data: URLs | All navigated successfully |
|
|
208
|
-
| `snapshot` | Multiple pages | Stats line on every snapshot, pruning working |
|
|
209
|
-
| `click` | Wikipedia "About Wikipedia" link | Navigated to target page |
|
|
210
|
-
| `type` | DDG search box `barebrowse npm` | Text entered correctly |
|
|
211
|
-
| `press` | Enter to submit DDG search | Search submitted (CAPTCHA returned — expected headless) |
|
|
212
|
-
| `scroll` | 500px down on Wikipedia:About | Scrolled successfully |
|
|
213
|
-
| `back` | After Wikipedia:About → CDP page | Returned to previous page |
|
|
214
|
-
| `forward` | After back | Returned to Wikipedia:About |
|
|
215
|
-
| `pdf` | Wikipedia:About | 380K base64 PDF generated |
|
|
216
|
-
|
|
217
|
-
### Tools tested with known limitations (2/12)
|
|
218
|
-
|
|
219
|
-
| Tool | Test | Result |
|
|
220
|
-
|------|------|--------|
|
|
221
|
-
| `upload` | data: page with file input | `ok` returned, file set via DOM.setFileInputFiles. onchange fires but result text pruned in act mode (non-interactive content). Works in integration tests. |
|
|
222
|
-
| `drag` | data: page with draggable divs | Mouse events dispatched but HTML5 drag/drop dataTransfer not populated via CDP synthetic events. Known CDP limitation (same as Playwright). |
|
|
223
|
-
|
|
224
|
-
### Observations
|
|
225
|
-
|
|
226
|
-
- DDG returned CAPTCHA in headless ("Select all squares containing a duck") — expected, hybrid mode handles this
|
|
227
|
-
- Stats line format: `# 42,367 chars → 5,453 chars (87% pruned)` — present on all pruned snapshots
|
|
228
|
-
- Token reduction ranges observed: 37% (Wikipedia) to 88% (example.com)
|
|
229
|
-
|
|
230
|
-
---
|
|
231
|
-
|
|
232
|
-
## MCP cookies + hybrid fallback validation (v0.4.2)
|
|
233
|
-
|
|
234
|
-
Three changes tested: all-browser cookie merge in auth.js, hybrid mode for connect(), cookie injection + hybrid in MCP goto.
|
|
235
|
-
|
|
236
|
-
### Cookie injection — login-walled sites via MCP goto
|
|
237
|
-
|
|
238
|
-
| Site | Logged In? | Details |
|
|
239
|
-
|------|-----------|---------|
|
|
240
|
-
| **Gmail** | Yes | Full inbox visible: Compose, labels, 4 emails. Required domain-stripping fix (`mail.google.com` → `google.com`) to capture parent-domain cookies (SID, HSID, etc.). 47 cookies merged from Firefox + Chromium. |
|
|
241
|
-
| **YouTube** | Yes | Personalized feed: tabs for Linux, AI, Electrical Engineering. Recommendations include Claude Code videos, KDE Plasma. Account buttons visible. |
|
|
242
|
-
| **LinkedIn** | Yes | Full feed as Amr Hassan: Home, My Network, Jobs, Messaging, Notifications. Posts visible. Stealth patches + cookies bypassed LinkedIn's aggressive bot detection. |
|
|
243
|
-
| **Amazon.nl** | No (expected) | Not logged in but consent dismissed, search + product pages worked. Cookie injection had no effect (no Amazon session in Firefox). |
|
|
244
|
-
| **GitHub** | No | Shows generic homepage with "Sign in". No GitHub session cookies in Firefox. |
|
|
245
|
-
|
|
246
|
-
### Bot detection — hybrid fallback
|
|
247
|
-
|
|
248
|
-
| Site | Headless Result | Hybrid Fallback | Final Result |
|
|
249
|
-
|------|----------------|-----------------|--------------|
|
|
250
|
-
| **Google Search** | Full results, no CAPTCHA | Not triggered (stealth sufficient) | Pass — logged in as Amr Hassan |
|
|
251
|
-
| **Reddit** | "Prove your humanity" + reCAPTCHA | Triggered → connected to headed Chromium on 9222 | Pass — full feed with posts, logged in |
|
|
252
|
-
| **LinkedIn** | Loaded fine with stealth + cookies | Not triggered | Pass |
|
|
253
|
-
|
|
254
|
-
### Bug fixes discovered during validation
|
|
255
|
-
|
|
256
|
-
1. **Domain stripping in authenticate()**: `mail.google.com` extracted only 9 cookies (subdomain-specific). Fix: strip to registrable domain (`google.com`) → 47 cookies including all auth cookies (SID, HSID, SSID, APISID, SAPISID).
|
|
257
|
-
2. **Reddit challenge detection**: Block page shows "Prove your humanity" and "File a ticket" — neither matched existing challenge phrases. Added both to `isChallengePage()`.
|
|
258
|
-
|
|
259
|
-
### connect() hybrid mode
|
|
260
|
-
|
|
261
|
-
Tested `connect({ mode: 'hybrid' })` with Reddit: headless detected challenge → killed browser → connected to headed Chromium → Reddit loaded with full content. Same code path as MCP session.
|
|
262
|
-
|
|
263
|
-
### All-browser cookie merge
|
|
264
|
-
|
|
265
|
-
`extractCookies({ domain: 'google.com' })` in auto mode: Chromium cookies merged first, then Firefox cookies (last-write-wins by `name@domain`). 47 cookies total for google.com. Previous behavior: stopped at first browser found (Chromium only, missed Firefox session).
|
|
266
|
-
|
|
267
|
-
---
|
|
268
|
-
|
|
269
|
-
*Add new validation entries when testing against new sites or features.*
|
|
@@ -1,31 +0,0 @@
|
|
|
1
|
-
# Definition of Done
|
|
2
|
-
|
|
3
|
-
A feature or change is "done" when ALL of these are true.
|
|
4
|
-
|
|
5
|
-
## Code
|
|
6
|
-
|
|
7
|
-
- [ ] Works end-to-end (not just the happy path)
|
|
8
|
-
- [ ] No heavy dependencies added (vanilla -> stdlib -> external hierarchy respected)
|
|
9
|
-
- [ ] Under reasonable line count -- no bloat
|
|
10
|
-
- [ ] Clean process management -- no orphan browser processes
|
|
11
|
-
- [ ] No security vulnerabilities introduced (command injection, XSS, etc.)
|
|
12
|
-
|
|
13
|
-
## Tests
|
|
14
|
-
|
|
15
|
-
- [ ] Existing tests still pass: `node --test test/unit/*.test.js test/integration/*.test.js`
|
|
16
|
-
- [ ] New behavior has test coverage (integration preferred over unit)
|
|
17
|
-
- [ ] Bug fixes include a regression test that fails before the fix
|
|
18
|
-
|
|
19
|
-
## Documentation
|
|
20
|
-
|
|
21
|
-
- [ ] `docs/00-context/system-state.md` updated if architecture changed
|
|
22
|
-
- [ ] `docs/03-logs/decisions-log.md` updated if a design decision was made
|
|
23
|
-
- [ ] `barebrowse.context.md` updated if public API changed
|
|
24
|
-
- [ ] `CHANGELOG.md` updated with what changed
|
|
25
|
-
|
|
26
|
-
## Not required (avoid over-engineering)
|
|
27
|
-
|
|
28
|
-
- 100% code coverage
|
|
29
|
-
- TypeScript types
|
|
30
|
-
- Cross-platform testing (Linux first, others later)
|
|
31
|
-
- Performance benchmarks (unless performance is the feature)
|
|
@@ -1,68 +0,0 @@
|
|
|
1
|
-
# Development Workflow
|
|
2
|
-
|
|
3
|
-
## Dev rules
|
|
4
|
-
|
|
5
|
-
**POC first.** Always validate logic with a ~15min proof-of-concept before building. Cover happy path + common edges. POC works -> design properly -> build with tests. Never ship the POC.
|
|
6
|
-
|
|
7
|
-
**Build incrementally.** Break work into small independent modules. One piece at a time, each must work on its own before integrating.
|
|
8
|
-
|
|
9
|
-
**Dependency hierarchy -- follow strictly:**
|
|
10
|
-
1. Vanilla language -- write it yourself if <50 lines and not security-critical
|
|
11
|
-
2. Standard library -- `node:test`, `node:fs`, `node:crypto`, `node:sqlite`
|
|
12
|
-
3. External -- only when stdlib can't do it in <100 lines. Must be maintained, lightweight, widely adopted
|
|
13
|
-
|
|
14
|
-
**Exception:** Always use vetted libraries for security-critical code (crypto, auth, sanitization).
|
|
15
|
-
|
|
16
|
-
**Lightweight over complex.** Fewer moving parts, fewer deps, less config. Simple > clever. Readable > elegant.
|
|
17
|
-
|
|
18
|
-
**Open-source only.** No vendor lock-in. Every line of code must have a purpose -- no speculative code, no premature abstractions.
|
|
19
|
-
|
|
20
|
-
## Language and runtime
|
|
21
|
-
|
|
22
|
-
- Vanilla JavaScript, ES modules, no build step
|
|
23
|
-
- Node.js >= 22 (built-in WebSocket, built-in SQLite)
|
|
24
|
-
- No TypeScript -- can add types later if needed
|
|
25
|
-
|
|
26
|
-
## Running tests
|
|
27
|
-
|
|
28
|
-
```bash
|
|
29
|
-
# All 54 tests
|
|
30
|
-
node --test test/unit/*.test.js test/integration/*.test.js
|
|
31
|
-
|
|
32
|
-
# Unit only (fast, no network)
|
|
33
|
-
node --test test/unit/prune.test.js
|
|
34
|
-
node --test test/unit/auth.test.js
|
|
35
|
-
node --test test/unit/cdp.test.js
|
|
36
|
-
|
|
37
|
-
# Integration (needs Chromium + network)
|
|
38
|
-
node --test test/integration/browse.test.js
|
|
39
|
-
node --test test/integration/interact.test.js
|
|
40
|
-
|
|
41
|
-
# Quick smoke test
|
|
42
|
-
node -e "import { browse } from './src/index.js'; console.log(await browse('https://example.com'))"
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
## Testing standards
|
|
46
|
-
|
|
47
|
-
- **Test behavior, not implementation.** Call the public API, assert on observable output.
|
|
48
|
-
- **Integration tests are the sweet spot.** Real components working together.
|
|
49
|
-
- **No test framework deps.** `node:test` and `node:assert/strict` only.
|
|
50
|
-
- **Always `page.close()` in a `finally` block** to avoid leaked browser processes.
|
|
51
|
-
- **Use `data:` URL fixtures** for deterministic tests (no network dependency).
|
|
52
|
-
- **Real-site tests** go in `interact.test.js`, grouped by site.
|
|
53
|
-
|
|
54
|
-
See `docs/04-process/testing.md` for the full test guide.
|
|
55
|
-
|
|
56
|
-
## Git workflow
|
|
57
|
-
|
|
58
|
-
- Main branch: `main`
|
|
59
|
-
- Commit messages: conventional (`fix:`, `feat:`, `chore:`, `docs:`, `release:`)
|
|
60
|
-
- No force pushes to main
|
|
61
|
-
|
|
62
|
-
## Environment
|
|
63
|
-
|
|
64
|
-
- OS: Fedora Linux, KDE Plasma, Wayland
|
|
65
|
-
- Node: 22.22.0
|
|
66
|
-
- Browser: `/usr/bin/chromium-browser`
|
|
67
|
-
- Default browser: Firefox (cookies extracted from `~/.mozilla/firefox/*.default-release/cookies.sqlite`)
|
|
68
|
-
- KWallet has Chromium Safe Storage key
|
|
@@ -1,242 +0,0 @@
|
|
|
1
|
-
# barebrowse -- Testing Guide
|
|
2
|
-
|
|
3
|
-
## Run all tests
|
|
4
|
-
|
|
5
|
-
```bash
|
|
6
|
-
node --test test/unit/*.test.js test/integration/*.test.js
|
|
7
|
-
```
|
|
8
|
-
|
|
9
|
-
64 tests, 6 files, ~60s on a typical machine. No test framework -- uses Node's built-in `node:test` runner.
|
|
10
|
-
|
|
11
|
-
## Test pyramid
|
|
12
|
-
|
|
13
|
-
```
|
|
14
|
-
/ E2E \ 15 tests — real websites (Google, Wikipedia, GitHub, etc.)
|
|
15
|
-
/----------\
|
|
16
|
-
/ Integration \ 21 tests — browse/connect pipeline + CLI session lifecycle
|
|
17
|
-
/----------------\
|
|
18
|
-
/ Unit \ 28 tests — pruning, cookie extraction, CDP client, browser launch
|
|
19
|
-
/--------------------\
|
|
20
|
-
```
|
|
21
|
-
|
|
22
|
-
Unit tests are fast and isolated. Integration tests launch a real headless Chromium. E2E tests (part of interact.test.js) hit live websites — they require internet and may be slower or flaky on CI.
|
|
23
|
-
|
|
24
|
-
---
|
|
25
|
-
|
|
26
|
-
## Unit tests (28 tests)
|
|
27
|
-
|
|
28
|
-
### `test/unit/prune.test.js` -- 16 tests
|
|
29
|
-
|
|
30
|
-
Tests the 9-step ARIA pruning pipeline in isolation. No browser, no network.
|
|
31
|
-
|
|
32
|
-
| # | Test | What it validates |
|
|
33
|
-
|---|------|-------------------|
|
|
34
|
-
| 1 | returns null for empty tree | prune(null) returns null |
|
|
35
|
-
| 2 | unwraps RootWebArea | Root container node stripped from output |
|
|
36
|
-
| 3 | keeps interactive elements in act mode | Buttons, links, textboxes survive pruning |
|
|
37
|
-
| 4 | drops paragraphs in act mode | Non-interactive text removed in act mode |
|
|
38
|
-
| 5 | keeps paragraphs in browse mode | Text content preserved in browse/read mode |
|
|
39
|
-
| 6 | drops InlineTextBox noise | Low-level rendering nodes always filtered |
|
|
40
|
-
| 7 | keeps headings | h1/h2 headings preserved in browse mode |
|
|
41
|
-
| 8 | drops description headings in act mode | Only primary h1 kept, secondary headings removed |
|
|
42
|
-
| 9 | collapses unnamed structural wrappers | Nested generic divs flattened, children promoted |
|
|
43
|
-
| 10 | keeps named groups | Radiogroup/radio elements preserved |
|
|
44
|
-
| 11 | drops separators | Separator/hr nodes always removed |
|
|
45
|
-
| 12 | drops images in act mode, keeps named in browse | Act strips all images, browse keeps named ones |
|
|
46
|
-
| 13 | trims combobox to just name + selected value | Combobox children (options list) stripped |
|
|
47
|
-
| 14 | uses context keywords to condense non-matching cards | Context filtering collapses irrelevant list items |
|
|
48
|
-
| 15 | extracts main landmark when present | Act mode keeps only main content area |
|
|
49
|
-
| 16 | handles pages without landmarks (HN-style) | Pruning works on flat, landmark-less pages |
|
|
50
|
-
|
|
51
|
-
### `test/unit/auth.test.js` -- 7 tests
|
|
52
|
-
|
|
53
|
-
Tests cookie extraction from the local filesystem. Reads real browser cookie databases.
|
|
54
|
-
|
|
55
|
-
| # | Test | What it validates |
|
|
56
|
-
|---|------|-------------------|
|
|
57
|
-
| 1 | auto-detects a browser and returns cookies | extractCookies() finds Firefox or Chromium and returns array |
|
|
58
|
-
| 2 | returns cookies with correct shape | Each cookie has name, value, domain, path, secure, httpOnly, sameSite, expires |
|
|
59
|
-
| 3 | filters by domain | Domain filter parameter restricts results |
|
|
60
|
-
| 4 | extracts from firefox explicitly | `{ browser: 'firefox' }` parameter works |
|
|
61
|
-
| 5 | throws for non-existent browser | Error thrown for unknown browser string |
|
|
62
|
-
| 6 | cookies have non-empty values | All returned cookies have non-empty value strings |
|
|
63
|
-
| 7 | sameSite is a valid value | sameSite is one of 'None', 'Lax', or 'Strict' |
|
|
64
|
-
|
|
65
|
-
Note: 2 tests may skip when Chromium profile is locked by a running instance (AES decryption needs keyring access).
|
|
66
|
-
|
|
67
|
-
### `test/unit/cdp.test.js` -- 5 tests
|
|
68
|
-
|
|
69
|
-
Tests browser discovery, launch, CDP WebSocket client, and session handling.
|
|
70
|
-
|
|
71
|
-
| # | Test | What it validates |
|
|
72
|
-
|---|------|-------------------|
|
|
73
|
-
| 1 | finds a Chromium-based browser | findBrowser() returns path to chromium/chrome/brave/edge |
|
|
74
|
-
| 2 | launches headless Chromium and returns WebSocket URL | launch() returns valid ws:// URL, port, and live process |
|
|
75
|
-
| 3 | connects to browser and sends commands | createCDP() connects, Browser.getVersion responds |
|
|
76
|
-
| 4 | creates session-scoped handles | Target.createTarget + session() dispatches to correct target |
|
|
77
|
-
| 5 | gets accessibility tree from a page | Accessibility.getFullAXTree returns nodes with role/name |
|
|
78
|
-
|
|
79
|
-
---
|
|
80
|
-
|
|
81
|
-
## Integration tests (11 tests)
|
|
82
|
-
|
|
83
|
-
### `test/integration/browse.test.js` -- 11 tests
|
|
84
|
-
|
|
85
|
-
Tests the full `browse()` and `connect()` pipeline end-to-end against real pages.
|
|
86
|
-
|
|
87
|
-
| # | Suite | Test | What it validates |
|
|
88
|
-
|---|-------|------|-------------------|
|
|
89
|
-
| 1 | browse() | returns ARIA snapshot for a public page | browse('example.com') returns non-empty snapshot with title |
|
|
90
|
-
| 2 | browse() | includes heading and ref markers | Snapshot contains roles and [ref=N] markers |
|
|
91
|
-
| 3 | browse() | prunes by default (act mode) | Pruned output smaller than raw ARIA tree |
|
|
92
|
-
| 4 | browse() | browse mode preserves paragraphs | pruneMode: 'browse' keeps text content |
|
|
93
|
-
| 5 | browse() | act mode drops paragraphs | pruneMode: 'act' removes non-interactive text |
|
|
94
|
-
| 6 | browse() | handles complex pages with significant reduction | Hacker News pruned by at least 20% |
|
|
95
|
-
| 7 | browse() | can disable cookies | cookies: false works without error |
|
|
96
|
-
| 8 | browse() | can disable pruning | prune: false keeps raw RootWebArea |
|
|
97
|
-
| 9 | connect() | creates a long-lived session and navigates | connect() + goto() + snapshot() works |
|
|
98
|
-
| 10 | connect() | supports multiple navigations in one session | Multiple goto() calls on same page |
|
|
99
|
-
| 11 | connect() | snapshot accepts prune: false for raw output | snapshot(false) preserves full tree |
|
|
100
|
-
|
|
101
|
-
### `test/integration/cli.test.js` -- 10 tests
|
|
102
|
-
|
|
103
|
-
Tests the full CLI session lifecycle: daemon spawn, command dispatch over HTTP, and cleanup. Uses a temp directory so tests don't pollute the project.
|
|
104
|
-
|
|
105
|
-
| # | Test | What it validates |
|
|
106
|
-
|---|------|-------------------|
|
|
107
|
-
| 1 | open starts a daemon and creates session.json | `barebrowse open about:blank` spawns daemon, writes session.json with port+pid |
|
|
108
|
-
| 2 | status shows running session | `barebrowse status` reports pid, port, start time |
|
|
109
|
-
| 3 | snapshot creates a .yml file | `barebrowse snapshot` writes .barebrowse/page-*.yml |
|
|
110
|
-
| 4 | goto navigates and snapshot shows new page content | `barebrowse goto example.com` + snapshot contains "Example Domain" + refs |
|
|
111
|
-
| 5 | click sends click command | `barebrowse click <ref>` returns "ok" |
|
|
112
|
-
| 6 | eval executes JS and returns result | `barebrowse eval 1+1` returns "2" |
|
|
113
|
-
| 7 | console-logs creates a .json file | After eval with console.log, `console-logs` writes JSON |
|
|
114
|
-
| 8 | network-log creates a .json file | `network-log` writes JSON with request entries |
|
|
115
|
-
| 9 | close shuts down the daemon | `barebrowse close` removes session.json, daemon exits |
|
|
116
|
-
| 10 | status after close shows no session | `barebrowse status` exits non-zero when no session |
|
|
117
|
-
|
|
118
|
-
Note: Tests run sequentially within the suite (each depends on the session opened in test 1). The `after()` hook ensures daemon cleanup even if tests fail.
|
|
119
|
-
|
|
120
|
-
---
|
|
121
|
-
|
|
122
|
-
## E2E tests (15 tests)
|
|
123
|
-
|
|
124
|
-
### `test/integration/interact.test.js` -- 15 tests
|
|
125
|
-
|
|
126
|
-
Tests real interactions: clicking, typing, scrolling, form submission, and navigation. Uses a local `data:` URL fixture for deterministic tests, plus live websites for real-world coverage.
|
|
127
|
-
|
|
128
|
-
#### Data URL fixture tests (7 tests)
|
|
129
|
-
|
|
130
|
-
| # | Test | What it validates |
|
|
131
|
-
|---|------|-------------------|
|
|
132
|
-
| 1 | click sets button result text | page.click(ref) triggers onclick handler |
|
|
133
|
-
| 2 | type fills an empty input | page.type(ref, text) fills empty textbox |
|
|
134
|
-
| 3 | type with clear replaces existing text | { clear: true } replaces prefilled input |
|
|
135
|
-
| 4 | click on offscreen element scrolls into view first | Auto-scroll before click on element at 3000px |
|
|
136
|
-
| 5 | press Enter submits a form | page.press('Enter') triggers form onsubmit |
|
|
137
|
-
| 6 | press throws on unknown key | Error thrown for unrecognized key names |
|
|
138
|
-
| 7 | link click + waitForNavigation navigates | Cross-page navigation via click + waitForNavigation |
|
|
139
|
-
|
|
140
|
-
#### Live website tests (8 tests)
|
|
141
|
-
|
|
142
|
-
| # | Site | Test | What it validates |
|
|
143
|
-
|---|------|------|-------------------|
|
|
144
|
-
| 1 | Google | search and navigate results | type() + press('Enter') + waitForNavigation() on Google |
|
|
145
|
-
| 2 | Wikipedia | navigate article links | click() + waitForNavigation() on Wikipedia article links |
|
|
146
|
-
| 3 | GitHub | navigate SPA repo links | click() works for SPA navigation (no loadEventFired) |
|
|
147
|
-
| 4 | DuckDuckGo | search query and verify results | type() + press('Enter') + navigation on DDG |
|
|
148
|
-
| 5 | Hacker News | load homepage and navigate to a story | click() + waitForNavigation() on HN story links |
|
|
149
|
-
| 6 | Reddit (old) | load and navigate to a post | Page navigation with fallback to www.reddit.com |
|
|
150
|
-
| 7 | Firefox cookies | extract and inject into CDP session | extractCookies() + injectCookies() workflow |
|
|
151
|
-
| 8 | Firefox cookies | extractCookies with firefox returns array | Explicit browser parameter returns proper array |
|
|
152
|
-
|
|
153
|
-
---
|
|
154
|
-
|
|
155
|
-
## Manual validation (v0.4.0 features)
|
|
156
|
-
|
|
157
|
-
Features added in v0.4.0 are manually validated but not yet in the automated test suite. See `docs/03-logs/validation-log.md` for full results.
|
|
158
|
-
|
|
159
|
-
| Feature | Validation method | Result |
|
|
160
|
-
|---------|-------------------|--------|
|
|
161
|
-
| `back` / `forward` | example.com → wikipedia → back → forward | ok |
|
|
162
|
-
| `upload <ref> <files..>` | data: URL with file input, verified onchange fired | ok |
|
|
163
|
-
| `pdf` | Wikipedia export, 200KB PDF | ok |
|
|
164
|
-
| `tabs` | Listed 2 tabs with urls/titles | ok |
|
|
165
|
-
| `wait-for --text` | Found "Wikipedia" text | ok |
|
|
166
|
-
| `wait-for --selector` | Found `body` selector | ok |
|
|
167
|
-
| `dialog-log` | alert() auto-dismissed, 1 entry logged | ok |
|
|
168
|
-
| `save-state` | 2.8KB cookies + localStorage JSON | ok |
|
|
169
|
-
| `--viewport=WxH` | 800x600, confirmed via innerWidth/innerHeight | ok |
|
|
170
|
-
| `drag` | Wired through all layers, needs drag UI to visually test |
|
|
171
|
-
| `--proxy` | Wired to Chromium launch arg, needs proxy to test |
|
|
172
|
-
| `--storage-state` | Wired to Network.setCookies, loads from save-state output |
|
|
173
|
-
|
|
174
|
-
---
|
|
175
|
-
|
|
176
|
-
## Writing new tests
|
|
177
|
-
|
|
178
|
-
Follow the existing pattern:
|
|
179
|
-
|
|
180
|
-
```javascript
|
|
181
|
-
import { describe, it } from 'node:test';
|
|
182
|
-
import assert from 'node:assert/strict';
|
|
183
|
-
import { connect } from '../../src/index.js';
|
|
184
|
-
|
|
185
|
-
describe('my feature', () => {
|
|
186
|
-
it('does the thing', async () => {
|
|
187
|
-
const page = await connect();
|
|
188
|
-
try {
|
|
189
|
-
await page.goto('https://example.com');
|
|
190
|
-
const snap = await page.snapshot();
|
|
191
|
-
assert.ok(snap.includes('Example Domain'));
|
|
192
|
-
} finally {
|
|
193
|
-
await page.close();
|
|
194
|
-
}
|
|
195
|
-
});
|
|
196
|
-
});
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
Key conventions:
|
|
200
|
-
- Always `page.close()` in a `finally` block to avoid leaked browser processes
|
|
201
|
-
- Use `data:` URL fixtures for deterministic tests (no network dependency)
|
|
202
|
-
- Real-site tests go in interact.test.js, grouped by site in `describe()` blocks
|
|
203
|
-
- Use `assert.ok()` and `assert.strictEqual()` from `node:assert/strict`
|
|
204
|
-
- No test framework dependencies -- `node:test` only
|
|
205
|
-
|
|
206
|
-
### Data URL fixture pattern
|
|
207
|
-
|
|
208
|
-
For testing interactions without network:
|
|
209
|
-
|
|
210
|
-
```javascript
|
|
211
|
-
const FIXTURE = `data:text/html,${encodeURIComponent(`
|
|
212
|
-
<html><body>
|
|
213
|
-
<button onclick="document.getElementById('r').textContent='clicked'">Click Me</button>
|
|
214
|
-
<div id="r"></div>
|
|
215
|
-
</body></html>
|
|
216
|
-
`)}`;
|
|
217
|
-
|
|
218
|
-
it('clicks the button', async () => {
|
|
219
|
-
const page = await connect();
|
|
220
|
-
try {
|
|
221
|
-
await page.goto(FIXTURE);
|
|
222
|
-
const snap = await page.snapshot({ mode: 'browse' });
|
|
223
|
-
const ref = findRef(snap, 'button', 'Click Me');
|
|
224
|
-
await page.click(ref);
|
|
225
|
-
const snap2 = await page.snapshot({ mode: 'browse' });
|
|
226
|
-
assert.ok(snap2.includes('clicked'));
|
|
227
|
-
} finally {
|
|
228
|
-
await page.close();
|
|
229
|
-
}
|
|
230
|
-
});
|
|
231
|
-
```
|
|
232
|
-
|
|
233
|
-
---
|
|
234
|
-
|
|
235
|
-
## CI considerations
|
|
236
|
-
|
|
237
|
-
- Unit tests: fast, no network, always safe to run
|
|
238
|
-
- Integration tests: need Chromium installed, no network (uses example.com/HN but tolerates failures)
|
|
239
|
-
- E2E tests: need internet, may be flaky (sites change, rate limits, geo-blocks)
|
|
240
|
-
- Recommended CI split: run unit + integration always, E2E on manual trigger or nightly
|
|
241
|
-
- Each test launches/kills its own browser instance -- no shared state between tests
|
|
242
|
-
- Auth tests may skip when Chromium profile is locked by a running instance
|
package/docs/README.md
DELETED
|
@@ -1,56 +0,0 @@
|
|
|
1
|
-
# barebrowse -- Documentation
|
|
2
|
-
|
|
3
|
-
## Navigation
|
|
4
|
-
|
|
5
|
-
### 00-context/ -- Why and what exists
|
|
6
|
-
|
|
7
|
-
| File | What's in it |
|
|
8
|
-
|------|-------------|
|
|
9
|
-
| [vision.md](00-context/vision.md) | What barebrowse is, what it's not, the core insight, success criteria |
|
|
10
|
-
| [assumptions.md](00-context/assumptions.md) | Hard constraints, assumptions, known limitations, risks |
|
|
11
|
-
| [system-state.md](00-context/system-state.md) | Current architecture, full pipeline, module table, capabilities, tested sites |
|
|
12
|
-
|
|
13
|
-
### 01-product/ -- What the product must do
|
|
14
|
-
|
|
15
|
-
| File | What's in it |
|
|
16
|
-
|------|-------------|
|
|
17
|
-
| [prd.md](01-product/prd.md) | Product requirements, API design, three modes, pruning strategy, future features |
|
|
18
|
-
|
|
19
|
-
### 02-features/ -- How features are designed
|
|
20
|
-
|
|
21
|
-
*Feature-specific docs go here as the project grows.*
|
|
22
|
-
|
|
23
|
-
### 03-logs/ -- What changed over time
|
|
24
|
-
|
|
25
|
-
| File | What's in it |
|
|
26
|
-
|------|-------------|
|
|
27
|
-
| [decisions-log.md](03-logs/decisions-log.md) | Settled design decisions with rationale (don't re-debate these) |
|
|
28
|
-
| [implementation-log.md](03-logs/implementation-log.md) | What changed per version (summary of CHANGELOG) |
|
|
29
|
-
| [bug-log.md](03-logs/bug-log.md) | Bugs: symptom, root cause, fix, regression test |
|
|
30
|
-
| [validation-log.md](03-logs/validation-log.md) | Test suite results, site validation matrix, token reduction measurements |
|
|
31
|
-
| [insights.md](03-logs/insights.md) | Lessons learned, repos studied, technical patterns |
|
|
32
|
-
|
|
33
|
-
### 04-process/ -- How to work with this system
|
|
34
|
-
|
|
35
|
-
| File | What's in it |
|
|
36
|
-
|------|-------------|
|
|
37
|
-
| [dev-workflow.md](04-process/dev-workflow.md) | Dev rules, dependency hierarchy, running tests, environment setup |
|
|
38
|
-
| [definition-of-done.md](04-process/definition-of-done.md) | Checklist: when is a feature/fix actually done |
|
|
39
|
-
| [testing.md](04-process/testing.md) | Test pyramid, all 64 tests documented, writing new tests, CI strategy |
|
|
40
|
-
|
|
41
|
-
### archive/ -- Historical docs
|
|
42
|
-
|
|
43
|
-
| File | Why archived |
|
|
44
|
-
|------|-------------|
|
|
45
|
-
| [poc-plan.md](archive/poc-plan.md) | All 4 POC phases completed. Useful bits migrated to system-state.md and testing.md. |
|
|
46
|
-
|
|
47
|
-
## Also at project root
|
|
48
|
-
|
|
49
|
-
| File | Purpose |
|
|
50
|
-
|------|---------|
|
|
51
|
-
| `README.md` | Public-facing project overview |
|
|
52
|
-
| `barebrowse.context.md` | LLM-consumable integration guide (full API, gotchas, wiring) |
|
|
53
|
-
| `commands/barebrowse.md` | CLI command reference for any agent (same as SKILL.md without frontmatter) |
|
|
54
|
-
| `commands/barebrowse/SKILL.md` | CLI command reference for Claude Code (copy to `.claude/skills/`) |
|
|
55
|
-
| `CHANGELOG.md` | Detailed version-by-version changelog |
|
|
56
|
-
| `CLAUDE.md` | AI agent instructions for this project |
|