onecrawl 4.0.0-alpha.37
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +77 -0
- package/assets/.github/copilot-instructions.md +44 -0
- package/assets/AGENTS.copilot-cli.MD +113 -0
- package/assets/AGENTS.md +66 -0
- package/assets/AGENTS.vscode.MD +113 -0
- package/assets/skills/README.md +35 -0
- package/assets/skills/breaking-change-paths/SKILL.md +31 -0
- package/assets/skills/completion-gate/SKILL.md +35 -0
- package/assets/skills/e2e-testing/SKILL.md +30 -0
- package/assets/skills/github-sync/SKILL.md +39 -0
- package/assets/skills/interaction-loop/SKILL.md +34 -0
- package/assets/skills/onecrawl-commands/SKILL.md +447 -0
- package/assets/skills/planning-tracking/SKILL.md +37 -0
- package/assets/skills/policy-coherence-audit/SKILL.md +29 -0
- package/assets/skills/programmatic-tool-calling/SKILL.md +41 -0
- package/assets/skills/rollback-rca/SKILL.md +36 -0
- package/assets/skills/session-logging/SKILL.md +39 -0
- package/assets/skills/systematic-debugging/SKILL.md +114 -0
- package/assets/skills/testing-policy/SKILL.md +37 -0
- package/bin/cli.js +81 -0
- package/lib/index.js +2 -0
- package/lib/skills.js +167 -0
- package/package.json +44 -0
- package/scripts/postinstall.js +115 -0
|
@@ -0,0 +1,447 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: onecrawl-commands
|
|
3
|
+
description: "Complete guide to OneCrawl CLI commands. Use primitives first, eval only as fallback."
|
|
4
|
+
---
|
|
5
|
+
# OneCrawl Commands Skill
|
|
6
|
+
|
|
7
|
+
## Core Principle
|
|
8
|
+
|
|
9
|
+
**Always use OneCrawl primitives first. Use `eval` only as a last resort.**
|
|
10
|
+
|
|
11
|
+
OneCrawl provides 500+ typed, validated CLI commands and 546 MCP tool actions
|
|
12
|
+
across 18 tools. Each one is faster, safer, and more debuggable than raw JavaScript
|
|
13
|
+
evaluation. Reserve `eval` for cases where no primitive exists.
|
|
14
|
+
|
|
15
|
+
## Decision Flowchart
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
Need to interact with a page?
|
|
19
|
+
├─ Navigation? → navigate, back, forward, reload
|
|
20
|
+
├─ Read content? → get text/html/url/title/value/attr/count
|
|
21
|
+
├─ Click something? → click, find text/role/label <target> click
|
|
22
|
+
├─ Fill a form? → fill, type, select-option, check/uncheck
|
|
23
|
+
├─ Wait for state? → wait-for-selector, wait-for-text, wait-for-load
|
|
24
|
+
├─ Take screenshot? → screenshot --full, screenshot --element <sel>
|
|
25
|
+
├─ Extract data? → extract content json, extract metadata
|
|
26
|
+
├─ Cookie ops? → cookie get/set/delete/export/import
|
|
27
|
+
├─ Auth state? → auth-state save/load, account export/import
|
|
28
|
+
├─ Passkeys? → auth passkey-enable/register, auth vault-list
|
|
29
|
+
├─ Network control? → network block, throttle, route, intercept
|
|
30
|
+
├─ HAR/traffic? → har start/drain/export, network-log start/drain
|
|
31
|
+
├─ Domain blocking? → domain block/unblock/stats/list/categories
|
|
32
|
+
├─ Proxy? → proxy create-pool, proxy-health check/rank
|
|
33
|
+
├─ Coverage? → coverage js-start/stop, css-start/report
|
|
34
|
+
├─ Performance? → perf trace-start/stop, metrics, timing
|
|
35
|
+
├─ Accessibility? → a11y tree/element/audit
|
|
36
|
+
├─ Workers/iframes? → worker list, iframe list/eval/content
|
|
37
|
+
├─ Page changes? → page-watcher start/drain/stop/state
|
|
38
|
+
├─ Visual diff? → diff screenshot/url, screenshot-diff
|
|
39
|
+
├─ HTTP requests? → request execute/batch
|
|
40
|
+
├─ Stealth? → stealth inject, stealth detection-audit
|
|
41
|
+
├─ None of the above? → eval "<js expression>" ← LAST RESORT
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## Command Categories
|
|
45
|
+
|
|
46
|
+
### 1. Session Management (start here)
|
|
47
|
+
```bash
|
|
48
|
+
session start # Default: daemon mode, normal Chrome
|
|
49
|
+
session start --engine lightpanda # Lightpanda browser engine
|
|
50
|
+
session start --session work # Named session (multi-session)
|
|
51
|
+
session start -H # Headless mode
|
|
52
|
+
session start --shared-browser # Incognito context in parent session
|
|
53
|
+
session start --import-passkey FILE # Auto-inject passkeys on connect
|
|
54
|
+
session info # Show active session
|
|
55
|
+
session close # Close session
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Multi-agent isolation** (enabled by default via `session_auto_isolate`):
|
|
59
|
+
- Each agent auto-gets a unique session name (e.g. `default-2`, `default-3`)
|
|
60
|
+
- Session files use advisory locks (fcntl) to prevent corruption
|
|
61
|
+
- Session writes are atomic (tmp + rename pattern)
|
|
62
|
+
- `--shared-browser` creates an isolated BrowserContext in a parent session
|
|
63
|
+
|
|
64
|
+
### 2. Navigation (most common)
|
|
65
|
+
```bash
|
|
66
|
+
navigate <url> # Go to URL
|
|
67
|
+
navigate <url> --wait 2000 # Wait 2s after load
|
|
68
|
+
navigate <url> --wait-cf # Wait for Cloudflare challenge
|
|
69
|
+
back # Browser back
|
|
70
|
+
forward # Browser forward
|
|
71
|
+
reload # Reload page
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### 3. Content Reading (prefer over eval)
|
|
75
|
+
```bash
|
|
76
|
+
get text <selector> # Get text content ← USE THIS, not eval
|
|
77
|
+
get html <selector> # Get innerHTML
|
|
78
|
+
get url # Current URL
|
|
79
|
+
get title # Page title
|
|
80
|
+
get value <selector> # Input value
|
|
81
|
+
get attr <selector> <name> # Element attribute
|
|
82
|
+
get count <selector> # Count matching elements
|
|
83
|
+
get styles <selector> # Computed styles
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Why not eval?** `get text h1` is typed, validated, and returns clean output.
|
|
87
|
+
`eval "document.querySelector('h1').textContent"` can throw, returns raw JSON,
|
|
88
|
+
and is harder to debug.
|
|
89
|
+
|
|
90
|
+
### 4. Element Interaction (typed and safe)
|
|
91
|
+
```bash
|
|
92
|
+
click <selector> # Click
|
|
93
|
+
dblclick <selector> # Double-click
|
|
94
|
+
fill <selector> <text> # Fill input (clears first)
|
|
95
|
+
type <selector> <text> # Type into element (appends)
|
|
96
|
+
hover <selector> # Mouse hover
|
|
97
|
+
focus <selector> # Focus element
|
|
98
|
+
scroll-into-view <selector> # Scroll to element
|
|
99
|
+
check <selector> # Check checkbox
|
|
100
|
+
uncheck <selector> # Uncheck checkbox
|
|
101
|
+
select-option <selector> <val> # Select dropdown
|
|
102
|
+
upload <selector> <file> # Upload file
|
|
103
|
+
drag <from_sel> <to_sel> # Drag and drop
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
### 5. Smart Finders (no CSS selectors needed)
|
|
107
|
+
```bash
|
|
108
|
+
find text "Submit" click # Click by visible text
|
|
109
|
+
find role button click --name "Login" # Click by ARIA role
|
|
110
|
+
find label "Email" fill "user@example.com" # Fill by label
|
|
111
|
+
find placeholder "Search..." fill "query" # Fill by placeholder
|
|
112
|
+
find test-id "submit-btn" click # Click by data-testid
|
|
113
|
+
find first ".item" click # First matching element
|
|
114
|
+
find nth 3 ".item" click # Nth element (0-based)
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Why finders?** They mirror how users think about the page — by visible text,
|
|
118
|
+
labels, and roles — not by CSS internals that break on refactors.
|
|
119
|
+
|
|
120
|
+
### 6. Waiting (explicit, not sleep)
|
|
121
|
+
```bash
|
|
122
|
+
wait-for-selector <selector> # Wait for element
|
|
123
|
+
wait-for-selector <sel> --timeout 10000 # Custom timeout
|
|
124
|
+
wait-for-text <text> # Wait for text to appear
|
|
125
|
+
wait-for-url <url_part> # Wait for navigation
|
|
126
|
+
wait-for-load networkidle # Wait for network idle
|
|
127
|
+
wait-for-function "window.ready" # Wait for JS condition
|
|
128
|
+
wait 1000 # Raw delay (avoid when possible)
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### 7. Screenshots & Visual
|
|
132
|
+
```bash
|
|
133
|
+
screenshot # Viewport screenshot
|
|
134
|
+
screenshot --full # Full page
|
|
135
|
+
screenshot --element <selector> # Element only
|
|
136
|
+
screenshot --output /path.png # Save to file
|
|
137
|
+
screenshot --annotate # With element annotations
|
|
138
|
+
pdf --output page.pdf # Save as PDF
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### 8. Cookie Management
|
|
142
|
+
```bash
|
|
143
|
+
cookie get # List all cookies (CDP, includes httpOnly)
|
|
144
|
+
cookie get --name session_id # Specific cookie
|
|
145
|
+
cookie set name value --domain .example.com
|
|
146
|
+
cookie delete name .example.com
|
|
147
|
+
cookie clear # Clear all
|
|
148
|
+
cookie export --output cookies.json
|
|
149
|
+
cookie import cookies.json
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
### 9. Auth State Persistence
|
|
153
|
+
```bash
|
|
154
|
+
auth-state save <name> # Save cookies + localStorage + sessionStorage + URL (v2)
|
|
155
|
+
auth-state load <name> # Restore full auth state via CDP
|
|
156
|
+
auth-state list # Show all saved states
|
|
157
|
+
auth-state show <name> # Display JSON content
|
|
158
|
+
auth-state rename <old> <new>
|
|
159
|
+
auth-state clear <name> # Delete specific
|
|
160
|
+
auth-state clean # Delete all
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
**Auth state v2 format** stores full CDP cookies (including httpOnly, SameSite, secure)
|
|
164
|
+
plus localStorage, sessionStorage, and the page URL. Auto-detects and handles v1 files.
|
|
165
|
+
|
|
166
|
+
### 10. Account Management (portable bundles)
|
|
167
|
+
```bash
|
|
168
|
+
account export <name> # Bundle auth state + passkeys
|
|
169
|
+
account export <name> --auth-state alt # Use different auth state name
|
|
170
|
+
account export <name> --rp-id x.com # Export only specific site passkeys
|
|
171
|
+
account import /path/to/bundle.json # Restore auth state + merge passkeys
|
|
172
|
+
account list # Show all account bundles
|
|
173
|
+
account show <name> # Display bundle details
|
|
174
|
+
account delete <name> # Remove a bundle
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
**Account bundles** combine auth state (cookies + localStorage + sessionStorage + URL)
|
|
178
|
+
with passkey credentials into a single portable JSON file at `~/.onecrawl/accounts/`.
|
|
179
|
+
|
|
180
|
+
### 11. Passkey & WebAuthn (CDP-only, real ECDSA signatures)
|
|
181
|
+
```bash
|
|
182
|
+
auth passkey-enable # Enable CDP virtual authenticator
|
|
183
|
+
auth passkey-register --output /tmp/passkeys.json # Watch + export credentials
|
|
184
|
+
auth passkey-add --credential-id <b64> --rp-id <domain>
|
|
185
|
+
auth passkey-list # List active credentials
|
|
186
|
+
auth passkey-log # Show WebAuthn operation log
|
|
187
|
+
auth passkey-disable # Disable virtual authenticator
|
|
188
|
+
auth passkey-remove --credential-id <b64>
|
|
189
|
+
auth passkey-set-file --file <path> # Auto-inject on reconnect
|
|
190
|
+
|
|
191
|
+
# Vault (persistent multi-site storage)
|
|
192
|
+
auth vault-list # List all sites + credential counts
|
|
193
|
+
auth vault-save --input <json> # Save passkeys to vault
|
|
194
|
+
auth vault-export --rp-id x.com --output /tmp/out.json
|
|
195
|
+
auth vault-remove --credential-id <b64>
|
|
196
|
+
auth vault-clear-site --rp-id x.com
|
|
197
|
+
|
|
198
|
+
# Import from password managers
|
|
199
|
+
auth import-bitwarden --input export.json # Bitwarden JSON export
|
|
200
|
+
auth import-one-password --input export.data # 1Password .1pux extract
|
|
201
|
+
auth import-cxf --input cxf.json # FIDO Alliance CXF format
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
**Architecture**: All passkey operations use real Chrome CDP WebAuthn domain
|
|
205
|
+
(not JS injection). ECDSA P-256 signatures are valid for server verification.
|
|
206
|
+
Vault stored at `~/.onecrawl/passkeys/vault.json` with atomic writes.
|
|
207
|
+
|
|
208
|
+
### 12. Stealth & Anti-Detection
|
|
209
|
+
```bash
|
|
210
|
+
stealth inject # Full stealth patch suite (auto on session start)
|
|
211
|
+
stealth detection-audit # Run comprehensive bot detection check
|
|
212
|
+
stealth tls-apply <profile> # TLS fingerprint: chrome, firefox, safari, edge
|
|
213
|
+
stealth webrtc-block # Prevent IP leaks via WebRTC
|
|
214
|
+
stealth battery-spoof # Spoof BatteryManager API
|
|
215
|
+
stealth sensor-block # Block motion/orientation sensors
|
|
216
|
+
stealth canvas-advanced # Gaussian canvas noise (not 1-bit XOR)
|
|
217
|
+
stealth timezone-sync <tz> # Sync Date + Intl + navigator timezone
|
|
218
|
+
stealth font-protect # Block font fingerprinting
|
|
219
|
+
stealth behavior-sim # Start continuous human behavior simulation
|
|
220
|
+
stealth behavior-stop # Stop simulation
|
|
221
|
+
stealth stealth-rotate # Auto-rotate fingerprint on domain change
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
**Stealth is injected automatically** on every new page/tab. 12 patches cover:
|
|
225
|
+
navigator.webdriver, chrome.runtime, plugins, WebGL, permissions, toString,
|
|
226
|
+
iframe isolation, dimensions, hardware concurrency, language, platform, codecs.
|
|
227
|
+
|
|
228
|
+
### 13. Network Control
|
|
229
|
+
```bash
|
|
230
|
+
network block image,font # Block resource types
|
|
231
|
+
throttle set 3g # Throttle to 3G
|
|
232
|
+
route "*.analytics.com" --block # Block specific domains
|
|
233
|
+
intercept set '<rules_json>' # Custom interception
|
|
234
|
+
har start # Record HAR
|
|
235
|
+
har drain # Drain HAR entries
|
|
236
|
+
har export --output traffic.har # Export HAR
|
|
237
|
+
network-log start # Start structured network logging
|
|
238
|
+
network-log drain # Get captured entries
|
|
239
|
+
network-log summary # Traffic summary (by type, domain, size)
|
|
240
|
+
network-log stop # Stop logging
|
|
241
|
+
network-log export --output log.json # Export to file
|
|
242
|
+
ws start # Start WebSocket capture
|
|
243
|
+
ws drain # Get captured frames
|
|
244
|
+
ws export --output ws.json # Export frames
|
|
245
|
+
ws connections # Count active WS connections
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
### 14. Domain Blocking
|
|
249
|
+
```bash
|
|
250
|
+
domain block example.com analytics.io # Block specific domains
|
|
251
|
+
domain block-category ads # Block by category (ads, analytics, social, trackers)
|
|
252
|
+
domain unblock # Clear all blocks
|
|
253
|
+
domain stats # Blocked request statistics
|
|
254
|
+
domain list # List all blocked domains
|
|
255
|
+
domain categories # Show available categories + domain counts
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
### 15. Proxy Management
|
|
259
|
+
```bash
|
|
260
|
+
proxy create-pool '<json>' # Create proxy rotation pool
|
|
261
|
+
proxy chrome-args '<json>' # Get Chrome args for proxy
|
|
262
|
+
proxy next '<json>' # Get next proxy from pool
|
|
263
|
+
proxy-health check <url> # Check single proxy health
|
|
264
|
+
proxy-health check-all '<json>' # Batch health check
|
|
265
|
+
proxy-health rank '<json>' # Rank by score
|
|
266
|
+
proxy-health filter '<json>' # Filter healthy proxies
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### 16. Accessibility & Debugging
|
|
270
|
+
```bash
|
|
271
|
+
a11y tree # Full accessibility tree
|
|
272
|
+
a11y element <selector> # Element accessibility info
|
|
273
|
+
a11y audit # Accessibility audit (issues + recommendations)
|
|
274
|
+
errors # Page errors
|
|
275
|
+
console start && console drain # Console messages
|
|
276
|
+
highlight <selector> # Visual highlight (3s red outline)
|
|
277
|
+
snapshot agent --compact # Agent-readable page snapshot
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
### 17. Code Coverage
|
|
281
|
+
```bash
|
|
282
|
+
coverage js-start # Start JS coverage collection
|
|
283
|
+
coverage js-stop # Stop + get coverage report
|
|
284
|
+
coverage css-start # Start CSS coverage collection
|
|
285
|
+
coverage css-report # Get CSS coverage data
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
### 18. Performance Profiling
|
|
289
|
+
```bash
|
|
290
|
+
perf trace-start # Start CDP performance tracing
|
|
291
|
+
perf trace-stop # Stop + get trace data
|
|
292
|
+
perf metrics # Get real-time performance metrics
|
|
293
|
+
perf timing # Navigation timing breakdown
|
|
294
|
+
perf resources # Resource timing for all loaded assets
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
### 19. Service Workers & Iframes
|
|
298
|
+
```bash
|
|
299
|
+
worker list # List active service workers
|
|
300
|
+
worker unregister # Unregister all service workers
|
|
301
|
+
worker info # Detailed worker information
|
|
302
|
+
|
|
303
|
+
iframe list # List all iframes with URLs
|
|
304
|
+
iframe eval <index> '<expr>' # Evaluate JS in specific iframe
|
|
305
|
+
iframe content <index> # Get iframe HTML content
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
### 20. Page Monitoring
|
|
309
|
+
```bash
|
|
310
|
+
page-watcher start # Start DOM change tracking
|
|
311
|
+
page-watcher drain # Get captured changes since last drain
|
|
312
|
+
page-watcher stop # Stop watching
|
|
313
|
+
page-watcher state # Get current page state snapshot
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
### 21. Print & PDF Export
|
|
317
|
+
```bash
|
|
318
|
+
print pdf --output doc.pdf # Export page as PDF
|
|
319
|
+
print pdf --landscape --scale 0.8 # Landscape, custom scale
|
|
320
|
+
print metrics # Get print layout metrics
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
### 22. Benchmarks & Rate Limiting
|
|
324
|
+
```bash
|
|
325
|
+
bench run --iterations 20 # Run CDP benchmarks
|
|
326
|
+
bench report # Format benchmark results
|
|
327
|
+
|
|
328
|
+
rate-limit set --preset cautious # Apply rate limit preset
|
|
329
|
+
rate-limit stats # Current rate limit state
|
|
330
|
+
rate-limit reset # Reset counters
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
### 23. Visual Diff & Screenshots
|
|
334
|
+
```bash
|
|
335
|
+
diff snapshot # Take accessibility snapshot for diffing
|
|
336
|
+
diff screenshot <baseline> # Compare current page vs baseline screenshot
|
|
337
|
+
diff url <url1> <url2> # Compare two URLs (DOM diff)
|
|
338
|
+
screenshot-diff compare <base> <current> # Pixel diff two images
|
|
339
|
+
screenshot-diff regression <baseline_dir> # Visual regression test
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
### 24. HTTP Requests (in-browser)
|
|
343
|
+
```bash
|
|
344
|
+
request execute '<json>' # Execute HTTP request via browser fetch
|
|
345
|
+
request batch '<json>' # Batch multiple requests
|
|
346
|
+
fingerprint apply <profile> # Apply fingerprint profile
|
|
347
|
+
fingerprint detect # Detect current browser fingerprint
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
### 25. Multi-Session & Daemon
|
|
351
|
+
```bash
|
|
352
|
+
daemon start # Start daemon (auto on session start)
|
|
353
|
+
daemon exec goto url=https://example.com --session linkedin
|
|
354
|
+
daemon exec evaluate expression="1+1" --session work
|
|
355
|
+
daemon status # Daemon health + session list
|
|
356
|
+
daemon stop # Stop daemon
|
|
357
|
+
```
|
|
358
|
+
|
|
359
|
+
### 26. Agent Automation
|
|
360
|
+
```bash
|
|
361
|
+
agent auto "<goal>" # Autonomous goal execution
|
|
362
|
+
agent auto "find the login button and click it" --max-steps 5
|
|
363
|
+
agent loop # Observation monitor with goal verification
|
|
364
|
+
agent think # Analyze page state, recommend actions
|
|
365
|
+
agent chain "<js actions>" # Execute pre-written action sequences
|
|
366
|
+
agent observe # Get annotated page state with coordinates
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
## Anti-Patterns (Don't Do This)
|
|
370
|
+
|
|
371
|
+
| ❌ Bad (eval) | ✅ Good (primitive) |
|
|
372
|
+
|---|---|
|
|
373
|
+
| `eval "document.title"` | `get title` |
|
|
374
|
+
| `eval "document.querySelector('#btn').click()"` | `click #btn` |
|
|
375
|
+
| `eval "document.querySelector('input').value = 'text'"` | `fill input text` |
|
|
376
|
+
| `eval "document.querySelectorAll('li').length"` | `get count li` |
|
|
377
|
+
| `eval "window.location.href"` | `get url` |
|
|
378
|
+
| `eval "document.cookie"` | `cookie get` |
|
|
379
|
+
| `eval "window.scrollTo(0, 999)"` | `scroll down 999` |
|
|
380
|
+
|
|
381
|
+
## When eval IS appropriate
|
|
382
|
+
|
|
383
|
+
- Custom business logic: `eval "calculateTotal(items)"`
|
|
384
|
+
- Complex DOM traversal not covered by selectors
|
|
385
|
+
- Reading `window.*` globals not exposed by primitives
|
|
386
|
+
- Injecting scripts for testing (mocking APIs, etc.)
|
|
387
|
+
|
|
388
|
+
## Configuration
|
|
389
|
+
|
|
390
|
+
OneCrawl loads defaults from `~/.onecrawl/config.toml`:
|
|
391
|
+
|
|
392
|
+
```toml
|
|
393
|
+
engine = "chrome" # or "lightpanda"
|
|
394
|
+
headless = false
|
|
395
|
+
daemon = true # daemon mode by default
|
|
396
|
+
daemon_headless = true
|
|
397
|
+
session_name = "default"
|
|
398
|
+
session_auto_isolate = true # auto-unique session names per agent
|
|
399
|
+
persist_cookies = "" # auto-persist path, empty = disabled
|
|
400
|
+
chrome_profile = "" # empty = auto (~/.onecrawl/chrome-profile/)
|
|
401
|
+
user_agent = "" # empty = auto
|
|
402
|
+
daemon_idle_timeout = 1800 # 30 minutes
|
|
403
|
+
daemon_max_sessions = 8
|
|
404
|
+
daemon_pool_size = 0 # pre-warmed sessions (0 = disabled)
|
|
405
|
+
daemon_rate_limit = 0 # per-second (0 = unlimited)
|
|
406
|
+
```
|
|
407
|
+
|
|
408
|
+
CLI flags always override config values.
|
|
409
|
+
|
|
410
|
+
## MCP Tool Reference (546 actions, 18 tools)
|
|
411
|
+
|
|
412
|
+
When using OneCrawl via MCP (Model Context Protocol), all actions are available
|
|
413
|
+
through `onecrawl run <tool> <action> --json`. The tools are:
|
|
414
|
+
|
|
415
|
+
| Tool | Actions | Description |
|
|
416
|
+
|------|---------|-------------|
|
|
417
|
+
| `browser` | 196 | Navigation, interaction, extraction, network, coverage, performance, accessibility |
|
|
418
|
+
| `agent` | 111 | Autonomous agent, skills, memory, planning, auto-login |
|
|
419
|
+
| `secure` | 40 | Auth state, account bundles, passkey vault, imports, stealth |
|
|
420
|
+
| `data` | 27 | Pipeline processing, export, transform |
|
|
421
|
+
| `automate` | 27 | Scheduler, retry queues, session pool |
|
|
422
|
+
| `stealth` | 25 | Bot detection, CAPTCHA, fingerprint, stealth patches |
|
|
423
|
+
| `computer` | 24 | Computer-use API, screen coordinates, mouse/keyboard |
|
|
424
|
+
| `daemon` | 22 | Daemon control, session management, pool config |
|
|
425
|
+
| `crawl` | 5 | Spider, robots.txt, sitemap, DOM snapshots |
|
|
426
|
+
| `vault` | 9 | Encrypted KV storage, PKCE, TOTP |
|
|
427
|
+
| `plugin` | 9 | Plugin management |
|
|
428
|
+
| `memory` | 6 | Long-term agent memory |
|
|
429
|
+
| `perf` | 8 | Performance monitoring, budgets, regressions |
|
|
430
|
+
| `durable` | 8 | Crash-safe sessions |
|
|
431
|
+
| `reactor` | 8 | Event-driven automation rules |
|
|
432
|
+
| `events` | 8 | Event bus, webhooks |
|
|
433
|
+
| `studio` | 8 | Recording, playback |
|
|
434
|
+
| `orchestrator` | 5 | Multi-agent coordination |
|
|
435
|
+
| `vision` | — | Computer vision (experimental) |
|
|
436
|
+
|
|
437
|
+
## File Layout
|
|
438
|
+
|
|
439
|
+
```
|
|
440
|
+
~/.onecrawl/
|
|
441
|
+
├── config.toml # Global config
|
|
442
|
+
├── auth-states/{name}.json # Auth state snapshots (v2: CDP cookies + storage)
|
|
443
|
+
├── accounts/{name}.json # Account bundles (auth state + passkeys)
|
|
444
|
+
├── passkeys/vault.json # Multi-site passkey vault (rp_id → credentials)
|
|
445
|
+
├── chrome-profile/ # Persistent Chrome profile
|
|
446
|
+
└── profiles/{name}/ # Named browser profiles
|
|
447
|
+
```
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: planning-tracking
|
|
3
|
+
description: "Execution plan with milestone/issue hierarchy, explicit dependencies, and safe parallelism."
|
|
4
|
+
---
|
|
5
|
+
# Planning & Tracking Skill
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Create and maintain an execution plan with milestone/issue hierarchy, explicit dependencies, and safe parallelism.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- Starting any non-trivial task
|
|
12
|
+
- Scope changes during execution
|
|
13
|
+
- Multiple files/workstreams must be coordinated
|
|
14
|
+
|
|
15
|
+
## Mandatory Schema
|
|
16
|
+
```typescript
|
|
17
|
+
interface Plan { PRD: string; context: string; milestones: Record<string, Milestone>; }
|
|
18
|
+
interface Milestone { id: string; description: string; priority: "critical"|"high"|"medium"|"low"; status: "todo"|"in_progress"|"review"|"done"; depends_on: string[]; issues: Record<string, Issue>; }
|
|
19
|
+
interface Issue { id: string; task: string; priority: "critical"|"high"|"medium"|"low"; status: "todo"|"in_progress"|"review"|"done"|"blocked"; depends_on: string[]; children: Record<string, Issue>; }
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Procedure
|
|
23
|
+
1. Build plan before implementation.
|
|
24
|
+
2. Assign unique IDs to milestones/issues.
|
|
25
|
+
3. Declare dependencies for every issue (`depends_on`).
|
|
26
|
+
4. Execute by dependency order, then priority (`critical` → `high` → `medium` → `low`).
|
|
27
|
+
5. Run independent same-priority milestones in parallel when safe.
|
|
28
|
+
6. Update statuses continuously and append concise progress summaries.
|
|
29
|
+
|
|
30
|
+
## Done Criteria
|
|
31
|
+
- Plan exists, is up-to-date, and reflects actual execution state.
|
|
32
|
+
- Dependencies are respected and parallel work is safe.
|
|
33
|
+
|
|
34
|
+
## Anti-patterns
|
|
35
|
+
- Starting implementation without a plan
|
|
36
|
+
- Missing dependency declarations
|
|
37
|
+
- Running blocked items in parallel
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: policy-coherence-audit
|
|
3
|
+
description: "Detect and remove contradictions across agent policies before execution."
|
|
4
|
+
---
|
|
5
|
+
# Policy Coherence Audit Skill
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Detect and remove contradictions across agent policies before execution.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- Updating `AGENTS.MD`
|
|
12
|
+
- Merging new workflow rules
|
|
13
|
+
- Noticing behavioral ambiguity during execution
|
|
14
|
+
|
|
15
|
+
## Checklist
|
|
16
|
+
- Language coherence: English-only wording.
|
|
17
|
+
- Interaction coherence: one question + 5-option model is consistently respected.
|
|
18
|
+
- Gate coherence: completion gates apply to both Non-Breaking and Breaking paths.
|
|
19
|
+
- Scope coherence: avoid wording that causes uncontrolled scope creep.
|
|
20
|
+
- Reference coherence: every mentioned skill path exists.
|
|
21
|
+
|
|
22
|
+
## Short examples
|
|
23
|
+
- Fix mixed language term: "TASSATIVO" -> "MANDATORY".
|
|
24
|
+
- Fix model mismatch: "propose one option" -> explicit 5-option decision set.
|
|
25
|
+
|
|
26
|
+
## Anti-patterns
|
|
27
|
+
- Leaving ambiguous precedence between structural and surgical strategies
|
|
28
|
+
- Contradictory clauses in different sections
|
|
29
|
+
- Referencing non-existent skill files
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: programmatic-tool-calling
|
|
3
|
+
description: "Multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead."
|
|
4
|
+
---
|
|
5
|
+
# Programmatic Tool Calling Skill (Model-Agnostic)
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Execute multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- 3+ dependent tool calls
|
|
12
|
+
- Large intermediate outputs (logs, tables, files)
|
|
13
|
+
- Branching logic, retries, or fan-out/fan-in workflows
|
|
14
|
+
|
|
15
|
+
## Core Idea
|
|
16
|
+
Treat tools as callable functions inside an orchestration runtime (script/runner), not as one-turn-at-a-time chat actions.
|
|
17
|
+
|
|
18
|
+
## Procedure
|
|
19
|
+
1. Generate/execute orchestration code for loops, conditionals, parallel calls, retries, and early termination.
|
|
20
|
+
2. Process intermediate data in runtime (filter/aggregate/transform) instead of returning raw data to model context.
|
|
21
|
+
3. Return only high-signal outputs to the model (summary, decision, artifact references).
|
|
22
|
+
|
|
23
|
+
## Why It Works (provider/model independent)
|
|
24
|
+
- Fewer model round-trips for multi-call workflows.
|
|
25
|
+
- Intermediate data stays out of context unless needed.
|
|
26
|
+
- Explicit code control flow is easier to test, monitor, and debug.
|
|
27
|
+
|
|
28
|
+
## Guardrails
|
|
29
|
+
- Strict input/output schemas.
|
|
30
|
+
- Validate tool results before use.
|
|
31
|
+
- Idempotent/retry-safe tool design when possible.
|
|
32
|
+
- Timeout/cancellation/expiry handling.
|
|
33
|
+
- Sandbox execution for untrusted code; never blindly execute external payloads.
|
|
34
|
+
|
|
35
|
+
## Done Criteria
|
|
36
|
+
- Workflow completes with reduced context load and deterministic control flow.
|
|
37
|
+
|
|
38
|
+
## Anti-patterns
|
|
39
|
+
- Returning raw intermediate payloads to the model by default
|
|
40
|
+
- Unbounded loops without stop conditions
|
|
41
|
+
- Executing unvalidated tool output
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: rollback-rca
|
|
3
|
+
description: "Stop ineffective iteration loops and choose a controlled recovery path after repeated gate failures."
|
|
4
|
+
---
|
|
5
|
+
# Rollback & RCA Skill
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Stop ineffective iteration loops and choose a controlled recovery path after repeated gate failures.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- An issue fails completion gate 3 consecutive times
|
|
12
|
+
|
|
13
|
+
## Procedure
|
|
14
|
+
1. Stop work on the issue immediately.
|
|
15
|
+
2. Run root cause analysis:
|
|
16
|
+
- Architecture mismatch?
|
|
17
|
+
- Dependency/environment problem?
|
|
18
|
+
- Scope too broad?
|
|
19
|
+
3. Present options via `ask_user`:
|
|
20
|
+
- Rescope (split into smaller sub-issues)
|
|
21
|
+
- Rollback (revert to last known good commit)
|
|
22
|
+
- Redesign (change architecture)
|
|
23
|
+
4. Execute selected path and document rationale in session log.
|
|
24
|
+
|
|
25
|
+
## Rollback Rules
|
|
26
|
+
- Keep history clean and revertible.
|
|
27
|
+
- Roll back only to the last commit that passed gate.
|
|
28
|
+
- Record rollback cause and follow-up plan.
|
|
29
|
+
|
|
30
|
+
## Done Criteria
|
|
31
|
+
- Issue has an approved recovery path and documented rationale.
|
|
32
|
+
|
|
33
|
+
## Anti-patterns
|
|
34
|
+
- Blindly retrying same failing strategy
|
|
35
|
+
- Silent rollback without traceability
|
|
36
|
+
- Continuing without user alignment
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: session-logging
|
|
3
|
+
description: "Accurate, auditable execution journal for each working session."
|
|
4
|
+
---
|
|
5
|
+
# Session Logging Skill
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Maintain an accurate, auditable execution journal for each working session.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- Starting/ending a session
|
|
12
|
+
- Completing issues/milestones
|
|
13
|
+
- Syncing GitHub status
|
|
14
|
+
|
|
15
|
+
## Required File
|
|
16
|
+
`sessions-<ISO-date>.md`
|
|
17
|
+
|
|
18
|
+
## Required Sections
|
|
19
|
+
- Status (milestone states)
|
|
20
|
+
- Work Completed (`[mX/iY]` references)
|
|
21
|
+
- Completion Gate Passed (include ✅ and consecutive-pass evidence)
|
|
22
|
+
- Decisions Made
|
|
23
|
+
- Blockers
|
|
24
|
+
- GitHub Sync (created/closed/updated issue IDs)
|
|
25
|
+
- Branch
|
|
26
|
+
- Date (ISO timestamp)
|
|
27
|
+
|
|
28
|
+
## Procedure
|
|
29
|
+
1. Create/update the session file at session start and after meaningful milestones.
|
|
30
|
+
2. Keep entries factual and aligned with plan/GitHub state.
|
|
31
|
+
3. Record gate-pass evidence per completed issue.
|
|
32
|
+
|
|
33
|
+
## Done Criteria
|
|
34
|
+
- Session file reflects real progress and traceable references.
|
|
35
|
+
|
|
36
|
+
## Anti-patterns
|
|
37
|
+
- Retroactive guesswork
|
|
38
|
+
- Missing gate evidence
|
|
39
|
+
- Inconsistent milestone/issue IDs
|