onecrawl 4.0.0-beta.4 → 4.0.0-beta.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/assets/AGENTS.md CHANGED
@@ -56,6 +56,12 @@ Read the corresponding `SKILL.md` when the trigger condition for that skill is m
56
56
 
57
57
  - **[`onecrawl-commands`](.github/skills/onecrawl-commands/SKILL.md)** — **Invoke when**: writing automation scripts, choosing between OneCrawl primitives and eval, or building agent workflows | **Provides**: complete command reference (200+ commands), decision flowchart (primitives-first, eval-as-fallback), anti-patterns, configuration guide
58
58
 
59
+ - **[`runtime-surface-parity`](.github/skills/runtime-surface-parity/SKILL.md)** — **Invoke when**: adding or changing behavior exposed through both CLI and MCP; or refactoring shared runtime flows such as session/auth/captcha/data handling | **Provides**: single-core implementation rule, parity checklist, and live cross-surface verification procedure
60
+
61
+ - **[`live-runtime-validation`](.github/skills/live-runtime-validation/SKILL.md)** — **Invoke when**: verifying browser, session, auth, agent loop, CLI, or MCP runtime behavior | **Provides**: real-process/no-mock validation procedure, artifact expectations, and cleanup checklist
62
+
63
+ - **[`programmatic-tool-calling`](.github/skills/programmatic-tool-calling/SKILL.md)** — **Invoke when**: 3+ dependent tool calls, large intermediate outputs, branching/retry orchestration, or fan-out/fan-in workflows are present | **Provides**: generalized orchestration pattern to batch tool use in code and return only decision-relevant summaries
64
+
59
65
  ## Maintenance Rule
60
66
 
61
67
  When updating policy logic:
@@ -16,6 +16,7 @@ Use this index to quickly select the right skill file.
16
16
  ## Testing
17
17
  - `testing-policy/SKILL.md` — unit/integration/E2E/non-regression baseline and enforcement.
18
18
  - `e2e-testing/SKILL.md` — focused E2E execution checklist and anti-flakiness rules.
19
+ - `live-runtime-validation/SKILL.md` — real-process, real-network runtime verification with artifact capture and cleanup.
19
20
 
20
21
  ## Policy Maintenance
21
22
  - `policy-coherence-audit/SKILL.md` — detect and fix contradictions across AGENTS policies.
@@ -23,13 +24,19 @@ Use this index to quickly select the right skill file.
23
24
  ## Debugging
24
25
  - `systematic-debugging/SKILL.md` — evidence-based debugging via structured logging, `debug-data.log` analysis, and MCP tool integration.
25
26
 
27
+ ## Runtime & Tooling
28
+ - `runtime-surface-parity/SKILL.md` — one-core parity rule for CLI and MCP surfaces.
29
+ - `programmatic-tool-calling/SKILL.md` — code-level orchestration pattern for multi-step tool workflows.
30
+
26
31
  ## Suggested invocation order (quick)
27
32
  1. `interaction-loop` (select path)
28
33
  2. `planning-tracking`
29
34
  3. `breaking-change-paths` (if relevant)
30
35
  4. `testing-policy` + `e2e-testing` (if relevant)
31
36
  5. `systematic-debugging` (if bug/unexpected behavior)
32
- 6. `completion-gate`
33
- 7. `session-logging` + `github-sync`
34
- 8. `rollback-rca` (only if blocked/failing repeatedly)
35
- 9. `policy-coherence-audit` (when editing policy)
37
+ 6. `runtime-surface-parity` (if CLI/MCP or multiple runtime surfaces are involved)
38
+ 7. `live-runtime-validation` (if runtime behavior must be proven live)
39
+ 8. `completion-gate`
40
+ 9. `session-logging` + `github-sync`
41
+ 10. `rollback-rca` (only if blocked/failing repeatedly)
42
+ 11. `policy-coherence-audit` (when editing policy)
@@ -33,7 +33,7 @@ onecrawl session close # Cleanup
33
33
  3. **Multi-agent isolation**: session start -s a1 → session start -s a2 → verify isolation
34
34
  4. **Profile management**: profile create → session start --profile → profile delete
35
35
  5. **Config management**: config set → config show → verify change persists
36
- 6. **Auth persistence**: auth-state save → session close → session start → auth-state load
36
+ 6. **Identity persistence**: profile save → session close → session start → profile load
37
37
  7. **Stealth**: session start → stealth detection-audit → verify 0% headless detection
38
38
  8. **MCP server**: onecrawl mcp → tool discovery → action execution → clean shutdown
39
39
 
@@ -0,0 +1,48 @@
1
+ ---
2
+ name: live-runtime-validation
3
+ description: "Validate browser, agent, CLI, or MCP behavior against real processes, real network targets, and real artifacts. No mocks."
4
+ ---
5
+ # Live Runtime Validation
6
+
7
+ ## Purpose
8
+ Force validation on real infrastructure when the task is about runtime correctness, browser behavior, identity profiles, anti-bot handling, or tool execution.
9
+
10
+ ## Use when
11
+ - Verifying MCP or CLI behavior
12
+ - Testing browser/session/auth/profile flows
13
+ - Testing agent loops, harness flows, or anti-bot behavior
14
+ - Confirming a bug fix that could be masked by mocks
15
+
16
+ ## Hard Rule
17
+ Do not mark runtime work done from unit tests alone.
18
+
19
+ ## Required Validation Surface
20
+ - Real binary
21
+ - Real browser process
22
+ - Real session/profile state
23
+ - Real network target when the feature depends on network behavior
24
+
25
+ ## Procedure
26
+ 1. Build the exact binary you are validating.
27
+ 2. Run the flow end-to-end from the user-facing surface:
28
+ - CLI command
29
+ - MCP `tools/call`
30
+ - daemon session
31
+ 3. Save artifacts under `packages/onecrawl-rust/target/onecrawl/live/`.
32
+ 4. Report:
33
+ - target URL or target system
34
+ - final status
35
+ - artifact path
36
+ - cleanup result
37
+
38
+ ## Cleanup Checklist
39
+ - Close sessions
40
+ - Remove temp harness/test state if created only for validation
41
+ - Kill stray test browsers
42
+ - Avoid leaving `/tmp/onecrawl-session*.json` or test profiles behind
43
+
44
+ ## Anti-patterns
45
+ - “Test passed” based only on unit/integration output
46
+ - Using `data:` URLs to validate network-dependent features
47
+ - Calling internal Rust functions and calling that E2E
48
+ - Leaving live artifacts in repo root or random temp paths without recording them
@@ -24,7 +24,7 @@ Need to interact with a page?
24
24
  ├─ Take screenshot? → screenshot --full, screenshot --element <sel>
25
25
  ├─ Extract data? → extract content json, extract metadata
26
26
  ├─ Cookie ops? → cookie get/set/delete/export/import
27
- ├─ Auth state? auth-state save/load, account export/import
27
+ ├─ Identity? profile save/load, profile export/import
28
28
  ├─ Passkeys? → auth passkey-enable/register, auth vault-list
29
29
  ├─ Network control? → network block, throttle, route, intercept
30
30
  ├─ HAR/traffic? → har start/drain/export, network-log start/drain
@@ -149,33 +149,29 @@ cookie export --output cookies.json
149
149
  cookie import cookies.json
150
150
  ```
151
151
 
152
- ### 9. Auth State Persistence
152
+ ### 9. Identity Profile Persistence
153
153
  ```bash
154
- auth-state save <name> # Save cookies + localStorage + sessionStorage + URL (v2)
155
- auth-state load <name> # Restore full auth state via CDP
156
- auth-state list # Show all saved states
157
- auth-state show <name> # Display JSON content
158
- auth-state rename <old> <new>
159
- auth-state clear <name> # Delete specific
160
- auth-state clean # Delete all
154
+ profile save <name> # Save cookies + localStorage + sessionStorage + URL
155
+ profile load <name> # Restore full identity via CDP
156
+ profile list # Show all saved profiles
157
+ profile show <name> # Display JSON content
158
+ profile rename <old> <new>
159
+ profile clear <name> # Delete specific
160
+ profile clean # Delete all
161
161
  ```
162
162
 
163
- **Auth state v2 format** stores full CDP cookies (including httpOnly, SameSite, secure)
164
- plus localStorage, sessionStorage, and the page URL. Auto-detects and handles v1 files.
163
+ **Identity profiles** store full CDP cookies (including httpOnly, SameSite, secure)
164
+ plus localStorage, sessionStorage, and the page URL.
165
165
 
166
- ### 10. Account Management (portable bundles)
166
+ ### 10. Profile Management (portable bundles)
167
167
  ```bash
168
- account export <name> # Bundle auth state + passkeys
169
- account export <name> --auth-state alt # Use different auth state name
170
- account export <name> --rp-id x.com # Export only specific site passkeys
171
- account import /path/to/bundle.json # Restore auth state + merge passkeys
172
- account list # Show all account bundles
173
- account show <name> # Display bundle details
174
- account delete <name> # Remove a bundle
168
+ profile export <name> # Bundle identity + passkeys
169
+ profile export <name> --rp-id x.com # Export only specific site passkeys
170
+ profile import /path/to/bundle.json # Restore identity + merge passkeys
175
171
  ```
176
172
 
177
- **Account bundles** combine auth state (cookies + localStorage + sessionStorage + URL)
178
- with passkey credentials into a single portable JSON file at `~/.onecrawl/accounts/`.
173
+ **Profile bundles** combine identity data (cookies + localStorage + sessionStorage + URL)
174
+ with passkey credentials into a single portable JSON file at `~/.onecrawl/profiles/`.
179
175
 
180
176
  ### 11. Passkey & WebAuthn (CDP-only, real ECDSA signatures)
181
177
  ```bash
@@ -432,7 +428,6 @@ daemon_headless = true
432
428
  session_name = "default"
433
429
  session_auto_isolate = true # auto-unique session names per agent
434
430
  auto_connect = false # auto-discover running Chrome with CDP
435
- persist_cookies = "" # auto-persist path, empty = disabled
436
431
  chrome_profile = "" # empty = auto (~/.onecrawl/chrome-profile/)
437
432
  user_agent = "" # empty = auto
438
433
  daemon_idle_timeout = 1800 # 30 minutes
@@ -481,8 +476,7 @@ through `onecrawl run <tool> <action> --json`. The tools are:
481
476
  ```
482
477
  ~/.onecrawl/
483
478
  ├── config.toml # Global config
484
- ├── auth-states/{name}.json # Auth state snapshots (v2: CDP cookies + storage)
485
- ├── accounts/{name}.json # Account bundles (auth state + passkeys)
479
+ ├── profiles/{name}/ # Identity profiles (CDP cookies + storage + passkeys)
486
480
  ├── passkeys/vault.json # Multi-site passkey vault (rp_id → credentials)
487
481
  ├── chrome-profile/ # Persistent Chrome profile
488
482
  └── profiles/{name}/ # Named browser profiles
@@ -1,21 +1,37 @@
1
1
  ---
2
2
  name: programmatic-tool-calling
3
- description: "Multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead."
3
+ description: "Generalized orchestration for multi-step tool workflows. Use code to batch, branch, retry, and filter tool calls before returning only high-signal results."
4
4
  ---
5
- # Programmatic Tool Calling Skill (Model-Agnostic)
5
+ # Programmatic Tool Calling
6
6
 
7
7
  ## Purpose
8
- Execute multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead.
8
+ Execute multi-step tool workflows through code orchestration to reduce latency, context pollution, and token overhead.
9
9
 
10
10
  ## Use when
11
11
  - 3+ dependent tool calls in sequence
12
12
  - Large intermediate outputs (logs, tables, file listings)
13
13
  - Branching logic, retries, or fan-out/fan-in workflows
14
14
  - Multi-crate operations that follow the dependency graph
15
+ - The model only needs the final decision-ready summary, not every intermediate payload
16
+
17
+ ## Avoid when
18
+ - A single primitive already solves the task
19
+ - The workflow is interactive and depends on user judgment at each step
20
+ - The script would hide important state transitions the model still needs to inspect
15
21
 
16
22
  ## Core Idea
17
23
  Treat tools as callable functions inside an orchestration runtime (bash script, Python), not as one-turn-at-a-time chat actions. Return only high-signal summaries to the model context.
18
24
 
25
+ This is a generalized workflow pattern, not a vendor-specific feature. The value is not the API shape; it is deterministic orchestration, filtered outputs, and lower context churn.
26
+
27
+ ## Preferred Runtime Choice
28
+ - Bash for CLI-heavy flows and simple gating
29
+ - Python when you need structured parsing, branching, or JSON manipulation
30
+ - Shared Rust/core implementation when the behavior itself belongs in the product rather than in the agent workflow
31
+
32
+ ## OneCrawl-Specific Rule
33
+ If a behavior exists in both CLI and MCP, orchestrate the same logical action on both surfaces instead of inventing two separate scripts with different semantics.
34
+
19
35
  ## OneCrawl Patterns
20
36
 
21
37
  ### Pattern 1: Sequential Gate Check (reduce 4 tool calls to 1)
@@ -97,6 +113,8 @@ done
97
113
  - Never blindly execute tool output as code.
98
114
  - Use `set -e` in bash scripts to fail fast on errors.
99
115
  - Cap parallel jobs to avoid resource exhaustion: `wait` after fan-out.
116
+ - Keep intermediate artifacts in `target/onecrawl/` when they matter for debugging.
117
+ - Prefer one orchestrated script that filters output over many chat-level tool calls that dump raw logs into context.
100
118
 
101
119
  ## Done Criteria
102
120
  - Workflow completes with reduced context load and deterministic control flow.
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: runtime-surface-parity
3
+ description: "Keep CLI and MCP behavior symmetric by enforcing one shared core implementation, schema parity, and cross-surface verification."
4
+ ---
5
+ # Runtime Surface Parity
6
+
7
+ ## Purpose
8
+ Prevent CLI and MCP from drifting apart semantically. One surface may parse or render differently, but behavior must come from the same core implementation.
9
+
10
+ ## Use when
11
+ - Adding or changing a CLI command that overlaps an MCP action
12
+ - Adding or changing an MCP action that overlaps a CLI command
13
+ - Refactoring browser/session/auth/captcha/data flows that exist on both surfaces
14
+
15
+ ## Core Rule
16
+ Implement behavior once in shared Rust code, then keep CLI and MCP as thin adapters.
17
+
18
+ ## Required Procedure
19
+ 1. Identify the shared behavior and move it into the deepest common crate that can own it.
20
+ 2. Make CLI and MCP call that same implementation.
21
+ 3. Keep request/response schemas structurally equivalent:
22
+ - same required fields
23
+ - same enums/status values
24
+ - same success/failure semantics
25
+ 4. Verify the same scenario from both surfaces against the same live target.
26
+
27
+ ## Parity Checklist
28
+ - One source of truth for runtime behavior
29
+ - No CLI-only fallback that MCP does not have
30
+ - No MCP-only fallback that CLI does not have
31
+ - Same auth/session/profile semantics
32
+ - Same challenge/detect/solve status model
33
+ - Same cleanup expectations
34
+
35
+ ## Good Patterns
36
+ - Shared handler in `onecrawl-cdp` or another common crate
37
+ - CLI `run <tool> <action>` delegating to MCP/shared core
38
+ - MCP returning structured JSON that CLI can print without inventing new semantics
39
+
40
+ ## Bad Patterns
41
+ - Fixing only MCP because the bug was observed there
42
+ - Fixing only CLI because it is easier to patch
43
+ - Two separate fallback chains for the same feature
44
+ - Same action name with different parameter aliases but different behavior
45
+
46
+ ## Verification
47
+ - Run one live MCP probe
48
+ - Run the equivalent live CLI command
49
+ - Compare outcome, not just exit code
50
+ - Check cleanup: no orphan browser/session/process left behind
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "onecrawl",
3
- "version": "4.0.0-beta.4",
3
+ "version": "4.0.0-beta.6",
4
4
  "description": "Browser automation engine — CLI, MCP server, and agent skills installer",
5
5
  "license": "BUSL-1.1",
6
6
  "author": "Giulio Leone <giulio@onecrawl.dev>",