onecrawl 4.0.0-beta.4 → 4.0.0-beta.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/AGENTS.md +6 -0
- package/assets/skills/README.md +11 -4
- package/assets/skills/e2e-testing/SKILL.md +1 -1
- package/assets/skills/live-runtime-validation/SKILL.md +48 -0
- package/assets/skills/onecrawl-commands/SKILL.md +18 -24
- package/assets/skills/programmatic-tool-calling/SKILL.md +21 -3
- package/assets/skills/runtime-surface-parity/SKILL.md +50 -0
- package/package.json +1 -1
package/assets/AGENTS.md
CHANGED
|
@@ -56,6 +56,12 @@ Read the corresponding `SKILL.md` when the trigger condition for that skill is m
|
|
|
56
56
|
|
|
57
57
|
- **[`onecrawl-commands`](.github/skills/onecrawl-commands/SKILL.md)** — **Invoke when**: writing automation scripts, choosing between OneCrawl primitives and eval, or building agent workflows | **Provides**: complete command reference (200+ commands), decision flowchart (primitives-first, eval-as-fallback), anti-patterns, configuration guide
|
|
58
58
|
|
|
59
|
+
- **[`runtime-surface-parity`](.github/skills/runtime-surface-parity/SKILL.md)** — **Invoke when**: adding or changing behavior exposed through both CLI and MCP; or refactoring shared runtime flows such as session/auth/captcha/data handling | **Provides**: single-core implementation rule, parity checklist, and live cross-surface verification procedure
|
|
60
|
+
|
|
61
|
+
- **[`live-runtime-validation`](.github/skills/live-runtime-validation/SKILL.md)** — **Invoke when**: verifying browser, session, auth, agent loop, CLI, or MCP runtime behavior | **Provides**: real-process/no-mock validation procedure, artifact expectations, and cleanup checklist
|
|
62
|
+
|
|
63
|
+
- **[`programmatic-tool-calling`](.github/skills/programmatic-tool-calling/SKILL.md)** — **Invoke when**: 3+ dependent tool calls, large intermediate outputs, branching/retry orchestration, or fan-out/fan-in workflows are present | **Provides**: generalized orchestration pattern to batch tool use in code and return only decision-relevant summaries
|
|
64
|
+
|
|
59
65
|
## Maintenance Rule
|
|
60
66
|
|
|
61
67
|
When updating policy logic:
|
package/assets/skills/README.md
CHANGED
|
@@ -16,6 +16,7 @@ Use this index to quickly select the right skill file.
|
|
|
16
16
|
## Testing
|
|
17
17
|
- `testing-policy/SKILL.md` — unit/integration/E2E/non-regression baseline and enforcement.
|
|
18
18
|
- `e2e-testing/SKILL.md` — focused E2E execution checklist and anti-flakiness rules.
|
|
19
|
+
- `live-runtime-validation/SKILL.md` — real-process, real-network runtime verification with artifact capture and cleanup.
|
|
19
20
|
|
|
20
21
|
## Policy Maintenance
|
|
21
22
|
- `policy-coherence-audit/SKILL.md` — detect and fix contradictions across AGENTS policies.
|
|
@@ -23,13 +24,19 @@ Use this index to quickly select the right skill file.
|
|
|
23
24
|
## Debugging
|
|
24
25
|
- `systematic-debugging/SKILL.md` — evidence-based debugging via structured logging, `debug-data.log` analysis, and MCP tool integration.
|
|
25
26
|
|
|
27
|
+
## Runtime & Tooling
|
|
28
|
+
- `runtime-surface-parity/SKILL.md` — one-core parity rule for CLI and MCP surfaces.
|
|
29
|
+
- `programmatic-tool-calling/SKILL.md` — code-level orchestration pattern for multi-step tool workflows.
|
|
30
|
+
|
|
26
31
|
## Suggested invocation order (quick)
|
|
27
32
|
1. `interaction-loop` (select path)
|
|
28
33
|
2. `planning-tracking`
|
|
29
34
|
3. `breaking-change-paths` (if relevant)
|
|
30
35
|
4. `testing-policy` + `e2e-testing` (if relevant)
|
|
31
36
|
5. `systematic-debugging` (if bug/unexpected behavior)
|
|
32
|
-
6. `
|
|
33
|
-
7. `
|
|
34
|
-
8. `
|
|
35
|
-
9. `
|
|
37
|
+
6. `runtime-surface-parity` (if CLI/MCP or multiple runtime surfaces are involved)
|
|
38
|
+
7. `live-runtime-validation` (if runtime behavior must be proven live)
|
|
39
|
+
8. `completion-gate`
|
|
40
|
+
9. `session-logging` + `github-sync`
|
|
41
|
+
10. `rollback-rca` (only if blocked/failing repeatedly)
|
|
42
|
+
11. `policy-coherence-audit` (when editing policy)
|
|
@@ -33,7 +33,7 @@ onecrawl session close # Cleanup
|
|
|
33
33
|
3. **Multi-agent isolation**: session start -s a1 → session start -s a2 → verify isolation
|
|
34
34
|
4. **Profile management**: profile create → session start --profile → profile delete
|
|
35
35
|
5. **Config management**: config set → config show → verify change persists
|
|
36
|
-
6. **
|
|
36
|
+
6. **Identity persistence**: profile save → session close → session start → profile load
|
|
37
37
|
7. **Stealth**: session start → stealth detection-audit → verify 0% headless detection
|
|
38
38
|
8. **MCP server**: onecrawl mcp → tool discovery → action execution → clean shutdown
|
|
39
39
|
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: live-runtime-validation
|
|
3
|
+
description: "Validate browser, agent, CLI, or MCP behavior against real processes, real network targets, and real artifacts. No mocks."
|
|
4
|
+
---
|
|
5
|
+
# Live Runtime Validation
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Force validation on real infrastructure when the task is about runtime correctness, browser behavior, identity profiles, anti-bot handling, or tool execution.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- Verifying MCP or CLI behavior
|
|
12
|
+
- Testing browser/session/auth/profile flows
|
|
13
|
+
- Testing agent loops, harness flows, or anti-bot behavior
|
|
14
|
+
- Confirming a bug fix that could be masked by mocks
|
|
15
|
+
|
|
16
|
+
## Hard Rule
|
|
17
|
+
Do not mark runtime work done from unit tests alone.
|
|
18
|
+
|
|
19
|
+
## Required Validation Surface
|
|
20
|
+
- Real binary
|
|
21
|
+
- Real browser process
|
|
22
|
+
- Real session/profile state
|
|
23
|
+
- Real network target when the feature depends on network behavior
|
|
24
|
+
|
|
25
|
+
## Procedure
|
|
26
|
+
1. Build the exact binary you are validating.
|
|
27
|
+
2. Run the flow end-to-end from the user-facing surface:
|
|
28
|
+
- CLI command
|
|
29
|
+
- MCP `tools/call`
|
|
30
|
+
- daemon session
|
|
31
|
+
3. Save artifacts under `packages/onecrawl-rust/target/onecrawl/live/`.
|
|
32
|
+
4. Report:
|
|
33
|
+
- target URL or target system
|
|
34
|
+
- final status
|
|
35
|
+
- artifact path
|
|
36
|
+
- cleanup result
|
|
37
|
+
|
|
38
|
+
## Cleanup Checklist
|
|
39
|
+
- Close sessions
|
|
40
|
+
- Remove temp harness/test state if created only for validation
|
|
41
|
+
- Kill stray test browsers
|
|
42
|
+
- Avoid leaving `/tmp/onecrawl-session*.json` or test profiles behind
|
|
43
|
+
|
|
44
|
+
## Anti-patterns
|
|
45
|
+
- “Test passed” based only on unit/integration output
|
|
46
|
+
- Using `data:` URLs to validate network-dependent features
|
|
47
|
+
- Calling internal Rust functions and calling that E2E
|
|
48
|
+
- Leaving live artifacts in repo root or random temp paths without recording them
|
|
@@ -24,7 +24,7 @@ Need to interact with a page?
|
|
|
24
24
|
├─ Take screenshot? → screenshot --full, screenshot --element <sel>
|
|
25
25
|
├─ Extract data? → extract content json, extract metadata
|
|
26
26
|
├─ Cookie ops? → cookie get/set/delete/export/import
|
|
27
|
-
├─
|
|
27
|
+
├─ Identity? → profile save/load, profile export/import
|
|
28
28
|
├─ Passkeys? → auth passkey-enable/register, auth vault-list
|
|
29
29
|
├─ Network control? → network block, throttle, route, intercept
|
|
30
30
|
├─ HAR/traffic? → har start/drain/export, network-log start/drain
|
|
@@ -149,33 +149,29 @@ cookie export --output cookies.json
|
|
|
149
149
|
cookie import cookies.json
|
|
150
150
|
```
|
|
151
151
|
|
|
152
|
-
### 9.
|
|
152
|
+
### 9. Identity Profile Persistence
|
|
153
153
|
```bash
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
154
|
+
profile save <name> # Save cookies + localStorage + sessionStorage + URL
|
|
155
|
+
profile load <name> # Restore full identity via CDP
|
|
156
|
+
profile list # Show all saved profiles
|
|
157
|
+
profile show <name> # Display JSON content
|
|
158
|
+
profile rename <old> <new>
|
|
159
|
+
profile clear <name> # Delete specific
|
|
160
|
+
profile clean # Delete all
|
|
161
161
|
```
|
|
162
162
|
|
|
163
|
-
**
|
|
164
|
-
plus localStorage, sessionStorage, and the page URL.
|
|
163
|
+
**Identity profiles** store full CDP cookies (including httpOnly, SameSite, secure)
|
|
164
|
+
plus localStorage, sessionStorage, and the page URL.
|
|
165
165
|
|
|
166
|
-
### 10.
|
|
166
|
+
### 10. Profile Management (portable bundles)
|
|
167
167
|
```bash
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
account import /path/to/bundle.json # Restore auth state + merge passkeys
|
|
172
|
-
account list # Show all account bundles
|
|
173
|
-
account show <name> # Display bundle details
|
|
174
|
-
account delete <name> # Remove a bundle
|
|
168
|
+
profile export <name> # Bundle identity + passkeys
|
|
169
|
+
profile export <name> --rp-id x.com # Export only specific site passkeys
|
|
170
|
+
profile import /path/to/bundle.json # Restore identity + merge passkeys
|
|
175
171
|
```
|
|
176
172
|
|
|
177
|
-
**
|
|
178
|
-
with passkey credentials into a single portable JSON file at `~/.onecrawl/
|
|
173
|
+
**Profile bundles** combine identity data (cookies + localStorage + sessionStorage + URL)
|
|
174
|
+
with passkey credentials into a single portable JSON file at `~/.onecrawl/profiles/`.
|
|
179
175
|
|
|
180
176
|
### 11. Passkey & WebAuthn (CDP-only, real ECDSA signatures)
|
|
181
177
|
```bash
|
|
@@ -432,7 +428,6 @@ daemon_headless = true
|
|
|
432
428
|
session_name = "default"
|
|
433
429
|
session_auto_isolate = true # auto-unique session names per agent
|
|
434
430
|
auto_connect = false # auto-discover running Chrome with CDP
|
|
435
|
-
persist_cookies = "" # auto-persist path, empty = disabled
|
|
436
431
|
chrome_profile = "" # empty = auto (~/.onecrawl/chrome-profile/)
|
|
437
432
|
user_agent = "" # empty = auto
|
|
438
433
|
daemon_idle_timeout = 1800 # 30 minutes
|
|
@@ -481,8 +476,7 @@ through `onecrawl run <tool> <action> --json`. The tools are:
|
|
|
481
476
|
```
|
|
482
477
|
~/.onecrawl/
|
|
483
478
|
├── config.toml # Global config
|
|
484
|
-
├──
|
|
485
|
-
├── accounts/{name}.json # Account bundles (auth state + passkeys)
|
|
479
|
+
├── profiles/{name}/ # Identity profiles (CDP cookies + storage + passkeys)
|
|
486
480
|
├── passkeys/vault.json # Multi-site passkey vault (rp_id → credentials)
|
|
487
481
|
├── chrome-profile/ # Persistent Chrome profile
|
|
488
482
|
└── profiles/{name}/ # Named browser profiles
|
|
@@ -1,21 +1,37 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: programmatic-tool-calling
|
|
3
|
-
description: "
|
|
3
|
+
description: "Generalized orchestration for multi-step tool workflows. Use code to batch, branch, retry, and filter tool calls before returning only high-signal results."
|
|
4
4
|
---
|
|
5
|
-
# Programmatic Tool Calling
|
|
5
|
+
# Programmatic Tool Calling
|
|
6
6
|
|
|
7
7
|
## Purpose
|
|
8
|
-
Execute multi-step tool workflows
|
|
8
|
+
Execute multi-step tool workflows through code orchestration to reduce latency, context pollution, and token overhead.
|
|
9
9
|
|
|
10
10
|
## Use when
|
|
11
11
|
- 3+ dependent tool calls in sequence
|
|
12
12
|
- Large intermediate outputs (logs, tables, file listings)
|
|
13
13
|
- Branching logic, retries, or fan-out/fan-in workflows
|
|
14
14
|
- Multi-crate operations that follow the dependency graph
|
|
15
|
+
- The model only needs the final decision-ready summary, not every intermediate payload
|
|
16
|
+
|
|
17
|
+
## Avoid when
|
|
18
|
+
- A single primitive already solves the task
|
|
19
|
+
- The workflow is interactive and depends on user judgment at each step
|
|
20
|
+
- The script would hide important state transitions the model still needs to inspect
|
|
15
21
|
|
|
16
22
|
## Core Idea
|
|
17
23
|
Treat tools as callable functions inside an orchestration runtime (bash script, Python), not as one-turn-at-a-time chat actions. Return only high-signal summaries to the model context.
|
|
18
24
|
|
|
25
|
+
This is a generalized workflow pattern, not a vendor-specific feature. The value is not the API shape; it is deterministic orchestration, filtered outputs, and lower context churn.
|
|
26
|
+
|
|
27
|
+
## Preferred Runtime Choice
|
|
28
|
+
- Bash for CLI-heavy flows and simple gating
|
|
29
|
+
- Python when you need structured parsing, branching, or JSON manipulation
|
|
30
|
+
- Shared Rust/core implementation when the behavior itself belongs in the product rather than in the agent workflow
|
|
31
|
+
|
|
32
|
+
## OneCrawl-Specific Rule
|
|
33
|
+
If a behavior exists in both CLI and MCP, orchestrate the same logical action on both surfaces instead of inventing two separate scripts with different semantics.
|
|
34
|
+
|
|
19
35
|
## OneCrawl Patterns
|
|
20
36
|
|
|
21
37
|
### Pattern 1: Sequential Gate Check (reduce 4 tool calls to 1)
|
|
@@ -97,6 +113,8 @@ done
|
|
|
97
113
|
- Never blindly execute tool output as code.
|
|
98
114
|
- Use `set -e` in bash scripts to fail fast on errors.
|
|
99
115
|
- Cap parallel jobs to avoid resource exhaustion: `wait` after fan-out.
|
|
116
|
+
- Keep intermediate artifacts in `target/onecrawl/` when they matter for debugging.
|
|
117
|
+
- Prefer one orchestrated script that filters output over many chat-level tool calls that dump raw logs into context.
|
|
100
118
|
|
|
101
119
|
## Done Criteria
|
|
102
120
|
- Workflow completes with reduced context load and deterministic control flow.
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: runtime-surface-parity
|
|
3
|
+
description: "Keep CLI and MCP behavior symmetric by enforcing one shared core implementation, schema parity, and cross-surface verification."
|
|
4
|
+
---
|
|
5
|
+
# Runtime Surface Parity
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
Prevent CLI and MCP from drifting apart semantically. One surface may parse or render differently, but behavior must come from the same core implementation.
|
|
9
|
+
|
|
10
|
+
## Use when
|
|
11
|
+
- Adding or changing a CLI command that overlaps an MCP action
|
|
12
|
+
- Adding or changing an MCP action that overlaps a CLI command
|
|
13
|
+
- Refactoring browser/session/auth/captcha/data flows that exist on both surfaces
|
|
14
|
+
|
|
15
|
+
## Core Rule
|
|
16
|
+
Implement behavior once in shared Rust code, then keep CLI and MCP as thin adapters.
|
|
17
|
+
|
|
18
|
+
## Required Procedure
|
|
19
|
+
1. Identify the shared behavior and move it into the deepest common crate that can own it.
|
|
20
|
+
2. Make CLI and MCP call that same implementation.
|
|
21
|
+
3. Keep request/response schemas structurally equivalent:
|
|
22
|
+
- same required fields
|
|
23
|
+
- same enums/status values
|
|
24
|
+
- same success/failure semantics
|
|
25
|
+
4. Verify the same scenario from both surfaces against the same live target.
|
|
26
|
+
|
|
27
|
+
## Parity Checklist
|
|
28
|
+
- One source of truth for runtime behavior
|
|
29
|
+
- No CLI-only fallback that MCP does not have
|
|
30
|
+
- No MCP-only fallback that CLI does not have
|
|
31
|
+
- Same auth/session/profile semantics
|
|
32
|
+
- Same challenge/detect/solve status model
|
|
33
|
+
- Same cleanup expectations
|
|
34
|
+
|
|
35
|
+
## Good Patterns
|
|
36
|
+
- Shared handler in `onecrawl-cdp` or another common crate
|
|
37
|
+
- CLI `run <tool> <action>` delegating to MCP/shared core
|
|
38
|
+
- MCP returning structured JSON that CLI can print without inventing new semantics
|
|
39
|
+
|
|
40
|
+
## Bad Patterns
|
|
41
|
+
- Fixing only MCP because the bug was observed there
|
|
42
|
+
- Fixing only CLI because it is easier to patch
|
|
43
|
+
- Two separate fallback chains for the same feature
|
|
44
|
+
- Same action name with different parameter aliases but different behavior
|
|
45
|
+
|
|
46
|
+
## Verification
|
|
47
|
+
- Run one live MCP probe
|
|
48
|
+
- Run the equivalent live CLI command
|
|
49
|
+
- Compare outcome, not just exit code
|
|
50
|
+
- Check cleanup: no orphan browser/session/process left behind
|
package/package.json
CHANGED