npm - onecrawl - Versions diffs - 4.0.0-beta.4 → 4.0.0-beta.6 - Mend

onecrawl 4.0.0-beta.4 → 4.0.0-beta.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/assets/AGENTS.md +6 -0
package/assets/skills/README.md +11 -4
package/assets/skills/e2e-testing/SKILL.md +1 -1
package/assets/skills/live-runtime-validation/SKILL.md +48 -0
package/assets/skills/onecrawl-commands/SKILL.md +18 -24
package/assets/skills/programmatic-tool-calling/SKILL.md +21 -3
package/assets/skills/runtime-surface-parity/SKILL.md +50 -0
package/package.json +1 -1

package/assets/AGENTS.md CHANGED Viewed

@@ -56,6 +56,12 @@ Read the corresponding `SKILL.md` when the trigger condition for that skill is m
 - **[`onecrawl-commands`](.github/skills/onecrawl-commands/SKILL.md)** — **Invoke when**: writing automation scripts, choosing between OneCrawl primitives and eval, or building agent workflows | **Provides**: complete command reference (200+ commands), decision flowchart (primitives-first, eval-as-fallback), anti-patterns, configuration guide
+- **[`runtime-surface-parity`](.github/skills/runtime-surface-parity/SKILL.md)** — **Invoke when**: adding or changing behavior exposed through both CLI and MCP; or refactoring shared runtime flows such as session/auth/captcha/data handling | **Provides**: single-core implementation rule, parity checklist, and live cross-surface verification procedure
+- **[`live-runtime-validation`](.github/skills/live-runtime-validation/SKILL.md)** — **Invoke when**: verifying browser, session, auth, agent loop, CLI, or MCP runtime behavior | **Provides**: real-process/no-mock validation procedure, artifact expectations, and cleanup checklist
+- **[`programmatic-tool-calling`](.github/skills/programmatic-tool-calling/SKILL.md)** — **Invoke when**: 3+ dependent tool calls, large intermediate outputs, branching/retry orchestration, or fan-out/fan-in workflows are present | **Provides**: generalized orchestration pattern to batch tool use in code and return only decision-relevant summaries
 ## Maintenance Rule
 When updating policy logic:

package/assets/skills/README.md CHANGED Viewed

@@ -16,6 +16,7 @@ Use this index to quickly select the right skill file.
 ## Testing
 - `testing-policy/SKILL.md` — unit/integration/E2E/non-regression baseline and enforcement.
 - `e2e-testing/SKILL.md` — focused E2E execution checklist and anti-flakiness rules.
+- `live-runtime-validation/SKILL.md` — real-process, real-network runtime verification with artifact capture and cleanup.
 ## Policy Maintenance
 - `policy-coherence-audit/SKILL.md` — detect and fix contradictions across AGENTS policies.
@@ -23,13 +24,19 @@ Use this index to quickly select the right skill file.
 ## Debugging
 - `systematic-debugging/SKILL.md` — evidence-based debugging via structured logging, `debug-data.log` analysis, and MCP tool integration.
+## Runtime & Tooling
+- `runtime-surface-parity/SKILL.md` — one-core parity rule for CLI and MCP surfaces.
+- `programmatic-tool-calling/SKILL.md` — code-level orchestration pattern for multi-step tool workflows.
 ## Suggested invocation order (quick)
 1. `interaction-loop` (select path)
 2. `planning-tracking`
 3. `breaking-change-paths` (if relevant)
 4. `testing-policy` + `e2e-testing` (if relevant)
 5. `systematic-debugging` (if bug/unexpected behavior)
-6. `completion-gate`
-7. `session-logging` + `github-sync`
-8. `rollback-rca` (only if blocked/failing repeatedly)
-9. `policy-coherence-audit` (when editing policy)
+6. `runtime-surface-parity` (if CLI/MCP or multiple runtime surfaces are involved)
+7. `live-runtime-validation` (if runtime behavior must be proven live)
+8. `completion-gate`
+9. `session-logging` + `github-sync`
+10. `rollback-rca` (only if blocked/failing repeatedly)
+11. `policy-coherence-audit` (when editing policy)

package/assets/skills/e2e-testing/SKILL.md CHANGED Viewed

@@ -33,7 +33,7 @@ onecrawl session close                       # Cleanup
 3. **Multi-agent isolation**: session start -s a1 → session start -s a2 → verify isolation
 4. **Profile management**: profile create → session start --profile → profile delete
 5. **Config management**: config set → config show → verify change persists
-6. **Auth persistence**: auth-state save → session close → session start → auth-state load
+6. **Identity persistence**: profile save → session close → session start → profile load
 7. **Stealth**: session start → stealth detection-audit → verify 0% headless detection
 8. **MCP server**: onecrawl mcp → tool discovery → action execution → clean shutdown

package/assets/skills/live-runtime-validation/SKILL.md ADDED Viewed

@@ -0,0 +1,48 @@
+---
+name: live-runtime-validation
+description: "Validate browser, agent, CLI, or MCP behavior against real processes, real network targets, and real artifacts. No mocks."
+---
+# Live Runtime Validation
+## Purpose
+Force validation on real infrastructure when the task is about runtime correctness, browser behavior, identity profiles, anti-bot handling, or tool execution.
+## Use when
+- Verifying MCP or CLI behavior
+- Testing browser/session/auth/profile flows
+- Testing agent loops, harness flows, or anti-bot behavior
+- Confirming a bug fix that could be masked by mocks
+## Hard Rule
+Do not mark runtime work done from unit tests alone.
+## Required Validation Surface
+- Real binary
+- Real browser process
+- Real session/profile state
+- Real network target when the feature depends on network behavior
+## Procedure
+1. Build the exact binary you are validating.
+2. Run the flow end-to-end from the user-facing surface:
+   - CLI command
+   - MCP `tools/call`
+   - daemon session
+3. Save artifacts under `packages/onecrawl-rust/target/onecrawl/live/`.
+4. Report:
+   - target URL or target system
+   - final status
+   - artifact path
+   - cleanup result
+## Cleanup Checklist
+- Close sessions
+- Remove temp harness/test state if created only for validation
+- Kill stray test browsers
+- Avoid leaving `/tmp/onecrawl-session*.json` or test profiles behind
+## Anti-patterns
+- “Test passed” based only on unit/integration output
+- Using `data:` URLs to validate network-dependent features
+- Calling internal Rust functions and calling that E2E
+- Leaving live artifacts in repo root or random temp paths without recording them

package/assets/skills/onecrawl-commands/SKILL.md CHANGED Viewed

@@ -24,7 +24,7 @@ Need to interact with a page?
   ├─ Take screenshot?   → screenshot --full, screenshot --element <sel>
   ├─ Extract data?      → extract content json, extract metadata
   ├─ Cookie ops?        → cookie get/set/delete/export/import
-  ├─ Auth state?        → auth-state save/load, account export/import
+  ├─ Identity?          → profile save/load, profile export/import
   ├─ Passkeys?          → auth passkey-enable/register, auth vault-list
   ├─ Network control?   → network block, throttle, route, intercept
   ├─ HAR/traffic?       → har start/drain/export, network-log start/drain
@@ -149,33 +149,29 @@ cookie export --output cookies.json
 cookie import cookies.json
 ```
-### 9. Auth State Persistence
+### 9. Identity Profile Persistence
 ```bash
-auth-state save <name>           # Save cookies + localStorage + sessionStorage + URL (v2)
-auth-state load <name>           # Restore full auth state via CDP
-auth-state list                  # Show all saved states
-auth-state show <name>           # Display JSON content
-auth-state rename <old> <new>
-auth-state clear <name>          # Delete specific
-auth-state clean                 # Delete all
+profile save <name>              # Save cookies + localStorage + sessionStorage + URL
+profile load <name>              # Restore full identity via CDP
+profile list                     # Show all saved profiles
+profile show <name>              # Display JSON content
+profile rename <old> <new>
+profile clear <name>             # Delete specific
+profile clean                    # Delete all
 ```
-**Auth state v2 format** stores full CDP cookies (including httpOnly, SameSite, secure)
-plus localStorage, sessionStorage, and the page URL. Auto-detects and handles v1 files.
+**Identity profiles** store full CDP cookies (including httpOnly, SameSite, secure)
+plus localStorage, sessionStorage, and the page URL.
-### 10. Account Management (portable bundles)
+### 10. Profile Management (portable bundles)
 ```bash
-account export <name>                    # Bundle auth state + passkeys
-account export <name> --auth-state alt   # Use different auth state name
-account export <name> --rp-id x.com      # Export only specific site passkeys
-account import /path/to/bundle.json      # Restore auth state + merge passkeys
-account list                             # Show all account bundles
-account show <name>                      # Display bundle details
-account delete <name>                    # Remove a bundle
+profile export <name>                    # Bundle identity + passkeys
+profile export <name> --rp-id x.com      # Export only specific site passkeys
+profile import /path/to/bundle.json      # Restore identity + merge passkeys
 ```
-**Account bundles** combine auth state (cookies + localStorage + sessionStorage + URL)
-with passkey credentials into a single portable JSON file at `~/.onecrawl/accounts/`.
+**Profile bundles** combine identity data (cookies + localStorage + sessionStorage + URL)
+with passkey credentials into a single portable JSON file at `~/.onecrawl/profiles/`.
 ### 11. Passkey & WebAuthn (CDP-only, real ECDSA signatures)
 ```bash
@@ -432,7 +428,6 @@ daemon_headless = true
 session_name = "default"
 session_auto_isolate = true     # auto-unique session names per agent
 auto_connect = false            # auto-discover running Chrome with CDP
-persist_cookies = ""            # auto-persist path, empty = disabled
 chrome_profile = ""             # empty = auto (~/.onecrawl/chrome-profile/)
 user_agent = ""                 # empty = auto
 daemon_idle_timeout = 1800      # 30 minutes
@@ -481,8 +476,7 @@ through `onecrawl run <tool> <action> --json`. The tools are:
 ```
 ~/.onecrawl/
 ├── config.toml                  # Global config
-├── auth-states/{name}.json      # Auth state snapshots (v2: CDP cookies + storage)
-├── accounts/{name}.json         # Account bundles (auth state + passkeys)
+├── profiles/{name}/             # Identity profiles (CDP cookies + storage + passkeys)
 ├── passkeys/vault.json          # Multi-site passkey vault (rp_id → credentials)
 ├── chrome-profile/              # Persistent Chrome profile
 └── profiles/{name}/             # Named browser profiles

package/assets/skills/programmatic-tool-calling/SKILL.md CHANGED Viewed

@@ -1,21 +1,37 @@
 ---
 name: programmatic-tool-calling
-description: "Multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead."
+description: "Generalized orchestration for multi-step tool workflows. Use code to batch, branch, retry, and filter tool calls before returning only high-signal results."
 ---
-# Programmatic Tool Calling Skill (Model-Agnostic)
+# Programmatic Tool Calling
 ## Purpose
-Execute multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead.
+Execute multi-step tool workflows through code orchestration to reduce latency, context pollution, and token overhead.
 ## Use when
 - 3+ dependent tool calls in sequence
 - Large intermediate outputs (logs, tables, file listings)
 - Branching logic, retries, or fan-out/fan-in workflows
 - Multi-crate operations that follow the dependency graph
+- The model only needs the final decision-ready summary, not every intermediate payload
+## Avoid when
+- A single primitive already solves the task
+- The workflow is interactive and depends on user judgment at each step
+- The script would hide important state transitions the model still needs to inspect
 ## Core Idea
 Treat tools as callable functions inside an orchestration runtime (bash script, Python), not as one-turn-at-a-time chat actions. Return only high-signal summaries to the model context.
+This is a generalized workflow pattern, not a vendor-specific feature. The value is not the API shape; it is deterministic orchestration, filtered outputs, and lower context churn.
+## Preferred Runtime Choice
+- Bash for CLI-heavy flows and simple gating
+- Python when you need structured parsing, branching, or JSON manipulation
+- Shared Rust/core implementation when the behavior itself belongs in the product rather than in the agent workflow
+## OneCrawl-Specific Rule
+If a behavior exists in both CLI and MCP, orchestrate the same logical action on both surfaces instead of inventing two separate scripts with different semantics.
 ## OneCrawl Patterns
 ### Pattern 1: Sequential Gate Check (reduce 4 tool calls to 1)
@@ -97,6 +113,8 @@ done
 - Never blindly execute tool output as code.
 - Use `set -e` in bash scripts to fail fast on errors.
 - Cap parallel jobs to avoid resource exhaustion: `wait` after fan-out.
+- Keep intermediate artifacts in `target/onecrawl/` when they matter for debugging.
+- Prefer one orchestrated script that filters output over many chat-level tool calls that dump raw logs into context.
 ## Done Criteria
 - Workflow completes with reduced context load and deterministic control flow.

package/assets/skills/runtime-surface-parity/SKILL.md ADDED Viewed

@@ -0,0 +1,50 @@
+---
+name: runtime-surface-parity
+description: "Keep CLI and MCP behavior symmetric by enforcing one shared core implementation, schema parity, and cross-surface verification."
+---
+# Runtime Surface Parity
+## Purpose
+Prevent CLI and MCP from drifting apart semantically. One surface may parse or render differently, but behavior must come from the same core implementation.
+## Use when
+- Adding or changing a CLI command that overlaps an MCP action
+- Adding or changing an MCP action that overlaps a CLI command
+- Refactoring browser/session/auth/captcha/data flows that exist on both surfaces
+## Core Rule
+Implement behavior once in shared Rust code, then keep CLI and MCP as thin adapters.
+## Required Procedure
+1. Identify the shared behavior and move it into the deepest common crate that can own it.
+2. Make CLI and MCP call that same implementation.
+3. Keep request/response schemas structurally equivalent:
+   - same required fields
+   - same enums/status values
+   - same success/failure semantics
+4. Verify the same scenario from both surfaces against the same live target.
+## Parity Checklist
+- One source of truth for runtime behavior
+- No CLI-only fallback that MCP does not have
+- No MCP-only fallback that CLI does not have
+- Same auth/session/profile semantics
+- Same challenge/detect/solve status model
+- Same cleanup expectations
+## Good Patterns
+- Shared handler in `onecrawl-cdp` or another common crate
+- CLI `run <tool> <action>` delegating to MCP/shared core
+- MCP returning structured JSON that CLI can print without inventing new semantics
+## Bad Patterns
+- Fixing only MCP because the bug was observed there
+- Fixing only CLI because it is easier to patch
+- Two separate fallback chains for the same feature
+- Same action name with different parameter aliases but different behavior
+## Verification
+- Run one live MCP probe
+- Run the equivalent live CLI command
+- Compare outcome, not just exit code
+- Check cleanup: no orphan browser/session/process left behind

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "onecrawl",
-  "version": "4.0.0-beta.4",
+  "version": "4.0.0-beta.6",
   "description": "Browser automation engine — CLI, MCP server, and agent skills installer",
   "license": "BUSL-1.1",
   "author": "Giulio Leone <giulio@onecrawl.dev>",