onecrawl 4.0.0-alpha.64 → 4.0.0-beta.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -17,43 +17,70 @@ Run a structured, future-proof decision process when a task may affect public co
17
17
  | Surface | Breaking Impact | Examples |
18
18
  |---------|----------------|----------|
19
19
  | **CLI flags/args** | HIGH | Renaming `--headless` → `--head`, removing `--native` |
20
- | **MCP tool names/schemas** | HIGH | Renaming tool actions, changing JSON schemas |
20
+ | **MCP tool names/schemas** | HIGH | Renaming tool actions, changing JSON input schemas |
21
21
  | **NAPI/PyO3 method signatures** | HIGH | Changing function names, param types, return types |
22
- | **Config keys** | MEDIUM | Renaming `config.toml` keys, changing defaults |
22
+ | **Config keys** (`config.toml`) | MEDIUM | Renaming keys, changing defaults, removing options |
23
23
  | **Session file format** | MEDIUM | Changing `/tmp/onecrawl-session-*.json` structure |
24
- | **Daemon protocol** | MEDIUM | Changing HTTP/WS API between CLI and daemon |
25
- | **Internal crate APIs** | LOW | Changing `pub` functions in workspace crates |
26
- | **CDP layer** | LOW | Internal CDP wrapper changes |
24
+ | **Daemon HTTP/WS protocol** | MEDIUM | Changing API between CLI daemon communication |
25
+ | **Profile file format** | MEDIUM | Changing profile storage schema |
26
+ | **Internal crate `pub` APIs** | LOW | Changing `pub fn` in workspace crates (internal consumers only) |
27
+ | **CDP layer wrappers** | LOW | Internal CDP abstractions (no external consumers) |
28
+
29
+ ## Concrete OneCrawl Examples
30
+
31
+ ### Non-Breaking Change Examples
32
+ ```
33
+ ✅ Adding new CLI flag: `onecrawl session start --timeout 30`
34
+ ✅ Adding new MCP action: `onecrawl run browser get_accessibility_snapshot`
35
+ ✅ New config key with default: config.toml gains `stealth.level = "standard"`
36
+ ✅ Adding optional field to session JSON: `{"pid": 1234, "created_at": "..."}`
37
+ ✅ New NAPI export function (additive)
38
+ ```
39
+
40
+ ### Breaking Change Examples
41
+ ```
42
+ ⛔ Renaming `onecrawl run browser` → `onecrawl browser exec`
43
+ ⛔ Removing `--native` flag from session start
44
+ ⛔ Changing MCP tool "browser" schema (existing clients break)
45
+ ⛔ Changing session JSON key: "session_name" → "name"
46
+ ⛔ Changing PyO3 class method signature
47
+ ```
27
48
 
28
49
  ## Decision Framework
29
50
 
30
51
  ### Non-Breaking Path
31
52
  - Add new flags/options alongside existing ones
32
- - Deprecation warnings before removal (minimum 2 alpha releases)
53
+ - Deprecation warnings before removal (minimum 2 beta releases)
33
54
  - Backward-compatible config: new keys get defaults, old keys still work
34
55
  - Additive MCP actions (new actions, not renamed ones)
56
+ - Session JSON: add optional fields, never remove/rename existing
35
57
 
36
58
  ### Breaking Path (requires justification)
37
- - Must document: what breaks, who is affected, migration steps
38
- - Must provide: migration guide or automated migration tool
39
- - Must bump: version increment that signals the break
40
- - Quality gates unchanged: unit/integration/E2E/non-regression must still pass
59
+ - [ ] Document: what breaks, who is affected, migration steps
60
+ - [ ] Migration: provide guide or automated migration in code
61
+ - [ ] Version: bump that signals the break (beta → beta.N+1 minimum)
62
+ - [ ] CHANGELOG: explicit "BREAKING" section with before/after examples
63
+ - [ ] Quality gates unchanged: completion-gate must still pass
41
64
 
42
65
  ## Procedure
43
- 1. Identify all contract surfaces affected.
44
- 2. For each, classify as non-breaking or breaking.
45
- 3. Present options via `ask_user` with the compatibility triad:
46
- - Non-Breaking Path (additive/compatible)
47
- - Breaking Path (with migration plan)
48
- - Alternative Structural Path (redesign to avoid the break)
66
+ 1. Identify all contract surfaces affected (use table above).
67
+ 2. For each surface, classify impact as non-breaking or breaking.
68
+ 3. If any HIGH-impact surface is affected, present options to user:
69
+ - **Non-Breaking Path** (additive/compatible — deprecate first)
70
+ - **Breaking Path** (with migration plan)
71
+ - **Alternative Structural Path** (redesign to avoid the break)
49
72
  4. Document decision in commit message and CHANGELOG.
73
+ 5. If breaking: add `BREAKING:` prefix to commit message.
50
74
 
51
75
  ## Done Criteria
52
76
  - Decision documented with rationale.
53
77
  - Migration steps provided for any breaking change.
54
- - All quality gates pass.
78
+ - All quality gates pass (completion-gate skill).
79
+ - CHANGELOG updated with breaking change notice if applicable.
55
80
 
56
81
  ## Anti-patterns
57
- - Silent breaking changes (no documentation)
82
+ - Silent breaking changes (no documentation, no CHANGELOG entry)
58
83
  - Breaking changes without version bump
59
- - Assuming internal changes are safe (NAPI/PyO3 are public)
84
+ - Assuming internal crate changes are safe (NAPI/PyO3 expose them)
85
+ - Removing deprecated items before 2-release grace period
86
+ - Breaking MCP schemas without client-side migration path
@@ -21,37 +21,74 @@ Enforce a mandatory quality gate for every Issue, Milestone, and PR with zero-er
21
21
 
22
22
  ## OneCrawl Gate Commands
23
23
 
24
- Run these in order. Both must pass **twice consecutively** with zero output:
24
+ Run in order. All must pass **twice consecutively** with zero output:
25
25
 
26
26
  ```bash
27
- # 1. Clippy (lint + static analysis) - 0 warnings required
27
+ # 1. Clippy (lint + static analysis) 0 warnings required
28
28
  cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W clippy::all
29
29
 
30
30
  # 2. Build check (fast compilation verification)
31
- cargo check --workspace
31
+ cargo check -p onecrawl-cli-rs -p onecrawl-mcp-rs
32
32
 
33
- # 3. Test suite (all tests, single-threaded for determinism)
33
+ # 3. Test suite (573 tests, single-threaded for determinism)
34
34
  cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1
35
35
 
36
- # 4. Health check (binary runs correctly)
36
+ # 4. Release build (binary artifact)
37
+ cargo build --release -p onecrawl-cli-rs
38
+
39
+ # 5. Health check (binary runs correctly)
37
40
  ./target/release/onecrawl health
38
41
  ./target/release/onecrawl --version
39
42
  ```
40
43
 
41
- Acceptable warnings: `onecrawl-browser` vendor crate (1 warning, do not touch).
44
+ Acceptable warnings: `onecrawl-browser` vendor crate (1 pre-existing clippy warning do not touch).
45
+
46
+ ## Gate Execution Checklist
47
+
48
+ - [ ] **Pass 1 — Clippy**: 0 warnings in owned crates
49
+ - [ ] **Pass 1 — Build**: `cargo check` exits 0
50
+ - [ ] **Pass 1 — Tests**: 573 tests passed, 0 failed
51
+ - [ ] **Pass 1 — Binary**: `onecrawl --version` prints `v4.0.0-beta.1`
52
+ - [ ] **Pass 2 — Clippy**: identical to pass 1
53
+ - [ ] **Pass 2 — Tests**: identical to pass 1
54
+ - [ ] **No regressions**: test count ≥ baseline recorded before changes
55
+ - [ ] **Commit**: use `git commit -F /tmp/commit-msg.txt` with Co-authored-by trailer
56
+
57
+ ## CI Verification (release.yml + rust-ci.yml)
58
+
59
+ After push, verify CI on GitHub:
60
+
61
+ ```bash
62
+ # Check CI status for current branch
63
+ gh run list --repo giulio-leone/onecrawl --branch "$(git branch --show-current)" --limit 5
64
+
65
+ # Watch a specific run
66
+ gh run watch <run-id> --repo giulio-leone/onecrawl
67
+
68
+ # Get failed job logs
69
+ gh run view <run-id> --repo giulio-leone/onecrawl --log-failed
70
+ ```
71
+
72
+ **rust-ci.yml** gates: clippy, test (--test-threads=1), build check.
73
+ **release.yml** gates: 5-platform matrix (linux-x64, linux-arm64, macos-x64, macos-arm64, windows-x64).
42
74
 
43
75
  ## Procedure
44
- 1. Run all gate commands above.
45
- 2. If any error/warning exists in owned crates, fix and restart from step 1.
46
- 3. Repeat until two clean consecutive passes are recorded.
47
- 4. Only then mark status `done`/merge.
76
+ 1. Record baseline: `cargo test ... 2>&1 | tail -1` (note test count).
77
+ 2. Run all gate commands above.
78
+ 3. If any error/warning exists in owned crates, fix and restart from step 1.
79
+ 4. Repeat until two clean consecutive passes are recorded.
80
+ 5. Verify CI is green after push (`gh run list`).
81
+ 6. Only then mark status `done`/merge.
48
82
 
49
83
  ## Done Criteria
50
84
  - Two consecutive clean review passes are documented.
51
85
  - No unresolved pre-existing issues in touched scope.
86
+ - CI (rust-ci.yml) is green on the pushed branch.
52
87
 
53
88
  ## Anti-patterns
54
89
  - Single-pass approval
55
- - Ignoring warnings
90
+ - Ignoring warnings ("it's just one warning")
56
91
  - Deferring known issues to "later"
57
92
  - Excluding tests with `#[ignore]` to pass the gate
93
+ - Pushing without verifying CI status
94
+ - Using `cargo check --workspace` when `cargo check -p <crate>` suffices (slower)
@@ -13,12 +13,13 @@ Ensure critical user flows work end-to-end with deterministic, CI-compatible exe
13
13
  ## OneCrawl E2E Architecture
14
14
 
15
15
  E2E tests live in `packages/onecrawl-rust/crates/onecrawl-e2e/` and require a running Chrome instance.
16
+ Session state files: `/tmp/onecrawl-session*.json` (per-PID markers for multi-agent isolation).
16
17
 
17
18
  ```bash
18
19
  # Run E2E tests (requires Chrome)
19
20
  cargo test -p onecrawl-e2e -- --test-threads=1
20
21
 
21
- # Manual E2E verification (quick smoke test)
22
+ # Manual E2E smoke test
22
23
  onecrawl session start -H # Headless Chrome
23
24
  onecrawl navigate https://example.com # Navigate
24
25
  onecrawl get title # Verify page loaded
@@ -34,20 +35,59 @@ onecrawl session close # Cleanup
34
35
  5. **Config management**: config set → config show → verify change persists
35
36
  6. **Auth persistence**: auth-state save → session close → session start → auth-state load
36
37
  7. **Stealth**: session start → stealth detection-audit → verify 0% headless detection
38
+ 8. **MCP server**: onecrawl mcp → tool discovery → action execution → clean shutdown
37
39
 
38
- ## Checklist
40
+ ## Multi-Agent E2E Test Template
41
+
42
+ Tests verifying process-level isolation between concurrent agents:
43
+
44
+ ```bash
45
+ # Agent A (PID-based session isolation)
46
+ onecrawl session start -H -s "agent-$$-a"
47
+ onecrawl navigate https://example.com
48
+ TITLE_A=$(onecrawl get title)
49
+
50
+ # Agent B (separate session, same daemon)
51
+ onecrawl session start -H -s "agent-$$-b"
52
+ onecrawl navigate https://httpbin.org
53
+ TITLE_B=$(onecrawl get title)
54
+
55
+ # Verify isolation — each session sees its own page
56
+ [ "$TITLE_A" != "$TITLE_B" ] && echo "PASS: isolation OK"
57
+
58
+ # Cleanup both
59
+ onecrawl session close -s "agent-$$-a"
60
+ onecrawl session close -s "agent-$$-b"
61
+ ```
62
+
63
+ ## Pre-Test Checklist
64
+ - [ ] Chrome/Chromium installed and accessible
65
+ - [ ] No stale sessions: `ls /tmp/onecrawl-session*.json` (should be empty)
66
+ - [ ] No orphan Chrome processes: `pgrep -f "chrome.*remote-debugging" | head`
67
+ - [ ] Release binary built: `cargo build --release -p onecrawl-cli-rs`
68
+
69
+ ## Test Execution Checklist
39
70
  - [ ] Happy path covered for each critical flow
40
71
  - [ ] Edge cases: invalid input, missing Chrome, concurrent sessions
41
- - [ ] Cleanup: sessions closed, temp files removed, profiles deleted
42
- - [ ] Deterministic: no timing-dependent assertions
72
+ - [ ] Deterministic: no timing-dependent assertions (use wait-for, not sleep)
43
73
  - [ ] Stable selectors: use `data-testid` or semantic selectors
44
74
  - [ ] CI-compatible: headless mode, no GUI dependencies
45
75
 
76
+ ## Post-Test Cleanup Checklist
77
+ - [ ] All sessions closed: `onecrawl session close` or `onecrawl session close -s <name>`
78
+ - [ ] Temp files removed: `rm -f /tmp/onecrawl-session*.json`
79
+ - [ ] No orphan Chrome: verify `pgrep -f "chrome.*remote-debugging"` returns empty
80
+ - [ ] Profiles cleaned: `onecrawl profile delete <test-profile>` if created
81
+ - [ ] Daemon stopped (if started for test): `onecrawl daemon stop`
82
+
46
83
  ## Done Criteria
47
84
  - All critical flows pass in headless mode.
48
85
  - No leaked browser processes after test completion.
86
+ - No residual session files in `/tmp/`.
49
87
 
50
88
  ## Anti-patterns
51
89
  - Hard-coded sleep/timeouts instead of wait-for conditions
52
- - Tests that depend on network state
90
+ - Tests that depend on network state or external services
53
91
  - Shared mutable state between test cases
92
+ - Forgetting to close sessions (leaks Chrome processes)
93
+ - Using `--test-threads=N` where N > 1 (session conflicts)
@@ -11,43 +11,113 @@ Keep local plan and GitHub project artifacts perfectly aligned.
11
11
  - Creating or updating a plan
12
12
  - Changing issue status
13
13
  - Completing milestones
14
+ - Preparing a release
14
15
 
15
16
  ## OneCrawl Repository
16
17
  - Owner: `giulio-leone`
17
18
  - Repo: `onecrawl`
18
- - Branch: `main`
19
- - Tags: `v4.0.0-alpha.XX`
19
+ - Default branch: `main`
20
+ - Current tag: `v4.0.0-beta.1`
21
+ - CI workflows: `release.yml` (5-platform matrix), `rust-ci.yml`
20
22
 
21
23
  ## Naming Rules
22
24
  - Milestones: `MX: <Title>` (e.g., `M14: Deep Audit`)
23
25
  - Issues: `<type>: <description>` (e.g., `fix: session name path traversal`)
24
26
  - Branches: `<type>/<short-description>` (e.g., `fix/session-path-traversal`)
25
- - Tags: `v4.0.0-alpha.XX` (SemVer pre-release)
27
+ - Tags: SemVer pre-release (e.g., `v4.0.0-beta.2`)
28
+ - Commits: use `git commit -F /tmp/commit-msg.txt` (not `-m`), always include Co-authored-by trailer
26
29
 
27
30
  ## Required Labels
28
31
  - Priority: `P0-critical`, `P1-high`, `P2-medium`, `P3-low`
29
32
  - Type: `bug`, `feature`, `refactor`, `docs`, `chore`
30
33
  - Status: `todo`, `in-progress`, `review`, `done`, `blocked`
31
34
 
35
+ ## gh CLI Commands
36
+
37
+ ```bash
38
+ # Create milestone
39
+ gh api repos/giulio-leone/onecrawl/milestones -f title="M15: Title" -f state="open"
40
+
41
+ # Create issue with labels and milestone
42
+ gh issue create --repo giulio-leone/onecrawl \
43
+ --title "fix: description" \
44
+ --body "Details..." \
45
+ --label "bug,P1-high,in-progress" \
46
+ --milestone "M15: Title"
47
+
48
+ # Update issue status
49
+ gh issue edit <number> --repo giulio-leone/onecrawl --remove-label "in-progress" --add-label "done"
50
+
51
+ # Close issue
52
+ gh issue close <number> --repo giulio-leone/onecrawl
53
+
54
+ # List open issues for a milestone
55
+ gh issue list --repo giulio-leone/onecrawl --milestone "M15: Title" --state open
56
+
57
+ # Check CI status
58
+ gh run list --repo giulio-leone/onecrawl --branch main --limit 5
59
+ ```
60
+
32
61
  ## Procedure
33
62
  1. Create GitHub milestone matching local plan milestone.
34
63
  2. Create issues for each task, linked to milestone.
35
64
  3. Apply labels (priority + type + status).
36
- 4. Update issue status as work progresses.
37
- 5. Close milestone when all issues are done.
65
+ 4. Update issue status labels as work progresses.
66
+ 5. Close issues when completion gate passes.
67
+ 6. Close milestone when all issues are done.
68
+
69
+ ## OneCrawl Release Workflow
70
+
71
+ ### Pre-Release Checklist
72
+ - [ ] All milestone issues closed
73
+ - [ ] Completion gate passed (2 consecutive clean runs)
74
+ - [ ] CHANGELOG.md updated with release notes
75
+ - [ ] Version bumped in `packages/onecrawl-rust/Cargo.toml`
76
+ - [ ] `cargo check --workspace` (propagates version to all crates)
77
+
78
+ ### Release Steps
79
+ ```bash
80
+ # 1. Version bump
81
+ # Edit packages/onecrawl-rust/Cargo.toml — update version field
82
+ cargo check --workspace
83
+
84
+ # 2. Update CHANGELOG.md — add release section
85
+
86
+ # 3. Commit
87
+ echo "chore: release v4.0.0-beta.2
88
+
89
+ - <summary of changes>
90
+
91
+ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>" > /tmp/release-msg.txt
92
+ git add -A && git commit -F /tmp/release-msg.txt
93
+
94
+ # 4. Tag and push
95
+ git tag v4.0.0-beta.2
96
+ git push origin main --tags
97
+
98
+ # 5. npm publish (from packages/onecrawl/)
99
+ cd packages/onecrawl
100
+ npm version 4.0.0-beta.2 --no-git-tag-version
101
+ npm publish --tag beta --access public
102
+
103
+ # 6. Verify release.yml CI (5-platform matrix)
104
+ gh run list --repo giulio-leone/onecrawl --workflow release.yml --limit 1
105
+ ```
38
106
 
39
- ## OneCrawl Release Sync
40
- After each alpha release:
41
- 1. Tag: `git tag v4.0.0-alpha.XX && git push origin main --tags`
42
- 2. npm: `npm publish --tag alpha --access public`
43
- 3. CHANGELOG: update with release notes
107
+ ### Post-Release Verification
108
+ - [ ] `gh run view <id>` — all 5 platforms green
109
+ - [ ] `npm view onecrawl versions --json | tail` version published
110
+ - [ ] GitHub tag visible: `gh release list --repo giulio-leone/onecrawl`
44
111
 
45
112
  ## Done Criteria
46
113
  - All plan items have matching GitHub artifacts.
47
114
  - Labels are applied consistently.
48
115
  - Milestones reflect actual progress.
116
+ - Releases are tagged, CI-verified, and npm-published.
49
117
 
50
118
  ## Anti-patterns
51
119
  - Local plan diverges from GitHub state
52
120
  - Missing labels on issues
53
121
  - Stale milestones with no issues
122
+ - Pushing tags before CI verification
123
+ - Using `git commit -m` instead of `git commit -F`
@@ -13,14 +13,14 @@ Enforce a strict iterative decision loop with consistent user checkpoints and ex
13
13
  - Completing an autonomous run
14
14
 
15
15
  ## Default 5-Option Structure
16
- 1. **Recommended Development Path** (mark with star as most future-proof)
16
+ 1. **Recommended Development Path** (mark with as most future-proof)
17
17
  2. **Alternative Development Path A**
18
18
  3. **Alternative Development Path B**
19
19
  4. **Freeform** — set enum value to `custom` (allows free-text input)
20
20
  5. **Autonomous Mode**
21
21
 
22
22
  ## Escalation (Compatibility Triad)
23
- Use ONLY when there is a concrete compatibility/contract impact:
23
+ Use ONLY when there is a concrete compatibility/contract impact (see breaking-change-paths skill):
24
24
  1. Non-Breaking Path
25
25
  2. Breaking Path
26
26
  3. Alternative Structural Path
@@ -31,17 +31,62 @@ Use ONLY when there is a concrete compatibility/contract impact:
31
31
  Each option must include:
32
32
  `<Title> — Why: <reason> | Leads to: <next step> | Risk: <low|medium|high>`
33
33
 
34
+ ### OneCrawl Example Cards
35
+ ```
36
+ ⭐ 1. Add new --timeout flag (non-breaking)
37
+ Why: Users need session timeouts | Leads to: CLI + daemon changes | Risk: low
38
+
39
+ 2. Replace --headless with --head (breaking)
40
+ Why: Simpler flag name | Leads to: CLI + docs + migration | Risk: high
41
+
42
+ 3. Config-only timeout (no CLI flag)
43
+ Why: Minimal surface change | Leads to: config.toml + docs | Risk: low
44
+
45
+ 4. Freeform — describe your preferred approach
46
+
47
+ 5. Autonomous Mode — I'll implement the recommended path and report back
48
+ ```
49
+
50
+ ## Copilot CLI Implementation
51
+
52
+ In Copilot CLI runtime, the interaction loop uses natural language responses (not tool calls).
53
+ The agent presents options as a numbered list in the response and the user replies with their choice.
54
+
55
+ ### Autonomous Mode Behavior
56
+ When the user selects Autonomous Mode (option 5):
57
+ 1. Agent implements the recommended path (option 1) without further prompts.
58
+ 2. Follows all applicable skills (planning-tracking, testing-policy, completion-gate).
59
+ 3. On completion, presents a summary with:
60
+ - What was done
61
+ - Test results (exact counts)
62
+ - Files changed
63
+ - Asks: "Rate this result (1-5), suggest next action, or say 'I am satisfied'."
64
+ 4. Loop continues until the exact phrase **"I am satisfied"** is received.
65
+
66
+ ### Decision Tracking
67
+ Record each decision point in the session database:
68
+ ```sql
69
+ INSERT INTO session_state (key, value) VALUES
70
+ ('decision_1', 'Option 1: Add --timeout flag (non-breaking) — user chose at 2024-01-15T10:30Z');
71
+ ```
72
+
34
73
  ## Rules
35
74
  - One clear question per iteration
36
75
  - Exactly 5 options every time
37
- - Option 4 is ALWAYS Freeform (hard invariant)
76
+ - Option 4 is ALWAYS Freeform (hard invariant — label "Freeform", enum value `custom`)
38
77
  - Options 1-3 carry forward from previous iteration unless explicitly replaced
39
- - Mark recommendation with star; explain changes
78
+ - Mark recommendation with ⭐; explain if recommendation changes between iterations
40
79
  - At autonomous completion: ask for rating, next action, satisfaction
41
- - Loop stops ONLY on exact phrase: "I am satisfied"
80
+ - Loop stops ONLY on exact phrase: **"I am satisfied"**
81
+
82
+ ## Done Criteria
83
+ - User has explicitly ended the loop with "I am satisfied".
84
+ - All decisions are documented in session log.
42
85
 
43
86
  ## Anti-patterns
44
87
  - Multi-question prompts in one iteration
45
- - Missing or renaming Freeform option
88
+ - Missing or renaming Freeform option (option 4)
46
89
  - Stopping without explicit "I am satisfied"
47
- - Using compatibility triad as default (it's an escalation)
90
+ - Using compatibility triad as default (it's an escalation for breaking changes)
91
+ - Implementing without presenting options first
92
+ - Changing recommendation without explaining why
@@ -19,42 +19,85 @@ interface Milestone { id: string; description: string; priority: "critical"|"hig
19
19
  interface Issue { id: string; task: string; priority: "critical"|"high"|"medium"|"low"; status: "todo"|"in_progress"|"review"|"done"|"blocked"; depends_on: string[]; children: Record<string, Issue>; }
20
20
  ```
21
21
 
22
+ ## SQL-Based Tracking
23
+
24
+ Use the session database for operational tracking:
25
+
26
+ ```sql
27
+ -- Create plan items
28
+ INSERT INTO todos (id, title, description, status) VALUES
29
+ ('m1-cdp-refactor', 'Refactor CDP error handling', 'Update onecrawl-cdp error types to use thiserror, propagate to onecrawl-mcp-rs', 'pending'),
30
+ ('m1-cli-flags', 'Add --timeout flag', 'Add session timeout to onecrawl-cli-rs session start', 'pending'),
31
+ ('m1-tests', 'Add timeout tests', 'Unit tests for session timeout in cli-rs and cdp', 'pending');
32
+
33
+ -- Declare dependencies
34
+ INSERT INTO todo_deps (todo_id, depends_on) VALUES
35
+ ('m1-cli-flags', 'm1-cdp-refactor'), -- CLI depends on CDP changes
36
+ ('m1-tests', 'm1-cli-flags'); -- Tests depend on implementation
37
+
38
+ -- Find ready items (no pending dependencies)
39
+ SELECT t.id, t.title FROM todos t
40
+ WHERE t.status = 'pending'
41
+ AND NOT EXISTS (
42
+ SELECT 1 FROM todo_deps td
43
+ JOIN todos dep ON td.depends_on = dep.id
44
+ WHERE td.todo_id = t.id AND dep.status != 'done'
45
+ );
46
+
47
+ -- Update status as work progresses
48
+ UPDATE todos SET status = 'in_progress' WHERE id = 'm1-cdp-refactor';
49
+ UPDATE todos SET status = 'done' WHERE id = 'm1-cdp-refactor';
50
+ ```
51
+
22
52
  ## OneCrawl Workspace Parallelism Rules
23
53
 
24
54
  Safe to parallelize:
25
- - Changes in different crates (e.g., `onecrawl-cdp` + `onecrawl-parser`)
55
+ - Changes in independent crates (e.g., `onecrawl-crypto` + `onecrawl-parser`)
26
56
  - NAPI + PyO3 bindings (independent build targets)
27
57
  - Documentation + code changes
58
+ - Tests in different crates (with `--test-threads=1` per crate)
28
59
 
29
60
  NOT safe to parallelize:
30
61
  - Changes in a crate + its dependents (e.g., `onecrawl-cdp` + `onecrawl-cli-rs`)
31
62
  - Multiple changes to the same file
32
- - Version bumps (must be sequential: Cargo.toml then package.json)
63
+ - Version bumps (must be sequential: Cargo.toml package.json → tag)
64
+ - Any two changes that touch session state or `/tmp/onecrawl-session*.json`
33
65
 
34
66
  ## OneCrawl Crate Dependency Graph
35
67
  ```
36
- onecrawl-cli-rs
37
- -> onecrawl-mcp-rs -> onecrawl-cdp -> onecrawl-browser
38
- -> onecrawl-server -> onecrawl-cdp
39
- -> onecrawl-core
40
- -> onecrawl-crypto
41
- -> onecrawl-parser
42
- -> onecrawl-storage
68
+ onecrawl-cli-rs (binary)
69
+ ├── onecrawl-mcp-rs ──→ onecrawl-cdp ──→ onecrawl-browser (vendor)
70
+ ├── onecrawl-server ──→ onecrawl-cdp
71
+ ├── onecrawl-core
72
+ ├── onecrawl-crypto
73
+ ├── onecrawl-parser
74
+ └── onecrawl-storage
75
+
76
+ Independent leaf crates (safe to change in parallel):
77
+ onecrawl-core, onecrawl-crypto, onecrawl-parser, onecrawl-storage
78
+
79
+ High-impact crates (changes cascade):
80
+ onecrawl-cdp (affects: mcp-rs, server, cli-rs)
81
+ onecrawl-browser (affects: cdp, mcp-rs, server, cli-rs)
43
82
  ```
44
83
 
45
84
  ## Procedure
46
- 1. Build plan before implementation.
47
- 2. Assign unique IDs to milestones/issues.
48
- 3. Declare dependencies for every issue.
49
- 4. Execute by dependency order, then priority (critical then high then medium then low).
50
- 5. Run independent same-priority milestones in parallel when safe.
51
- 6. Update statuses continuously.
85
+ 1. Build plan before implementation (use SQL todos).
86
+ 2. Assign unique IDs to milestones/issues (kebab-case: `m1-feature-name`).
87
+ 3. Declare dependencies using `todo_deps` table.
88
+ 4. Execute by dependency order, then priority (critical high medium low).
89
+ 5. Run independent same-priority items in parallel when safe per rules above.
90
+ 6. Update statuses continuously (`pending` → `in_progress` → `done`).
91
+ 7. Sync to GitHub when milestones complete (github-sync skill).
52
92
 
53
93
  ## Done Criteria
54
- - Plan exists, is up-to-date, and reflects actual execution state.
55
- - Dependencies are respected and parallel work is safe.
94
+ - Plan exists in SQL todos, is up-to-date, and reflects actual execution state.
95
+ - Dependencies are respected and parallel work follows safety rules.
96
+ - GitHub artifacts are synced (github-sync skill).
56
97
 
57
98
  ## Anti-patterns
58
99
  - Starting implementation without a plan
59
- - Missing dependency declarations
100
+ - Missing dependency declarations (causes parallel conflicts)
60
101
  - Running blocked items in parallel
102
+ - Changing `onecrawl-cdp` and `onecrawl-cli-rs` simultaneously
103
+ - Using free-form notes instead of structured SQL tracking
@@ -11,42 +11,77 @@ Detect and remove contradictions across agent policies before execution.
11
11
  - Updating AGENTS.md or runtime adapters
12
12
  - Merging new workflow rules
13
13
  - Noticing behavioral ambiguity during execution
14
+ - Adding or modifying a SKILL.md file
14
15
 
15
16
  ## OneCrawl Policy Files
16
17
 
17
- | File | Purpose |
18
- |------|---------|
19
- | `AGENTS.md` | Root dispatcher, skill catalog |
20
- | `AGENTS.vscode.MD` | VS Code runtime adapter |
21
- | `AGENTS.copilot-cli.MD` | Copilot CLI runtime adapter |
22
- | `.github/copilot-instructions.md` | Build, test, architecture reference |
23
- | `.github/skills/*/SKILL.md` | 14 operational skills |
18
+ | File | Purpose | Owner |
19
+ |------|---------|-------|
20
+ | `AGENTS.md` | Root dispatcher, skill catalog | Runtime-critical contract |
21
+ | `AGENTS.vscode.MD` | VS Code runtime adapter | `vscode_askQuestions` binding |
22
+ | `AGENTS.copilot-cli.MD` | Copilot CLI runtime adapter | `ask_user` binding |
23
+ | `.github/copilot-instructions.md` | Build, test, architecture reference | Project-level context |
24
+ | `.github/skills/*/SKILL.md` | 14 operational skills | Procedural ownership |
24
25
 
25
26
  ## Audit Checklist
26
27
 
27
- 1. Language: Are all policies in English?
28
- 2. Interaction model: Is ask_user (CLI) vs vscode_askQuestions (VS Code) correctly bound?
29
- 3. Freeform invariant: Is option 4 always Freeform in all 5-option prompts?
30
- 4. Completion gates: Do all skills reference the same gate procedure?
31
- 5. Scope: Are skills non-overlapping? No duplicate procedures?
32
- 6. Skill references: Do AGENTS.md catalog entries match actual SKILL.md files?
33
- 7. Tool names: Are MCP tool references correct (onecrawl run, not chrome-devtools)?
34
- 8. Build commands: Are cargo/npm commands consistent across all skills?
35
- 9. Version: Are version references up-to-date?
36
- 10. Stop condition: Is "I am satisfied" the only stop phrase in all loop references?
28
+ ### Language & Format
29
+ - [ ] All policies are in English (no mixed languages)
30
+ - [ ] Markdown formatting is consistent (headers, lists, code blocks)
31
+ - [ ] No prose duplication between AGENTS.md and SKILL.md files
32
+
33
+ ### Runtime Binding
34
+ - [ ] `AGENTS.copilot-cli.MD` uses only `ask_user` (never `vscode_askQuestions`)
35
+ - [ ] `AGENTS.vscode.MD` uses only `vscode_askQuestions` (never `ask_user`)
36
+ - [ ] Both adapters have semantic parity (same options, different tool names)
37
+
38
+ ### Interaction Model
39
+ - [ ] Option 4 is ALWAYS "Freeform" with enum value `custom` in all 5-option prompts
40
+ - [ ] Stop condition is always exact phrase "I am satisfied"
41
+ - [ ] Autonomous mode is always option 5
42
+ - [ ] Compatibility triad is escalation-only, not default
43
+
44
+ ### Completion Gate Consistency
45
+ - [ ] All skills reference same gate commands:
46
+ - `cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W clippy::all`
47
+ - `cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1`
48
+ - [ ] All skills agree: `onecrawl-browser` vendor crate 1 warning is acceptable
49
+ - [ ] Test count reference is consistent (573 tests as of v4.0.0-beta.1)
50
+
51
+ ### Tool & Command Coherence
52
+ - [ ] MCP tool references use `onecrawl run <tool>` format (not `chrome-devtools`)
53
+ - [ ] Build commands are consistent: `cargo check -p onecrawl-cli-rs -p onecrawl-mcp-rs`
54
+ - [ ] Commit convention: `git commit -F /tmp/file.txt` (not `-m`), Co-authored-by trailer
55
+ - [ ] Version references: `v4.0.0-beta.1` (not stale alpha references)
56
+
57
+ ### Cross-Skill Coherence
58
+ - [ ] No two skills define the same procedure (ownership is exclusive)
59
+ - [ ] Skill catalog in `AGENTS.md` matches actual `.github/skills/*/SKILL.md` files (count = 14)
60
+ - [ ] Each skill trigger description in catalog matches the skill's "Use when" section
61
+ - [ ] No circular dependencies between skills
62
+
63
+ ### OneCrawl-Specific
64
+ - [ ] Crate exclusions consistent: `--exclude onecrawl-e2e --exclude onecrawl-python`
65
+ - [ ] VecDeque convention: `.back()` not `.last()` in all skills
66
+ - [ ] `rand::rng()` warning present where async code is discussed
67
+ - [ ] Session files: `/tmp/onecrawl-session*.json` format consistent
37
68
 
38
69
  ## Procedure
39
70
  1. Read all policy files listed above.
40
- 2. Run each checklist item.
41
- 3. Document any contradictions found.
42
- 4. Fix contradictions (update the owning file).
43
- 5. Verify fix does not introduce new contradictions.
71
+ 2. Run each checklist item — mark pass/fail.
72
+ 3. Document any contradictions found with file:line references.
73
+ 4. Fix contradictions (update the **owning** file, not duplicates).
74
+ 5. Verify fix does not introduce new contradictions (re-run checklist).
75
+ 6. If skill content was changed, verify AGENTS.md catalog entry still matches.
44
76
 
45
77
  ## Done Criteria
46
78
  - All checklist items pass.
47
79
  - No contradictions between any two policy files.
80
+ - Skill catalog count matches actual skill directory count.
48
81
 
49
82
  ## Anti-patterns
50
83
  - Duplicating procedures across skills and AGENTS files
51
84
  - Mixing tool names across runtimes
52
85
  - Updating one file without checking cross-references
86
+ - Adding a skill without updating AGENTS.md catalog
87
+ - Fixing a contradiction by duplicating content (fix at source instead)
@@ -8,34 +8,103 @@ description: "Multi-step tool workflows via code orchestration to reduce latency
8
8
  Execute multi-step tool workflows via code orchestration to reduce latency, context pollution, and token overhead.
9
9
 
10
10
  ## Use when
11
- - 3+ dependent tool calls
12
- - Large intermediate outputs (logs, tables, files)
11
+ - 3+ dependent tool calls in sequence
12
+ - Large intermediate outputs (logs, tables, file listings)
13
13
  - Branching logic, retries, or fan-out/fan-in workflows
14
+ - Multi-crate operations that follow the dependency graph
14
15
 
15
16
  ## Core Idea
16
- Treat tools as callable functions inside an orchestration runtime (script/runner), not as one-turn-at-a-time chat actions.
17
+ Treat tools as callable functions inside an orchestration runtime (bash script, Python), not as one-turn-at-a-time chat actions. Return only high-signal summaries to the model context.
17
18
 
18
- ## Procedure
19
- 1. Generate/execute orchestration code for loops, conditionals, parallel calls, retries, and early termination.
20
- 2. Process intermediate data in runtime (filter/aggregate/transform) instead of returning raw data to model context.
21
- 3. Return only high-signal outputs to the model (summary, decision, artifact references).
19
+ ## OneCrawl Patterns
20
+
21
+ ### Pattern 1: Sequential Gate Check (reduce 4 tool calls to 1)
22
+ ```bash
23
+ # Instead of 4 separate tool calls, run gate as single orchestrated script
24
+ GATE_RESULT=$(
25
+ cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W clippy::all 2>&1 && \
26
+ cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1 2>&1 | tail -5
27
+ )
28
+ echo "GATE: $GATE_RESULT"
29
+ ```
30
+
31
+ ### Pattern 2: Multi-Crate Parallel Build Check
32
+ ```bash
33
+ # Fan-out: check independent crates in parallel
34
+ cargo check -p onecrawl-crypto &
35
+ cargo check -p onecrawl-parser &
36
+ cargo check -p onecrawl-storage &
37
+ cargo check -p onecrawl-core &
38
+ wait
39
+ # Fan-in: all must succeed
40
+ echo "All independent crates OK"
41
+
42
+ # Then check dependent crates sequentially
43
+ cargo check -p onecrawl-cdp && cargo check -p onecrawl-mcp-rs && cargo check -p onecrawl-cli-rs
44
+ ```
45
+
46
+ ### Pattern 3: Batch File Analysis (filter before returning to model)
47
+ ```bash
48
+ # Instead of reading 12 Cargo.toml files individually:
49
+ grep -r "^version" packages/onecrawl-rust/crates/*/Cargo.toml | \
50
+ grep -v "\.version" | head -20
51
+ # Returns only the version lines, not full file contents
52
+ ```
22
53
 
23
- ## Why It Works (provider/model independent)
24
- - Fewer model round-trips for multi-call workflows.
25
- - Intermediate data stays out of context unless needed.
26
- - Explicit code control flow is easier to test, monitor, and debug.
54
+ ### Pattern 4: Multi-Agent Session Orchestration
55
+ ```bash
56
+ # Orchestrate multiple onecrawl sessions for testing
57
+ for agent in agent-a agent-b agent-c; do
58
+ onecrawl session start -H -s "$agent" &
59
+ done
60
+ wait
61
+
62
+ # Verify all started
63
+ for agent in agent-a agent-b agent-c; do
64
+ onecrawl get title -s "$agent"
65
+ done
66
+
67
+ # Cleanup
68
+ for agent in agent-a agent-b agent-c; do
69
+ onecrawl session close -s "$agent"
70
+ done
71
+ ```
72
+
73
+ ### Pattern 5: Conditional Retry with Backoff
74
+ ```bash
75
+ MAX_RETRIES=3
76
+ for i in $(seq 1 $MAX_RETRIES); do
77
+ if cargo test -p onecrawl-cdp -- --test-threads=1 2>&1 | tail -1 | grep -q "ok"; then
78
+ echo "PASS on attempt $i"
79
+ break
80
+ fi
81
+ echo "FAIL attempt $i/$MAX_RETRIES — waiting ${i}s"
82
+ sleep "$i"
83
+ done
84
+ ```
85
+
86
+ ## Procedure
87
+ 1. Identify workflows with 3+ sequential tool calls.
88
+ 2. Write orchestration code (bash preferred for CLI operations).
89
+ 3. Process intermediate data in the script (filter/aggregate/transform).
90
+ 4. Return only high-signal output to model context (summary, pass/fail, counts).
91
+ 5. Validate tool results before using them in subsequent steps.
27
92
 
28
93
  ## Guardrails
29
- - Strict input/output schemas.
30
- - Validate tool results before use.
31
- - Idempotent/retry-safe tool design when possible.
32
- - Timeout/cancellation/expiry handling.
33
- - Sandbox execution for untrusted code; never blindly execute external payloads.
94
+ - Strict input/output schemas — validate before use.
95
+ - Idempotent/retry-safe operations when possible.
96
+ - Timeout handling: `timeout 60 cargo test ...` for long operations.
97
+ - Never blindly execute tool output as code.
98
+ - Use `set -e` in bash scripts to fail fast on errors.
99
+ - Cap parallel jobs to avoid resource exhaustion: `wait` after fan-out.
34
100
 
35
101
  ## Done Criteria
36
102
  - Workflow completes with reduced context load and deterministic control flow.
103
+ - Intermediate data stays out of model context unless decision-relevant.
37
104
 
38
105
  ## Anti-patterns
39
- - Returning raw intermediate payloads to the model by default
40
- - Unbounded loops without stop conditions
41
- - Executing unvalidated tool output
106
+ - Returning raw `cargo test` output (500+ lines) to model context
107
+ - Unbounded loops without stop conditions or max retries
108
+ - Executing unvalidated tool output as shell commands
109
+ - Sequential tool calls for independent operations (use parallel)
110
+ - Hardcoding paths instead of using workspace-relative references
@@ -14,42 +14,77 @@ Stop ineffective iteration loops after repeated completion gate failures and cho
14
14
 
15
15
  | Failure | Root Cause | Recovery |
16
16
  |---------|-----------|----------|
17
- | Clippy warnings persist | Vendor crate or structural pattern | Add targeted allow with justification |
18
- | Test timeout | Browser not running or port conflict | Check daemon status |
19
- | Linker error PyO3 | PyO3 test mode linker conflict | Exclude onecrawl-python from test runs |
20
- | CDP connection refused | Daemon crashed or wrong port | Restart daemon |
21
- | Memory growth in tests | Unbounded Vec or HashMap | Cap with VecDeque |
22
- | Send compile error | rand rng across await | Scope RNG in sync block before await |
17
+ | Clippy warnings persist | Vendor crate or structural pattern | Add targeted `#[allow(...)]` with comment justification |
18
+ | Test timeout | Browser not running or port conflict | `onecrawl health` → restart daemon if needed |
19
+ | Linker error (PyO3) | PyO3 test mode linker conflict | Exclude `onecrawl-python` from test runs |
20
+ | CDP connection refused | Daemon crashed or wrong port | `onecrawl daemon stop && onecrawl daemon start` |
21
+ | Memory growth in tests | Unbounded Vec or HashMap | Cap with VecDeque, use `.back()` not `.last()` |
22
+ | Send compile error | `rand::rng()` across `.await` | Scope RNG in sync block: `let val = { rand::rng().gen() };` |
23
+ | Serde parse failure | `unwrap_or_default()` on bad input | Use `match` with proper error propagation |
24
+ | Session file collision | Multi-agent PID race | Use `session start -s "unique-name-$$"` |
25
+ | Build cache stale | Incremental compilation artifacts | `cargo clean && cargo check --workspace` |
23
26
 
24
27
  ## RCA Procedure
25
- 1. Reproduce: Run the exact failing command and capture full output.
26
- 2. Classify: Is it architecture, dependency, scope, or environment?
27
- 3. Analyze:
28
- - Architecture: Does the fix require structural changes across crates?
29
- - Dependency: Is a third-party crate causing the issue?
30
- - Scope: Was the original issue too broadly defined?
31
- - Environment: Is it platform or toolchain specific?
32
-
33
- ## Recovery Options (present via ask_user)
34
- 1. Rescope: Break the issue into smaller, independently-completable pieces.
35
- 2. Rollback: git stash or git checkout to last known-good state, then retry with different approach.
36
- 3. Redesign: Rethink the approach, maybe the breaking change path is needed.
28
+
29
+ ### Step 1 Reproduce
30
+ ```bash
31
+ # Capture full failing output
32
+ cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W clippy::all 2>&1 | tee /tmp/rca-clippy.log
33
+ cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1 2>&1 | tee /tmp/rca-test.log
34
+ ```
35
+
36
+ ### Step 2 Classify
37
+ - [ ] **Architecture**: Fix requires structural changes across multiple crates?
38
+ - [ ] **Dependency**: Third-party crate causing the issue? (`cargo tree -p <crate>`)
39
+ - [ ] **Scope**: Original issue too broadly defined? Can it be split?
40
+ - [ ] **Environment**: Platform or toolchain specific? (check rust-ci.yml matrix)
41
+
42
+ ### Step 3 — Analyze
43
+ Examine the crate dependency graph to understand blast radius:
44
+ ```
45
+ onecrawl-cli-rs → onecrawl-mcp-rs → onecrawl-cdp → onecrawl-browser
46
+ onecrawl-cli-rs → onecrawl-server → onecrawl-cdp
47
+ onecrawl-cli-rs → onecrawl-core, onecrawl-crypto, onecrawl-parser, onecrawl-storage
48
+ ```
49
+ A failure in `onecrawl-cdp` affects CLI, MCP, and server. A failure in `onecrawl-crypto` is isolated.
50
+
51
+ ## Recovery Options (present to user)
52
+ 1. **Rescope**: Break the issue into smaller, independently-completable pieces.
53
+ 2. **Rollback**: Return to last known-good state, retry with different approach.
54
+ 3. **Redesign**: Rethink the approach — the breaking-change path may be needed.
37
55
 
38
56
  ## OneCrawl Rollback Commands
39
57
 
40
58
  ```bash
41
- git stash push -m "rollback: issue-id"
42
- git checkout v4.0.0-alpha.XX
43
- cargo check --workspace
59
+ # Option A: Stash current work
60
+ git stash push -m "rollback: issue-id — gate failure #3"
61
+
62
+ # Option B: Hard reset to last tag
63
+ git checkout v4.0.0-beta.1
64
+
65
+ # Option C: Reset specific files to main
66
+ git checkout main -- packages/onecrawl-rust/crates/<crate>/src/
67
+
68
+ # Verify clean state after rollback
69
+ cargo check -p onecrawl-cli-rs -p onecrawl-mcp-rs
44
70
  cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1
45
71
  ```
46
72
 
73
+ ## Post-Recovery Verification
74
+ - [ ] Gate commands pass after recovery (completion-gate skill)
75
+ - [ ] Root cause documented in session log
76
+ - [ ] Recovery option chosen is recorded
77
+ - [ ] If rescoped: new sub-issues created on GitHub (github-sync skill)
78
+ - [ ] Cleanup: `rm -f /tmp/rca-*.log`
79
+
47
80
  ## Done Criteria
48
81
  - Root cause identified and documented.
49
82
  - Recovery option selected and executed.
50
- - Gate passes after recovery.
83
+ - Completion gate passes after recovery.
51
84
 
52
85
  ## Anti-patterns
53
86
  - Continuing to iterate without analyzing root cause
54
87
  - Ignoring test failures to move forward
55
88
  - Rolling back without understanding what went wrong
89
+ - Applying the same fix strategy after 3 failures (definition of insanity)
90
+ - Not cleaning up RCA artifacts (`/tmp/rca-*.log`)
@@ -15,55 +15,97 @@ Maintain an accurate, auditable execution journal for each working session.
15
15
  ## OneCrawl Session Template
16
16
 
17
17
  ```markdown
18
- # Session: YYYY-MM-DD
18
+ # Session: YYYY-MM-DD-HHmm
19
19
 
20
20
  ## Status
21
- - Branch: main
22
- - Version: v4.0.0-alpha.XX
21
+ - Branch: <current branch>
22
+ - Version: v4.0.0-beta.1
23
23
  - Mode: Autonomous / Interactive
24
+ - Started: <ISO 8601 timestamp>
25
+ - Ended: <ISO 8601 timestamp>
24
26
 
25
27
  ## Work Completed
26
- - [ ] Issue/feature description
27
- - [ ] Tests added/updated
28
+ - [ ] Issue/feature description (link to GitHub issue)
29
+ - [ ] Tests added/updated (count: +N new, M modified)
28
30
 
29
31
  ## Completion Gate Evidence
30
- - Clippy: 0 warnings (1 vendor)
31
- - Tests: XX passed, 0 failed
32
- - Build: cargo check clean
32
+ - Clippy: 0 warnings (1 vendor — onecrawl-browser)
33
+ - Tests: 573 passed, 0 failed, 0 ignored
34
+ - Build: cargo check -p onecrawl-cli-rs -p onecrawl-mcp-rs clean
35
+ - Binary: ./target/release/onecrawl --version → v4.0.0-beta.1
36
+ - CI: rust-ci.yml run #<id> — all green
33
37
 
34
38
  ## Decisions
35
- - Decision 1: rationale
39
+ - Decision 1: rationale (link to interaction-loop option chosen)
36
40
 
37
41
  ## Blockers
38
42
  - None / description
39
43
 
40
44
  ## GitHub Sync
41
- - Commits pushed: X
42
- - Tags: vX.X.X-alpha.XX
43
- - npm: published vX.X.X-alpha.XX
45
+ - Commits pushed: X (SHAs: abc1234, def5678)
46
+ - Issues closed: #N, #M
47
+ - Tags: v4.0.0-beta.X
48
+ - npm: published v4.0.0-beta.X
49
+ - CI runs: <run-id> (green/red)
44
50
 
45
51
  ## OneCrawl Health
46
- - CLI: vX.X.X-alpha.XX
47
- - Binary: XXM
52
+ - CLI version: v4.0.0-beta.1
53
+ - Binary size: XXM (./target/release/onecrawl)
48
54
  - Daemon: running/stopped
49
- - Sessions: X active
55
+ - Active sessions: X (/tmp/onecrawl-session*.json)
56
+ - Crate count: 12 workspace members
57
+ ```
58
+
59
+ ## SQL Checkpoint Integration
60
+
61
+ Use the session database to track session state across tool calls:
62
+
63
+ ```sql
64
+ -- Record session start
65
+ INSERT OR REPLACE INTO session_state (key, value) VALUES
66
+ ('session_start', datetime('now')),
67
+ ('session_branch', '<branch>'),
68
+ ('session_version', 'v4.0.0-beta.1'),
69
+ ('baseline_test_count', '573');
70
+
71
+ -- Record completion gate pass
72
+ INSERT OR REPLACE INTO session_state (key, value) VALUES
73
+ ('gate_pass_1', datetime('now')),
74
+ ('gate_clippy', '0 warnings'),
75
+ ('gate_tests', '573 passed, 0 failed');
76
+
77
+ -- Query session state
78
+ SELECT key, value FROM session_state WHERE key LIKE 'gate_%' OR key LIKE 'session_%';
50
79
  ```
51
80
 
52
81
  ## OneCrawl Version Bump Checklist
53
- 1. Update version in `packages/onecrawl-rust/Cargo.toml`
54
- 2. `cargo check --workspace` (propagates to all crates)
55
- 3. Commit with `chore: bump version to vX.X.X-alpha.XX`
56
- 4. `git tag vX.X.X-alpha.XX`
57
- 5. `git push origin main --tags`
58
- 6. Update `packages/onecrawl/package.json` version
59
- 7. `node scripts/sync-assets.js`
60
- 8. `npm publish --tag alpha --access public`
82
+ - [ ] Update version in `packages/onecrawl-rust/Cargo.toml`
83
+ - [ ] `cargo check --workspace` (propagates to all crates)
84
+ - [ ] Commit: `git commit -F /tmp/bump-msg.txt` with Co-authored-by trailer
85
+ - [ ] Tag: `git tag v4.0.0-beta.X`
86
+ - [ ] Push: `git push origin main --tags`
87
+ - [ ] Update `packages/onecrawl/package.json` version
88
+ - [ ] `node scripts/sync-assets.js`
89
+ - [ ] `npm publish --tag beta --access public`
90
+
91
+ ## Session Health Snapshot Commands
92
+
93
+ ```bash
94
+ # Capture health snapshot for session log
95
+ ./target/release/onecrawl --version
96
+ ls -la ./target/release/onecrawl | awk '{print $5}' # Binary size
97
+ ls /tmp/onecrawl-session*.json 2>/dev/null | wc -l # Active sessions
98
+ git --no-pager log --oneline -5 # Recent commits
99
+ ```
61
100
 
62
101
  ## Done Criteria
63
- - Session journal file exists with all required sections.
64
- - All status fields are accurate.
102
+ - Session journal exists with all required sections filled.
103
+ - All status fields are accurate and timestamped.
104
+ - Completion gate evidence includes exact counts (not "all passed").
65
105
 
66
106
  ## Anti-patterns
67
107
  - Missing completion gate evidence
68
108
  - Undocumented decisions
69
109
  - Stale version numbers
110
+ - Using approximate test counts ("~500 passed")
111
+ - Missing GitHub sync SHAs and CI run IDs
@@ -20,6 +20,11 @@ Replace blind trial-and-error debugging with a structured, evidence-based proces
20
20
  2. Capture the **initial error output** (stack trace, console error, failing test output).
21
21
  3. If the bug is intermittent, note frequency and conditions.
22
22
 
23
+ ```bash
24
+ # OneCrawl: reproduce a failing test with verbose output
25
+ RUST_LOG=debug cargo test -p onecrawl-cdp -- failing_test_name --test-threads=1 --nocapture 2>&1 | tee debug-data.log
26
+ ```
27
+
23
28
  ### Step 2 — Hypothesize
24
29
  1. Based on the error and context, form **max 3 ranked hypotheses**.
25
30
  2. For each hypothesis state:
@@ -31,26 +36,37 @@ Replace blind trial-and-error debugging with a structured, evidence-based proces
31
36
  ### Step 3 — Instrument
32
37
  1. Add **targeted logging** at the locations identified in hypotheses.
33
38
  2. Use structured prefixes so output is parseable:
39
+ ```rust
40
+ // Rust: use eprintln! for debug (goes to stderr)
41
+ eprintln!("[DEBUG:H1:session_start] session_name={}, pid={}", name, pid);
42
+ ```
34
43
  ```
35
44
  [DEBUG:H1:functionName] variable = value
36
45
  [DEBUG:H2:moduleName] state = value
37
46
  ```
38
47
  3. Log **inputs, outputs, and intermediate state** — not just "reached here".
39
- 4. For async code, include timestamps:
40
- ```
41
- [DEBUG:H1:fetch] ${new Date().toISOString()} response.status = ${res.status}
48
+ 4. For async Rust code, include context about the await point:
49
+ ```rust
50
+ eprintln!("[DEBUG:H1:cdp_send] before await, cmd={}", cmd);
51
+ let result = cdp.send(cmd).await;
52
+ eprintln!("[DEBUG:H1:cdp_send] after await, result={:?}", result);
42
53
  ```
43
54
 
44
55
  ### Step 4 — Collect
45
56
  1. Run the code / test that triggers the bug.
46
57
  2. Pipe **all debug output** to `debug-data.log`:
47
- - Terminal: `node app.js 2>&1 | tee debug-data.log`
48
- - Test runner: `npm test 2>&1 | tee debug-data.log`
49
- - Browser: copy console output into `debug-data.log`
58
+ ```bash
59
+ # Rust test with full output
60
+ RUST_LOG=debug cargo test -p onecrawl-cdp -- test_name --test-threads=1 --nocapture 2>&1 | tee debug-data.log
61
+
62
+ # OneCrawl CLI operation
63
+ RUST_LOG=debug onecrawl session start -H 2>&1 | tee debug-data.log
64
+ ```
50
65
  3. If using OneCrawl MCP tools, also collect:
51
66
  - `onecrawl run browser get_console_messages` → append to `debug-data.log`
52
67
  - `onecrawl har drain` → append failed/relevant requests
53
68
  - `onecrawl health` → append daemon/session state
69
+ - `onecrawl config show` → append configuration
54
70
 
55
71
  ### Step 5 — Analyze
56
72
  1. Read `debug-data.log` and correlate with hypotheses.
@@ -64,12 +80,16 @@ Replace blind trial-and-error debugging with a structured, evidence-based proces
64
80
  ### Step 6 — Fix
65
81
  1. Apply the **minimal fix** that addresses the confirmed root cause.
66
82
  2. Re-run the failing test / reproduction steps to verify the fix.
67
- 3. Ensure no regressions: run the full test suite if available.
83
+ 3. Run full suite to ensure no regressions:
84
+ ```bash
85
+ cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1
86
+ ```
68
87
 
69
88
  ### Step 7 — Clean Up
70
- 1. Remove **all** `[DEBUG:...]` logging added in Step 3.
89
+ 1. Remove **all** `[DEBUG:...]` logging / `eprintln!` added in Step 3.
71
90
  2. Delete `debug-data.log`.
72
91
  3. Commit only the fix, not the debug instrumentation.
92
+ 4. Use `git diff --cached` to verify no debug code is staged.
73
93
 
74
94
  ## MCP Tool Integration
75
95
 
@@ -87,17 +107,28 @@ When available, prefer OneCrawl MCP tools over manual instrumentation:
87
107
  | `onecrawl config show` | Verify configuration values |
88
108
  | `onecrawl page-watcher drain` | Monitor DOM changes |
89
109
  | `onecrawl network-log drain` | Live network traffic |
90
- | `cargo test -p <crate> -- test_name` | Run specific failing test |
91
- | `RUST_LOG=debug cargo test` | Verbose test output |
110
+ | `cargo test -p <crate> -- test_name --nocapture` | Run specific failing test with output |
111
+ | `RUST_LOG=debug cargo test` | Verbose Rust test output |
112
+
113
+ ## OneCrawl-Specific Debug Checklist
114
+ - [ ] Check daemon status: `onecrawl health`
115
+ - [ ] Check session files: `ls /tmp/onecrawl-session*.json`
116
+ - [ ] Check Chrome processes: `pgrep -la chrome | grep remote-debugging`
117
+ - [ ] Check port conflicts: `lsof -i :9222` (default CDP port)
118
+ - [ ] Check Rust toolchain: `rustc --version && cargo --version`
119
+ - [ ] VecDeque: using `.back()` not `.last()`?
120
+ - [ ] `rand::rng()`: scoped in sync block before `.await`?
121
+ - [ ] Serde: using `match` not `unwrap_or_default()` on parse?
92
122
 
93
123
  ## Escalation
94
124
  If after **2 instrumentation → analysis cycles** the root cause is still unclear:
95
125
  1. Document all collected evidence in `debug-data.log`
96
126
  2. Present findings to the user with the data
97
- 3. Ask whether to:
127
+ 3. Suggest options:
98
128
  - Broaden the investigation scope
99
129
  - Involve additional expertise (e.g., review architecture)
100
130
  - Accept a workaround while root cause is investigated
131
+ 4. If 3+ fix attempts fail, invoke rollback-rca skill.
101
132
 
102
133
  ## Done Criteria
103
134
  - Root cause identified and documented
@@ -110,7 +141,7 @@ If after **2 instrumentation → analysis cycles** the root cause is still uncle
110
141
  - **Blind changes** — modifying code without evidence of what is wrong
111
142
  - **Skipping log collection** — "fixing" based on guesses without data
112
143
  - **Shotgun debugging** — changing multiple things simultaneously
113
- - **Leaving debug code** — forgetting to remove `console.log` instrumentation
144
+ - **Leaving debug code** — forgetting to remove `eprintln!` instrumentation
114
145
  - **Symptom fixing** — addressing the visible error instead of the root cause
115
146
  - **Infinite retry loops** — more than 3 fix attempts without re-analyzing from data
116
147
  - **No reproduction** — attempting to fix without confirming the bug exists
@@ -20,17 +20,21 @@ Guarantee deterministic, CI-ready quality verification with no regressions and n
20
20
  ## OneCrawl Test Commands
21
21
 
22
22
  ```bash
23
- # Full test suite (excludes E2E and PyO3 — known linker issue in test mode)
23
+ # Full test suite — 573 tests (excludes E2E and PyO3)
24
24
  cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1
25
25
 
26
- # Single crate (fast iteration)
26
+ # Single crate (fast iteration during development)
27
27
  cargo test -p onecrawl-cli-rs -- --test-threads=1
28
28
  cargo test -p onecrawl-cdp -- --test-threads=1
29
29
  cargo test -p onecrawl-mcp-rs -- --test-threads=1
30
+ cargo test -p onecrawl-crypto -- --test-threads=1
30
31
 
31
- # Run specific test
32
+ # Run specific test by name
32
33
  cargo test -p onecrawl-cli-rs -- test_name --test-threads=1
33
34
 
35
+ # Verbose output for debugging failures
36
+ RUST_LOG=debug cargo test -p onecrawl-cdp -- test_name --test-threads=1 --nocapture
37
+
34
38
  # Clippy as static analysis layer
35
39
  cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W clippy::all
36
40
  ```
@@ -39,33 +43,57 @@ cargo clippy --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- -W
39
43
 
40
44
  | Crate | Tests | Scope |
41
45
  |-------|-------|-------|
42
- | `onecrawl-cli-rs` | Unit | CLI dispatch, config, parsing |
43
- | `onecrawl-cdp` | Unit | CDP protocol, stealth, JS evaluation |
44
- | `onecrawl-mcp-rs` | Unit | MCP server, tool dispatch |
45
- | `onecrawl-core` | Unit | Shared types, utilities |
46
- | `onecrawl-crypto` | Unit | Encryption, TOTP, PKCE |
47
- | `onecrawl-parser` | Unit + Doc | HTML parsing, accessibility tree |
48
- | `onecrawl-storage` | Unit | KV storage, encrypted store |
49
- | `onecrawl-server` | Unit | HTTP API server |
46
+ | `onecrawl-cli-rs` | Unit | CLI dispatch, config, session mgmt, daemon control |
47
+ | `onecrawl-cdp` | Unit | CDP protocol, stealth patches, JS evaluation |
48
+ | `onecrawl-mcp-rs` | Unit | MCP server, 18 tools / 546+ actions dispatch |
49
+ | `onecrawl-core` | Unit | Shared types, utilities, error types |
50
+ | `onecrawl-crypto` | Unit | AES-256-GCM, TOTP, PKCE, PBKDF2 |
51
+ | `onecrawl-parser` | Unit + Doc | HTML parsing, accessibility tree, selectors |
52
+ | `onecrawl-storage` | Unit | KV storage, encrypted store, profile mgmt |
53
+ | `onecrawl-server` | Unit | axum HTTP API server, tab lifecycle |
54
+ | `onecrawl-browser` | None | **Vendor** crate — do not add tests, 1 clippy warning OK |
50
55
  | `onecrawl-e2e` | E2E | **Excluded** — requires running Chrome |
51
- | `onecrawl-python` | Unit | **Excluded** — PyO3 linker issue |
56
+ | `onecrawl-python` | Unit | **Excluded** — PyO3 linker issue in test mode |
57
+ | `onecrawl-napi` | None | Node.js bindings — tested via npm |
58
+
59
+ ## Baseline Capture & Regression Check
60
+
61
+ ```bash
62
+ # Before changes — capture baseline test count
63
+ cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1 2>&1 | grep "test result"
64
+ # Example output: test result: ok. 573 passed; 0 failed; 0 ignored
65
+
66
+ # After changes — compare
67
+ cargo test --workspace --exclude onecrawl-e2e --exclude onecrawl-python -- --test-threads=1 2>&1 | grep "test result"
68
+ # Must show: passed ≥ baseline, 0 failed
69
+ ```
52
70
 
53
71
  ## Procedure
54
- 1. **Before changes**: run full suite and record baseline.
55
- 2. Implement changes and add/update tests.
56
- 3. **After changes**: rerun full suite and verify no regressions.
72
+ 1. **Before changes**: run full suite and record baseline test count.
73
+ 2. Implement changes and add/update tests for new/changed behavior.
74
+ 3. **After changes**: rerun full suite and verify:
75
+ - [ ] No regressions (0 failed)
76
+ - [ ] Test count ≥ baseline (no tests removed without justification)
77
+ - [ ] New tests added for new code paths
57
78
  4. If any previously passing test breaks, fix before completion.
58
79
  5. Ensure tests are deterministic, isolated, non-interactive, and CI-compatible.
59
80
 
60
81
  ## Key Rules
61
- - Use `--test-threads=1` — some tests share browser state.
82
+ - **Always** use `--test-threads=1` — tests share browser/session state.
62
83
  - Never use `#[ignore]` to skip failing tests.
63
- - VecDeque: use `.back()` not `.last()`, `.iter().skip(n)` not `[n..]`.
84
+ - `VecDeque`: use `.back()` not `.last()`, `.iter().skip(n)` not `[n..]`.
85
+ - `rand::rng()` is `!Send` — scope all RNG in sync blocks before `.await`.
86
+ - Don't use `unwrap_or_default()` on serde parse — use `match` with proper error.
87
+ - Prefer CDP-native operations over JS evaluation in test assertions.
64
88
 
65
89
  ## Done Criteria
66
- - Full suite green, no regressions.
90
+ - [ ] Full suite green (573+ tests), 0 failed, 0 regressions.
91
+ - [ ] Clippy: 0 warnings in owned crates.
92
+ - [ ] New/changed code has corresponding test coverage.
67
93
 
68
94
  ## Anti-patterns
69
- - Manual-only validation
70
- - Flaky timeout-driven E2E
95
+ - Manual-only validation ("I checked it works")
96
+ - Flaky timeout-driven tests (use wait-for conditions)
71
97
  - Merging with failing tests
98
+ - Using `--test-threads=N` where N > 1 (causes flaky failures)
99
+ - Adding `#[allow(unused)]` to silence test compilation warnings
package/lib/chrome.js CHANGED
@@ -171,7 +171,11 @@ async function downloadChromium({ silent = false } = {}) {
171
171
  if (!silent) console.log(` Extracting...`);
172
172
  try {
173
173
  if (platform() === "win32") {
174
- execSync(`powershell -NoProfile -Command "Expand-Archive -Path '${zipPath}' -DestinationPath '${CHROME_DIR}' -Force"`, { stdio: "pipe" });
174
+ // Use array form to avoid shell quoting issues with spaces in paths
175
+ execSync(
176
+ `powershell -NoProfile -Command "Expand-Archive -LiteralPath '${zipPath.replace(/'/g, "''")}' -DestinationPath '${CHROME_DIR.replace(/'/g, "''")}' -Force"`,
177
+ { stdio: "pipe" },
178
+ );
175
179
  } else {
176
180
  execSync(`unzip -o "${zipPath}" -d "${CHROME_DIR}"`, { stdio: "pipe" });
177
181
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "onecrawl",
3
- "version": "4.0.0-alpha.64",
3
+ "version": "4.0.0-beta.3",
4
4
  "description": "Browser automation engine — CLI, MCP server, and agent skills installer",
5
5
  "license": "BUSL-1.1",
6
6
  "author": "Giulio Leone <giulio@onecrawl.dev>",
@@ -23,6 +23,7 @@ const PLATFORM_MAP = {
23
23
  "linux-x64": "onecrawl-x86_64-unknown-linux-gnu",
24
24
  "linux-arm64": "onecrawl-aarch64-unknown-linux-gnu",
25
25
  "win32-x64": "onecrawl-x86_64-pc-windows-msvc.exe",
26
+ "win32-arm64": "onecrawl-aarch64-pc-windows-msvc.exe",
26
27
  };
27
28
 
28
29
  const platformKey = `${process.platform}-${process.arch}`;
@@ -31,6 +32,7 @@ const binaryName = PLATFORM_MAP[platformKey];
31
32
  if (!binaryName) {
32
33
  console.log(`⚠️ No pre-built binary for ${platformKey}.`);
33
34
  console.log(` Build from source: cargo build --release -p onecrawl-cli-rs`);
35
+ // Exit 0 so npm install doesn't fail — binary can be built manually
34
36
  process.exit(0);
35
37
  }
36
38
 
@@ -105,6 +107,14 @@ async function installBinary() {
105
107
  }
106
108
 
107
109
  console.log(`✅ OneCrawl v${VERSION} installed`);
110
+
111
+ // Verify binary actually runs
112
+ try {
113
+ const ver = execSync(`"${destPath}" --version`, { encoding: "utf8", timeout: 5000 }).trim();
114
+ console.log(` Verified: ${ver}`);
115
+ } catch {
116
+ console.log(`⚠️ Binary installed but could not verify execution`);
117
+ }
108
118
  } catch (e) {
109
119
  try { unlinkSync(tmpPath); } catch {}
110
120
  console.log(`⚠️ Failed to install binary: ${e.message}`);