npm - crawlio-browser - Versions diffs - 1.6.2 → 1.6.4 - Mend

crawlio-browser 1.6.2 → 1.6.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/dist/mcp-server/{chunk-T4GKS2PG.js → chunk-LOOYHD6I.js} +1 -1
package/dist/mcp-server/index.js +2 -2
package/dist/mcp-server/{init-PFND5ZFY.js → init-UJD7YE3X.js} +1 -1
package/package.json +1 -1
package/skills/clone/SKILL.md +101 -91
package/skills/compare/SKILL.md +94 -90
package/skills/dossier/SKILL.md +81 -135
package/skills/extract/SKILL.md +86 -46
package/skills/investigate/SKILL.md +107 -83
package/skills/monitor/SKILL.md +94 -55
package/skills/test/SKILL.md +108 -92
package/skills/web-research/SKILL.md +30 -5

package/skills/monitor/SKILL.md CHANGED Viewed

@@ -1,64 +1,103 @@
 ---
 name: monitor
-description: "Monitor a URL for changes — baseline capture, re-capture, diff report"
-allowed-tools: Agent, Read, Write, Bash, Glob, Grep
-argument-hint: <url>
+description: "Monitor a page for changes — capture baseline, recapture, diff, report what changed"
+allowed-tools: mcp__crawlio-browser__search, mcp__crawlio-browser__execute, mcp__crawlio-browser__connect_tab
 ---
-# Monitor Investigation
-You are running a **monitor** investigation. Your goal is to capture a baseline snapshot of the target URL, re-capture it, and produce a `DiffReport` showing what changed between the two captures.
-## Loop Definition
-Read `loops/monitor.json` to understand the phase sequence. The monitor loop has 3 phases:
-1. **baseline** — Spawn `crawlio-crawler` to capture the target URL. Record the `EVIDENCE_ID`.
-2. **recapture** — Spawn `crawlio-crawler` again with the same URL. Record the second `EVIDENCE_ID`.
-3. **diff** — Spawn `crawlio-differ` with both evidence IDs. It reads both `EvidenceEnvelope<PageEvidence>` files, compares them field by field across 10 dimensions, and writes an `EvidenceEnvelope<DiffReport>`.
-## Execution
-1. Read `loops/monitor.json` to confirm phase order.
-2. Spawn `crawlio-crawler` for the baseline capture:
-   ```
-   Crawl <url> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<baselineId>`.
-3. Spawn `crawlio-crawler` for the recapture:
-   ```
-   Crawl <url> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<currentId>`.
-4. Spawn `crawlio-differ` with both evidence IDs:
-   ```
-   Compare baseline evidence <baselineId> against current evidence <currentId> for <url>.
-   Read both from .crawlio/evidence/. Write EvidenceEnvelope<DiffReport> to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<diffId>`.
-5. Read the DiffReport evidence and summarize changes for the user.
-## Output Format
+# Monitor — Change Detection
+Capture a baseline snapshot, recapture later, diff the results. Report structural, content, technology, and performance changes as structured findings.
+## When to Use
+- Tracking what changed on a page between two points in time
+- Detecting content additions/removals, tech stack changes, performance regressions
+- Verifying a deploy changed what was expected (and nothing else)
+## Protocol
+1. **search** for diff and snapshot commands: `search("diff snapshot")`
+2. **connect_tab** to the target URL
+3. **execute** Code Mode: baseline with `smart.snapshot()` + `smart.extractPage()`
+4. On recapture, diff and emit one `smart.finding()` per change dimension
+5. Return `smart.findings()` as the final output
+## Code Example
+```js
+// 1. Baseline
+const baselineSnap = await smart.snapshot();
+const baseline = await smart.extractPage();
+const baselineTech = await smart.detectTechnologies();
+// ... user triggers recapture ...
+// 2. Recapture + diff
+const current = await smart.extractPage();
+const currentTech = await smart.detectTechnologies();
+const diff = await smart.diffSnapshots(baselineSnap.snapshot);
+smart.finding({
+  claim: `${diff.additions} elements added, ${diff.removals} removed since baseline`,
+  evidence: [`additions: ${diff.additions}`, `removals: ${diff.removals}`, `unchanged: ${diff.unchanged}`],
+  sourceUrl: current.capture.url,
+  confidence: "high",
+  method: "diffSnapshots",
+  dimension: "structure"
+});
+if (diff.additions > 0 || diff.removals > 0) {
+  smart.finding({
+    claim: `Content changed: ${diff.additions + diff.removals} total mutations`,
+    evidence: diff.sample || [`${diff.additions} added`, `${diff.removals} removed`],
+    sourceUrl: current.capture.url, confidence: "high",
+    method: "diffSnapshots", dimension: "content"
+  });
+}
+// Technology changes
+const oldTech = new Set(baselineTech.technologies?.map(t => t.name) || []);
+const newTech = new Set(currentTech.technologies?.map(t => t.name) || []);
+const added = [...newTech].filter(t => !oldTech.has(t));
+const removed = [...oldTech].filter(t => !newTech.has(t));
+if (added.length || removed.length) {
+  smart.finding({
+    claim: `Tech stack changed: +${added.length} -${removed.length}`,
+    evidence: [...added.map(t => `added: ${t}`), ...removed.map(t => `removed: ${t}`)],
+    sourceUrl: current.capture.url, confidence: "high",
+    method: "detectTechnologies", dimension: "technology"
+  });
+}
+// Performance regression check
+const oldLcp = baseline.performance?.webVitals?.lcp;
+const newLcp = current.performance?.webVitals?.lcp;
+if (oldLcp && newLcp) {
+  const delta = newLcp - oldLcp;
+  smart.finding({
+    claim: delta > 100 ? `LCP regressed by ${delta}ms` : `LCP stable (delta: ${delta}ms)`,
+    evidence: [`baseline: ${oldLcp}ms`, `current: ${newLcp}ms`],
+    sourceUrl: current.capture.url, confidence: "high",
+    method: "extractPage", dimension: "performance"
+  });
+}
+await smart.scrollCapture();
+return { findings: smart.findings(), diff: { added: diff.additions, removed: diff.removals } };
 ```
-## Monitor: <url>
-### Changes Detected
-- [List of DiffChange entries grouped by dimension]
+## Anti-Patterns
-### Summary
-- Total changes: N
-- Breaking changes: N
-- Dimensions affected: [list]
+- Do NOT use `sleep()` loops to poll for changes — capture baseline, wait for user signal, recapture
+- Do NOT compare screenshots visually — use `smart.diffSnapshots()` for structural diff
+- Do NOT use `location.href` — use `current.capture.url` from extractPage
+- Always `search()` first to confirm diff commands exist
-### Evidence Chain
-- Baseline: <baselineId> (quality: ...)
-- Current: <currentId> (quality: ...)
-- Diff: <diffId> (quality: ...)
+## Output
-### Confidence
-- Overall: high/medium/low
-```
+The skill produces `Finding[]` via `smart.findings()`. Dimension tags:
+- **structure** — DOM elements added or removed
+- **content** — text and media mutations
+- **technology** — frameworks added or removed
+- **performance** — LCP, CLS, timing regressions or improvements

package/skills/test/SKILL.md CHANGED Viewed

@@ -1,101 +1,117 @@
 ---
 name: test
-description: "Automated testing — accessibility, performance, security, SEO, and best-practices audits with testable flow discovery"
-allowed-tools: Agent, Read, Write, Bash, Glob, Grep
-argument-hint: <url>
+description: "Test a page against quality assertions — accessibility, performance, security, SEO, mobile readiness"
+allowed-tools: mcp__crawlio-browser__search, mcp__crawlio-browser__execute, mcp__crawlio-browser__connect_tab
 ---
-# Test Investigation
-You are running a **test** investigation. Your goal is to audit a target URL across accessibility, performance, security, SEO, and best-practices categories, discover testable user flows, and produce a `TestSuite` evidence report.
-## Loop Definition
-Read `loops/test.json` to understand the phase sequence. The test loop has 5 phases:
-1. **crawl** — Spawn `crawlio-crawler` to capture the target URL. Record the `EVIDENCE_ID`.
-2. **analyze** — Spawn `crawlio-analyzer` with the crawl evidence ID. Identifies framework and rendering mode to inform test strategy.
-3. **audit** — Spawn `crawlio-auditor` with the crawl evidence ID. Runs accessibility, performance, security, SEO, and best-practices checks.
-4. **discover-flows** (optional) — Spawn `crawlio-auditor` with the crawl evidence ID and flow discovery focus. Discovers testable user flows from navigation/forms.
-5. **synthesize** — Spawn `crawlio-synthesizer` with all phase evidence to produce the final `TestSuite`.
-## Execution
-1. Read `loops/test.json` to confirm phase order.
-2. Parse the user's argument: `<url>`.
-3. Spawn `crawlio-crawler` to capture the page:
-   ```
-   Crawl <url> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<crawlId>`.
-4. Spawn `crawlio-analyzer` with the crawl evidence:
-   ```
-   Read PageEvidence from .crawlio/evidence/<crawlId>.json.
-   Analyze framework, rendering mode, and component patterns.
-   Write FrameworkEvidence to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<analyzeId>`.
-5. Spawn `crawlio-auditor` to run audits:
-   ```
-   Read PageEvidence from .crawlio/evidence/<crawlId>.json.
-   Run accessibility, performance, security, SEO, and best-practices audits.
-   Discover testable user flows from navigation and forms.
-   Write TestSuite evidence to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<auditId>`.
-6. Spawn `crawlio-synthesizer` to produce the final TestSuite:
-   ```
-   Read all evidence: <crawlId>, <analyzeId>, <auditId>.
-   Produce a TestSuite with audit results, discovered flows, and overall score.
-   Write to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<suiteId>`.
-7. Read the TestSuite evidence and summarize results for the user.
-## Output Format
+# Test — Quality Assertions
+Run pass/fail assertions across accessibility, performance, security, SEO, and mobile readiness. Every assertion becomes a finding. Confidence auto-caps when data is missing.
+## When to Use
+- Auditing accessibility, performance, security, SEO, or mobile readiness
+- Running pass/fail assertions against quality thresholds
+## Protocol
+1. **search** for extraction commands: `search("extract page accessibility")`
+2. **connect_tab** to the target URL
+3. **execute** Code Mode: `smart.extractPage()` gathers all dimensions in one call
+4. Emit one `smart.finding()` per assertion — claim states pass or fail
+5. Return `smart.findings()` + `page.gaps`
+## Code Example
+```js
+const page = await smart.extractPage();
+// Accessibility
+if (page.accessibility) {
+  smart.finding({
+    claim: page.accessibility.imagesWithoutAlt === 0
+      ? "All images have alt text"
+      : `${page.accessibility.imagesWithoutAlt} images missing alt text`,
+    evidence: [`imagesWithoutAlt: ${page.accessibility.imagesWithoutAlt}`, `nodeCount: ${page.accessibility.nodeCount}`],
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "extractPage", dimension: "accessibility"
+  });
+  smart.finding({
+    claim: page.accessibility.landmarkCount > 0
+      ? `${page.accessibility.landmarkCount} ARIA landmarks found`
+      : "No ARIA landmarks — add banner, main, contentinfo",
+    evidence: [`landmarkCount: ${page.accessibility.landmarkCount}`],
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "extractPage", dimension: "accessibility"
+  });
+}
+// Performance
+if (page.performance) {
+  const lcp = page.performance.webVitals?.lcp;
+  const cls = page.performance.webVitals?.cls;
+  smart.finding({
+    claim: lcp && lcp < 2500 ? `LCP good (${lcp}ms)` : `LCP needs work (${lcp || "unknown"}ms)`,
+    evidence: [`LCP: ${lcp}ms`, `CLS: ${cls}`, `thresholds: LCP<2500, CLS<0.1`],
+    sourceUrl: page.capture.url, confidence: lcp ? "high" : "low",
+    method: "extractPage", dimension: "performance"
+  });
+}
+// Security
+if (page.security) {
+  smart.finding({
+    claim: page.security.securityState === "secure"
+      ? "TLS connection is secure" : `Security state: ${page.security.securityState || "unknown"}`,
+    evidence: [`protocol: ${page.security.protocol || "unknown"}`],
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "extractPage", dimension: "security"
+  });
+}
+// SEO
+if (page.capture?.meta) {
+  const m = page.capture.meta;
+  smart.finding({
+    claim: m.title && m.description ? "Title + meta description present" : "SEO meta tags incomplete",
+    evidence: [`title: ${m.title || "missing"} (${m.title?.length || 0} chars)`,
+               `description: ${m.description || "missing"} (${m.description?.length || 0} chars)`],
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "extractPage", dimension: "seo"
+  });
+}
+// Mobile readiness
+if (page.mobileReadiness) {
+  smart.finding({
+    claim: page.mobileReadiness.hasViewportMeta ? "Viewport meta tag present" : "Missing viewport meta",
+    evidence: [`viewport: ${page.mobileReadiness.viewportContent || "none"}`],
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "extractPage", dimension: "mobile-readiness"
+  });
+}
+// Tech stack
+const tech = await smart.detectTechnologies();
+if (tech.technologies?.length) {
+  smart.finding({
+    claim: `${tech.technologies.length} technologies detected`,
+    evidence: tech.technologies.map(t => t.name),
+    sourceUrl: page.capture.url, confidence: "high",
+    method: "detectTechnologies", dimension: "technology"
+  });
+}
+return { findings: smart.findings(), gaps: page.gaps };
 ```
-## Test: <url>
-### Summary
-- Score: [0-100]/100
-- Tests: [passed] passed, [failed] failed, [warnings] warnings
-- Flows: [count] testable flows discovered
-### Accessibility
-- [audit results with pass/fail/warning status]
-### Performance
-- [audit results with scores]
-### Security
-- [audit results with pass/fail status]
+## Anti-Patterns
-### SEO
-- [audit results]
+- Do NOT use `smart.screenshot()` — extractPage captures everything needed
+- Do NOT use `sleep()` to wait for metrics — extractPage handles page load
+- Do NOT use `location.href` — use `page.capture.url`
+- Always `search()` first if unsure which fields extractPage returns
-### Best Practices
-- [audit results]
+## Output
-### Discovered Flows
-- [list of testable flows with steps]
-### Evidence Chain
-- Crawler: <crawlId> (quality: ...)
-- Analyzer: <analyzeId> (quality: ...)
-- Auditor: <auditId> (quality: ...)
-- Suite: <suiteId> (quality: ...)
-### Coverage Gaps
-- [Any gaps from the investigation]
-### Confidence
-- Overall: high/medium/low
-```
+The skill produces `Finding[]` via `smart.findings()`. Dimensions: **accessibility**, **performance**, **security**, **seo**, **mobile-readiness**, **technology**. When data is missing, confidence auto-caps to "low" and the gap appears in `page.gaps`.

package/skills/web-research/SKILL.md CHANGED Viewed

@@ -29,7 +29,7 @@ await connect_tab({ url: "https://example.com" })
 // Extract everything in one call
 const page = await smart.extractPage()
-// Returns: { capture, performance, security, fonts, meta }
+// Returns: { capture, performance, security, fonts, meta, accessibility, mobileReadiness, gaps }
 ```
 For visual evidence, add `smart.scrollCapture()`:
@@ -63,12 +63,34 @@ const record = {
   security: page.security,             // TLS, cert, mixed content
   fonts: page.fonts,                   // declared + computed
   meta: page.meta,                     // OG tags, structured data, headings, nav links
+  accessibility: page.accessibility,   // node count, landmarks, images without alt, heading structure
+  mobileReadiness: page.mobileReadiness, // viewport meta, media queries, overflow
+  gaps: page.gaps,                     // what data failed — check before trusting null fields
 }
 ```
 ### Phase 3: Analyze
-Compare against a fixed rubric. Produce structured findings, not prose.
+Compare against a fixed rubric. Produce validated findings using `smart.finding()`:
+```js
+smart.finding({
+  claim: "Site loads 47 network requests with 2 failures",
+  evidence: ["network.total: 47", "network.failed: 2"],
+  sourceUrl: page.capture.url,
+  confidence: "high",
+  method: "extractPage",
+  dimension: "performance"  // auto-caps confidence if perf data had gaps
+})
+// Retrieve all findings from this session
+const allFindings = smart.findings()
+// Reset for next research task
+smart.clearFindings()
+```
+If `extractPage()` reported gaps (e.g., performance metrics failed), findings with matching `dimension` get confidence automatically capped one level (high → medium, medium → low). Check `page.gaps` to understand what data is missing.
 ## Use Existing Tools — Not Manual Equivalents
@@ -128,9 +150,12 @@ When comparing sites, evaluate these dimensions:
 ### Prefer:
-- **One `extractPage()` per page** — it runs capture_page + perf + security + fonts + meta in parallel
-- **`comparePages()` for 2-site comparisons** — handles navigation + extraction for both sites
-- **Structured findings** — each with URL, extracted data, and confidence level
+- **One `extractPage()` per page** — it runs 7 ops in parallel (capture + perf + security + fonts + meta + accessibility + mobile-readiness) with typed gaps
+- **`comparePages()` for 2-site comparisons** — returns `{ siteA, siteB, scaffold }` with 11 fixed comparison dimensions
+- **`smart.finding()` for validated findings** — enforces claim + evidence + sourceUrl + confidence + method schema
+- **No `smart.screenshot()`** — it doesn't exist. Use `bridge.send({ type: 'take_screenshot' })` or `smart.scrollCapture()`
+- **No `smart.snapshot({ compact: true })`** — compact option doesn't exist. Use `smart.snapshot()` or `{ interactive: true }`
+- **No `location.href = "..."` for navigation** — use `smart.navigate(url)`. Direct assignment breaks CDP
 ## Example: Competitive Audit