npm - crawlio-browser - Versions diffs - 1.6.2 → 1.6.4 - Mend

crawlio-browser 1.6.2 → 1.6.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/dist/mcp-server/{chunk-T4GKS2PG.js → chunk-LOOYHD6I.js} +1 -1
package/dist/mcp-server/index.js +2 -2
package/dist/mcp-server/{init-PFND5ZFY.js → init-UJD7YE3X.js} +1 -1
package/package.json +1 -1
package/skills/clone/SKILL.md +101 -91
package/skills/compare/SKILL.md +94 -90
package/skills/dossier/SKILL.md +81 -135
package/skills/extract/SKILL.md +86 -46
package/skills/investigate/SKILL.md +107 -83
package/skills/monitor/SKILL.md +94 -55
package/skills/test/SKILL.md +108 -92
package/skills/web-research/SKILL.md +30 -5

package/dist/mcp-server/{chunk-T4GKS2PG.js → chunk-LOOYHD6I.js} RENAMED Viewed

@@ -1,7 +1,7 @@
 // src/shared/constants.ts
 import { homedir } from "os";
 import { join } from "path";
-var PKG_VERSION = "1.6.2";
+var PKG_VERSION = "1.6.4";
 var WS_PORT = 9333;
 var WS_PORT_MAX = 9342;
 var WS_HOST = "127.0.0.1";

package/dist/mcp-server/index.js CHANGED Viewed

@@ -9,7 +9,7 @@ import {
   WS_PORT_MAX,
   WS_RECONNECT_GRACE,
   WS_STALE_THRESHOLD
-} from "./chunk-T4GKS2PG.js";
+} from "./chunk-LOOYHD6I.js";
 // src/mcp-server/index.ts
 import { randomBytes as randomBytes3 } from "crypto";
@@ -8758,7 +8758,7 @@ function getMaxOutput() {
 process.title = "Crawlio Agent";
 var initMode = process.argv.includes("init") || process.argv.includes("--setup") || process.argv.includes("setup");
 if (initMode) {
-  const { runInit } = await import("./init-PFND5ZFY.js");
+  const { runInit } = await import("./init-UJD7YE3X.js");
   await runInit(process.argv.slice(2));
   process.exit(0);
 }

package/dist/mcp-server/{init-PFND5ZFY.js → init-UJD7YE3X.js} RENAMED Viewed

@@ -1,6 +1,6 @@
 import {
   PKG_VERSION
-} from "./chunk-T4GKS2PG.js";
+} from "./chunk-LOOYHD6I.js";
 // src/mcp-server/init.ts
 import { execFileSync, spawn } from "child_process";

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "crawlio-browser",
-  "version": "1.6.2",
+  "version": "1.6.4",
   "description": "MCP server with 114 CDP-backed tools for browser automation — screenshots, DOM, network capture, framework detection, cookies, storage, session recording, structured data extraction, tracking analysis, SEO auditing, technographic fingerprinting, performance metrics via Chrome",
   "type": "module",
   "main": "dist/mcp-server/index.js",

package/skills/clone/SKILL.md CHANGED Viewed

@@ -1,101 +1,111 @@
 ---
 name: clone
-description: "Clone a site — capture design tokens, component tree, assets, and compile a replayable skill"
-allowed-tools: Agent, Read, Write, Bash, Glob, Grep
-argument-hint: <url>
+description: "Capture a site's design system — colors, typography, spacing, layout, components — as structured findings"
+allowed-tools: mcp__crawlio-browser__search, mcp__crawlio-browser__execute, mcp__crawlio-browser__connect_tab
 ---
-# Clone Investigation
-You are running a **clone** investigation. Your goal is to capture the design system, component structure, and assets of a target URL, then compile the investigation into a replayable skill.
-## Loop Definition
-Read `loops/clone.json` to understand the phase sequence. The clone loop has 5 phases:
-1. **crawl** — Spawn `crawlio-crawler` to capture the target URL. Record the `EVIDENCE_ID`.
-2. **analyze** — Spawn `crawlio-analyzer` with the crawl evidence ID. Identifies framework, rendering mode, component patterns.
-3. **extract-design** — Spawn `crawlio-extractor` with the crawl evidence ID and `what: "design"`. Extracts design tokens (colors, typography, spacing, breakpoints).
-4. **compile** (optional) — Spawn `crawlio-recorder` to compile the investigation into a replayable SKILL.md.
-5. **synthesize** — Spawn `crawlio-synthesizer` with all phase evidence to produce the final `CloneBlueprint`.
-## Execution
-1. Read `loops/clone.json` to confirm phase order.
-2. Parse the user's argument: `<url>`.
-3. Spawn `crawlio-crawler` to capture the page:
-   ```
-   Crawl <url> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<crawlId>`.
-4. Spawn `crawlio-analyzer` with the crawl evidence:
-   ```
-   Read PageEvidence from .crawlio/evidence/<crawlId>.json.
-   Analyze framework, rendering mode, and component patterns.
-   Write FrameworkEvidence to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<analyzeId>`.
-5. Spawn `crawlio-extractor` for design token extraction:
-   ```
-   Read PageEvidence from .crawlio/evidence/<crawlId>.json.
-   Extract "design" data — colors, typography, spacing, breakpoints.
-   Write DesignTokens evidence to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<designId>`.
-6. Spawn `crawlio-recorder` to compile the investigation:
-   ```
-   Read evidence chain: <crawlId>, <analyzeId>, <designId>.
-   Compile into a replayable SKILL.md.
-   ```
-   Record the skill path.
-7. Spawn `crawlio-synthesizer` to produce the CloneBlueprint:
-   ```
-   Read all evidence: <crawlId>, <analyzeId>, <designId>.
-   Produce a CloneBlueprint with design tokens, component tree, assets, and compiled skill path.
-   Write to .crawlio/evidence/.
-   Target URL: <url>
-   ```
-   Record `EVIDENCE_ID=<blueprintId>`.
-8. Read the CloneBlueprint evidence and summarize results for the user.
-## Output Format
+# Clone — Design System Extraction
+Capture the visual DNA of a page: design tokens, typography scale, spacing system, component patterns, and CSS framework.
+## When to Use
+- Reproducing or referencing another site's design system
+- Extracting CSS custom properties and design tokens
+- Identifying typography, color palette, spacing conventions
+- Determining what CSS framework or UI library is in use
+## Protocol
+1. **search** for the right commands: `search("design tokens extract CSS")` or `search("detect technologies")`
+2. **connect_tab** to the target URL (or use an already-connected tab)
+3. **execute** Code Mode with smart.* methods to extract evidence
+4. Emit one `smart.finding()` per design dimension discovered
+5. Return `smart.findings()` as the final output
+## Code Example
+```js
+const page = await smart.extractPage();
+const tech = await smart.detectTechnologies();
+// Extract CSS custom properties (design tokens)
+const tokens = await smart.evaluate(`(() => {
+  const styles = getComputedStyle(document.documentElement);
+  const props = {};
+  for (const name of [...document.styleSheets].flatMap(s => {
+    try { return [...s.cssRules] } catch { return [] }
+  }).filter(r => r.style).flatMap(r => [...r.style]).filter(p => p.startsWith('--'))) {
+    props[name] = styles.getPropertyValue(name).trim();
+  }
+  return { count: Object.keys(props).length, sample: Object.entries(props).slice(0, 20) };
+})()`);
+smart.finding({
+  claim: `Site uses ${tokens.result.count} CSS custom properties`,
+  evidence: tokens.result.sample.map(([k, v]) => `${k}: ${v}`),
+  sourceUrl: page.capture.url,
+  confidence: "high",
+  method: "evaluate + extractPage",
+  dimension: "design-system"
+});
+// Typography
+if (page.fonts?.length) {
+  smart.finding({
+    claim: `${page.fonts.length} font families loaded`,
+    evidence: page.fonts.map(f => f.name || f),
+    sourceUrl: page.capture.url,
+    confidence: "high",
+    method: "extractPage",
+    dimension: "typography"
+  });
+}
+// Framework / UI library
+const frameworks = tech.technologies?.map(t => t.name) || [];
+if (frameworks.length) {
+  smart.finding({
+    claim: `Detected CSS/UI frameworks: ${frameworks.join(", ")}`,
+    evidence: frameworks,
+    sourceUrl: page.capture.url,
+    confidence: "high",
+    method: "detectTechnologies",
+    dimension: "technology"
+  });
+}
+// Repeating UI patterns (tables / grids)
+const tables = await smart.detectTables();
+if (tables.length) {
+  smart.finding({
+    claim: `${tables.length} repeating UI patterns detected`,
+    evidence: tables.map(t => `${t.selector}: ${t.rowCount} rows, ${t.columns.length} cols`),
+    sourceUrl: page.capture.url,
+    confidence: "high",
+    method: "detectTables",
+    dimension: "layout"
+  });
+}
+// Visual reference
+await smart.scrollCapture();
+return { findings: smart.findings(), fonts: page.fonts, tech: frameworks };
 ```
-## Clone: <url>
-### Design Tokens
-- Colors: [count] tokens extracted
-- Typography: [count] font stacks
-- Spacing: [count] spacing values
-- Breakpoints: [count] responsive breakpoints
-### Component Tree
-- Root: <root component>
-- Components: [count] total
-- Types: [breakdown by type]
+## Anti-Patterns
-### Assets
-- [count] total assets ([breakdown by type])
+- Do NOT use `smart.screenshot()` — use `smart.scrollCapture()` for full-page visual reference
+- Do NOT use `sleep()` loops to wait for styles — styles are available immediately after page load
+- Do NOT use `location.href` — use `page.capture.url` from extractPage
+- Always `search()` first if unsure which command extracts what you need
-### Compiled Skill
-- Path: <skill path or "not compiled">
+## Output
-### Evidence Chain
-- Crawler: <crawlId> (quality: ...)
-- Analyzer: <analyzeId> (quality: ...)
-- Design: <designId> (quality: ...)
-- Blueprint: <blueprintId> (quality: ...)
+The skill produces `Finding[]` via `smart.findings()`. Each finding is tagged with a dimension:
-### Coverage Gaps
-- [Any gaps from the investigation]
-### Confidence
-- Overall: high/medium/low
-```
+- **design-system** — CSS custom properties, design tokens
+- **typography** — font families, type scale, weights
+- **layout** — grid systems, repeating patterns, spacing
+- **technology** — CSS framework, UI library, build tooling

package/skills/compare/SKILL.md CHANGED Viewed

@@ -1,102 +1,106 @@
 ---
 name: compare
-description: "Compare two URLs side-by-side across 10 typed dimensions"
-allowed-tools: Agent, Read, Write, Bash, Glob, Grep
-argument-hint: <urlA> <urlB>
+description: "Side-by-side comparison of two websites across 11 dimensions. Produces Finding[] evidence per dimension."
+allowed-tools: mcp__crawlio-browser__search, mcp__crawlio-browser__execute, mcp__crawlio-browser__connect_tab
 ---
-# Compare Investigation
-You are running a **compare** investigation. Your goal is to capture two URLs, analyze their frameworks, and produce a `ComparisonReport` with typed findings across 10 dimensions.
-## The 10 Dimensions
-| # | Dimension | What It Measures |
-|---|-----------|------------------|
-| 1 | Framework | Technology stack, versions, SSR mode |
-| 2 | Performance | Web Vitals, load metrics, bottlenecks |
-| 3 | Security | TLS, headers, cookies, mixed content |
-| 4 | SEO | Meta tags, structured data, heading hierarchy |
-| 5 | Accessibility | ARIA, semantic HTML, keyboard nav, contrast |
-| 6 | Error Surface | Console errors, network failures, JS exceptions |
-| 7 | Third-Party Load | External scripts, tracking, CDN, SDK risk |
-| 8 | Architecture | SSR vs CSR, routing, data fetching, state management |
-| 9 | Content Delivery | Caching, compression, asset optimization |
-| 10 | Mobile Readiness | Viewport, responsive signals, device emulation |
-## Loop Definition
-Read `loops/compare.json` to understand the phase sequence. The compare loop has 6 phases:
-1. **crawl-a** — Spawn `crawlio-crawler` to capture URL A. Record the `EVIDENCE_ID`.
-2. **crawl-b** — Spawn `crawlio-crawler` to capture URL B. Record the `EVIDENCE_ID`.
-3. **analyze-a** (optional) — Spawn `crawlio-analyzer` with crawl-a evidence to identify frameworks.
-4. **analyze-b** (optional) — Spawn `crawlio-analyzer` with crawl-b evidence to identify frameworks.
-5. **compare** — Spawn `crawlio-comparator` with all evidence IDs. It reads both URLs' evidence, compares across 10 dimensions, and writes an `EvidenceEnvelope<ComparisonReport>`.
-6. **synthesize** (optional) — Spawn `crawlio-synthesizer` if a full blueprint is useful.
-## Execution
-1. Read `loops/compare.json` to confirm phase order.
-2. Parse the user's arguments: `<urlA>` and `<urlB>`.
-3. Spawn `crawlio-crawler` for URL A:
-   ```
-   Crawl <urlA> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<crawlAId>`.
-4. Spawn `crawlio-crawler` for URL B:
-   ```
-   Crawl <urlB> and write PageEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<crawlBId>`.
-5. Spawn `crawlio-analyzer` for URL A (optional):
-   ```
-   Analyze page evidence <crawlAId> for <urlA>. Read from .crawlio/evidence/. Write FrameworkEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<analyzeAId>`.
-6. Spawn `crawlio-analyzer` for URL B (optional):
-   ```
-   Analyze page evidence <crawlBId> for <urlB>. Read from .crawlio/evidence/. Write FrameworkEvidence to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<analyzeBId>`.
-7. Spawn `crawlio-comparator` with all evidence:
-   ```
-   Compare URL A (<urlA>) against URL B (<urlB>).
-   Evidence IDs — crawl-a: <crawlAId>, crawl-b: <crawlBId>, analyze-a: <analyzeAId>, analyze-b: <analyzeBId>.
-   Read all evidence from .crawlio/evidence/. Write EvidenceEnvelope<ComparisonReport> to .crawlio/evidence/.
-   ```
-   Record `EVIDENCE_ID=<compareId>`.
-8. Read the ComparisonReport evidence and summarize for the user.
-## Output Format
+# Compare
+Side-by-side comparison of two websites across 11 typed dimensions. One call captures both sites and returns a scaffold with per-dimension comparability. Produces one finding per dimension.
+## When to Use
+- Compare two competing sites (framework, performance, security)
+- Audit staging vs production
+- Benchmark a site against a competitor across all 11 dimensions
+- Identify gaps where one site excels and the other falls short
+## The 11 Dimensions
+framework, performance, security, seo, accessibility, error-surface, third-party-load, architecture, content-delivery, mobile-readiness, data-structure.
+## Protocol
+**Acquire -> Normalize -> Analyze** with Evidence Mode.
+### 1. Connect
 ```
-## Compare: <urlA> vs <urlB>
+connect_tab({ url: "https://site-a.com" })
+```
-### Winner: <A|B|Tie|Inconclusive>
-<winnerReason>
+`comparePages` handles navigation to both sites internally.
-### Dimension Results
-| Dimension | Verdict | Confidence | Key Differences |
-|-----------|---------|------------|-----------------|
-| [per-dimension rows] |
+### 2. Acquire + Normalize
-### Summary
-- Total differences: N
-- Critical differences: N
+```js
+const comparison = await smart.comparePages(
+  "https://site-a.com",
+  "https://site-b.com"
+);
+// comparison.siteA / siteB — full PageEvidence (capture, performance, security, etc.)
+// comparison.scaffold.dimensions[] — 11 objects: { name, comparable, siteA.status, siteB.status }
+// comparison.scaffold.sharedFields / missingFields
+// comparison.siteA.gaps[] / siteB.gaps[] — what failed per site
+```
-### Evidence Chain
-- Crawl A: <crawlAId> (quality: ...)
-- Crawl B: <crawlBId> (quality: ...)
-- Analyze A: <analyzeAId> (quality: ...)
-- Analyze B: <analyzeBId> (quality: ...)
-- Compare: <compareId> (quality: ...)
+### 3. Analyze — produce findings
+Walk the scaffold. One finding per dimension.
+```js
+for (const dim of comparison.scaffold.dimensions) {
+  if (!dim.comparable) {
+    smart.finding({
+      claim: `${dim.name}: not comparable — data missing`,
+      evidence: [`siteA: ${dim.siteA.status}`, `siteB: ${dim.siteB.status}`],
+      sourceUrl: comparison.siteA.capture?.url || "unknown",
+      confidence: "low", method: "comparePages", dimension: dim.name
+    });
+    continue;
+  }
+  smart.finding({
+    claim: `${dim.name}: both sites present — ready for comparison`,
+    evidence: [`siteA: ${dim.siteA.status}`, `siteB: ${dim.siteB.status}`],
+    sourceUrl: comparison.siteA.capture?.url || "unknown",
+    confidence: "high", method: "comparePages", dimension: dim.name
+  });
+}
+```
-### Confidence
-- Overall: high/medium/low
+Drill into specific dimensions using raw PageEvidence:
+```js
+// Performance drill-down
+if (comparison.siteA.performance && comparison.siteB.performance) {
+  const lcpA = comparison.siteA.performance.webVitals?.lcp;
+  const lcpB = comparison.siteB.performance.webVitals?.lcp;
+  if (lcpA && lcpB) {
+    const faster = lcpA < lcpB ? "A" : "B";
+    smart.finding({
+      claim: `Site ${faster} loads ${Math.abs(lcpA - lcpB)}ms faster (LCP)`,
+      evidence: [`siteA LCP: ${lcpA}ms`, `siteB LCP: ${lcpB}ms`],
+      sourceUrl: comparison.siteA.capture.url, confidence: "high",
+      method: "comparePages", dimension: "performance"
+    });
+  }
+}
+return {
+  findings: smart.findings(),
+  scaffold: comparison.scaffold,
+  gaps: { siteA: comparison.siteA.gaps, siteB: comparison.siteB.gaps }
+};
 ```
+## Anti-Patterns
+- No `smart.screenshot()` -- use `bridge.send({ type: 'take_screenshot' })`
+- No `sleep()` loops -- use `smart.waitForIdle()`
+- No `location.href` -- use `smart.navigate()`
+- Always `search()` before guessing command names
+- No manual "extract A, then extract B" -- `smart.comparePages()` does both. See **browser-automation** for full list.
+## Output
+Produces `Finding[]` via `smart.findings()`. Each finding has: `claim`, `evidence[]`, `sourceUrl`, `confidence`, `method`, `dimension`. Dimension tags match the 11 scaffold dimensions.