opencodekit 0.19.1 → 0.19.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/index.js CHANGED
@@ -20,7 +20,7 @@ var __require = /* @__PURE__ */ createRequire(import.meta.url);
20
20
 
21
21
  //#endregion
22
22
  //#region package.json
23
- var version = "0.19.1";
23
+ var version = "0.19.2";
24
24
 
25
25
  //#endregion
26
26
  //#region src/utils/license.ts
@@ -46,6 +46,15 @@ You have access to the TodoWrite tools to help you manage and plan tasks. Use th
46
46
 
47
47
  When referencing specific functions or pieces of code include the pattern `file_path:line_number` to allow the user to easily navigate to the source code location.
48
48
 
49
+ # Web Research Tool Priority
50
+
51
+ When fetching content from URLs (docs, READMEs, web pages):
52
+
53
+ 1. **`webclaw` MCP tools** (primary) — `scrape`, `crawl`, `batch`, `brand`. Handles 403s, bot protection, 67% fewer tokens.
54
+ 2. **`webfetch`** (fallback) — only if webclaw is unavailable or returns an error.
55
+
56
+ Never use `webfetch` as first choice when webclaw MCP is connected.
57
+
49
58
  # Build Agent
50
59
 
51
60
  **Purpose**: Primary execution coordinator — you ship working code, not promises.
@@ -81,7 +81,13 @@ If lower-ranked sources conflict with higher-ranked sources, follow higher-ranke
81
81
  |------|------|
82
82
  | docs/API | `context7`, `codesearch` |
83
83
  | production examples | `grepsearch`, `codesearch` |
84
- | latest ecosystem/release info | `websearch`, `webfetch` |
84
+ | latest ecosystem/release info | `websearch` (search), then `webclaw` (`scrape`) for content |
85
+ | URL content extraction | `webclaw` MCP (`scrape`) — primary; `webfetch` only as fallback |
86
+ | crawl a doc site | `webclaw` MCP (`crawl`) |
87
+ | batch multi-URL extraction | `webclaw` MCP (`batch`) |
88
+ | brand identity from a site | `webclaw` MCP (`brand`) |
89
+
90
+ **Web content priority:** Always try `webclaw` tools first for URL extraction. They handle 403s, bot protection, and produce 67% fewer tokens than raw HTML. Fall back to `webfetch` only if webclaw is unavailable.
85
91
 
86
92
  3. Run independent calls in parallel
87
93
  4. Return concise recommendations with sources
@@ -72,6 +72,7 @@ Route by need:
72
72
  | Distinctive UI direction / anti-slop guidance | `frontend-design` |
73
73
  | Figma design data (read/write via MCP) | `figma-go` |
74
74
  | Pencil design-as-code workflow | `pencil` |
75
+ | Brand identity extraction from URLs | `webclaw` |
75
76
 
76
77
  ### Taste-Skill Variants (installed)
77
78
 
@@ -115,6 +116,14 @@ If design must be created or iterated quickly, use Pencil:
115
116
  2. Export PNGs for review
116
117
  3. Provide audit with node-level critique where possible
117
118
 
119
+ ## Brand Extraction Workflow (when auditing existing sites)
120
+
121
+ Use `webclaw` MCP to extract brand identity from live sites:
122
+
123
+ 1. `brand(url)` → get colors, fonts, logos
124
+ 2. Cross-reference with visual analysis findings
125
+ 3. Flag inconsistencies between declared brand and actual UI
126
+
118
127
  ## Design QA Checklist (strict)
119
128
 
120
129
  - **Hierarchy**: clear H1/H2/body scale and weight separation
Binary file
@@ -115,6 +115,12 @@
115
115
  "environment": {},
116
116
  "timeout": 120000,
117
117
  "type": "local"
118
+ },
119
+ "webclaw": {
120
+ "command": ["webclaw-mcp"],
121
+ "enabled": true,
122
+ "timeout": 120000,
123
+ "type": "local"
118
124
  }
119
125
  },
120
126
  "model": "opencode/minimax-m2.5-free",
@@ -171,7 +177,7 @@
171
177
  "claude-haiku-4.5": {
172
178
  "attachment": true,
173
179
  "limit": {
174
- "context": 144000,
180
+ "context": 216000,
175
181
  "output": 32000
176
182
  },
177
183
  "options": {
@@ -199,7 +205,7 @@
199
205
  "claude-opus-4.5": {
200
206
  "attachment": true,
201
207
  "limit": {
202
- "context": 160000,
208
+ "context": 216000,
203
209
  "output": 32000
204
210
  },
205
211
  "options": {
@@ -224,7 +230,7 @@
224
230
  "claude-opus-4.6": {
225
231
  "attachment": true,
226
232
  "limit": {
227
- "context": 144000,
233
+ "context": 216000,
228
234
  "output": 64000
229
235
  },
230
236
  "options": {
@@ -294,7 +300,7 @@
294
300
  "claude-sonnet-4.5": {
295
301
  "attachment": true,
296
302
  "limit": {
297
- "context": 144000,
303
+ "context": 216000,
298
304
  "output": 32000
299
305
  },
300
306
  "options": {
@@ -319,7 +325,7 @@
319
325
  "claude-sonnet-4.6": {
320
326
  "attachment": true,
321
327
  "limit": {
322
- "context": 200000,
328
+ "context": 216000,
323
329
  "output": 32000
324
330
  },
325
331
  "options": {
@@ -0,0 +1,155 @@
1
+ ---
2
+ name: webclaw
3
+ description: Web content extraction, crawling, and scraping via webclaw MCP server. Use when fetching URLs fails (403), when crawling doc sites, batch-extracting pages, tracking content changes, or extracting brand identity from websites.
4
+ ---
5
+
6
+ # Webclaw Skill
7
+
8
+ Fast, local-first web content extraction for LLMs. Rust-based scraper with TLS fingerprinting, 67% token reduction, and native MCP integration.
9
+
10
+ ## Prerequisites
11
+
12
+ - `webclaw-mcp` binary installed at `~/.webclaw/webclaw-mcp`
13
+ - Install: `brew tap 0xMassi/webclaw && brew install webclaw`
14
+ - Or: download from https://github.com/0xMassi/webclaw/releases
15
+ - MCP server must be enabled in `.opencode/opencode.json` (`"enabled": true`)
16
+
17
+ ## When to Use
18
+
19
+ | Scenario | Tool | Why |
20
+ | ---------------------- | ---------------------- | ------------------------------------------ |
21
+ | `webfetch` got 403 | `scrape` | TLS fingerprinting bypasses bot protection |
22
+ | Research a doc site | `crawl` | BFS recursive extraction, same-origin |
23
+ | Extract multiple URLs | `batch` | Parallel multi-URL, faster than sequential |
24
+ | Discover sitemap URLs | `map` | Find all pages without crawling |
25
+ | Track doc changes | `diff` | Snapshot + compare workflow |
26
+ | Extract brand identity | `brand` | Colors, fonts, logos from any site |
27
+ | LLM-optimized output | `scrape` with `-f llm` | 67% fewer tokens than raw HTML |
28
+
29
+ ## MCP Tools (8 local, no API key needed)
30
+
31
+ ### scrape
32
+
33
+ Extract clean content from a URL. Returns markdown, text, JSON, or LLM-optimized format.
34
+
35
+ ```
36
+ scrape(url: "https://example.com", format: "llm")
37
+ ```
38
+
39
+ Options: `format` (markdown|text|json|llm|html), `include` (CSS selectors), `exclude` (CSS selectors), `only_main_content` (boolean)
40
+
41
+ ### crawl
42
+
43
+ Recursive BFS crawl of a site. Same-origin only.
44
+
45
+ ```
46
+ crawl(url: "https://docs.example.com", depth: 2, max_pages: 50)
47
+ ```
48
+
49
+ Options: `depth` (1-10), `max_pages`, `sitemap` (seed from sitemap.xml)
50
+
51
+ ### map
52
+
53
+ Discover URLs from a site's sitemap without fetching content.
54
+
55
+ ```
56
+ map(url: "https://docs.example.com")
57
+ ```
58
+
59
+ ### batch
60
+
61
+ Parallel extraction from multiple URLs.
62
+
63
+ ```
64
+ batch(urls: ["https://a.com", "https://b.com"], format: "llm")
65
+ ```
66
+
67
+ ### diff
68
+
69
+ Compare current page content against a previous snapshot.
70
+
71
+ ```
72
+ diff(url: "https://example.com", snapshot: previous_json)
73
+ ```
74
+
75
+ ### brand
76
+
77
+ Extract visual identity: colors, fonts, logos, OG image.
78
+
79
+ ```
80
+ brand(url: "https://stripe.com")
81
+ ```
82
+
83
+ Returns: `{ name, colors: [{hex, usage}], fonts: [], logos: [{url, kind}] }`
84
+
85
+ ### extract (needs Ollama)
86
+
87
+ LLM-powered structured extraction. Requires local Ollama instance.
88
+
89
+ ### summarize (needs Ollama)
90
+
91
+ Page summarization via local LLM.
92
+
93
+ ## Workflow Patterns
94
+
95
+ ### Fallback when webfetch fails
96
+
97
+ 1. Try `webfetch` first (built-in, no setup)
98
+ 2. If 403 or empty → use `scrape` via webclaw MCP
99
+ 3. If JS-rendered SPA → note limitation (webclaw doesn't execute JS without cloud API)
100
+
101
+ ### Research a documentation site
102
+
103
+ 1. `map(url)` to discover all pages
104
+ 2. `crawl(url, depth: 2, max_pages: 50)` to extract content
105
+ 3. Feed results to analysis agent
106
+
107
+ ### Brand/design audit
108
+
109
+ 1. `brand(url)` to extract colors, fonts, logos
110
+ 2. Pass to vision agent for design system analysis
111
+
112
+ ### Track changes over time
113
+
114
+ 1. `scrape(url, format: "json")` → save as snapshot
115
+ 2. Later: `diff(url, snapshot: saved)` → see what changed
116
+
117
+ ## Installation
118
+
119
+ ```bash
120
+ # Homebrew (recommended)
121
+ brew tap 0xMassi/webclaw
122
+ brew install webclaw
123
+
124
+ # Prebuilt binary
125
+ # Download from https://github.com/0xMassi/webclaw/releases
126
+ # Place webclaw-mcp in ~/.webclaw/
127
+
128
+ # Verify
129
+ webclaw-mcp --version
130
+ ```
131
+
132
+ ## Configuration
133
+
134
+ After install, enable in `.opencode/opencode.json`:
135
+
136
+ ```json
137
+ {
138
+ "mcp": {
139
+ "webclaw": {
140
+ "enabled": true
141
+ }
142
+ }
143
+ }
144
+ ```
145
+
146
+ Optional env vars:
147
+
148
+ - `WEBCLAW_API_KEY` — cloud API for bot-protected sites (optional)
149
+ - `OLLAMA_HOST` — for extract/summarize tools (default: `http://localhost:11434`)
150
+
151
+ ## Limitations
152
+
153
+ - **No JS rendering** (local mode) — SPAs that render entirely client-side won't extract fully. Use `--cloud` with API key, or use `playwright` skill instead.
154
+ - **Same-origin crawl only** — won't follow external links during crawl.
155
+ - **Early version** — v0.3.2, MIT license. Report issues to https://github.com/0xMassi/webclaw/issues
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencodekit",
3
- "version": "0.19.1",
3
+ "version": "0.19.2",
4
4
  "description": "CLI tool for bootstrapping and managing OpenCodeKit projects",
5
5
  "keywords": [
6
6
  "agents",