opencodekit 0.19.1 → 0.19.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.js +1 -1
- package/dist/template/.opencode/agent/build.md +9 -0
- package/dist/template/.opencode/agent/scout.md +7 -1
- package/dist/template/.opencode/agent/vision.md +9 -0
- package/dist/template/.opencode/memory.db +0 -0
- package/dist/template/.opencode/memory.db-shm +0 -0
- package/dist/template/.opencode/memory.db-wal +0 -0
- package/dist/template/.opencode/opencode.json +11 -5
- package/dist/template/.opencode/skill/webclaw/SKILL.md +155 -0
- package/package.json +1 -1
package/dist/index.js
CHANGED
|
@@ -46,6 +46,15 @@ You have access to the TodoWrite tools to help you manage and plan tasks. Use th
|
|
|
46
46
|
|
|
47
47
|
When referencing specific functions or pieces of code include the pattern `file_path:line_number` to allow the user to easily navigate to the source code location.
|
|
48
48
|
|
|
49
|
+
# Web Research Tool Priority
|
|
50
|
+
|
|
51
|
+
When fetching content from URLs (docs, READMEs, web pages):
|
|
52
|
+
|
|
53
|
+
1. **`webclaw` MCP tools** (primary) — `scrape`, `crawl`, `batch`, `brand`. Handles 403s, bot protection, 67% fewer tokens.
|
|
54
|
+
2. **`webfetch`** (fallback) — only if webclaw is unavailable or returns an error.
|
|
55
|
+
|
|
56
|
+
Never use `webfetch` as first choice when webclaw MCP is connected.
|
|
57
|
+
|
|
49
58
|
# Build Agent
|
|
50
59
|
|
|
51
60
|
**Purpose**: Primary execution coordinator — you ship working code, not promises.
|
|
@@ -81,7 +81,13 @@ If lower-ranked sources conflict with higher-ranked sources, follow higher-ranke
|
|
|
81
81
|
|------|------|
|
|
82
82
|
| docs/API | `context7`, `codesearch` |
|
|
83
83
|
| production examples | `grepsearch`, `codesearch` |
|
|
84
|
-
| latest ecosystem/release info | `websearch
|
|
84
|
+
| latest ecosystem/release info | `websearch` (search), then `webclaw` (`scrape`) for content |
|
|
85
|
+
| URL content extraction | `webclaw` MCP (`scrape`) — primary; `webfetch` only as fallback |
|
|
86
|
+
| crawl a doc site | `webclaw` MCP (`crawl`) |
|
|
87
|
+
| batch multi-URL extraction | `webclaw` MCP (`batch`) |
|
|
88
|
+
| brand identity from a site | `webclaw` MCP (`brand`) |
|
|
89
|
+
|
|
90
|
+
**Web content priority:** Always try `webclaw` tools first for URL extraction. They handle 403s, bot protection, and produce 67% fewer tokens than raw HTML. Fall back to `webfetch` only if webclaw is unavailable.
|
|
85
91
|
|
|
86
92
|
3. Run independent calls in parallel
|
|
87
93
|
4. Return concise recommendations with sources
|
|
@@ -72,6 +72,7 @@ Route by need:
|
|
|
72
72
|
| Distinctive UI direction / anti-slop guidance | `frontend-design` |
|
|
73
73
|
| Figma design data (read/write via MCP) | `figma-go` |
|
|
74
74
|
| Pencil design-as-code workflow | `pencil` |
|
|
75
|
+
| Brand identity extraction from URLs | `webclaw` |
|
|
75
76
|
|
|
76
77
|
### Taste-Skill Variants (installed)
|
|
77
78
|
|
|
@@ -115,6 +116,14 @@ If design must be created or iterated quickly, use Pencil:
|
|
|
115
116
|
2. Export PNGs for review
|
|
116
117
|
3. Provide audit with node-level critique where possible
|
|
117
118
|
|
|
119
|
+
## Brand Extraction Workflow (when auditing existing sites)
|
|
120
|
+
|
|
121
|
+
Use `webclaw` MCP to extract brand identity from live sites:
|
|
122
|
+
|
|
123
|
+
1. `brand(url)` → get colors, fonts, logos
|
|
124
|
+
2. Cross-reference with visual analysis findings
|
|
125
|
+
3. Flag inconsistencies between declared brand and actual UI
|
|
126
|
+
|
|
118
127
|
## Design QA Checklist (strict)
|
|
119
128
|
|
|
120
129
|
- **Hierarchy**: clear H1/H2/body scale and weight separation
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -115,6 +115,12 @@
|
|
|
115
115
|
"environment": {},
|
|
116
116
|
"timeout": 120000,
|
|
117
117
|
"type": "local"
|
|
118
|
+
},
|
|
119
|
+
"webclaw": {
|
|
120
|
+
"command": ["webclaw-mcp"],
|
|
121
|
+
"enabled": true,
|
|
122
|
+
"timeout": 120000,
|
|
123
|
+
"type": "local"
|
|
118
124
|
}
|
|
119
125
|
},
|
|
120
126
|
"model": "opencode/minimax-m2.5-free",
|
|
@@ -171,7 +177,7 @@
|
|
|
171
177
|
"claude-haiku-4.5": {
|
|
172
178
|
"attachment": true,
|
|
173
179
|
"limit": {
|
|
174
|
-
"context":
|
|
180
|
+
"context": 216000,
|
|
175
181
|
"output": 32000
|
|
176
182
|
},
|
|
177
183
|
"options": {
|
|
@@ -199,7 +205,7 @@
|
|
|
199
205
|
"claude-opus-4.5": {
|
|
200
206
|
"attachment": true,
|
|
201
207
|
"limit": {
|
|
202
|
-
"context":
|
|
208
|
+
"context": 216000,
|
|
203
209
|
"output": 32000
|
|
204
210
|
},
|
|
205
211
|
"options": {
|
|
@@ -224,7 +230,7 @@
|
|
|
224
230
|
"claude-opus-4.6": {
|
|
225
231
|
"attachment": true,
|
|
226
232
|
"limit": {
|
|
227
|
-
"context":
|
|
233
|
+
"context": 216000,
|
|
228
234
|
"output": 64000
|
|
229
235
|
},
|
|
230
236
|
"options": {
|
|
@@ -294,7 +300,7 @@
|
|
|
294
300
|
"claude-sonnet-4.5": {
|
|
295
301
|
"attachment": true,
|
|
296
302
|
"limit": {
|
|
297
|
-
"context":
|
|
303
|
+
"context": 216000,
|
|
298
304
|
"output": 32000
|
|
299
305
|
},
|
|
300
306
|
"options": {
|
|
@@ -319,7 +325,7 @@
|
|
|
319
325
|
"claude-sonnet-4.6": {
|
|
320
326
|
"attachment": true,
|
|
321
327
|
"limit": {
|
|
322
|
-
"context":
|
|
328
|
+
"context": 216000,
|
|
323
329
|
"output": 32000
|
|
324
330
|
},
|
|
325
331
|
"options": {
|
|
@@ -0,0 +1,155 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: webclaw
|
|
3
|
+
description: Web content extraction, crawling, and scraping via webclaw MCP server. Use when fetching URLs fails (403), when crawling doc sites, batch-extracting pages, tracking content changes, or extracting brand identity from websites.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Webclaw Skill
|
|
7
|
+
|
|
8
|
+
Fast, local-first web content extraction for LLMs. Rust-based scraper with TLS fingerprinting, 67% token reduction, and native MCP integration.
|
|
9
|
+
|
|
10
|
+
## Prerequisites
|
|
11
|
+
|
|
12
|
+
- `webclaw-mcp` binary installed at `~/.webclaw/webclaw-mcp`
|
|
13
|
+
- Install: `brew tap 0xMassi/webclaw && brew install webclaw`
|
|
14
|
+
- Or: download from https://github.com/0xMassi/webclaw/releases
|
|
15
|
+
- MCP server must be enabled in `.opencode/opencode.json` (`"enabled": true`)
|
|
16
|
+
|
|
17
|
+
## When to Use
|
|
18
|
+
|
|
19
|
+
| Scenario | Tool | Why |
|
|
20
|
+
| ---------------------- | ---------------------- | ------------------------------------------ |
|
|
21
|
+
| `webfetch` got 403 | `scrape` | TLS fingerprinting bypasses bot protection |
|
|
22
|
+
| Research a doc site | `crawl` | BFS recursive extraction, same-origin |
|
|
23
|
+
| Extract multiple URLs | `batch` | Parallel multi-URL, faster than sequential |
|
|
24
|
+
| Discover sitemap URLs | `map` | Find all pages without crawling |
|
|
25
|
+
| Track doc changes | `diff` | Snapshot + compare workflow |
|
|
26
|
+
| Extract brand identity | `brand` | Colors, fonts, logos from any site |
|
|
27
|
+
| LLM-optimized output | `scrape` with `-f llm` | 67% fewer tokens than raw HTML |
|
|
28
|
+
|
|
29
|
+
## MCP Tools (8 local, no API key needed)
|
|
30
|
+
|
|
31
|
+
### scrape
|
|
32
|
+
|
|
33
|
+
Extract clean content from a URL. Returns markdown, text, JSON, or LLM-optimized format.
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
scrape(url: "https://example.com", format: "llm")
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Options: `format` (markdown|text|json|llm|html), `include` (CSS selectors), `exclude` (CSS selectors), `only_main_content` (boolean)
|
|
40
|
+
|
|
41
|
+
### crawl
|
|
42
|
+
|
|
43
|
+
Recursive BFS crawl of a site. Same-origin only.
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
crawl(url: "https://docs.example.com", depth: 2, max_pages: 50)
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Options: `depth` (1-10), `max_pages`, `sitemap` (seed from sitemap.xml)
|
|
50
|
+
|
|
51
|
+
### map
|
|
52
|
+
|
|
53
|
+
Discover URLs from a site's sitemap without fetching content.
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
map(url: "https://docs.example.com")
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### batch
|
|
60
|
+
|
|
61
|
+
Parallel extraction from multiple URLs.
|
|
62
|
+
|
|
63
|
+
```
|
|
64
|
+
batch(urls: ["https://a.com", "https://b.com"], format: "llm")
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### diff
|
|
68
|
+
|
|
69
|
+
Compare current page content against a previous snapshot.
|
|
70
|
+
|
|
71
|
+
```
|
|
72
|
+
diff(url: "https://example.com", snapshot: previous_json)
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### brand
|
|
76
|
+
|
|
77
|
+
Extract visual identity: colors, fonts, logos, OG image.
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
brand(url: "https://stripe.com")
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Returns: `{ name, colors: [{hex, usage}], fonts: [], logos: [{url, kind}] }`
|
|
84
|
+
|
|
85
|
+
### extract (needs Ollama)
|
|
86
|
+
|
|
87
|
+
LLM-powered structured extraction. Requires local Ollama instance.
|
|
88
|
+
|
|
89
|
+
### summarize (needs Ollama)
|
|
90
|
+
|
|
91
|
+
Page summarization via local LLM.
|
|
92
|
+
|
|
93
|
+
## Workflow Patterns
|
|
94
|
+
|
|
95
|
+
### Fallback when webfetch fails
|
|
96
|
+
|
|
97
|
+
1. Try `webfetch` first (built-in, no setup)
|
|
98
|
+
2. If 403 or empty → use `scrape` via webclaw MCP
|
|
99
|
+
3. If JS-rendered SPA → note limitation (webclaw doesn't execute JS without cloud API)
|
|
100
|
+
|
|
101
|
+
### Research a documentation site
|
|
102
|
+
|
|
103
|
+
1. `map(url)` to discover all pages
|
|
104
|
+
2. `crawl(url, depth: 2, max_pages: 50)` to extract content
|
|
105
|
+
3. Feed results to analysis agent
|
|
106
|
+
|
|
107
|
+
### Brand/design audit
|
|
108
|
+
|
|
109
|
+
1. `brand(url)` to extract colors, fonts, logos
|
|
110
|
+
2. Pass to vision agent for design system analysis
|
|
111
|
+
|
|
112
|
+
### Track changes over time
|
|
113
|
+
|
|
114
|
+
1. `scrape(url, format: "json")` → save as snapshot
|
|
115
|
+
2. Later: `diff(url, snapshot: saved)` → see what changed
|
|
116
|
+
|
|
117
|
+
## Installation
|
|
118
|
+
|
|
119
|
+
```bash
|
|
120
|
+
# Homebrew (recommended)
|
|
121
|
+
brew tap 0xMassi/webclaw
|
|
122
|
+
brew install webclaw
|
|
123
|
+
|
|
124
|
+
# Prebuilt binary
|
|
125
|
+
# Download from https://github.com/0xMassi/webclaw/releases
|
|
126
|
+
# Place webclaw-mcp in ~/.webclaw/
|
|
127
|
+
|
|
128
|
+
# Verify
|
|
129
|
+
webclaw-mcp --version
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
## Configuration
|
|
133
|
+
|
|
134
|
+
After install, enable in `.opencode/opencode.json`:
|
|
135
|
+
|
|
136
|
+
```json
|
|
137
|
+
{
|
|
138
|
+
"mcp": {
|
|
139
|
+
"webclaw": {
|
|
140
|
+
"enabled": true
|
|
141
|
+
}
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Optional env vars:
|
|
147
|
+
|
|
148
|
+
- `WEBCLAW_API_KEY` — cloud API for bot-protected sites (optional)
|
|
149
|
+
- `OLLAMA_HOST` — for extract/summarize tools (default: `http://localhost:11434`)
|
|
150
|
+
|
|
151
|
+
## Limitations
|
|
152
|
+
|
|
153
|
+
- **No JS rendering** (local mode) — SPAs that render entirely client-side won't extract fully. Use `--cloud` with API key, or use `playwright` skill instead.
|
|
154
|
+
- **Same-origin crawl only** — won't follow external links during crawl.
|
|
155
|
+
- **Early version** — v0.3.2, MIT license. Report issues to https://github.com/0xMassi/webclaw/issues
|