openclaw-crawleo-skill 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +236 -0
- package/SKILL.md +70 -0
- package/contracts/coverage-checklist.md +22 -0
- package/contracts/crawleo-endpoint-evidence.md +137 -0
- package/contracts/crawleo-endpoints.json +237 -0
- package/contracts/crawleo-endpoints.md +268 -0
- package/contracts/final-assembly-report.md +84 -0
- package/examples/README.md +37 -0
- package/examples/live-usage-template.js +34 -0
- package/examples/offline-fake-fetch.js +41 -0
- package/package.json +42 -0
- package/scripts/verify-contracts.js +97 -0
- package/scripts/verify-final.js +162 -0
- package/scripts/verify-scaffold.js +166 -0
- package/skill.json +47 -0
- package/src/client.js +155 -0
- package/src/contract.js +50 -0
- package/src/endpoints.js +78 -0
- package/src/errors.js +89 -0
- package/src/index.js +37 -0
- package/test/client.test.js +104 -0
- package/test/endpoints.test.js +130 -0
- package/test/error-fixtures.test.js +151 -0
- package/test/errors.test.js +116 -0
- package/test/live.test.js +28 -0
- package/test/scaffold.test.js +47 -0
- package/test/wrapper-fixtures.test.js +227 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Crawleo OpenClaw Skill contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,236 @@
|
|
|
1
|
+
# Crawleo OpenClaw Skill
|
|
2
|
+
|
|
3
|
+
A self-contained OpenClaw skill package for Crawleo web intelligence capabilities: Bing-powered web search, Google Search, Google Maps place search, URL crawling/content extraction, and Headful Browser crawling for protected or heavily dynamic sites.
|
|
4
|
+
|
|
5
|
+
## Status
|
|
6
|
+
|
|
7
|
+
This package is contract-backed and includes offline-tested REST wrapper helpers for every documented Crawleo endpoint in this milestone:
|
|
8
|
+
|
|
9
|
+
- `/search`
|
|
10
|
+
- `/google-search`
|
|
11
|
+
- `/google-maps`
|
|
12
|
+
- `/crawl`
|
|
13
|
+
- `/headful-browser`
|
|
14
|
+
|
|
15
|
+
Live Crawleo calls require `CRAWLEO_API_KEY`. Default verification is offline and must not call Crawleo, require credentials, or consume credits.
|
|
16
|
+
|
|
17
|
+
Current source of truth:
|
|
18
|
+
|
|
19
|
+
- Machine-readable endpoint contract: `contracts/crawleo-endpoints.json`
|
|
20
|
+
- Human-readable endpoint contract: `contracts/crawleo-endpoints.md`
|
|
21
|
+
- Endpoint coverage checklist: `contracts/coverage-checklist.md`
|
|
22
|
+
- Final assembly report: `contracts/final-assembly-report.md`
|
|
23
|
+
- Skill metadata: `skill.json`
|
|
24
|
+
- Skill instructions: `SKILL.md`
|
|
25
|
+
|
|
26
|
+
## Installation and Setup
|
|
27
|
+
|
|
28
|
+
Use Node.js 18 or newer.
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
npm install
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
This package currently has no runtime dependencies.
|
|
35
|
+
|
|
36
|
+
For live Crawleo calls, configure `CRAWLEO_API_KEY` in the environment where OpenClaw or Node runs. The key is sent with Crawleo's documented `x-api-key` header by default.
|
|
37
|
+
|
|
38
|
+
Never print, echo, log, serialize, commit, or include the API key in examples, errors, test output, or debug output. The wrapper errors are designed to redact configured secrets before serialization.
|
|
39
|
+
|
|
40
|
+
## Basic Usage
|
|
41
|
+
|
|
42
|
+
```js
|
|
43
|
+
import { createCrawleoClient } from 'openclaw-crawleo-skill';
|
|
44
|
+
|
|
45
|
+
const client = createCrawleoClient({
|
|
46
|
+
apiKey: process.env.CRAWLEO_API_KEY
|
|
47
|
+
});
|
|
48
|
+
|
|
49
|
+
const results = await client.search({
|
|
50
|
+
query: 'ai agents',
|
|
51
|
+
max_pages: 1
|
|
52
|
+
});
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
For tests and offline examples, inject `fetch` so no network request is made:
|
|
56
|
+
|
|
57
|
+
```js
|
|
58
|
+
import { createCrawleoClient } from 'openclaw-crawleo-skill';
|
|
59
|
+
|
|
60
|
+
const client = createCrawleoClient({
|
|
61
|
+
apiKey: 'test-key',
|
|
62
|
+
fetch: async (url, init) => ({
|
|
63
|
+
ok: true,
|
|
64
|
+
status: 200,
|
|
65
|
+
headers: new Map([['content-type', 'application/json']]),
|
|
66
|
+
async text() {
|
|
67
|
+
return JSON.stringify({ url: url.toString(), method: init.method });
|
|
68
|
+
}
|
|
69
|
+
})
|
|
70
|
+
});
|
|
71
|
+
|
|
72
|
+
await client.crawl({ urls: ['https://example.com'], markdown: true });
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Wrapper API
|
|
76
|
+
|
|
77
|
+
Create one client and call endpoint-specific methods with endpoint parameters.
|
|
78
|
+
|
|
79
|
+
### Bing-powered web search
|
|
80
|
+
|
|
81
|
+
```js
|
|
82
|
+
await client.search({
|
|
83
|
+
query: 'machine learning',
|
|
84
|
+
max_pages: 1,
|
|
85
|
+
device: 'desktop',
|
|
86
|
+
markdown: true
|
|
87
|
+
});
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Google Search
|
|
91
|
+
|
|
92
|
+
```js
|
|
93
|
+
await client.googleSearch({
|
|
94
|
+
q: 'best CRM software',
|
|
95
|
+
gl: 'us',
|
|
96
|
+
hl: 'en',
|
|
97
|
+
type: 'search',
|
|
98
|
+
num: 10
|
|
99
|
+
});
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Google Maps
|
|
103
|
+
|
|
104
|
+
```js
|
|
105
|
+
await client.googleMaps({
|
|
106
|
+
q: 'restaurants in Paris',
|
|
107
|
+
hl: 'fr'
|
|
108
|
+
});
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### URL crawling and extraction
|
|
112
|
+
|
|
113
|
+
```js
|
|
114
|
+
await client.crawl({
|
|
115
|
+
urls: ['https://example.com'],
|
|
116
|
+
markdown: true,
|
|
117
|
+
render_js: false
|
|
118
|
+
});
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
### Headful Browser crawling
|
|
122
|
+
|
|
123
|
+
```js
|
|
124
|
+
await client.headfulBrowser({
|
|
125
|
+
urls: 'https://example.com',
|
|
126
|
+
country: 'us',
|
|
127
|
+
output_format: 'markdown',
|
|
128
|
+
screenshot: false
|
|
129
|
+
});
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Top-level wrapper functions are also exported for advanced composition. They accept a configured client object plus endpoint parameters, not raw credentials:
|
|
133
|
+
|
|
134
|
+
```js
|
|
135
|
+
import { createCrawleoClient, googleSearch } from 'openclaw-crawleo-skill';
|
|
136
|
+
|
|
137
|
+
const client = createCrawleoClient({ apiKey: process.env.CRAWLEO_API_KEY });
|
|
138
|
+
await googleSearch(client, { q: 'AI agents', type: 'news' });
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
## Covered Crawleo Capabilities
|
|
142
|
+
|
|
143
|
+
| REST Endpoint | MCP Tool | Wrapper Method | Required Params | Documented Cost / Notes |
|
|
144
|
+
|---|---|---|---|---|
|
|
145
|
+
| `/search` | `search_web` | `client.search` | `query` | 10 credits per page via `max_pages`; example-only `count` behavior is not specified in Crawleo docs. |
|
|
146
|
+
| `/google-search` | `google_search` | `client.googleSearch` | `q` | 10 credits per request; documented endpoint is absent from the local OpenAPI snapshot. |
|
|
147
|
+
| `/google-maps` | `google_maps` | `client.googleMaps` | `q` | Endpoint docs say 30 credits per request; MCP overview says 10 credits per request. Preserve this as a source conflict. |
|
|
148
|
+
| `/crawl` | `crawl_web` | `client.crawl` | `urls` | 1 credit per URL when `render_js=false`; 10 credits per URL when `render_js=true`. |
|
|
149
|
+
| `/headful-browser` | `headful_browser` | `client.headfulBrowser` | `urls` | 50 credits per URL; failed requests cost 0 credits; documented endpoint is absent from the local OpenAPI snapshot. |
|
|
150
|
+
|
|
151
|
+
Validation is intentionally bounded to the contract. The wrapper checks required parameters and explicit documented enums such as `device`, Google Search `type` and `tbs`, and Headful Browser `output_format`. It does not invent undocumented limits.
|
|
152
|
+
|
|
153
|
+
## Error Handling
|
|
154
|
+
|
|
155
|
+
The package exports `CrawleoError` and `CRAWLEO_ERROR_CODES` for stable diagnostics.
|
|
156
|
+
|
|
157
|
+
```js
|
|
158
|
+
import { CrawleoError, CRAWLEO_ERROR_CODES } from 'openclaw-crawleo-skill';
|
|
159
|
+
|
|
160
|
+
try {
|
|
161
|
+
await client.googleSearch({ q: 'AI agents', type: 'videos' });
|
|
162
|
+
} catch (error) {
|
|
163
|
+
if (error instanceof CrawleoError) {
|
|
164
|
+
console.error(error.toJSON());
|
|
165
|
+
}
|
|
166
|
+
}
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
Stable error codes include:
|
|
170
|
+
|
|
171
|
+
| Code | Meaning |
|
|
172
|
+
|---|---|
|
|
173
|
+
| `CRAWLEO_CONFIG_MISSING_API_KEY` | Live call attempted without an API key. |
|
|
174
|
+
| `CRAWLEO_CONFIG_MISSING_FETCH` | No `fetch` implementation is available. |
|
|
175
|
+
| `CRAWLEO_VALIDATION_ERROR` | Wrapper input failed documented required-field or enum validation. |
|
|
176
|
+
| `CRAWLEO_HTTP_BAD_REQUEST` | Crawleo returned HTTP 400. |
|
|
177
|
+
| `CRAWLEO_HTTP_AUTH` | Crawleo returned HTTP 401. |
|
|
178
|
+
| `CRAWLEO_HTTP_PAYMENT_REQUIRED` | Crawleo returned HTTP 402, typically insufficient credits. |
|
|
179
|
+
| `CRAWLEO_HTTP_FORBIDDEN` | Crawleo returned HTTP 403, such as inactive account or expired subscription. |
|
|
180
|
+
| `CRAWLEO_HTTP_RATE_LIMIT` | Crawleo returned HTTP 429. |
|
|
181
|
+
| `CRAWLEO_HTTP_UPSTREAM` | Crawleo returned a 5xx or otherwise unmapped HTTP failure. |
|
|
182
|
+
| `CRAWLEO_RESPONSE_MALFORMED_JSON` | Crawleo returned a successful response that could not be parsed as JSON. |
|
|
183
|
+
| `CRAWLEO_TRANSPORT_ERROR` | Fetch failed before an HTTP response was received. |
|
|
184
|
+
|
|
185
|
+
`CrawleoError.toJSON()` includes structured fields such as `code`, `endpoint`, `status`, `field`, and redacted `details`. It must not include raw API key values.
|
|
186
|
+
|
|
187
|
+
## Offline Verification
|
|
188
|
+
|
|
189
|
+
Run the default offline checks:
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
npm test
|
|
193
|
+
npm run verify:contracts
|
|
194
|
+
npm run verify:examples
|
|
195
|
+
npm run verify:scaffold
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
These commands must not call `https://api.crawleo.dev`, require `CRAWLEO_API_KEY`, or consume Crawleo credits.
|
|
199
|
+
|
|
200
|
+
Optional live smoke tests are available but are disabled unless both variables are present:
|
|
201
|
+
|
|
202
|
+
```bash
|
|
203
|
+
CRAWLEO_API_KEY=... CRAWLEO_ENABLE_LIVE_TESTS=1 npm run test:live
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
Without both variables, `npm run test:live` skips the live test and exits successfully.
|
|
207
|
+
|
|
208
|
+
## Optional Crawleo MCP Companion
|
|
209
|
+
|
|
210
|
+
Crawleo documents an MCP endpoint at:
|
|
211
|
+
|
|
212
|
+
```text
|
|
213
|
+
https://api.crawleo.dev/mcp
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
This package uses REST wrappers as the primary execution path. MCP setup is optional companion context, not the primary implementation path for this milestone.
|
|
217
|
+
|
|
218
|
+
## Contract and Ambiguity Policy
|
|
219
|
+
|
|
220
|
+
Use `contracts/crawleo-endpoints.json` as the implementation source of truth. Crawleo endpoint-specific docs take precedence over the local OpenAPI snapshot when the OpenAPI snapshot omits a documented endpoint.
|
|
221
|
+
|
|
222
|
+
When a default, limit, parameter, error table, or response field is unclear, write `not specified in Crawleo docs` rather than inventing behavior.
|
|
223
|
+
|
|
224
|
+
Known source issues preserved from the contract inventory:
|
|
225
|
+
|
|
226
|
+
- `/google-search` and `/headful-browser` are documented by endpoint docs but absent from the local OpenAPI snapshot.
|
|
227
|
+
- Google Maps cost differs by source: endpoint docs say 30 credits per request; MCP overview says 10 credits per request.
|
|
228
|
+
- `/search` examples include `count`, but the visible parameter table does not document it.
|
|
229
|
+
|
|
230
|
+
## Development Roadmap
|
|
231
|
+
|
|
232
|
+
- S02: package scaffold, metadata, README, and offline scaffold verification.
|
|
233
|
+
- S03: Crawleo REST client and endpoint wrappers with offline tests.
|
|
234
|
+
- S04: complete user documentation, examples, and endpoint coverage checklist.
|
|
235
|
+
- S05: offline wrapper tests and optional live-test gating.
|
|
236
|
+
- S06: final package integration proof.
|
package/SKILL.md
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: crawleo
|
|
3
|
+
description: Use when OpenClaw needs Crawleo-powered web search, Google Search SERP data, Google Maps place data, URL crawling/content extraction, or headful browser crawling through the Crawleo REST API.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Crawleo OpenClaw Skill
|
|
7
|
+
|
|
8
|
+
Use this skill when OpenClaw needs Crawleo-powered live web search, Google Search SERP data, Google Maps place data, URL crawling/content extraction, or headful browser crawling for protected and highly dynamic sites.
|
|
9
|
+
|
|
10
|
+
## Current Implementation Status
|
|
11
|
+
|
|
12
|
+
This repository now includes offline-tested Crawleo REST wrapper helpers for all five documented endpoints. Live Crawleo calls require `CRAWLEO_API_KEY`; user-facing examples and optional live verification are expanded in later slices. Do not claim live behavior was verified unless an explicitly enabled live test has run.
|
|
13
|
+
|
|
14
|
+
## Source of Truth
|
|
15
|
+
|
|
16
|
+
Use `contracts/crawleo-endpoints.json` as the machine-readable endpoint contract, `contracts/crawleo-endpoints.md` as the human-readable contract, and `contracts/coverage-checklist.md` as the endpoint-to-wrapper/test/example coverage checklist. These files cover:
|
|
17
|
+
|
|
18
|
+
- `/search` mapped to MCP tool `search_web`
|
|
19
|
+
- `/google-search` mapped to MCP tool `google_search`
|
|
20
|
+
- `/google-maps` mapped to MCP tool `google_maps`
|
|
21
|
+
- `/crawl` mapped to MCP tool `crawl_web`
|
|
22
|
+
- `/headful-browser` mapped to MCP tool `headful_browser`
|
|
23
|
+
|
|
24
|
+
Crawleo endpoint-specific docs take precedence over the local OpenAPI snapshot when sources conflict. If a default, limit, response field, error table, or parameter is unclear, write `not specified in Crawleo docs` rather than inventing behavior.
|
|
25
|
+
|
|
26
|
+
## Authentication and Secret Handling
|
|
27
|
+
|
|
28
|
+
Live Crawleo REST calls require `CRAWLEO_API_KEY`. Send it with Crawleo's documented `x-api-key` header by default. Crawleo also documents `Authorization: Bearer YOUR_API_KEY` as an alternate authentication style.
|
|
29
|
+
|
|
30
|
+
Create a client with `createCrawleoClient({ apiKey: process.env.CRAWLEO_API_KEY })`, then call `client.search`, `client.googleSearch`, `client.googleMaps`, `client.crawl`, or `client.headfulBrowser` with endpoint parameters from `contracts/crawleo-endpoints.json`.
|
|
31
|
+
|
|
32
|
+
Never print, echo, log, persist, or include API key values in errors, examples, test output, or debug output.
|
|
33
|
+
|
|
34
|
+
## Offline-First Behavior
|
|
35
|
+
|
|
36
|
+
Default commands, examples, and tests must be offline-safe. They must not call `https://api.crawleo.dev`, require `CRAWLEO_API_KEY`, or consume Crawleo credits unless explicitly marked as live tests.
|
|
37
|
+
|
|
38
|
+
Live tests must require both:
|
|
39
|
+
|
|
40
|
+
1. `CRAWLEO_API_KEY`
|
|
41
|
+
2. `CRAWLEO_ENABLE_LIVE_TESTS=1`
|
|
42
|
+
|
|
43
|
+
Run optional live verification with:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
CRAWLEO_API_KEY=... CRAWLEO_ENABLE_LIVE_TESTS=1 npm run test:live
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Without both variables, the live test skips safely and exits 0.
|
|
50
|
+
|
|
51
|
+
## Endpoint Use Guidance
|
|
52
|
+
|
|
53
|
+
- Use `/search` / `search_web` for Bing-powered web search with optional auto-crawling and content extraction for LLM/RAG workflows.
|
|
54
|
+
- Use `/google-search` / `google_search` for Google SERP data, including web, news, images, places, shopping, knowledge graph, People Also Ask, related searches, and answer boxes.
|
|
55
|
+
- Use `/google-maps` / `google_maps` for structured place/business/landmark data from Google Maps.
|
|
56
|
+
- Use `/crawl` / `crawl_web` for direct URL crawling and content extraction. Try this before headful browser to reduce credit usage.
|
|
57
|
+
- Use `/headful-browser` / `headful_browser` only when standard crawling is blocked or a headed browser/screenshot path is required. Crawleo docs say this costs 50 credits per URL and failed requests cost 0 credits.
|
|
58
|
+
|
|
59
|
+
## Verification
|
|
60
|
+
|
|
61
|
+
Run offline verification with:
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
npm test
|
|
65
|
+
npm run verify:contracts
|
|
66
|
+
npm run verify:examples
|
|
67
|
+
npm run verify:scaffold
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
At this stage, `npm run verify:scaffold` proves the self-contained package files exist, point to the Crawleo contract inventory, and export the runtime wrapper surface. Later slices add richer examples, documentation, and optional live-test gating.
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Crawleo Endpoint Coverage Checklist
|
|
2
|
+
|
|
3
|
+
This checklist maps the documented Crawleo surface to this OpenClaw skill package. It is intended for downstream documentation, test, and final assembly verification.
|
|
4
|
+
|
|
5
|
+
| REST Endpoint | MCP Tool | Contract Entry | Wrapper Method | README Coverage | Example Coverage | Test Coverage | Notes |
|
|
6
|
+
|---|---|---|---|---|---|---|---|
|
|
7
|
+
| `/search` | `search_web` | `contracts/crawleo-endpoints.json` id `search` | `client.search` / `search(client, params)` | Basic usage, wrapper API, covered capabilities table | `examples/offline-fake-fetch.js` | `test/wrapper-fixtures.test.js`, `test/endpoints.test.js`, `test/client.test.js`, `test/errors.test.js` | `count` appears in examples but is not in the visible parameter table; keep as not specified in Crawleo docs. |
|
|
8
|
+
| `/google-search` | `google_search` | `contracts/crawleo-endpoints.json` id `google_search` | `client.googleSearch` / `googleSearch(client, params)` | Wrapper API and covered capabilities table | `examples/offline-fake-fetch.js` | `test/wrapper-fixtures.test.js`, `test/endpoints.test.js`, `test/errors.test.js` | Documented by endpoint docs but absent from the local OpenAPI snapshot. |
|
|
9
|
+
| `/google-maps` | `google_maps` | `contracts/crawleo-endpoints.json` id `google_maps` | `client.googleMaps` / `googleMaps(client, params)` | Wrapper API and covered capabilities table | `examples/offline-fake-fetch.js` | `test/wrapper-fixtures.test.js`, `test/endpoints.test.js`, `test/errors.test.js` | Cost differs by source: endpoint docs say 30 credits per request; MCP overview says 10 credits per request. |
|
|
10
|
+
| `/crawl` | `crawl_web` | `contracts/crawleo-endpoints.json` id `crawl` | `client.crawl` / `crawl(client, params)` | Wrapper API and covered capabilities table | `examples/offline-fake-fetch.js` | `test/wrapper-fixtures.test.js`, `test/endpoints.test.js`, `test/client.test.js`, `test/errors.test.js` | Error response table is not specified in Crawleo docs. |
|
|
11
|
+
| `/headful-browser` | `headful_browser` | `contracts/crawleo-endpoints.json` id `headful_browser` | `client.headfulBrowser` / `headfulBrowser(client, params)` | Wrapper API and covered capabilities table | `examples/offline-fake-fetch.js` | `test/wrapper-fixtures.test.js`, `test/endpoints.test.js`, `test/errors.test.js` | Documented by endpoint docs but absent from the local OpenAPI snapshot; error response table is not specified in Crawleo docs. |
|
|
12
|
+
|
|
13
|
+
## Verification Surfaces
|
|
14
|
+
|
|
15
|
+
- `npm run verify:contracts` verifies endpoint and MCP tool coverage in the contract inventory.
|
|
16
|
+
- `npm test` verifies wrapper request construction, validation, response parsing, error normalization, live-test skip behavior, and secret redaction offline.
|
|
17
|
+
- `npm run test:live` runs the optional live smoke test file; it skips unless both `CRAWLEO_API_KEY` and `CRAWLEO_ENABLE_LIVE_TESTS=1` are set.
|
|
18
|
+
- `npm run verify:scaffold` verifies package files, public exports, README/SKILL references, live-test gate documentation, and this coverage checklist.
|
|
19
|
+
|
|
20
|
+
## Live Verification Status
|
|
21
|
+
|
|
22
|
+
Live Crawleo calls are intentionally not part of default verification. Optional live tests must require both `CRAWLEO_API_KEY` and `CRAWLEO_ENABLE_LIVE_TESTS=1` before making any request to Crawleo. Without both variables, `npm run test:live` exits 0 with the live test skipped.
|
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# Crawleo Endpoint Contract Evidence
|
|
2
|
+
|
|
3
|
+
This file records the source evidence for S01 before the structured contract inventory is authored. Endpoint-specific Crawleo docs are treated as authoritative where the local OpenAPI snapshot is narrower.
|
|
4
|
+
|
|
5
|
+
## Source Set
|
|
6
|
+
|
|
7
|
+
- Crawleo docs index: `.gsd/research/crawleo-docs/llms.txt.txt` / `https://docs.crawleo.dev/llms.txt`
|
|
8
|
+
- REST introduction: `.gsd/research/crawleo-docs/md/introduction.md` / `https://docs.crawleo.dev/api-reference/introduction.md`
|
|
9
|
+
- Authentication: `.gsd/research/crawleo-docs/md/authentication.md` / `https://docs.crawleo.dev/authentication.md`
|
|
10
|
+
- MCP overview: `.gsd/research/crawleo-docs/md/overview.md` / `https://docs.crawleo.dev/mcp/overview.md`
|
|
11
|
+
- OpenAPI snapshot: `.gsd/research/crawleo-docs/openapi.json.json` / `https://docs.crawleo.dev/openapi.json`
|
|
12
|
+
|
|
13
|
+
## Cross-Source Conflict Notes
|
|
14
|
+
|
|
15
|
+
- The endpoint-specific docs and docs index list five REST capabilities: `/search`, `/google-search`, `/google-maps`, `/crawl`, and `/headful-browser`.
|
|
16
|
+
- The local OpenAPI snapshot currently lists only `/search`, `/google-maps`, and `/crawl`.
|
|
17
|
+
- S01 therefore keeps `/google-search` and `/headful-browser` because they are documented on endpoint-specific Crawleo docs pages and in the docs index.
|
|
18
|
+
- Where fields are not explicitly documented, downstream contract files must use the phrase `not specified in Crawleo docs` rather than inventing behavior.
|
|
19
|
+
|
|
20
|
+
## REST Endpoint Evidence
|
|
21
|
+
|
|
22
|
+
### `/search` — Bing Search API
|
|
23
|
+
|
|
24
|
+
- Source doc: `.gsd/research/crawleo-docs/md/search.md`
|
|
25
|
+
- Source URL: `https://docs.crawleo.dev/api-reference/endpoint/search.md`
|
|
26
|
+
- Endpoint evidence: `GET https://api.crawleo.dev/search`
|
|
27
|
+
- Authentication evidence: required `x-api-key` header; docs also allow `Authorization: Bearer YOUR_API_KEY`.
|
|
28
|
+
- Required query parameter: `query` string.
|
|
29
|
+
- Documented optional parameters include:
|
|
30
|
+
- `max_pages` integer, default `1`; each page costs 10 credits.
|
|
31
|
+
- `setLang` string language code examples: `en`, `es`, `fr`, `de`.
|
|
32
|
+
- `cc` string country code examples: `US`, `GB`, `DE`.
|
|
33
|
+
- `geolocation` string; docs mention `random` for randomized geolocation.
|
|
34
|
+
- `device` string, default `desktop`; documented options: `desktop`, `mobile`, `tablet`.
|
|
35
|
+
- SERP toggles: `copilot_answer`, `questions_answers`, `related_queries`, `sidebar`, `direct_answer`; each default `true`.
|
|
36
|
+
- Page-content toggles: `raw_html`, `enhanced_html`, `page_text`, `markdown`; each default `false` in the endpoint doc.
|
|
37
|
+
- `auto_crawling` boolean, default `false`.
|
|
38
|
+
- Example evidence: basic search request uses `https://api.crawleo.dev/search?query=machine%20learning&count=10` even though `count` is not listed in the visible parameter table; this must be marked as an ambiguity before implementation.
|
|
39
|
+
- Response evidence fields include `query`, `pages_fetched`, `pages`, `total_results`, `search_results`, result `title`, `link`, `date`, `snippet`, `source`, `related_queries`, `page_content`, `enhanced_html`, `page_markdown`, and `credits`.
|
|
40
|
+
- Error response table: not specified in Crawleo docs.
|
|
41
|
+
- MCP mapping: `search_web` from MCP overview.
|
|
42
|
+
|
|
43
|
+
### `/google-search` — Google Search API
|
|
44
|
+
|
|
45
|
+
- Source doc: `.gsd/research/crawleo-docs/md/google-search.md`
|
|
46
|
+
- Source URL: `https://docs.crawleo.dev/api-reference/endpoint/google-search.md`
|
|
47
|
+
- Endpoint evidence: `GET https://api.crawleo.dev/google-search`
|
|
48
|
+
- Cost evidence: 10 credits per request.
|
|
49
|
+
- Authentication evidence: required `x-api-key` header; docs also allow `Authorization: Bearer YOUR_API_KEY`.
|
|
50
|
+
- Required query parameter: `q` string.
|
|
51
|
+
- Documented optional parameters include:
|
|
52
|
+
- `gl` string, default `us`; ISO 3166-1 alpha-2 examples include `us`, `gb`, `eg`, `de`, `fr`.
|
|
53
|
+
- `hl` string, default `en`; IETF language tag examples include `en`, `ar`, `fr`, `de`.
|
|
54
|
+
- `tbs` string; documented values include `qdr:h`, `qdr:d`, `qdr:w`, `qdr:m`, `qdr:y`.
|
|
55
|
+
- `page` integer, default `1`; 1-indexed.
|
|
56
|
+
- `num` integer, default `10`; documented range `1–100`.
|
|
57
|
+
- `type` string, default `search`; documented values: `search`, `news`, `images`, `places`, `shopping`.
|
|
58
|
+
- Response evidence fields include `parameters`, `google_search_results`, result `title`, `link`, `snippet`, `position`, `knowledgeGraph`, `peopleAlsoAsk`, `relatedSearches`, and `answerBox`.
|
|
59
|
+
- Error evidence: `400` missing `q`, `401` invalid/missing API key, `402` insufficient credits, `429` rate limit exceeded, `500` internal server error.
|
|
60
|
+
- MCP mapping: endpoint doc states tool name `google_search`; MCP overview also lists `google_search`.
|
|
61
|
+
|
|
62
|
+
### `/google-maps` — Google Maps API
|
|
63
|
+
|
|
64
|
+
- Source doc: `.gsd/research/crawleo-docs/md/google-maps.md`
|
|
65
|
+
- Source URL: `https://docs.crawleo.dev/api-reference/endpoint/google-maps.md`
|
|
66
|
+
- Endpoint evidence: `GET https://api.crawleo.dev/google-maps`
|
|
67
|
+
- Cost evidence: endpoint doc says 30 credits per request; MCP overview says `google_maps` costs 10 credits per request. Preserve this as a source conflict until resolved.
|
|
68
|
+
- Authentication evidence: required `x-api-key` header; docs also allow `Authorization: Bearer YOUR_API_KEY`.
|
|
69
|
+
- Required query parameter: `q` string; accepts business names, landmarks, addresses, keywords, and category + location queries.
|
|
70
|
+
- Documented optional parameters include:
|
|
71
|
+
- `hl` string ISO 639-1 language code; examples include `en`, `ar`, `fr`, `de`.
|
|
72
|
+
- `ll` string location bias in format `@latitude,longitude,zoomz`; zoom range documented as `1z` to `21z`.
|
|
73
|
+
- `placeId` string Google Place ID for direct lookup.
|
|
74
|
+
- `cid` string Google numeric business/customer ID for direct business lookup.
|
|
75
|
+
- Parameter combination evidence includes `q` only, `q + hl`, `q + ll`, `q + ll + hl`, `q + placeId`, `q + placeId + hl`, `q + cid`, and `q + cid + hl`.
|
|
76
|
+
- Response evidence fields include `parameters`, `google_maps_results`, result `position`, `title`, `address`, `rating`, `ratingCount`, `phoneNumber`, `website`, `type`, `types`, `priceLevel`, `placeId`, `cid`, `latitude`, `longitude`, `openingHours`, `thumbnailUrl`, and `credits`.
|
|
77
|
+
- Error evidence: `400` missing `q`, `401` invalid/missing API key, `403` inactive account or expired subscription, `429` credits exhausted or concurrent request limit reached, `500` internal server error.
|
|
78
|
+
- MCP mapping: `google_maps` from MCP overview.
|
|
79
|
+
|
|
80
|
+
### `/crawl` — Crawler API
|
|
81
|
+
|
|
82
|
+
- Source doc: `.gsd/research/crawleo-docs/md/crawler.md`
|
|
83
|
+
- Source URL: `https://docs.crawleo.dev/api-reference/endpoint/crawler.md`
|
|
84
|
+
- Endpoint evidence: `GET https://api.crawleo.dev/crawl`
|
|
85
|
+
- Authentication evidence: required `x-api-key` header; docs also allow `Authorization: Bearer YOUR_API_KEY`.
|
|
86
|
+
- Required query parameter: `urls` string; accepts a single URL or comma-separated list.
|
|
87
|
+
- Documented optional parameters include:
|
|
88
|
+
- `render_js` boolean, default `false`; `true` browser rendering costs 10 credits per URL, `false` HTTP request costs 1 credit per URL.
|
|
89
|
+
- `geolocation` string ISO 3166-1 alpha-2 country code.
|
|
90
|
+
- Output toggles: `raw_html` default `false`, `enhanced_html` default `true`, `page_text` default `false`, `markdown` default `true`.
|
|
91
|
+
- Screenshot toggles: `screenshot` default `false`, `screenshot_full_page` default `false`; screenshots are only available when `render_js=true`, and `screenshot=true` without JavaScript rendering is ignored.
|
|
92
|
+
- Response evidence fields include `results`, per-result `url`, `status_code`, `raw_html`, `enhanced_html`, `markdown`, `page_text`, `screenshot`, `error`, plus top-level `credits` and `successful_pages`.
|
|
93
|
+
- Error response table: not specified in Crawleo docs.
|
|
94
|
+
- MCP mapping: `crawl_web` from MCP overview.
|
|
95
|
+
|
|
96
|
+
### `/headful-browser` — Headful Browser API
|
|
97
|
+
|
|
98
|
+
- Source doc: `.gsd/research/crawleo-docs/md/headful-browser.md`
|
|
99
|
+
- Source URL: `https://docs.crawleo.dev/api-reference/endpoint/headful-browser.md`
|
|
100
|
+
- Endpoint evidence: `GET https://api.crawleo.dev/headful-browser`
|
|
101
|
+
- Cost evidence: 50 credits per URL; failed requests cost 0 credits.
|
|
102
|
+
- Authentication evidence: required `x-api-key` header; docs also allow `Authorization: Bearer YOUR_API_KEY`.
|
|
103
|
+
- Required query parameter: `urls` string; accepts one or more URLs as a single URL or comma-separated list.
|
|
104
|
+
- Documented optional parameters include:
|
|
105
|
+
- `country` string, default `us`; supported examples include `us`, `gb`, `de`, `fr`, `jp`, `in`, `br`, `ca`, `au`, and more.
|
|
106
|
+
- `output_format` string, default `markdown`; documented values: `markdown`, `enhanced_html`, `raw_html`, `page_text`.
|
|
107
|
+
- `screenshot` boolean, default `false`; screenshot is returned as a URL.
|
|
108
|
+
- Response evidence fields include `status`, `data`, per-item `url`, `markdown`, `raw_html`, `enhanced_html`, `page_text`, `screenshot`, `blocked`, `credits_used`, and `credits_remaining`.
|
|
109
|
+
- Error response table: not specified in Crawleo docs.
|
|
110
|
+
- MCP mapping: endpoint doc states tool name `headful_browser`; MCP overview also lists `headful_browser`.
|
|
111
|
+
|
|
112
|
+
## MCP Tool Evidence
|
|
113
|
+
|
|
114
|
+
- Source doc: `.gsd/research/crawleo-docs/md/overview.md`
|
|
115
|
+
- Source URL: `https://docs.crawleo.dev/mcp/overview.md`
|
|
116
|
+
- MCP endpoint: `https://api.crawleo.dev/mcp`
|
|
117
|
+
- Documented tools:
|
|
118
|
+
- `search_web` — Bing-powered web search with auto-crawling; cost 10 credits per page of results.
|
|
119
|
+
- `google_search` — Google SERP data; cost 10 credits per request.
|
|
120
|
+
- `google_maps` — Google Maps places/business search; MCP overview says 10 credits per request, conflicting with endpoint doc's 30 credits per request.
|
|
121
|
+
- `crawl_web` — URL crawling/content extraction; cost 1 credit per URL for HTTP request and 10 credits per URL for browser rendering.
|
|
122
|
+
- `headful_browser` — premium headed browser crawling; cost 50 credits per URL.
|
|
123
|
+
|
|
124
|
+
## OpenAPI Evidence
|
|
125
|
+
|
|
126
|
+
- Source file: `.gsd/research/crawleo-docs/openapi.json.json`
|
|
127
|
+
- Source URL: `https://docs.crawleo.dev/openapi.json`
|
|
128
|
+
- Paths present in the local snapshot: `/search`, `/google-maps`, `/crawl`.
|
|
129
|
+
- Paths absent from the local snapshot but present in endpoint docs and docs index: `/google-search`, `/headful-browser`.
|
|
130
|
+
|
|
131
|
+
## Downstream Contract Rules
|
|
132
|
+
|
|
133
|
+
- Use Crawleo-only naming and branding.
|
|
134
|
+
- Use endpoint-specific docs over OpenAPI when OpenAPI omits an endpoint.
|
|
135
|
+
- Preserve source conflicts, especially Google Maps cost discrepancy, instead of choosing silently.
|
|
136
|
+
- Use `not specified in Crawleo docs` for missing error tables, undocumented defaults, unclear ranges, or example-only parameters not listed in parameter tables.
|
|
137
|
+
- Do not include any live Crawleo calls in default verification.
|