@j0hanz/fetch-url-mcp 1.6.1 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +171 -181
- package/dist/cli.d.ts +1 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/http/auth.d.ts +12 -0
- package/dist/http/auth.d.ts.map +1 -0
- package/dist/http/auth.js +84 -5
- package/dist/http/health.d.ts +1 -0
- package/dist/http/health.d.ts.map +1 -0
- package/dist/http/helpers.d.ts +1 -0
- package/dist/http/helpers.d.ts.map +1 -0
- package/dist/http/native.d.ts +1 -0
- package/dist/http/native.d.ts.map +1 -0
- package/dist/http/native.js +80 -63
- package/dist/http/rate-limit.d.ts +1 -0
- package/dist/http/rate-limit.d.ts.map +1 -0
- package/dist/index.d.ts +1 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/lib/content.d.ts +3 -0
- package/dist/lib/content.d.ts.map +1 -0
- package/dist/lib/content.js +16 -11
- package/dist/lib/core.d.ts +5 -8
- package/dist/lib/core.d.ts.map +1 -0
- package/dist/lib/core.js +110 -97
- package/dist/lib/fetch-pipeline.d.ts +1 -1
- package/dist/lib/fetch-pipeline.d.ts.map +1 -0
- package/dist/lib/fetch-pipeline.js +54 -44
- package/dist/lib/http.d.ts +1 -0
- package/dist/lib/http.d.ts.map +1 -0
- package/dist/lib/mcp-tools.d.ts +45 -7
- package/dist/lib/mcp-tools.d.ts.map +1 -0
- package/dist/lib/mcp-tools.js +37 -6
- package/dist/lib/net-utils.d.ts +1 -0
- package/dist/lib/net-utils.d.ts.map +1 -0
- package/dist/lib/progress.d.ts +1 -0
- package/dist/lib/progress.d.ts.map +1 -0
- package/dist/lib/task-handlers.d.ts +3 -2
- package/dist/lib/task-handlers.d.ts.map +1 -0
- package/dist/lib/task-handlers.js +19 -36
- package/dist/lib/types.d.ts +4 -0
- package/dist/lib/types.d.ts.map +1 -0
- package/dist/lib/types.js +12 -1
- package/dist/lib/url.d.ts +3 -0
- package/dist/lib/url.d.ts.map +1 -0
- package/dist/lib/url.js +78 -151
- package/dist/lib/utils.d.ts +2 -2
- package/dist/lib/utils.d.ts.map +1 -0
- package/dist/lib/utils.js +60 -94
- package/dist/lib/zod.d.ts +3 -0
- package/dist/lib/zod.d.ts.map +1 -0
- package/dist/lib/zod.js +33 -0
- package/dist/prompts/index.d.ts +2 -1
- package/dist/prompts/index.d.ts.map +1 -0
- package/dist/prompts/index.js +2 -13
- package/dist/resources/index.d.ts +2 -1
- package/dist/resources/index.d.ts.map +1 -0
- package/dist/resources/index.js +5 -19
- package/dist/resources/instructions.d.ts +1 -0
- package/dist/resources/instructions.d.ts.map +1 -0
- package/dist/resources/instructions.js +2 -0
- package/dist/schemas/cache.d.ts +18 -0
- package/dist/schemas/cache.d.ts.map +1 -0
- package/dist/schemas/cache.js +19 -0
- package/dist/schemas/inputs.d.ts +1 -0
- package/dist/schemas/inputs.d.ts.map +1 -0
- package/dist/schemas/outputs.d.ts +6 -5
- package/dist/schemas/outputs.d.ts.map +1 -0
- package/dist/schemas/outputs.js +5 -9
- package/dist/server.d.ts +1 -0
- package/dist/server.d.ts.map +1 -0
- package/dist/server.js +6 -6
- package/dist/tasks/execution.d.ts +1 -0
- package/dist/tasks/execution.d.ts.map +1 -0
- package/dist/tasks/execution.js +3 -21
- package/dist/tasks/manager.d.ts +2 -6
- package/dist/tasks/manager.d.ts.map +1 -0
- package/dist/tasks/manager.js +2 -4
- package/dist/tasks/owner.d.ts +1 -0
- package/dist/tasks/owner.d.ts.map +1 -0
- package/dist/tasks/tool-registry.d.ts +1 -0
- package/dist/tasks/tool-registry.d.ts.map +1 -0
- package/dist/tools/fetch-url.d.ts +4 -6
- package/dist/tools/fetch-url.d.ts.map +1 -0
- package/dist/tools/fetch-url.js +46 -44
- package/dist/tools/index.d.ts +1 -0
- package/dist/tools/index.d.ts.map +1 -0
- package/dist/transform/html-translators.d.ts +1 -0
- package/dist/transform/html-translators.d.ts.map +1 -0
- package/dist/transform/html-translators.js +5 -2
- package/dist/transform/metadata.d.ts +1 -0
- package/dist/transform/metadata.d.ts.map +1 -0
- package/dist/transform/metadata.js +1 -0
- package/dist/transform/{workers/shared.d.ts → shared.d.ts} +2 -1
- package/dist/transform/shared.d.ts.map +1 -0
- package/dist/transform/{workers/shared.js → shared.js} +1 -1
- package/dist/transform/transform.d.ts +1 -0
- package/dist/transform/transform.d.ts.map +1 -0
- package/dist/transform/transform.js +21 -14
- package/dist/transform/types.d.ts +1 -4
- package/dist/transform/types.d.ts.map +1 -0
- package/dist/transform/worker-pool.d.ts +3 -18
- package/dist/transform/worker-pool.d.ts.map +1 -0
- package/dist/transform/worker-pool.js +51 -167
- package/package.json +9 -6
- package/dist/transform/workers/transform-child.d.ts +0 -1
- package/dist/transform/workers/transform-child.js +0 -15
- package/dist/transform/workers/transform-worker.d.ts +0 -1
- package/dist/transform/workers/transform-worker.js +0 -13
package/README.md
CHANGED
|
@@ -6,19 +6,19 @@
|
|
|
6
6
|
|
|
7
7
|
[](https://lmstudio.ai/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D) [](https://cursor.com/en/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D) [](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40j0hanz%2Ffetch-url-mcp%40latest&id=%40j0hanz%2Ffetch-url-mcp&name=fetch-url-mcp&description=fetch-url-mcp%20MCP%20server)
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
A web content fetcher MCP server that converts HTML to clean, AI and human readable markdown.
|
|
10
10
|
|
|
11
11
|
## Overview
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
The Fetch URL MCP Server provides a standardized interface for fetching public web content and transforming it into Markdown enriched with structured metadata. It validates URLs, applies noise removal heuristics, and caches results for reuse. The server supports both inline and task-based execution modes, making it suitable for a wide range of client applications and LLM interactions.
|
|
14
14
|
|
|
15
15
|
## Key Features
|
|
16
16
|
|
|
17
|
-
- `fetch-url`
|
|
18
|
-
- The tool
|
|
19
|
-
- GitHub, GitLab, and
|
|
20
|
-
- `
|
|
21
|
-
- HTTP mode
|
|
17
|
+
- `fetch-url` validates public HTTP(S) URLs, fetches the page, and returns cleaned Markdown plus structured metadata.
|
|
18
|
+
- The tool advertises optional task support and emits progress updates while fetching and transforming larger pages.
|
|
19
|
+
- GitHub, GitLab, Bitbucket, and Gist page URLs are rewritten to raw-content endpoints when possible before fetch.
|
|
20
|
+
- `internal://instructions` and `internal://cache/{namespace}/{hash}` expose built-in guidance and cached Markdown as MCP resources.
|
|
21
|
+
- HTTP mode adds host/origin validation, auth, rate limiting, health checks, OAuth protected-resource metadata, and cached-download URLs.
|
|
22
22
|
|
|
23
23
|
## Requirements
|
|
24
24
|
|
|
@@ -53,6 +53,7 @@ Add to `.vscode/mcp.json`:
|
|
|
53
53
|
{
|
|
54
54
|
"servers": {
|
|
55
55
|
"fetch-url-mcp": {
|
|
56
|
+
"type": "stdio",
|
|
56
57
|
"command": "npx",
|
|
57
58
|
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
58
59
|
}
|
|
@@ -81,6 +82,7 @@ Add to `.vscode/mcp.json`:
|
|
|
81
82
|
{
|
|
82
83
|
"servers": {
|
|
83
84
|
"fetch-url-mcp": {
|
|
85
|
+
"type": "stdio",
|
|
84
86
|
"command": "npx",
|
|
85
87
|
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
86
88
|
}
|
|
@@ -125,12 +127,13 @@ For more info, see [Cursor MCP docs](https://docs.cursor.com/context/model-conte
|
|
|
125
127
|
|
|
126
128
|
[](https://vs-open.link/mcp-install?%7B%22fetch-url-mcp%22%3A%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D%7D)
|
|
127
129
|
|
|
128
|
-
|
|
130
|
+
For solution-scoped setup, add this to `.mcp.json` at the solution root:
|
|
129
131
|
|
|
130
132
|
```json
|
|
131
133
|
{
|
|
132
|
-
"
|
|
134
|
+
"servers": {
|
|
133
135
|
"fetch-url-mcp": {
|
|
136
|
+
"type": "stdio",
|
|
134
137
|
"command": "npx",
|
|
135
138
|
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
136
139
|
}
|
|
@@ -145,22 +148,22 @@ For more info, see [Visual Studio MCP docs](https://learn.microsoft.com/en-us/vi
|
|
|
145
148
|
<details>
|
|
146
149
|
<summary><b>Install in Goose</b></summary>
|
|
147
150
|
|
|
148
|
-
[](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40j0hanz%2Ffetch-url-mcp%40latest&id=%40j0hanz%2Ffetch-url-mcp&name=fetch-url-mcp&description=
|
|
151
|
+
[](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40j0hanz%2Ffetch-url-mcp%40latest&id=%40j0hanz%2Ffetch-url-mcp&name=fetch-url-mcp&description=A%20web%20content%20fetcher%20MCP%20server%20that%20converts%20HTML%20to%20clean%2C%20AI%20and%20human%20readable%20markdown.)
|
|
149
152
|
|
|
150
|
-
Add to `
|
|
153
|
+
Add to `~/.config/goose/config.yaml` on macOS/Linux or `%APPDATA%\Block\goose\config\config.yaml` on Windows:
|
|
151
154
|
|
|
152
|
-
```
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
155
|
+
```yaml
|
|
156
|
+
extensions:
|
|
157
|
+
fetch-url-mcp:
|
|
158
|
+
name: fetch-url-mcp
|
|
159
|
+
cmd: npx
|
|
160
|
+
args: ['-y', '@j0hanz/fetch-url-mcp@latest']
|
|
161
|
+
enabled: true
|
|
162
|
+
type: stdio
|
|
163
|
+
timeout: 300
|
|
161
164
|
```
|
|
162
165
|
|
|
163
|
-
For more info, see [Goose
|
|
166
|
+
For more info, see [Goose extension docs](https://block.github.io/goose/docs/getting-started/using-extensions/).
|
|
164
167
|
|
|
165
168
|
</details>
|
|
166
169
|
|
|
@@ -169,7 +172,7 @@ For more info, see [Goose MCP docs](https://block.github.io/goose/docs/getting-s
|
|
|
169
172
|
|
|
170
173
|
[](https://lmstudio.ai/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D)
|
|
171
174
|
|
|
172
|
-
Add to `
|
|
175
|
+
Add to `~/.lmstudio/mcp.json` on macOS/Linux or `%USERPROFILE%/.lmstudio/mcp.json` on Windows:
|
|
173
176
|
|
|
174
177
|
```json
|
|
175
178
|
{
|
|
@@ -209,26 +212,27 @@ For more info, see [Claude Desktop MCP docs](https://modelcontextprotocol.io/qui
|
|
|
209
212
|
<details>
|
|
210
213
|
<summary><b>Install in Claude Code</b></summary>
|
|
211
214
|
|
|
212
|
-
|
|
215
|
+
Use the CLI:
|
|
216
|
+
|
|
217
|
+
```sh
|
|
218
|
+
claude mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
For project-scoped config, Claude Code writes `.mcp.json` with:
|
|
213
222
|
|
|
214
223
|
```json
|
|
215
224
|
{
|
|
216
225
|
"mcpServers": {
|
|
217
226
|
"fetch-url-mcp": {
|
|
218
227
|
"command": "npx",
|
|
219
|
-
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
228
|
+
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
|
|
229
|
+
"env": {}
|
|
220
230
|
}
|
|
221
231
|
}
|
|
222
232
|
}
|
|
223
233
|
```
|
|
224
234
|
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
```sh
|
|
228
|
-
claude mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
For more info, see [Claude Code MCP docs](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/tutorials#set-up-model-context-protocol-mcp).
|
|
235
|
+
For more info, see [Claude Code MCP docs](https://docs.anthropic.com/en/docs/claude-code/mcp).
|
|
232
236
|
|
|
233
237
|
</details>
|
|
234
238
|
|
|
@@ -248,18 +252,18 @@ Add to `~/.codeium/windsurf/mcp_config.json`:
|
|
|
248
252
|
}
|
|
249
253
|
```
|
|
250
254
|
|
|
251
|
-
For more info, see [Windsurf MCP docs](https://docs.windsurf.com/windsurf/mcp).
|
|
255
|
+
For more info, see [Windsurf MCP docs](https://docs.windsurf.com/windsurf/cascade/mcp).
|
|
252
256
|
|
|
253
257
|
</details>
|
|
254
258
|
|
|
255
259
|
<details>
|
|
256
260
|
<summary><b>Install in Amp</b></summary>
|
|
257
261
|
|
|
258
|
-
Add to `
|
|
262
|
+
Add to `~/.config/amp/settings.json` on macOS/Linux, `%USERPROFILE%\.config\amp\settings.json` on Windows, or `.amp/settings.json` for workspace-scoped config:
|
|
259
263
|
|
|
260
264
|
```json
|
|
261
265
|
{
|
|
262
|
-
"mcpServers": {
|
|
266
|
+
"amp.mcpServers": {
|
|
263
267
|
"fetch-url-mcp": {
|
|
264
268
|
"command": "npx",
|
|
265
269
|
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
@@ -274,14 +278,14 @@ Or install via CLI:
|
|
|
274
278
|
amp mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
|
|
275
279
|
```
|
|
276
280
|
|
|
277
|
-
For more info, see [Amp
|
|
281
|
+
For more info, see [Amp docs](https://ampcode.com/manual).
|
|
278
282
|
|
|
279
283
|
</details>
|
|
280
284
|
|
|
281
285
|
<details>
|
|
282
286
|
<summary><b>Install in Cline</b></summary>
|
|
283
287
|
|
|
284
|
-
|
|
288
|
+
Open the MCP Servers panel, choose `Configure MCP Servers`, and add this to `cline_mcp_settings.json`:
|
|
285
289
|
|
|
286
290
|
```json
|
|
287
291
|
{
|
|
@@ -294,27 +298,28 @@ Add to `cline_mcp_settings.json`:
|
|
|
294
298
|
}
|
|
295
299
|
```
|
|
296
300
|
|
|
297
|
-
For more info, see [Cline MCP docs](https://docs.cline.bot/mcp
|
|
301
|
+
For more info, see [Cline MCP docs](https://docs.cline.bot/mcp/configuring-mcp-servers).
|
|
298
302
|
|
|
299
303
|
</details>
|
|
300
304
|
|
|
301
305
|
<details>
|
|
302
306
|
<summary><b>Install in Codex CLI</b></summary>
|
|
303
307
|
|
|
304
|
-
|
|
308
|
+
Use the CLI:
|
|
305
309
|
|
|
306
|
-
```
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
310
|
+
```sh
|
|
311
|
+
codex mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
Or add this to `~/.codex/config.toml` or project-scoped `.codex/config.toml`:
|
|
315
|
+
|
|
316
|
+
```toml
|
|
317
|
+
[mcp_servers.fetch-url-mcp]
|
|
318
|
+
command = "npx"
|
|
319
|
+
args = ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
315
320
|
```
|
|
316
321
|
|
|
317
|
-
For more info, see [Codex
|
|
322
|
+
For more info, see [Codex MCP docs](https://developers.openai.com/codex/mcp/).
|
|
318
323
|
|
|
319
324
|
</details>
|
|
320
325
|
|
|
@@ -327,6 +332,7 @@ Add to `.vscode/mcp.json`:
|
|
|
327
332
|
{
|
|
328
333
|
"servers": {
|
|
329
334
|
"fetch-url-mcp": {
|
|
335
|
+
"type": "stdio",
|
|
330
336
|
"command": "npx",
|
|
331
337
|
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
332
338
|
}
|
|
@@ -341,7 +347,12 @@ For more info, see [GitHub Copilot MCP docs](https://code.visualstudio.com/docs/
|
|
|
341
347
|
<details>
|
|
342
348
|
<summary><b>Install in Warp</b></summary>
|
|
343
349
|
|
|
344
|
-
|
|
350
|
+
Open `Personal > MCP Servers` in Warp, choose `+ Add`, and either add a CLI server with:
|
|
351
|
+
|
|
352
|
+
- `command`: `npx`
|
|
353
|
+
- `args`: `["-y", "@j0hanz/fetch-url-mcp@latest"]`
|
|
354
|
+
|
|
355
|
+
Or paste this JSON snippet when using Warp's multi-server import flow:
|
|
345
356
|
|
|
346
357
|
```json
|
|
347
358
|
{
|
|
@@ -354,27 +365,21 @@ Add to `Warp MCP config`:
|
|
|
354
365
|
}
|
|
355
366
|
```
|
|
356
367
|
|
|
357
|
-
For more info, see [Warp MCP docs](https://docs.warp.dev/features/mcp
|
|
368
|
+
For more info, see [Warp MCP docs](https://docs.warp.dev/features/warp-ai/mcp).
|
|
358
369
|
|
|
359
370
|
</details>
|
|
360
371
|
|
|
361
372
|
<details>
|
|
362
373
|
<summary><b>Install in Kiro</b></summary>
|
|
363
374
|
|
|
364
|
-
Add to `.kiro/settings/mcp.json
|
|
375
|
+
Use Kiro's MCP Servers panel or the `Add to Kiro` install flow. Kiro stores workspace-scoped MCP config in `.kiro/settings/mcp.json` and user-scoped config in `~/.kiro/settings/mcp.json`.
|
|
365
376
|
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
"command": "npx",
|
|
371
|
-
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
372
|
-
}
|
|
373
|
-
}
|
|
374
|
-
}
|
|
375
|
-
```
|
|
377
|
+
For this server, use:
|
|
378
|
+
|
|
379
|
+
- `command`: `npx`
|
|
380
|
+
- `args`: `["-y", "@j0hanz/fetch-url-mcp@latest"]`
|
|
376
381
|
|
|
377
|
-
For more info, see [Kiro MCP docs](https://kiro.dev/
|
|
382
|
+
For more info, see [Kiro MCP docs](https://kiro.dev/blog/unlock-your-development-productivity-with-kiro-and-mcp/).
|
|
378
383
|
|
|
379
384
|
</details>
|
|
380
385
|
|
|
@@ -394,7 +399,7 @@ Add to `~/.gemini/settings.json`:
|
|
|
394
399
|
}
|
|
395
400
|
```
|
|
396
401
|
|
|
397
|
-
For more info, see [Gemini CLI MCP docs](https://
|
|
402
|
+
For more info, see [Gemini CLI MCP docs](https://google-gemini.github.io/gemini-cli/docs/tools/mcp-server.html).
|
|
398
403
|
|
|
399
404
|
</details>
|
|
400
405
|
|
|
@@ -407,118 +412,101 @@ Add to `~/.config/zed/settings.json`:
|
|
|
407
412
|
{
|
|
408
413
|
"context_servers": {
|
|
409
414
|
"fetch-url-mcp": {
|
|
410
|
-
"
|
|
411
|
-
|
|
412
|
-
|
|
413
|
-
}
|
|
415
|
+
"command": "npx",
|
|
416
|
+
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
|
|
417
|
+
"env": {}
|
|
414
418
|
}
|
|
415
419
|
}
|
|
416
420
|
}
|
|
417
421
|
```
|
|
418
422
|
|
|
419
|
-
For more info, see [Zed MCP docs](https://zed.dev/docs/
|
|
423
|
+
For more info, see [Zed MCP docs](https://zed.dev/docs/ai/mcp).
|
|
420
424
|
|
|
421
425
|
</details>
|
|
422
426
|
|
|
423
427
|
<details>
|
|
424
428
|
<summary><b>Install in Augment</b></summary>
|
|
425
429
|
|
|
426
|
-
|
|
427
|
-
|
|
428
|
-
> Add to your VS Code `settings.json` under `augment.advanced`.
|
|
430
|
+
Use the Augment Settings panel and either add the server manually or choose `Import from JSON`:
|
|
429
431
|
|
|
430
432
|
```json
|
|
431
433
|
{
|
|
432
|
-
"
|
|
433
|
-
"
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
438
|
-
}
|
|
439
|
-
]
|
|
434
|
+
"mcpServers": {
|
|
435
|
+
"fetch-url-mcp": {
|
|
436
|
+
"command": "npx",
|
|
437
|
+
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
438
|
+
}
|
|
440
439
|
}
|
|
441
440
|
}
|
|
442
441
|
```
|
|
443
442
|
|
|
444
|
-
For more info, see [Augment MCP docs](https://docs.augmentcode.com/setup-mcp
|
|
443
|
+
For more info, see [Augment MCP docs](https://docs.augmentcode.com/setup-augment/mcp).
|
|
445
444
|
|
|
446
445
|
</details>
|
|
447
446
|
|
|
448
447
|
<details>
|
|
449
448
|
<summary><b>Install in Roo Code</b></summary>
|
|
450
449
|
|
|
451
|
-
|
|
450
|
+
Use Roo Code's MCP Servers UI or marketplace flow.
|
|
452
451
|
|
|
453
|
-
|
|
454
|
-
{
|
|
455
|
-
"mcpServers": {
|
|
456
|
-
"fetch-url-mcp": {
|
|
457
|
-
"command": "npx",
|
|
458
|
-
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
459
|
-
}
|
|
460
|
-
}
|
|
461
|
-
}
|
|
462
|
-
```
|
|
452
|
+
For this server, use:
|
|
463
453
|
|
|
464
|
-
|
|
454
|
+
- `command`: `npx`
|
|
455
|
+
- `args`: `["-y", "@j0hanz/fetch-url-mcp@latest"]`
|
|
456
|
+
|
|
457
|
+
For more info, see [Roo Code docs](https://docs.roocode.com/).
|
|
465
458
|
|
|
466
459
|
</details>
|
|
467
460
|
|
|
468
461
|
<details>
|
|
469
462
|
<summary><b>Install in Kilo Code</b></summary>
|
|
470
463
|
|
|
471
|
-
|
|
464
|
+
Use Kilo Code's MCP Servers UI or marketplace flow.
|
|
472
465
|
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
"command": "npx",
|
|
478
|
-
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
|
|
479
|
-
}
|
|
480
|
-
}
|
|
481
|
-
}
|
|
482
|
-
```
|
|
466
|
+
For this server, use:
|
|
467
|
+
|
|
468
|
+
- `command`: `npx`
|
|
469
|
+
- `args`: `["-y", "@j0hanz/fetch-url-mcp@latest"]`
|
|
483
470
|
|
|
484
|
-
For more info, see [Kilo Code
|
|
471
|
+
For more info, see [Kilo Code docs](https://kilocode.ai/docs).
|
|
485
472
|
|
|
486
473
|
</details>
|
|
487
474
|
|
|
488
475
|
## Use Cases
|
|
489
476
|
|
|
490
477
|
- Fetch documentation pages, blog posts, or reference material into Markdown before sending them to an LLM.
|
|
491
|
-
- Retrieve repository-hosted content from GitHub, GitLab, or
|
|
492
|
-
-
|
|
493
|
-
- Use
|
|
478
|
+
- Retrieve repository-hosted content from GitHub, GitLab, Bitbucket, or Gists and let the server rewrite page URLs to raw endpoints when possible.
|
|
479
|
+
- Reuse cached Markdown through `internal://cache/{namespace}/{hash}` or bypass the cache with `forceRefresh` for time-sensitive pages.
|
|
480
|
+
- Use task mode for large pages or slower sites when the inline response would otherwise be truncated or delayed.
|
|
494
481
|
|
|
495
482
|
## Architecture
|
|
496
483
|
|
|
497
484
|
```text
|
|
498
485
|
[MCP Client]
|
|
499
|
-
|
|
500
|
-
|
|
486
|
+
├─ stdio -> `src/index.ts` -> `startStdioServer()` -> `createMcpServer()`
|
|
487
|
+
└─ HTTP (`--http`) -> `src/index.ts` -> `startHttpServer()` -> HTTP dispatcher
|
|
488
|
+
├─ `GET /health`
|
|
489
|
+
├─ `GET /.well-known/oauth-protected-resource`
|
|
490
|
+
├─ `GET /.well-known/oauth-protected-resource/mcp`
|
|
491
|
+
├─ `GET /mcp/downloads/{namespace}/{hash}`
|
|
492
|
+
└─ `POST|GET|DELETE /mcp`
|
|
501
493
|
|
|
502
494
|
`createMcpServer()`
|
|
503
|
-
|
|
504
|
-
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
-> fetch via shared pipeline
|
|
519
|
-
-> transform HTML to Markdown
|
|
520
|
-
-> validate structured output with `fetchUrlOutputSchema`
|
|
521
|
-
-> return text content plus `structuredContent`
|
|
495
|
+
├─ registers tool: `fetch-url`
|
|
496
|
+
├─ registers prompt: `get-help`
|
|
497
|
+
├─ registers resources:
|
|
498
|
+
│ - `internal://instructions`
|
|
499
|
+
│ - `internal://cache/{namespace}/{hash}`
|
|
500
|
+
├─ enables capabilities: completions, logging, resources, prompts, tasks
|
|
501
|
+
└─ installs task handlers, log-level handling, and shutdown cleanup
|
|
502
|
+
|
|
503
|
+
`fetch-url` execution
|
|
504
|
+
├─ validate input with `fetchUrlInputSchema`
|
|
505
|
+
├─ normalize URL and block local/private targets unless allowed
|
|
506
|
+
├─ rewrite supported code-host URLs to raw endpoints when possible
|
|
507
|
+
├─ fetch and cache content via the shared pipeline
|
|
508
|
+
├─ transform HTML into Markdown in the transform worker path
|
|
509
|
+
└─ validate `structuredContent` with `fetchUrlOutputSchema`
|
|
522
510
|
```
|
|
523
511
|
|
|
524
512
|
### Request Lifecycle
|
|
@@ -528,7 +516,7 @@ Tool execution flow
|
|
|
528
516
|
[Server] -- {protocolVersion, capabilities, serverInfo} --> [Client]
|
|
529
517
|
[Client] -- notifications/initialized --> [Server]
|
|
530
518
|
[Client] -- tools/call {name, arguments} --> [Server]
|
|
531
|
-
[Server] -- {content: [{type, text}], isError?} --> [Client]
|
|
519
|
+
[Server] -- {content: [{type, text}], structuredContent?, isError?} --> [Client]
|
|
532
520
|
```
|
|
533
521
|
|
|
534
522
|
## MCP Surface
|
|
@@ -537,34 +525,31 @@ Tool execution flow
|
|
|
537
525
|
|
|
538
526
|
#### `fetch-url`
|
|
539
527
|
|
|
540
|
-
Fetch public webpages and convert HTML into AI-readable Markdown. The tool is read-only, does not execute page JavaScript, can bypass cache with `forceRefresh`, and supports task mode for larger or slower fetches.
|
|
528
|
+
Fetch public webpages and convert HTML into AI-readable Markdown. The tool is read-only, does not execute page JavaScript, can bypass the cache with `forceRefresh`, and supports optional task mode for larger or slower fetches.
|
|
541
529
|
|
|
542
|
-
| Parameter | Type | Required | Description
|
|
543
|
-
| ------------------ | --------- | -------- |
|
|
544
|
-
| `url` | `string` | yes | Target URL. Max 2048 chars.
|
|
545
|
-
| `skipNoiseRemoval` | `boolean` | no | Preserve navigation/footers (disable noise filtering).
|
|
546
|
-
| `forceRefresh` | `boolean` | no | Bypass cache and fetch fresh content.
|
|
547
|
-
| `maxInlineChars` | `integer` | no | Inline markdown limit (0-10485760, 0=unlimited). Lower of this or global limit applies. |
|
|
530
|
+
| Parameter | Type | Required | Description |
|
|
531
|
+
| ------------------ | --------- | -------- | ------------------------------------------------------------------------------------------- |
|
|
532
|
+
| `url` | `string` | yes | Target URL. Max 2048 chars. |
|
|
533
|
+
| `skipNoiseRemoval` | `boolean` | no | Preserve navigation/footers (disable noise filtering). |
|
|
534
|
+
| `forceRefresh` | `boolean` | no | Bypass cache and fetch fresh content. |
|
|
535
|
+
| `maxInlineChars` | `integer` | no | Inline markdown limit (0-10485760, 0=unlimited). Lower of this or the global limit applies. |
|
|
548
536
|
|
|
549
|
-
|
|
550
|
-
<summary>Data Flow</summary>
|
|
537
|
+
The response is returned as MCP text content and, when validation succeeds, as `structuredContent` containing `url`, `resolvedUrl`, `finalUrl`, `title`, `metadata`, `markdown`, `fromCache`, `fetchedAt`, `contentSize`, and `truncated`.
|
|
551
538
|
|
|
552
539
|
```text
|
|
553
|
-
1. Client
|
|
554
|
-
2.
|
|
555
|
-
3.
|
|
556
|
-
4.
|
|
557
|
-
5.
|
|
558
|
-
6. `fetchUrlOutputSchema` validates the structured payload before it is returned.
|
|
540
|
+
1. [Client] -- tools/call {name: "fetch-url", arguments} --> [Server]
|
|
541
|
+
2. [Server] -- dispatch("fetch-url") --> [src/tools/fetch-url.ts]
|
|
542
|
+
3. [Handler] -- validate(fetchUrlInputSchema) --> normalize / fetch / transform
|
|
543
|
+
4. [Handler] -- validate(fetchUrlOutputSchema) --> assemble content + structuredContent
|
|
544
|
+
5. [Server] -- result or tool error --> [Client]
|
|
559
545
|
```
|
|
560
546
|
|
|
561
|
-
</details>
|
|
562
|
-
|
|
563
547
|
### Resources
|
|
564
548
|
|
|
565
|
-
| Resource | URI
|
|
566
|
-
| ---------------------------- |
|
|
567
|
-
| `fetch-url-mcp-instructions` | `internal://instructions`
|
|
549
|
+
| Resource | URI | MIME Type | Description |
|
|
550
|
+
| ---------------------------- | ------------------------------------- | --------------- | ------------------------------------------------------------- |
|
|
551
|
+
| `fetch-url-mcp-instructions` | `internal://instructions` | `text/markdown` | Guidance for using the Fetch URL MCP server. |
|
|
552
|
+
| `fetch-url-mcp-cache-entry` | `internal://cache/{namespace}/{hash}` | `text/markdown` | Read cached markdown generated by previous `fetch-url` calls. |
|
|
568
553
|
|
|
569
554
|
### Prompts
|
|
570
555
|
|
|
@@ -574,26 +559,27 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
574
559
|
|
|
575
560
|
## MCP Capabilities
|
|
576
561
|
|
|
577
|
-
| Capability | Status |
|
|
578
|
-
| ------------------------------- | --------- |
|
|
579
|
-
|
|
|
580
|
-
|
|
|
581
|
-
|
|
|
582
|
-
|
|
|
583
|
-
|
|
|
562
|
+
| Capability | Status | Notes |
|
|
563
|
+
| ------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------- |
|
|
564
|
+
| completions | confirmed | Advertised in `createServerCapabilities()` and used by the cache resource template for `namespace` and `hash` completion. |
|
|
565
|
+
| logging | confirmed | Advertised in `createServerCapabilities()` and handled through `SetLevelRequestSchema`. |
|
|
566
|
+
| resources subscribe/listChanged | confirmed | Advertised in `createServerCapabilities()` and implemented for cache resource subscriptions and list changes. |
|
|
567
|
+
| prompts | confirmed | `get-help` is registered during server startup. |
|
|
568
|
+
| tasks | confirmed | Advertised in `createServerCapabilities()` and backed by registered task handlers plus optional tool task support. |
|
|
569
|
+
| progress notifications | confirmed | Tool execution reports `notifications/progress` updates during fetch and transform stages. |
|
|
584
570
|
|
|
585
571
|
### Tool Annotations
|
|
586
572
|
|
|
587
|
-
| Annotation |
|
|
588
|
-
| ----------------- |
|
|
589
|
-
| `readOnlyHint` |
|
|
590
|
-
| `destructiveHint` |
|
|
591
|
-
| `
|
|
592
|
-
| `
|
|
573
|
+
| Annotation | Value |
|
|
574
|
+
| ----------------- | ------- |
|
|
575
|
+
| `readOnlyHint` | `true` |
|
|
576
|
+
| `destructiveHint` | `false` |
|
|
577
|
+
| `idempotentHint` | `true` |
|
|
578
|
+
| `openWorldHint` | `true` |
|
|
593
579
|
|
|
594
580
|
### Structured Output
|
|
595
581
|
|
|
596
|
-
- `fetch-url` publishes an explicit `outputSchema` and returns `structuredContent` when the
|
|
582
|
+
- `fetch-url` publishes an explicit `outputSchema` and returns `structuredContent` when the assembled response passes validation.
|
|
597
583
|
|
|
598
584
|
## Configuration
|
|
599
585
|
|
|
@@ -602,11 +588,13 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
602
588
|
| `HOST` | `127.0.0.1` | HTTP mode | Bind address. Non-loopback bindings also require `ALLOW_REMOTE=true`. |
|
|
603
589
|
| `PORT` | `3000` | HTTP mode | Listening port for `--http`. |
|
|
604
590
|
| `ALLOW_REMOTE` | `false` | HTTP mode | Must be enabled to bind to a non-loopback interface. |
|
|
605
|
-
| `ACCESS_TOKENS` | unset | HTTP mode | Comma
|
|
591
|
+
| `ACCESS_TOKENS` | unset | HTTP mode | Comma- or space-separated static bearer tokens. |
|
|
606
592
|
| `API_KEY` | unset | HTTP mode | Alternate static token source for header auth. |
|
|
607
593
|
| `OAUTH_ISSUER_URL` | unset | HTTP mode | Enables OAuth mode when combined with the other OAuth URLs. |
|
|
608
594
|
| `OAUTH_AUTHORIZATION_URL` | unset | HTTP mode | Optional explicit authorization endpoint. |
|
|
609
595
|
| `OAUTH_TOKEN_URL` | unset | HTTP mode | Optional explicit token endpoint. |
|
|
596
|
+
| `OAUTH_REVOCATION_URL` | unset | HTTP mode | Optional OAuth revocation endpoint. |
|
|
597
|
+
| `OAUTH_REGISTRATION_URL` | unset | HTTP mode | Optional OAuth dynamic client registration endpoint. |
|
|
610
598
|
| `OAUTH_INTROSPECTION_URL` | unset | HTTP mode | Required for OAuth token introspection. |
|
|
611
599
|
| `OAUTH_REQUIRED_SCOPES` | empty | HTTP mode | Required scopes enforced after auth. |
|
|
612
600
|
| `OAUTH_CLIENT_ID` | unset | HTTP mode | Optional introspection client ID. |
|
|
@@ -623,13 +611,15 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
623
611
|
| `SERVER_BLOCK_PRIVATE_CONNECTIONS` | `false` | HTTP mode | Enables inbound private-network protections. |
|
|
624
612
|
| `MCP_STRICT_PROTOCOL_VERSION_HEADER` | `true` | HTTP mode | Requires `MCP-Protocol-Version` on session init. |
|
|
625
613
|
| `ALLOWED_HOSTS` | empty | HTTP mode | Additional allowed `Host` and `Origin` values. |
|
|
626
|
-
| `ALLOW_LOCAL_FETCH` | `false` | Fetching | Allows
|
|
614
|
+
| `ALLOW_LOCAL_FETCH` | `false` | Fetching | Allows loopback and private-network fetch targets. |
|
|
627
615
|
| `FETCH_TIMEOUT_MS` | `15000` | Fetching | Network fetch timeout in milliseconds. |
|
|
616
|
+
| `USER_AGENT` | `fetch-url-mcp/<version>` | Fetching | Override the outbound user agent string. |
|
|
628
617
|
| `MAX_INLINE_CONTENT_CHARS` | `0` | Tool output | `0` means no explicit inline truncation limit. |
|
|
629
618
|
| `CACHE_ENABLED` | `true` | Caching | Enables in-memory fetch result caching. |
|
|
630
619
|
| `TASKS_MAX_TOTAL` | `5000` | Tasks | Total task capacity. |
|
|
631
620
|
| `TASKS_MAX_PER_OWNER` | `1000` | Tasks | Per-owner task cap, clamped to the total cap. |
|
|
632
621
|
| `TASKS_STATUS_NOTIFICATIONS` | `false` | Tasks | Enables status notifications for tasks. |
|
|
622
|
+
| `TASKS_REQUIRE_INTERCEPTION` | `true` | Tasks | Requires task interception for task-capable tool execution. |
|
|
633
623
|
| `TRANSFORM_CANCEL_ACK_TIMEOUT_MS` | `200` | Transform workers | Cancellation acknowledgement timeout. |
|
|
634
624
|
| `TRANSFORM_WORKER_MODE` | `threads` | Transform workers | Worker execution mode. |
|
|
635
625
|
| `TRANSFORM_WORKER_MAX_OLD_GENERATION_MB` | unset | Transform workers | Optional worker memory limit. |
|
|
@@ -640,11 +630,10 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
640
630
|
| `FETCH_URL_MCP_EXTRA_NOISE_SELECTORS` | empty | Content cleanup | Extra DOM selectors for noise removal. |
|
|
641
631
|
| `FETCH_URL_MCP_LOCALE` | system default | Content cleanup | Locale override for extraction heuristics. |
|
|
642
632
|
| `MARKDOWN_HEADING_KEYWORDS` | built-in list | Markdown cleanup | Override heading keywords used by cleanup. |
|
|
643
|
-
| `USER_AGENT` | `fetch-url-mcp/<version>` | Fetching | Override outbound user agent string. |
|
|
644
633
|
| `LOG_LEVEL` | `info` | Logging | `debug`, `info`, `warn`, or `error`. |
|
|
645
|
-
| `LOG_FORMAT` | `text` | Logging | `json`
|
|
634
|
+
| `LOG_FORMAT` | `text` | Logging | Set to `json` for structured logs. |
|
|
646
635
|
|
|
647
|
-
## HTTP
|
|
636
|
+
## HTTP Endpoints
|
|
648
637
|
|
|
649
638
|
| Method | Path | Auth | Purpose |
|
|
650
639
|
| -------- | ------------------------------------------- | ------------------------------------------ | ------------------------------------------------------- |
|
|
@@ -654,18 +643,19 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
654
643
|
| `POST` | `/mcp` | yes | Session initialization and JSON-RPC requests. |
|
|
655
644
|
| `GET` | `/mcp` | yes | Session-bound server-to-client stream handling. |
|
|
656
645
|
| `DELETE` | `/mcp` | yes | Session shutdown. |
|
|
657
|
-
| `GET` | `/mcp/downloads/{namespace}/{hash}` | yes | Download route used by HTTP-mode fetch results.
|
|
646
|
+
| `GET` | `/mcp/downloads/{namespace}/{hash}` | yes | Download route used by HTTP-mode cached fetch results. |
|
|
658
647
|
|
|
659
648
|
## Security
|
|
660
649
|
|
|
661
|
-
| Control | Status | Notes
|
|
662
|
-
| -------------------------- | ----------- |
|
|
663
|
-
| Host and origin validation | implemented | HTTP requests are
|
|
664
|
-
| Authentication | implemented | HTTP mode supports static bearer tokens or OAuth introspection.
|
|
665
|
-
| Protocol version checks | implemented |
|
|
666
|
-
| Rate limiting | implemented | Requests pass through the HTTP rate limiter before route dispatch.
|
|
667
|
-
|
|
|
668
|
-
|
|
|
650
|
+
| Control | Status | Notes |
|
|
651
|
+
| -------------------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
|
|
652
|
+
| Host and origin validation | implemented | HTTP requests are rejected unless `Host` and `Origin` match the allowlist built from loopback, the configured host, and `ALLOWED_HOSTS`. |
|
|
653
|
+
| Authentication | implemented | HTTP mode supports static bearer tokens locally or OAuth token introspection; remote bindings require OAuth. |
|
|
654
|
+
| Protocol version checks | implemented | HTTP sessions validate `MCP-Protocol-Version` and pin it to the negotiated session version. |
|
|
655
|
+
| Rate limiting | implemented | Requests pass through the HTTP rate limiter before route dispatch. |
|
|
656
|
+
| Outbound SSRF protections | implemented | Local/private IPs, metadata endpoints, and `.local`/`.internal` hosts are blocked unless `ALLOW_LOCAL_FETCH=true`. |
|
|
657
|
+
| TLS | optional | HTTPS is enabled when both TLS key and certificate files are configured. |
|
|
658
|
+
| Stdio logging safety | implemented | Server logs are written to stderr, not stdout, so stdio MCP traffic stays clean. |
|
|
669
659
|
|
|
670
660
|
## Development
|
|
671
661
|
|
|
@@ -697,15 +687,15 @@ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is re
|
|
|
697
687
|
|
|
698
688
|
## Build and Release
|
|
699
689
|
|
|
700
|
-
-
|
|
701
|
-
-
|
|
702
|
-
-
|
|
690
|
+
- The repository includes release automation under `.github/workflows/`.
|
|
691
|
+
- `Dockerfile` and `docker-compose.yml` are available for container-based packaging and local runs.
|
|
692
|
+
- `npm run prepublishOnly` runs the release gate: lint, type-check, and build.
|
|
703
693
|
|
|
704
694
|
## Troubleshooting
|
|
705
695
|
|
|
706
696
|
- For stdio mode, avoid writing logs to stdout; keep logs on stderr.
|
|
707
697
|
- For HTTP mode, verify MCP protocol headers and endpoint routing.
|
|
708
|
-
-
|
|
698
|
+
- Update client snippets when client MCP configuration formats change.
|
|
709
699
|
|
|
710
700
|
## Credits
|
|
711
701
|
|