@j0hanz/fetch-url-mcp 1.6.0 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,610 +1,724 @@
1
1
  # Fetch URL MCP Server
2
2
 
3
- [![npm version](https://img.shields.io/npm/v/%40j0hanz%2Ffetch-url-mcp)](https://www.npmjs.com/package/@j0hanz/fetch-url-mcp) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Node.js](https://img.shields.io/badge/node-%3E%3D24-3c873a)](https://nodejs.org) [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-3178c6?logo=typescript&logoColor=white)](https://www.typescriptlang.org) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-1.26-7c3aed)](https://modelcontextprotocol.io)
3
+ [![npm version](https://img.shields.io/npm/v/%40j0hanz%2Ffetch-url-mcp?style=flat-square&logo=npm)](https://www.npmjs.com/package/%40j0hanz%2Ffetch-url-mcp) [![License](https://img.shields.io/badge/license-MIT-blue?style=flat-square)](#contributing-and-license)
4
4
 
5
- [![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0078d7?logo=visual-studio-code&logoColor=white)](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%7B%22name%22%3A%22fetch-url-mcp%22%2C%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%2C%22--stdio%22%5D%7D) [![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?logo=visual-studio-code&logoColor=white)](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%7B%22name%22%3A%22fetch-url-mcp%22%2C%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%2C%22--stdio%22%5D%7D) [![Install in Cursor](https://img.shields.io/badge/Cursor-Install-f97316?logo=cursor&logoColor=white)](https://cursor.com/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiLCItLXN0ZGlvIl19)
5
+ [![Install in VS Code](https://img.shields.io/badge/VS_Code-Install_Server-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=fetch-url-mcp&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D) [![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install_Server-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=fetch-url-mcp&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D&quality=insiders) [![Install in Visual Studio](https://img.shields.io/badge/Visual_Studio-Install_Server-C16FDE?logo=visualstudio&logoColor=white)](https://vs-open.link/mcp-install?%7B%22fetch-url-mcp%22%3A%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D%7D)
6
6
 
7
- Fetch public web pages and convert them into clean, AI-readable Markdown.
7
+ [![Add to LM Studio](https://files.lmstudio.ai/deeplink/mcp-install-light.svg)](https://lmstudio.ai/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D) [![Install in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/en/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D) [![Install in Goose](https://block.github.io/goose/img/extension-install-dark.svg)](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40j0hanz%2Ffetch-url-mcp%40latest&id=%40j0hanz%2Ffetch-url-mcp&name=fetch-url-mcp&description=fetch-url-mcp%20MCP%20server)
8
8
 
9
- ## Overview
9
+ Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable Markdown
10
10
 
11
- Fetch URL is a [Model Context Protocol](https://modelcontextprotocol.io) (MCP) server that fetches public web pages, extracts meaningful content using Mozilla's Readability algorithm, and converts the result into clean Markdown optimized for LLM context windows. It handles noise removal, caching, SSRF protection, async task execution, and supports both **stdio** and **Streamable HTTP** transports.
11
+ ## Overview
12
12
 
13
- > [!NOTE]
14
- > Content extraction quality varies depending on the HTML structure and complexity of the source page. Fetch URL works best with standard article and documentation layouts. Pages relying on client-side JavaScript rendering may yield incomplete results.
13
+ `@j0hanz/fetch-url-mcp` is an MCP server for fetching public web pages and converting them into cleaned Markdown. It exposes one read-only tool, one built-in help prompt, and one internal instructions resource. The default transport is stdio, and `--http` enables Streamable HTTP mode.
15
14
 
16
15
  ## Key Features
17
16
 
18
- - **HTML to Markdown** Content extraction via Mozilla Readability + node-html-markdown
19
- - **Noise removal** Strips navigation, ads, cookie banners, and other non-content elements
20
- - **In-memory LRU cache** Faster repeat fetches with configurable TTL (24 h default)
21
- - **Raw URL rewriting** Auto-converts GitHub, GitLab, Bitbucket, and Gist URLs to raw content endpoints
17
+ - `fetch-url` returns cleaned Markdown, metadata, redirect information, cache status, and structured output.
18
+ - The tool is explicitly annotated as read-only, idempotent, and open-world, with optional task support for large fetches.
19
+ - GitHub, GitLab, and Bitbucket page URLs are normalized to raw-content endpoints when appropriate.
20
+ - `get-help` exposes the server instructions, and `internal://instructions` makes the same guidance available as a resource.
21
+ - HTTP mode includes auth, host/origin validation, rate limiting, health checks, and OAuth protected-resource metadata routes.
22
22
 
23
- ## Tech Stack
23
+ ## Requirements
24
24
 
25
- | Component | Technology |
26
- | ------------------- | ----------------------------------- |
27
- | Runtime | Node.js >= 24 |
28
- | Language | TypeScript 5.9 |
29
- | MCP SDK | `@modelcontextprotocol/sdk` ^1.26.0 |
30
- | Content Extraction | `@mozilla/readability` ^0.6.0 |
31
- | DOM Parsing | `linkedom` ^0.18.12 |
32
- | Markdown Conversion | `node-html-markdown` ^2.0.0 |
33
- | Schema Validation | `zod` ^4.3.6 |
34
- | Package Manager | npm |
25
+ - Node.js >=24 (from `package.json`)
26
+ - Docker is optional if you want to run the published container image.
35
27
 
36
- ## Architecture
28
+ ## Quick Start
37
29
 
38
- ```text
39
- URL → Validate → DNS Preflight → HTTP Fetch → Decompress
40
- → Truncate HTML → Readability Extract → Noise Removal
41
- → Markdown Convert → Cleanup Pipeline → Cache → Response
30
+ Use this standard MCP client configuration:
31
+
32
+ ```json
33
+ {
34
+ "mcpServers": {
35
+ "fetch-url-mcp": {
36
+ "command": "npx",
37
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
38
+ }
39
+ }
40
+ }
42
41
  ```
43
42
 
44
- 1. **URL Validation** — Normalize, block private hosts, transform raw-content URLs (GitHub, GitLab, Bitbucket)
45
- 2. **Fetch** — HTTP request with redirect following, DNS preflight SSRF checks, and size limits (10 MB)
46
- 3. **Transform** — Offloaded to worker threads: parse HTML with `linkedom`, extract with Readability, remove DOM noise, convert to Markdown
47
- 4. **Cleanup** Multi-pass Markdown normalization (heading promotion, spacing, skip-link removal)
48
- 5. **Cache + Respond** — Store result in LRU cache, apply inline content limits, return structured content
43
+ ## Client Configuration
44
+
45
+ <details>
46
+ <summary><b>Install in VS Code</b></summary>
49
47
 
50
- ## Repository Structure
48
+ [![Install in VS Code](https://img.shields.io/badge/VS_Code-Install_Server-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=fetch-url-mcp&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D)
51
49
 
52
- ```text
53
- fetch-url-mcp/
54
- ├── assets/ # Server icon (logo.svg)
55
- ├── examples/ # Client examples
56
- ├── scripts/ # Build & test orchestration
57
- ├── src/
58
- │ ├── workers/ # Worker-thread child for HTML transforms
59
- │ ├── index.ts # CLI entrypoint, transport wiring, shutdown
60
- │ ├── server.ts # McpServer lifecycle and registration
61
- │ ├── tools.ts # fetch-url tool definition and pipeline
62
- │ ├── fetch.ts # URL normalization, SSRF, HTTP fetch
63
- │ ├── transform.ts # HTML-to-Markdown pipeline, worker pool
64
- │ ├── config.ts # Env-driven configuration
65
- │ ├── resources.ts # MCP resource/template registration
66
- │ ├── prompts.ts # MCP prompt registration (get-help)
67
- │ ├── mcp.ts # Task execution management
68
- │ ├── http-native.ts # Streamable HTTP server, auth, sessions
69
- │ └── instructions.md # Server instructions embedded at runtime
70
- ├── tests/ # Unit/integration tests (Node.js test runner)
71
- ├── package.json
72
- ├── tsconfig.json
73
- └── AGENTS.md
50
+ Add to `.vscode/mcp.json`:
51
+
52
+ ```json
53
+ {
54
+ "servers": {
55
+ "fetch-url-mcp": {
56
+ "command": "npx",
57
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
58
+ }
59
+ }
60
+ }
74
61
  ```
75
62
 
76
- ## Requirements
63
+ Or install via CLI:
77
64
 
78
- - **Node.js** >= 24
65
+ ```sh
66
+ code --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'
67
+ ```
79
68
 
80
- ## Quickstart
69
+ For more info, see [VS Code MCP docs](https://code.visualstudio.com/docs/copilot/chat/mcp-servers).
81
70
 
82
- ```bash
83
- npx -y @j0hanz/fetch-url-mcp@latest --stdio
84
- ```
71
+ </details>
85
72
 
86
- Add to your MCP client configuration:
73
+ <details>
74
+ <summary><b>Install in VS Code Insiders</b></summary>
75
+
76
+ [![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install_Server-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=fetch-url-mcp&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D&quality=insiders)
77
+
78
+ Add to `.vscode/mcp.json`:
87
79
 
88
80
  ```json
89
81
  {
90
- "mcpServers": {
82
+ "servers": {
91
83
  "fetch-url-mcp": {
92
84
  "command": "npx",
93
- "args": ["-y", "@j0hanz/fetch-url-mcp@latest", "--stdio"]
85
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
94
86
  }
95
87
  }
96
88
  }
97
89
  ```
98
90
 
99
- ## Client Example (CLI)
100
-
101
- Build the server and examples, then run the client:
91
+ Or install via CLI:
102
92
 
103
- ```bash
104
- npm run build
105
- node dist/examples/mcp-fetch-url-client.js https://example.com
93
+ ```sh
94
+ code-insiders --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'
106
95
  ```
107
96
 
108
- Optional flags:
97
+ For more info, see [VS Code Insiders MCP docs](https://code.visualstudio.com/docs/copilot/chat/mcp-servers).
109
98
 
110
- - `--full` reads the cached markdown resource to avoid inline truncation.
111
- - `--task` enables task-based execution with streamed status updates.
112
- - `--task-ttl <ms>` sets task TTL; `--task-poll <ms>` sets poll interval.
113
- - `--http http://localhost:3000/mcp` connects to the Streamable HTTP server.
114
- - Progress updates (when emitted) are printed to stderr.
99
+ </details>
115
100
 
116
- ## Installation
101
+ <details>
102
+ <summary><b>Install in Cursor</b></summary>
117
103
 
118
- ### NPX (Recommended)
104
+ [![Install in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/en/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D)
119
105
 
120
- No installation required — runs directly:
106
+ Add to `~/.cursor/mcp.json`:
121
107
 
122
- ```bash
123
- npx -y @j0hanz/fetch-url-mcp@latest --stdio
108
+ ```json
109
+ {
110
+ "mcpServers": {
111
+ "fetch-url-mcp": {
112
+ "command": "npx",
113
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
114
+ }
115
+ }
116
+ }
124
117
  ```
125
118
 
126
- ### Global Install
119
+ For more info, see [Cursor MCP docs](https://docs.cursor.com/context/model-context-protocol).
127
120
 
128
- ```bash
129
- npm install -g @j0hanz/fetch-url-mcp
130
- fetch-url-mcp --stdio
131
- ```
121
+ </details>
132
122
 
133
- ### From Source
123
+ <details>
124
+ <summary><b>Install in Visual Studio</b></summary>
134
125
 
135
- ```bash
136
- git clone https://github.com/j0hanz/fetch-url-mcp.git
137
- cd fetch-url-mcp
138
- npm install
139
- npm run build
140
- node dist/index.js --stdio
141
- ```
126
+ [![Install in Visual Studio](https://img.shields.io/badge/Visual_Studio-Install_Server-C16FDE?logo=visualstudio&logoColor=white)](https://vs-open.link/mcp-install?%7B%22fetch-url-mcp%22%3A%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Ffetch-url-mcp%40latest%22%5D%7D%7D)
142
127
 
143
- ### Docker
128
+ Add to `mcp.json (VS integrated)`:
144
129
 
145
- ```bash
146
- docker compose up --build
130
+ ```json
131
+ {
132
+ "mcpServers": {
133
+ "fetch-url-mcp": {
134
+ "command": "npx",
135
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
136
+ }
137
+ }
138
+ }
147
139
  ```
148
140
 
149
- ## Configuration
141
+ For more info, see [Visual Studio MCP docs](https://learn.microsoft.com/en-us/visualstudio/ide/mcp-servers).
150
142
 
151
- ### Runtime Modes
152
-
153
- | Flag | Description |
154
- | ----------------- | ---------------------------------------------------------- |
155
- | `--stdio`, `-s` | Run in stdio mode (for desktop MCP clients; default) |
156
- | `--http` | Run in HTTP mode (Streamable HTTP on port 3000 by default) |
157
- | `--help`, `-h` | Show usage help |
158
- | `--version`, `-v` | Print server version |
159
-
160
- When no transport flag is passed, the server starts in **stdio mode**.
161
-
162
- ### Environment Variables
163
-
164
- #### Core Settings
165
-
166
- | Variable | Default | Description |
167
- | ------------------------------------ | ------------------------- | ------------------------------------------------------------------------------------------------------- |
168
- | `HOST` | `127.0.0.1` | HTTP server bind address |
169
- | `PORT` | `3000` | HTTP server port (1024–65535) |
170
- | `LOG_LEVEL` | `info` | Log level: `debug`, `info`, `warn`, `error` |
171
- | `FETCH_TIMEOUT_MS` | `15000` | HTTP fetch timeout in ms (1000–60000) |
172
- | `CACHE_ENABLED` | `true` | Enable/disable in-memory content cache |
173
- | `USER_AGENT` | `fetch-url-mcp/{version}` | Custom User-Agent header |
174
- | `ALLOW_REMOTE` | `false` | Allow remote connections in HTTP mode |
175
- | `ALLOWED_HOSTS` | _(empty)_ | Comma-separated host/origin allowlist for HTTP mode |
176
- | `MCP_STRICT_PROTOCOL_VERSION_HEADER` | `true` | Require `MCP-Protocol-Version` on HTTP session initialize (`false` allows legacy headerless initialize) |
177
-
178
- #### Task Management
179
-
180
- | Variable | Default | Description |
181
- | ---------------------------- | ------- | -------------------------------------------------------- |
182
- | `TASKS_MAX_TOTAL` | `5000` | Maximum retained task records across all owners |
183
- | `TASKS_MAX_PER_OWNER` | `1000` | Maximum retained task records per session/client |
184
- | `TASKS_STATUS_NOTIFICATIONS` | `false` | Emit experimental `notifications/tasks/status` extension |
185
-
186
- #### Authentication (HTTP Mode)
187
-
188
- | Variable | Default | Description |
189
- | ------------------------- | --------- | --------------------------------------- |
190
- | `ACCESS_TOKENS` | _(empty)_ | Comma-separated static bearer tokens |
191
- | `API_KEY` | _(empty)_ | Single API key (added to static tokens) |
192
- | `OAUTH_ISSUER_URL` | _(empty)_ | OAuth issuer URL (enables OAuth mode) |
193
- | `OAUTH_AUTHORIZATION_URL` | _(empty)_ | OAuth authorization endpoint |
194
- | `OAUTH_TOKEN_URL` | _(empty)_ | OAuth token endpoint |
195
- | `OAUTH_INTROSPECTION_URL` | _(empty)_ | OAuth token introspection endpoint |
196
- | `OAUTH_REVOCATION_URL` | _(empty)_ | OAuth token revocation endpoint |
197
- | `OAUTH_REGISTRATION_URL` | _(empty)_ | OAuth dynamic client registration |
198
- | `OAUTH_REQUIRED_SCOPES` | _(empty)_ | Required OAuth scopes |
199
- | `OAUTH_CLIENT_ID` | _(empty)_ | OAuth client ID |
200
- | `OAUTH_CLIENT_SECRET` | _(empty)_ | OAuth client secret |
201
-
202
- #### Transform & Workers
203
-
204
- | Variable | Default | Description |
205
- | ------------------------------------------ | --------- | ----------------------------------------- |
206
- | `TRANSFORM_WORKER_MODE` | `threads` | Worker mode: `threads` or `process` |
207
- | `TRANSFORM_WORKER_MAX_OLD_GENERATION_MB` | _(unset)_ | V8 old generation heap limit per worker |
208
- | `TRANSFORM_WORKER_MAX_YOUNG_GENERATION_MB` | _(unset)_ | V8 young generation heap limit per worker |
209
- | `TRANSFORM_WORKER_CODE_RANGE_MB` | _(unset)_ | V8 code range limit per worker |
210
- | `TRANSFORM_WORKER_STACK_MB` | _(unset)_ | Stack size limit per worker |
211
-
212
- #### Content Tuning
213
-
214
- | Variable | Default | Description |
215
- | ------------------------------------- | ----------------- | ------------------------------------------------ |
216
- | `MAX_INLINE_CONTENT_CHARS` | `0` | Global inline markdown limit (`0` = unlimited) |
217
- | `FETCH_URL_MCP_EXTRA_NOISE_TOKENS` | _(empty)_ | Additional CSS class/id tokens for noise removal |
218
- | `FETCH_URL_MCP_EXTRA_NOISE_SELECTORS` | _(empty)_ | Additional CSS selectors for noise removal |
219
- | `MARKDOWN_HEADING_KEYWORDS` | _(built-in list)_ | Keywords triggering heading promotion |
220
- | `FETCH_URL_MCP_LOCALE` | _(system)_ | Locale for content processing |
221
-
222
- #### Server Tuning
223
-
224
- | Variable | Default | Description |
225
- | ---------------------------------- | --------------- | ---------------------------------------- |
226
- | `SERVER_MAX_CONNECTIONS` | `0` (unlimited) | Maximum concurrent HTTP connections |
227
- | `SERVER_BLOCK_PRIVATE_CONNECTIONS` | `false` | Block connections from private IP ranges |
228
-
229
- ### Hardcoded Defaults
230
-
231
- | Setting | Value |
232
- | ------------------------ | ------------------------------- |
233
- | Max HTML size | 10 MB |
234
- | Max inline content chars | 0 (unlimited, configurable) |
235
- | Fetch timeout | 15 s |
236
- | Transform timeout | 30 s |
237
- | Tool timeout | Fetch + Transform + 5 s padding |
238
- | Max redirects | 5 |
239
- | Cache TTL | 86400 s (24 h) |
240
- | Cache max keys | 100 |
241
- | Rate limit | 100 requests / 60 s |
242
- | Max sessions | 200 |
243
- | Session TTL | 30 min |
244
- | Max URL length | 2048 chars |
245
- | Worker pool max scale | 4 |
246
-
247
- ## Usage
248
-
249
- ### Stdio Mode
250
-
251
- ```bash
252
- fetch-url-mcp --stdio
253
- ```
143
+ </details>
144
+
145
+ <details>
146
+ <summary><b>Install in Goose</b></summary>
254
147
 
255
- The server communicates via JSON-RPC over stdin/stdout. All MCP clients that support stdio transport can connect directly.
148
+ [![Install in Goose](https://block.github.io/goose/img/extension-install-dark.svg)](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40j0hanz%2Ffetch-url-mcp%40latest&id=%40j0hanz%2Ffetch-url-mcp&name=fetch-url-mcp&description=Intelligent%20web%20content%20fetcher%20MCP%20server%20that%20converts%20HTML%20to%20clean%2C%20AI-readable%20Markdown)
256
149
 
257
- ### HTTP Mode
150
+ Add to `Goose extension registry`:
258
151
 
259
- ```bash
260
- fetch-url-mcp
261
- # or
262
- PORT=8080 HOST=0.0.0.0 ALLOW_REMOTE=true fetch-url-mcp
152
+ ```json
153
+ {
154
+ "mcpServers": {
155
+ "fetch-url-mcp": {
156
+ "command": "npx",
157
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
158
+ }
159
+ }
160
+ }
263
161
  ```
264
162
 
265
- The server starts a Streamable HTTP endpoint at `/mcp`. Authenticate with bearer tokens via the `ACCESS_TOKENS` or `API_KEY` environment variables.
163
+ For more info, see [Goose MCP docs](https://block.github.io/goose/docs/getting-started/using-extensions).
266
164
 
267
- For `POST /mcp`, clients should send:
165
+ </details>
268
166
 
269
- - `Accept: application/json, text/event-stream`
270
- - `MCP-Protocol-Version: 2025-11-25` (or `2025-03-26` for legacy clients)
167
+ <details>
168
+ <summary><b>Install in LM Studio</b></summary>
271
169
 
272
- ## MCP Surface
170
+ [![Add to LM Studio](https://files.lmstudio.ai/deeplink/mcp-install-light.svg)](https://lmstudio.ai/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiXX0%3D)
273
171
 
274
- ### Tools
172
+ Add to `LM Studio MCP config`:
275
173
 
276
- #### `fetch-url`
174
+ ```json
175
+ {
176
+ "mcpServers": {
177
+ "fetch-url-mcp": {
178
+ "command": "npx",
179
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
180
+ }
181
+ }
182
+ }
183
+ ```
277
184
 
278
- Fetches a webpage and converts it to clean Markdown format optimized for LLM context.
185
+ For more info, see [LM Studio MCP docs](https://lmstudio.ai/docs/basics/mcp).
279
186
 
280
- **Useful for:**
187
+ </details>
281
188
 
282
- - Reading documentation, blog posts, or articles
283
- - Extracting main content while removing navigation and ads
284
- - Caching content to speed up repeated queries
189
+ <details>
190
+ <summary><b>Install in Claude Desktop</b></summary>
285
191
 
286
- **Limitations:**
192
+ Add to `claude_desktop_config.json`:
193
+
194
+ ```json
195
+ {
196
+ "mcpServers": {
197
+ "fetch-url-mcp": {
198
+ "command": "npx",
199
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
200
+ }
201
+ }
202
+ }
203
+ ```
287
204
 
288
- - Does not execute complex client-side JavaScript interactions
289
- - Inline output may be truncated when `MAX_INLINE_CONTENT_CHARS` is set
205
+ For more info, see [Claude Desktop MCP docs](https://modelcontextprotocol.io/quickstart/user).
290
206
 
291
- ##### Parameters
207
+ </details>
292
208
 
293
- | Parameter | Type | Required | Default | Description |
294
- | ------------------ | -------------- | -------- | ------- | -------------------------------------------------------------------------- |
295
- | `url` | `string` (URL) | Yes | — | The URL of the webpage to fetch (http/https, max 2048 chars) |
296
- | `skipNoiseRemoval` | `boolean` | No | `false` | Preserve navigation, footers, and other elements normally filtered |
297
- | `forceRefresh` | `boolean` | No | `false` | Bypass cache and fetch fresh content |
298
- | `maxInlineChars` | `number` | No | `0` | Per-call inline markdown limit (`0` = unlimited; global cap still applies) |
209
+ <details>
210
+ <summary><b>Install in Claude Code</b></summary>
299
211
 
300
- ##### Returns
212
+ Add to `Claude Code CLI`:
301
213
 
302
214
  ```json
303
215
  {
304
- "url": "https://example.com",
305
- "inputUrl": "https://example.com",
306
- "resolvedUrl": "https://example.com",
307
- "finalUrl": "https://example.com",
308
- "title": "Example Domain",
309
- "metadata": {
310
- "title": "Example Domain",
311
- "description": "...",
312
- "author": "...",
313
- "image": "...",
314
- "favicon": "...",
315
- "publishedAt": "...",
316
- "modifiedAt": "..."
317
- },
318
- "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
319
- "fromCache": false,
320
- "fetchedAt": "2026-02-11T12:00:00.000Z",
321
- "contentSize": 1234,
322
- "truncated": false
216
+ "mcpServers": {
217
+ "fetch-url-mcp": {
218
+ "command": "npx",
219
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
220
+ }
221
+ }
323
222
  }
324
223
  ```
325
224
 
326
- | Field | Type | Description |
327
- | ------------- | ---------- | ---------------------------------------------------------------------------------------- |
328
- | `url` | `string` | The canonical URL (pre-raw-transform) |
329
- | `inputUrl` | `string?` | The original URL provided by the caller |
330
- | `resolvedUrl` | `string?` | The normalized/transformed URL that was fetched |
331
- | `finalUrl` | `string?` | Final response URL after redirects |
332
- | `title` | `string?` | Extracted page title |
333
- | `metadata` | `object?` | Extracted metadata (title, description, author, image, favicon, publishedAt, modifiedAt) |
334
- | `markdown` | `string?` | Extracted content in Markdown format |
335
- | `fromCache` | `boolean?` | Whether the response was served from cache |
336
- | `fetchedAt` | `string?` | ISO timestamp for fetch/cache retrieval |
337
- | `contentSize` | `number?` | Full markdown size before inline truncation |
338
- | `truncated` | `boolean?` | Whether inline markdown was truncated |
339
- | `error` | `string?` | Error message if the request failed |
340
- | `statusCode` | `number?` | HTTP status code for failed requests |
341
- | `details` | `object?` | Additional error details |
342
-
343
- ##### Annotations
344
-
345
- | Annotation | Value |
346
- | ----------------- | ------- |
347
- | `readOnlyHint` | `true` |
348
- | `destructiveHint` | `false` |
349
- | `idempotentHint` | `true` |
350
- | `openWorldHint` | `true` |
351
-
352
- ##### Async Task Execution
353
-
354
- The `fetch-url` tool supports optional async task execution (`execution.taskSupport: "optional"`). Include a `task` field in the tool call to run the fetch in the background:
225
+ Or install via CLI:
226
+
227
+ ```sh
228
+ claude mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
229
+ ```
230
+
231
+ For more info, see [Claude Code MCP docs](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/tutorials#set-up-model-context-protocol-mcp).
232
+
233
+ </details>
234
+
235
+ <details>
236
+ <summary><b>Install in Windsurf</b></summary>
237
+
238
+ Add to `~/.codeium/windsurf/mcp_config.json`:
355
239
 
356
240
  ```json
357
241
  {
358
- "method": "tools/call",
359
- "params": {
360
- "name": "fetch-url",
361
- "arguments": { "url": "https://example.com" },
362
- "task": { "ttl": 30000 }
242
+ "mcpServers": {
243
+ "fetch-url-mcp": {
244
+ "command": "npx",
245
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
246
+ }
363
247
  }
364
248
  }
365
249
  ```
366
250
 
367
- Then poll `tasks/get` until the task status is `completed` or `failed`, and retrieve the result via `tasks/result`.
251
+ For more info, see [Windsurf MCP docs](https://docs.windsurf.com/windsurf/mcp).
368
252
 
369
- ### Prompts
253
+ </details>
370
254
 
371
- | Name | Description |
372
- | ---------- | --------------------------------- |
373
- | `get-help` | Returns server usage instructions |
255
+ <details>
256
+ <summary><b>Install in Amp</b></summary>
374
257
 
375
- ### Resources
258
+ Add to `Amp MCP config`:
376
259
 
377
- | URI Pattern | MIME Type | Description |
378
- | ------------------------------------- | --------------- | ---------------------------------------------------- |
379
- | `internal://instructions` | `text/markdown` | Server instructions and usage guidance |
380
- | `internal://cache/{namespace}/{hash}` | `text/markdown` | Cached markdown entries from prior `fetch-url` calls |
260
+ ```json
261
+ {
262
+ "mcpServers": {
263
+ "fetch-url-mcp": {
264
+ "command": "npx",
265
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
266
+ }
267
+ }
268
+ }
269
+ ```
381
270
 
382
- ### Tasks
271
+ Or install via CLI:
383
272
 
384
- The server declares full MCP task support:
273
+ ```sh
274
+ amp mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
275
+ ```
385
276
 
386
- | Endpoint | Description |
387
- | -------------- | ------------------------------------ |
388
- | `tasks/list` | List tasks (scoped to session/owner) |
389
- | `tasks/get` | Get task status by ID |
390
- | `tasks/result` | Retrieve completed task result |
391
- | `tasks/cancel` | Cancel an in-flight task |
277
+ For more info, see [Amp MCP docs](https://docs.amp.dev).
392
278
 
393
- ## HTTP Mode Endpoints
279
+ </details>
280
+
281
+ <details>
282
+ <summary><b>Install in Cline</b></summary>
394
283
 
395
- | Method | Path | Auth | Description |
396
- | -------- | ----------------------------------- | ----- | ---------------------------------------- |
397
- | `GET` | `/health` | No | Health check (minimal payload) |
398
- | `GET` | `/health?verbose=true` | Yes\* | Detailed diagnostics and runtime metrics |
399
- | `POST` | `/mcp` | Yes | MCP JSON-RPC (Streamable HTTP) |
400
- | `GET` | `/mcp` | Yes | SSE stream for server-initiated messages |
401
- | `DELETE` | `/mcp` | Yes | Terminate MCP session |
402
- | `GET` | `/mcp/downloads/{namespace}/{hash}` | Yes | Download cached content |
284
+ Add to `cline_mcp_settings.json`:
403
285
 
404
- \* `verbose=true` can be read without auth only for local-only deployments (`ALLOW_REMOTE=false`).
286
+ ```json
287
+ {
288
+ "mcpServers": {
289
+ "fetch-url-mcp": {
290
+ "command": "npx",
291
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
292
+ }
293
+ }
294
+ }
295
+ ```
296
+
297
+ For more info, see [Cline MCP docs](https://docs.cline.bot/mcp-servers/configuring-mcp-servers).
405
298
 
406
- ### Session Behavior
299
+ </details>
407
300
 
408
- - Sessions are created on the first `POST /mcp` request with an `initialize` message
409
- - Session ID is returned in the `mcp-session-id` response header
410
- - Sessions expire after 30 minutes of inactivity (max 200 concurrent)
301
+ <details>
302
+ <summary><b>Install in Codex CLI</b></summary>
411
303
 
412
- ### Authentication
304
+ Add to `~/.codex/config.yaml or codex CLI`:
305
+
306
+ ```json
307
+ {
308
+ "mcpServers": {
309
+ "fetch-url-mcp": {
310
+ "command": "npx",
311
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
312
+ }
313
+ }
314
+ }
315
+ ```
413
316
 
414
- - **Static tokens**: Set `ACCESS_TOKENS` or `API_KEY` environment variables; pass as `Authorization: Bearer <token>`
415
- - **OAuth**: Configure `OAUTH_*` environment variables to enable OAuth 2.0 token introspection
317
+ For more info, see [Codex CLI MCP docs](https://github.com/openai/codex).
416
318
 
417
- ## Client Configuration Examples
319
+ </details>
418
320
 
419
321
  <details>
420
- <summary>VS Code / VS Code Insiders</summary>
322
+ <summary><b>Install in GitHub Copilot</b></summary>
421
323
 
422
- Add to your VS Code settings (`.vscode/mcp.json` or User Settings):
324
+ Add to `.vscode/mcp.json`:
423
325
 
424
326
  ```json
425
327
  {
426
328
  "servers": {
427
329
  "fetch-url-mcp": {
428
330
  "command": "npx",
429
- "args": ["-y", "@j0hanz/fetch-url-mcp@latest", "--stdio"]
331
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
430
332
  }
431
333
  }
432
334
  }
433
335
  ```
434
336
 
337
+ For more info, see [GitHub Copilot MCP docs](https://code.visualstudio.com/docs/copilot/chat/mcp-servers).
338
+
435
339
  </details>
436
340
 
437
341
  <details>
438
- <summary>Claude Desktop</summary>
342
+ <summary><b>Install in Warp</b></summary>
439
343
 
440
- Add to `claude_desktop_config.json`:
344
+ Add to `Warp MCP config`:
441
345
 
442
346
  ```json
443
347
  {
444
348
  "mcpServers": {
445
349
  "fetch-url-mcp": {
446
350
  "command": "npx",
447
- "args": ["-y", "@j0hanz/fetch-url-mcp@latest", "--stdio"]
351
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
448
352
  }
449
353
  }
450
354
  }
451
355
  ```
452
356
 
357
+ For more info, see [Warp MCP docs](https://docs.warp.dev/features/mcp-model-context-protocol).
358
+
453
359
  </details>
454
360
 
455
361
  <details>
456
- <summary>Cursor</summary>
457
-
458
- [![Install in Cursor](https://img.shields.io/badge/Cursor-Install-f97316?logo=cursor&logoColor=white)](https://cursor.com/install-mcp?name=fetch-url-mcp&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovZmV0Y2gtdXJsLW1jcEBsYXRlc3QiLCItLXN0ZGlvIl19)
362
+ <summary><b>Install in Kiro</b></summary>
459
363
 
460
- Or manually add to Cursor MCP settings:
364
+ Add to `.kiro/settings/mcp.json`:
461
365
 
462
366
  ```json
463
367
  {
464
368
  "mcpServers": {
465
369
  "fetch-url-mcp": {
466
370
  "command": "npx",
467
- "args": ["-y", "@j0hanz/fetch-url-mcp@latest", "--stdio"]
371
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
468
372
  }
469
373
  }
470
374
  }
471
375
  ```
472
376
 
377
+ For more info, see [Kiro MCP docs](https://kiro.dev/docs/mcp/overview/).
378
+
473
379
  </details>
474
380
 
475
381
  <details>
476
- <summary>Windsurf</summary>
382
+ <summary><b>Install in Gemini CLI</b></summary>
477
383
 
478
- Add to your Windsurf MCP configuration:
384
+ Add to `~/.gemini/settings.json`:
479
385
 
480
386
  ```json
481
387
  {
482
388
  "mcpServers": {
483
389
  "fetch-url-mcp": {
484
390
  "command": "npx",
485
- "args": ["-y", "@j0hanz/fetch-url-mcp@latest", "--stdio"]
391
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
486
392
  }
487
393
  }
488
394
  }
489
395
  ```
490
396
 
397
+ For more info, see [Gemini CLI MCP docs](https://github.com/google-gemini/gemini-cli).
398
+
491
399
  </details>
492
400
 
493
401
  <details>
494
- <summary>Docker</summary>
402
+ <summary><b>Install in Zed</b></summary>
495
403
 
496
- Use the published image from GitHub Container Registry:
404
+ Add to `~/.config/zed/settings.json`:
497
405
 
498
406
  ```json
499
407
  {
500
- "mcpServers": {
408
+ "context_servers": {
501
409
  "fetch-url-mcp": {
502
- "command": "docker",
503
- "args": [
504
- "run",
505
- "-i",
506
- "--rm",
507
- "ghcr.io/j0hanz/fetch-url-mcp:latest",
508
- "--stdio"
509
- ]
410
+ "settings": {
411
+ "command": "npx",
412
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
413
+ }
510
414
  }
511
415
  }
512
416
  }
513
417
  ```
514
418
 
515
- Or build and run locally:
419
+ For more info, see [Zed MCP docs](https://zed.dev/docs/assistant/model-context-protocol).
516
420
 
517
- ```bash
518
- docker build -t fetch-url-mcp .
519
- docker run -i --rm fetch-url-mcp --stdio
421
+ </details>
422
+
423
+ <details>
424
+ <summary><b>Install in Augment</b></summary>
425
+
426
+ Add to `VS Code settings.json`:
427
+
428
+ > Add to your VS Code `settings.json` under `augment.advanced`.
429
+
430
+ ```json
431
+ {
432
+ "augment.advanced": {
433
+ "mcpServers": [
434
+ {
435
+ "id": "fetch-url-mcp",
436
+ "command": "npx",
437
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
438
+ }
439
+ ]
440
+ }
441
+ }
520
442
  ```
521
443
 
444
+ For more info, see [Augment MCP docs](https://docs.augmentcode.com/setup-mcp-servers).
445
+
522
446
  </details>
523
447
 
524
- ## Security
448
+ <details>
449
+ <summary><b>Install in Roo Code</b></summary>
525
450
 
526
- ### SSRF Protection
451
+ Add to `Roo Code MCP settings`:
527
452
 
528
- Fetch URL blocks requests to private and internal network addresses:
453
+ ```json
454
+ {
455
+ "mcpServers": {
456
+ "fetch-url-mcp": {
457
+ "command": "npx",
458
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
459
+ }
460
+ }
461
+ }
462
+ ```
529
463
 
530
- - **Blocked hosts**: `localhost`, `127.0.0.0/8`, `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`, `169.254.0.0/16`, `100.64.0.0/10`
531
- - **Blocked IPv6**: `::1`, `fc00::/7`, `fe80::/10`, IPv4-mapped private addresses
532
- - **Cloud metadata**: `169.254.169.254` (AWS), `metadata.google.internal`, `metadata.azure.com`, `100.100.100.200` (Azure IMDS)
464
+ For more info, see [Roo Code MCP docs](https://docs.roocode.com/features/mcp/using-mcp-in-roo).
533
465
 
534
- DNS preflight checks run on every redirect hop to prevent DNS rebinding attacks.
466
+ </details>
535
467
 
536
- ### Stdio Transport Safety
468
+ <details>
469
+ <summary><b>Install in Kilo Code</b></summary>
537
470
 
538
- The server never writes non-protocol data to stdout. All logs and diagnostics go to stderr.
471
+ Add to `Kilo Code MCP settings`:
539
472
 
540
- ### Rate Limiting
473
+ ```json
474
+ {
475
+ "mcpServers": {
476
+ "fetch-url-mcp": {
477
+ "command": "npx",
478
+ "args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
479
+ }
480
+ }
481
+ }
482
+ ```
541
483
 
542
- HTTP mode enforces a rate limit of 100 requests per 60-second window per client.
484
+ For more info, see [Kilo Code MCP docs](https://kilocode.ai/docs/features/mcp/using-mcp-servers).
543
485
 
544
- ### Content Safety
486
+ </details>
545
487
 
546
- - HTML downloads are capped at 10 MB
547
- - Worker threads run in isolation with configurable resource limits
548
- - Auth tokens are stored in-memory only and compared using timing-safe equality
488
+ ## Use Cases
549
489
 
550
- ## Development Workflow
490
+ - Fetch documentation pages, blog posts, or reference material into Markdown before sending them to an LLM.
491
+ - Retrieve repository-hosted content from GitHub, GitLab, or Bitbucket and let the server rewrite page URLs to raw endpoints when possible.
492
+ - Force a fresh fetch for time-sensitive pages with `forceRefresh`, or preserve navigation and boilerplate with `skipNoiseRemoval`.
493
+ - Use MCP task mode for large pages or slower sites when the inline response would otherwise be truncated or delayed.
551
494
 
552
- ### Install Dependencies
495
+ ## Architecture
553
496
 
554
- ```bash
555
- npm install
497
+ ```text
498
+ [MCP Client]
499
+ -> stdio -> `dist/index.js` -> `startStdioServer()` -> `createMcpServer()`
500
+ -> HTTP -> `dist/index.js --http` -> `startHttpServer()` -> `/mcp`
501
+
502
+ `createMcpServer()`
503
+ -> registers tool: `fetch-url`
504
+ -> registers prompt: `get-help`
505
+ -> registers resource: `internal://instructions`
506
+ -> enables logging, resources notifications, prompts, and task handlers
507
+
508
+ HTTP request flow
509
+ -> host/origin validation
510
+ -> CORS handling
511
+ -> rate limiting
512
+ -> authentication
513
+ -> health / OAuth metadata / download route dispatch
514
+ -> MCP session gateway for `POST /mcp`, `GET /mcp`, `DELETE /mcp`
515
+
516
+ Tool execution flow
517
+ -> validate input with `fetchUrlInputSchema`
518
+ -> fetch via shared pipeline
519
+ -> transform HTML to Markdown
520
+ -> validate structured output with `fetchUrlOutputSchema`
521
+ -> return text content plus `structuredContent`
556
522
  ```
557
523
 
558
- ### Scripts
559
-
560
- | Script | Command | Description |
561
- | --------------- | ----------------------- | -------------------------------------------- |
562
- | `dev` | `npm run dev` | TypeScript watch mode |
563
- | `dev:run` | `npm run dev:run` | Run compiled output with watch + `.env` |
564
- | `build` | `npm run build` | Clean, compile, copy assets, make executable |
565
- | `start` | `npm start` | Run compiled server |
566
- | `test` | `npm test` | Run test suite (Node.js native test runner) |
567
- | `test:coverage` | `npm run test:coverage` | Run tests with coverage |
568
- | `lint` | `npm run lint` | ESLint |
569
- | `lint:fix` | `npm run lint:fix` | ESLint with auto-fix |
570
- | `format` | `npm run format` | Prettier |
571
- | `type-check` | `npm run type-check` | TypeScript type checking |
572
- | `inspector` | `npm run inspector` | Build and launch MCP Inspector |
573
-
574
- ## Build and Release
524
+ ### Request Lifecycle
575
525
 
576
- ```bash
577
- npm run build # Clean Compile → Copy Assets → chmod
578
- npm run prepublishOnly # Lint Type-Check → Build
579
- npm publish # Publish to npm
526
+ ```text
527
+ [Client] -- initialize {protocolVersion, capabilities} --> [Server]
528
+ [Server] -- {protocolVersion, capabilities, serverInfo} --> [Client]
529
+ [Client] -- notifications/initialized --> [Server]
530
+ [Client] -- tools/call {name, arguments} --> [Server]
531
+ [Server] -- {content: [{type, text}], isError?} --> [Client]
580
532
  ```
581
533
 
582
- CI/CD is handled via a GitHub Actions workflow (`release.yml`) that runs lint, type-check, test, build, and publishes to npm with version bumping.
534
+ ## MCP Surface
583
535
 
584
- ## Troubleshooting
536
+ ### Tools
585
537
 
586
- ### MCP Inspector
538
+ #### `fetch-url`
539
+
540
+ Fetch public webpages and convert HTML into AI-readable Markdown. The tool is read-only, does not execute page JavaScript, can bypass cache with `forceRefresh`, and supports task mode for larger or slower fetches.
587
541
 
588
- Use the built-in inspector to test the server interactively:
542
+ | Parameter | Type | Required | Description |
543
+ | ------------------ | --------- | -------- | --------------------------------------------------------------------------------------- |
544
+ | `url` | `string` | yes | Target URL. Max 2048 chars. |
545
+ | `skipNoiseRemoval` | `boolean` | no | Preserve navigation/footers (disable noise filtering). |
546
+ | `forceRefresh` | `boolean` | no | Bypass cache and fetch fresh content. |
547
+ | `maxInlineChars` | `integer` | no | Inline markdown limit (0-10485760, 0=unlimited). Lower of this or global limit applies. |
589
548
 
590
- ```bash
591
- npm run inspector
549
+ <details>
550
+ <summary>Data Flow</summary>
551
+
552
+ ```text
553
+ 1. Client calls `fetch-url` with `url` and optional fetch flags.
554
+ 2. `fetchUrlInputSchema` validates the payload.
555
+ 3. `performSharedFetch()` downloads the page and applies cache policy.
556
+ 4. `markdownTransform()` converts the response body into Markdown and metadata.
557
+ 5. The result is assembled into `content` plus `structuredContent`.
558
+ 6. `fetchUrlOutputSchema` validates the structured payload before it is returned.
592
559
  ```
593
560
 
594
- ### Common Issues
561
+ </details>
562
+
563
+ ### Resources
564
+
565
+ | Resource | URI | MIME Type | Description |
566
+ | ---------------------------- | ------------------------- | ------------- | -------------------------------------------- |
567
+ | `fetch-url-mcp-instructions` | `internal://instructions` | text/markdown | Guidance for using the Fetch URL MCP server. |
568
+
569
+ ### Prompts
570
+
571
+ | Prompt | Arguments | Description |
572
+ | ---------- | --------- | -------------------------------------------------------------------------------------------- |
573
+ | `get-help` | none | Return Fetch URL server instructions: workflows, cache usage, task mode, and error handling. |
574
+
575
+ ## MCP Capabilities
576
+
577
+ | Capability | Status | Evidence |
578
+ | ------------------------------- | --------- | ------------------------------------------------------------------------------------------------------------- |
579
+ | logging | confirmed | `createServerCapabilities()` advertises logging support and `SetLevelRequestSchema` is handled by the server. |
580
+ | resources subscribe/listChanged | confirmed | `createServerCapabilities()` enables resource subscriptions and list change notifications. |
581
+ | prompts | confirmed | `get-help` is registered during server startup. |
582
+ | tasks | confirmed | Task capabilities are advertised and task handlers are registered during startup. |
583
+ | progress notifications | confirmed | Tool execution reports progress through the task/progress helpers. |
584
+
585
+ ### Tool Annotations
586
+
587
+ | Annotation | Detected | Evidence |
588
+ | ----------------- | -------- | -------------------------- |
589
+ | `readOnlyHint` | yes | src/tools/fetch-url.ts:406 |
590
+ | `destructiveHint` | yes | src/tools/fetch-url.ts:407 |
591
+ | `openWorldHint` | yes | src/tools/fetch-url.ts:409 |
592
+ | `idempotentHint` | yes | src/tools/fetch-url.ts:408 |
593
+
594
+ ### Structured Output
595
+
596
+ - `fetch-url` publishes an explicit `outputSchema` and returns `structuredContent` when the output passes validation.
597
+
598
+ ## Configuration
599
+
600
+ | Variable | Default | Applies To | Notes |
601
+ | ------------------------------------------ | ------------------------- | ----------------- | --------------------------------------------------------------------- |
602
+ | `HOST` | `127.0.0.1` | HTTP mode | Bind address. Non-loopback bindings also require `ALLOW_REMOTE=true`. |
603
+ | `PORT` | `3000` | HTTP mode | Listening port for `--http`. |
604
+ | `ALLOW_REMOTE` | `false` | HTTP mode | Must be enabled to bind to a non-loopback interface. |
605
+ | `ACCESS_TOKENS` | unset | HTTP mode | Comma/space separated static bearer tokens. |
606
+ | `API_KEY` | unset | HTTP mode | Alternate static token source for header auth. |
607
+ | `OAUTH_ISSUER_URL` | unset | HTTP mode | Enables OAuth mode when combined with the other OAuth URLs. |
608
+ | `OAUTH_AUTHORIZATION_URL` | unset | HTTP mode | Optional explicit authorization endpoint. |
609
+ | `OAUTH_TOKEN_URL` | unset | HTTP mode | Optional explicit token endpoint. |
610
+ | `OAUTH_INTROSPECTION_URL` | unset | HTTP mode | Required for OAuth token introspection. |
611
+ | `OAUTH_REQUIRED_SCOPES` | empty | HTTP mode | Required scopes enforced after auth. |
612
+ | `OAUTH_CLIENT_ID` | unset | HTTP mode | Optional introspection client ID. |
613
+ | `OAUTH_CLIENT_SECRET` | unset | HTTP mode | Optional introspection client secret. |
614
+ | `SERVER_TLS_KEY_FILE` | unset | HTTP mode | Enable HTTPS when set together with `SERVER_TLS_CERT_FILE`. |
615
+ | `SERVER_TLS_CERT_FILE` | unset | HTTP mode | TLS certificate path. |
616
+ | `SERVER_TLS_CA_FILE` | unset | HTTP mode | Optional custom CA bundle. |
617
+ | `SERVER_MAX_CONNECTIONS` | `0` | HTTP mode | Optional connection cap. |
618
+ | `SERVER_HEADERS_TIMEOUT_MS` | unset | HTTP mode | Optional Node server tuning. |
619
+ | `SERVER_REQUEST_TIMEOUT_MS` | unset | HTTP mode | Optional Node server tuning. |
620
+ | `SERVER_KEEP_ALIVE_TIMEOUT_MS` | unset | HTTP mode | Optional keep-alive tuning. |
621
+ | `SERVER_KEEP_ALIVE_TIMEOUT_BUFFER_MS` | unset | HTTP mode | Optional keep-alive tuning buffer. |
622
+ | `SERVER_MAX_HEADERS_COUNT` | unset | HTTP mode | Optional header count limit. |
623
+ | `SERVER_BLOCK_PRIVATE_CONNECTIONS` | `false` | HTTP mode | Enables inbound private-network protections. |
624
+ | `MCP_STRICT_PROTOCOL_VERSION_HEADER` | `true` | HTTP mode | Requires `MCP-Protocol-Version` on session init. |
625
+ | `ALLOWED_HOSTS` | empty | HTTP mode | Additional allowed `Host` and `Origin` values. |
626
+ | `ALLOW_LOCAL_FETCH` | `false` | Fetching | Allows local/loopback fetch targets. |
627
+ | `FETCH_TIMEOUT_MS` | `15000` | Fetching | Network fetch timeout in milliseconds. |
628
+ | `MAX_INLINE_CONTENT_CHARS` | `0` | Tool output | `0` means no explicit inline truncation limit. |
629
+ | `CACHE_ENABLED` | `true` | Caching | Enables in-memory fetch result caching. |
630
+ | `TASKS_MAX_TOTAL` | `5000` | Tasks | Total task capacity. |
631
+ | `TASKS_MAX_PER_OWNER` | `1000` | Tasks | Per-owner task cap, clamped to the total cap. |
632
+ | `TASKS_STATUS_NOTIFICATIONS` | `false` | Tasks | Enables status notifications for tasks. |
633
+ | `TRANSFORM_CANCEL_ACK_TIMEOUT_MS` | `200` | Transform workers | Cancellation acknowledgement timeout. |
634
+ | `TRANSFORM_WORKER_MODE` | `threads` | Transform workers | Worker execution mode. |
635
+ | `TRANSFORM_WORKER_MAX_OLD_GENERATION_MB` | unset | Transform workers | Optional worker memory limit. |
636
+ | `TRANSFORM_WORKER_MAX_YOUNG_GENERATION_MB` | unset | Transform workers | Optional worker memory limit. |
637
+ | `TRANSFORM_WORKER_CODE_RANGE_MB` | unset | Transform workers | Optional worker memory limit. |
638
+ | `TRANSFORM_WORKER_STACK_MB` | unset | Transform workers | Optional worker stack size. |
639
+ | `FETCH_URL_MCP_EXTRA_NOISE_TOKENS` | empty | Content cleanup | Extra noise-removal tokens. |
640
+ | `FETCH_URL_MCP_EXTRA_NOISE_SELECTORS` | empty | Content cleanup | Extra DOM selectors for noise removal. |
641
+ | `FETCH_URL_MCP_LOCALE` | system default | Content cleanup | Locale override for extraction heuristics. |
642
+ | `MARKDOWN_HEADING_KEYWORDS` | built-in list | Markdown cleanup | Override heading keywords used by cleanup. |
643
+ | `USER_AGENT` | `fetch-url-mcp/<version>` | Fetching | Override outbound user agent string. |
644
+ | `LOG_LEVEL` | `info` | Logging | `debug`, `info`, `warn`, or `error`. |
645
+ | `LOG_FORMAT` | `text` | Logging | `json` switches logger output format. |
646
+
647
+ ## HTTP Mode Endpoints
648
+
649
+ | Method | Path | Auth | Purpose |
650
+ | -------- | ------------------------------------------- | ------------------------------------------ | ------------------------------------------------------- |
651
+ | `GET` | `/health` | no, unless `?verbose=1` on a remote server | Basic health response, with optional diagnostics. |
652
+ | `GET` | `/.well-known/oauth-protected-resource` | no | OAuth protected-resource metadata. |
653
+ | `GET` | `/.well-known/oauth-protected-resource/mcp` | no | OAuth protected-resource metadata for the MCP endpoint. |
654
+ | `POST` | `/mcp` | yes | Session initialization and JSON-RPC requests. |
655
+ | `GET` | `/mcp` | yes | Session-bound server-to-client stream handling. |
656
+ | `DELETE` | `/mcp` | yes | Session shutdown. |
657
+ | `GET` | `/mcp/downloads/{namespace}/{hash}` | yes | Download route used by HTTP-mode fetch results. |
658
+
659
+ ## Security
660
+
661
+ | Control | Status | Notes |
662
+ | -------------------------- | ----------- | ------------------------------------------------------------------------ |
663
+ | Host and origin validation | implemented | HTTP requests are checked against an allowlist before dispatch. |
664
+ | Authentication | implemented | HTTP mode supports static bearer tokens or OAuth introspection. |
665
+ | Protocol version checks | implemented | Supported MCP protocol versions are validated on HTTP sessions. |
666
+ | Rate limiting | implemented | Requests pass through the HTTP rate limiter before route dispatch. |
667
+ | TLS | optional | HTTPS is enabled when both TLS key and certificate files are configured. |
668
+ | Stdio logging safety | implemented | Server logs are written to stderr, not stdout. |
669
+
670
+ ## Development
671
+
672
+ | Script | Command |
673
+ | ------------------------ | ------------------------------------------------------------------------------------------------------------------- |
674
+ | `clean` | `node scripts/tasks.mjs clean` |
675
+ | `build` | `node scripts/tasks.mjs build` |
676
+ | `copy:assets` | `node scripts/tasks.mjs copy:assets` |
677
+ | `prepare` | `npm run build` |
678
+ | `dev` | `tsc --watch --preserveWatchOutput` |
679
+ | `dev:run` | `node --env-file=.env --watch dist/index.js` |
680
+ | `start` | `node dist/index.js` |
681
+ | `format` | `prettier --write .` |
682
+ | `type-check` | `node scripts/tasks.mjs type-check` |
683
+ | `type-check:src` | `node node_modules/typescript/bin/tsc -p tsconfig.json --noEmit` |
684
+ | `type-check:tests` | `node node_modules/typescript/bin/tsc -p tsconfig.test.json --noEmit` |
685
+ | `type-check:diagnostics` | `tsc --noEmit --extendedDiagnostics` |
686
+ | `type-check:trace` | `node -e "require('fs').rmSync('.ts-trace',{recursive:true,force:true})" && tsc --noEmit --generateTrace .ts-trace` |
687
+ | `lint` | `eslint .` |
688
+ | `lint:tests` | `eslint src/__tests__` |
689
+ | `lint:fix` | `eslint . --fix` |
690
+ | `test` | `node scripts/tasks.mjs test` |
691
+ | `test:fast` | `node --test --import tsx/esm src/__tests__/**/*.test.ts node-tests/**/*.test.ts` |
692
+ | `test:coverage` | `node scripts/tasks.mjs test --coverage` |
693
+ | `knip` | `knip` |
694
+ | `knip:fix` | `knip --fix` |
695
+ | `inspector` | `npm run build && npx -y @modelcontextprotocol/inspector node dist/index.js --stdio` |
696
+ | `prepublishOnly` | `npm run lint && npm run type-check && npm run build` |
697
+
698
+ ## Build and Release
699
+
700
+ - CI workflows detected: .github/workflows/docker-republish.yml, .github/workflows/release.yml
701
+ - Docker build signal detected (`Dockerfile` present).
702
+ - Publish/release script signal detected in `package.json`.
703
+
704
+ ## Troubleshooting
595
705
 
596
- | Issue | Solution |
597
- | ------------------------- | ------------------------------------------------------------------------------------- |
598
- | `VALIDATION_ERROR` on URL | URL is blocked (private IP/localhost) or malformed. Do not retry. |
599
- | `queue_full` error | Worker pool busy. Wait briefly, then retry or use async task mode. |
600
- | Garbled output | Binary content (images, PDFs) cannot be converted. Ensure the URL serves HTML. |
601
- | No output in stdio mode | If you intended HTTP mode, pass `--http`. Stdio is the default transport. |
602
- | Auth errors in HTTP mode | Set `ACCESS_TOKENS` or `API_KEY` env var and pass as `Authorization: Bearer <token>`. |
706
+ - For stdio mode, avoid writing logs to stdout; keep logs on stderr.
707
+ - For HTTP mode, verify MCP protocol headers and endpoint routing.
708
+ - Re-run discovery and fact extraction after surface changes to keep documentation aligned.
603
709
 
604
- ### Stdout / Stderr Guidance
710
+ ## Credits
605
711
 
606
- In stdio mode, **stdout** is reserved exclusively for MCP JSON-RPC messages. Logs and diagnostics are written to **stderr**. Never pipe stdout to a log file when using stdio transport.
712
+ | Dependency | Registry |
713
+ | ------------------------------------------------------------------------------------ | -------- |
714
+ | [@modelcontextprotocol/sdk](https://www.npmjs.com/package/@modelcontextprotocol/sdk) | npm |
715
+ | [@mozilla/readability](https://www.npmjs.com/package/@mozilla/readability) | npm |
716
+ | [linkedom](https://www.npmjs.com/package/linkedom) | npm |
717
+ | [node-html-markdown](https://www.npmjs.com/package/node-html-markdown) | npm |
718
+ | [undici](https://www.npmjs.com/package/undici) | npm |
719
+ | [zod](https://www.npmjs.com/package/zod) | npm |
607
720
 
608
- ## License
721
+ ## Contributing and License
609
722
 
610
- [MIT](https://opensource.org/licenses/MIT)
723
+ - License: MIT
724
+ - Contributions are welcome via pull requests.