@j0hanz/superfetch 2.2.2 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (85) hide show
  1. package/README.md +363 -363
  2. package/dist/cache.d.ts +0 -1
  3. package/dist/cache.js +13 -25
  4. package/dist/config.d.ts +0 -1
  5. package/dist/config.js +9 -7
  6. package/dist/crypto.d.ts +0 -1
  7. package/dist/crypto.js +0 -1
  8. package/dist/dom-noise-removal.d.ts +0 -1
  9. package/dist/dom-noise-removal.js +35 -32
  10. package/dist/errors.d.ts +0 -1
  11. package/dist/errors.js +0 -1
  12. package/dist/fetch.d.ts +0 -1
  13. package/dist/fetch.js +45 -29
  14. package/dist/host-normalization.d.ts +1 -0
  15. package/dist/host-normalization.js +47 -0
  16. package/dist/http-native.d.ts +0 -1
  17. package/dist/http-native.js +73 -25
  18. package/dist/index.d.ts +0 -1
  19. package/dist/index.js +0 -1
  20. package/dist/instructions.md +41 -41
  21. package/dist/json.d.ts +0 -1
  22. package/dist/json.js +0 -1
  23. package/dist/language-detection.d.ts +0 -1
  24. package/dist/language-detection.js +10 -2
  25. package/dist/markdown-cleanup.d.ts +0 -1
  26. package/dist/markdown-cleanup.js +10 -10
  27. package/dist/mcp-validator.d.ts +14 -0
  28. package/dist/mcp-validator.js +22 -0
  29. package/dist/mcp.d.ts +0 -1
  30. package/dist/mcp.js +0 -1
  31. package/dist/observability.d.ts +0 -1
  32. package/dist/observability.js +5 -3
  33. package/dist/server-tuning.d.ts +9 -0
  34. package/dist/server-tuning.js +30 -0
  35. package/dist/{http-utils.d.ts → session.d.ts} +0 -25
  36. package/dist/{http-utils.js → session.js} +11 -104
  37. package/dist/tools.d.ts +0 -1
  38. package/dist/tools.js +19 -29
  39. package/dist/transform-types.d.ts +0 -1
  40. package/dist/transform-types.js +0 -1
  41. package/dist/transform.d.ts +0 -1
  42. package/dist/transform.js +85 -79
  43. package/dist/type-guards.d.ts +0 -1
  44. package/dist/type-guards.js +0 -1
  45. package/dist/workers/transform-worker.d.ts +0 -1
  46. package/dist/workers/transform-worker.js +29 -19
  47. package/package.json +85 -85
  48. package/dist/cache.d.ts.map +0 -1
  49. package/dist/cache.js.map +0 -1
  50. package/dist/config.d.ts.map +0 -1
  51. package/dist/config.js.map +0 -1
  52. package/dist/crypto.d.ts.map +0 -1
  53. package/dist/crypto.js.map +0 -1
  54. package/dist/dom-noise-removal.d.ts.map +0 -1
  55. package/dist/dom-noise-removal.js.map +0 -1
  56. package/dist/errors.d.ts.map +0 -1
  57. package/dist/errors.js.map +0 -1
  58. package/dist/fetch.d.ts.map +0 -1
  59. package/dist/fetch.js.map +0 -1
  60. package/dist/http-native.d.ts.map +0 -1
  61. package/dist/http-native.js.map +0 -1
  62. package/dist/http-utils.d.ts.map +0 -1
  63. package/dist/http-utils.js.map +0 -1
  64. package/dist/index.d.ts.map +0 -1
  65. package/dist/index.js.map +0 -1
  66. package/dist/json.d.ts.map +0 -1
  67. package/dist/json.js.map +0 -1
  68. package/dist/language-detection.d.ts.map +0 -1
  69. package/dist/language-detection.js.map +0 -1
  70. package/dist/markdown-cleanup.d.ts.map +0 -1
  71. package/dist/markdown-cleanup.js.map +0 -1
  72. package/dist/mcp.d.ts.map +0 -1
  73. package/dist/mcp.js.map +0 -1
  74. package/dist/observability.d.ts.map +0 -1
  75. package/dist/observability.js.map +0 -1
  76. package/dist/tools.d.ts.map +0 -1
  77. package/dist/tools.js.map +0 -1
  78. package/dist/transform-types.d.ts.map +0 -1
  79. package/dist/transform-types.js.map +0 -1
  80. package/dist/transform.d.ts.map +0 -1
  81. package/dist/transform.js.map +0 -1
  82. package/dist/type-guards.d.ts.map +0 -1
  83. package/dist/type-guards.js.map +0 -1
  84. package/dist/workers/transform-worker.d.ts.map +0 -1
  85. package/dist/workers/transform-worker.js.map +0 -1
package/README.md CHANGED
@@ -1,363 +1,363 @@
1
- <!-- markdownlint-disable MD033 -->
2
-
3
- # superFetch MCP Server
4
-
5
- Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable Markdown.
6
-
7
- [![npm version](https://img.shields.io/npm/v/@j0hanz/superfetch.svg)](https://www.npmjs.com/package/@j0hanz/superfetch) [![license](https://img.shields.io/npm/l/@j0hanz/superfetch.svg)](https://www.npmjs.com/package/@j0hanz/superfetch) [![Node.js](https://img.shields.io/badge/Node.js-%3E=20.18.1-339933?logo=nodedotjs&logoColor=white)](https://nodejs.org/) [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-1.25.x-6f42c1)](https://github.com/modelcontextprotocol/sdk)
8
-
9
- <img src="docs/logo.png" alt="SuperFetch MCP Logo" width="300">
10
-
11
- ## One-Click Install
12
-
13
- [![Install with NPX in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=superfetch&inputs=%5B%5D&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Fsuperfetch%40latest%22%2C%22--stdio%22%5D%7D)
14
- [![Install with NPX in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=superfetch&inputs=%5B%5D&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Fsuperfetch%40latest%22%2C%22--stdio%22%5D%7D&quality=insiders)
15
-
16
- [![Install in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/install-mcp?name=superfetch&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovc3VwZXJmZXRjaEBsYXRlc3QiLCItLXN0ZGlvIl19)
17
-
18
- ## Overview
19
-
20
- | Feature | Details |
21
- | -------------------- | -------------------------------------------------------------------------- |
22
- | HTML → Markdown | Mozilla Readability + node-html-markdown pipeline with metadata injection. |
23
- | Raw content handling | Rewrites supported GitHub/GitLab/Bitbucket/Gist URLs to raw content. |
24
- | Caching + resources | LRU cache with resource listing and update notifications. |
25
- | Transport | Stdio (local clients) and Streamable HTTP (self-hosted). |
26
- | Safety | SSRF/IP blocklists, Host/Origin validation, auth for HTTP mode. |
27
-
28
- ### When to use
29
-
30
- - You need clean, AI-friendly Markdown from public http(s) URLs.
31
- - You want a single MCP tool that handles fetching, extraction, and caching.
32
- - You need self-hosted HTTP with auth and session management.
33
-
34
- ## Quick Start
35
-
36
- Recommended for MCP clients: stdio mode.
37
-
38
- ```bash
39
- npx -y @j0hanz/superfetch@latest --stdio
40
- ```
41
-
42
- Example MCP client configuration:
43
-
44
- ```json
45
- {
46
- "mcpServers": {
47
- "superFetch": {
48
- "command": "npx",
49
- "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
50
- }
51
- }
52
- }
53
- ```
54
-
55
- ## Installation
56
-
57
- ### NPX (recommended)
58
-
59
- ```bash
60
- npx -y @j0hanz/superfetch@latest --stdio
61
- ```
62
-
63
- ### Global install
64
-
65
- ```bash
66
- npm install -g @j0hanz/superfetch
67
- superfetch --stdio
68
- ```
69
-
70
- ### From source
71
-
72
- ```bash
73
- git clone https://github.com/j0hanz/super-fetch-mcp-server.git
74
- cd super-fetch-mcp-server
75
- npm install
76
- npm run build
77
- node dist/index.js --stdio
78
- ```
79
-
80
- ## Configuration
81
-
82
- ### CLI arguments
83
-
84
- | Argument | Type | Default | Description |
85
- | --------- | ------- | ------- | ----------------------------------- |
86
- | `--stdio` | boolean | false | Run in stdio mode (no HTTP server). |
87
-
88
- ### Environment variables
89
-
90
- #### Core server settings
91
-
92
- | Variable | Default | Description |
93
- | ---------------------------------- | -------------------- | -------------------------------------------------------------- |
94
- | `HOST` | `127.0.0.1` | HTTP bind address. |
95
- | `PORT` | `3000` | HTTP server port (1024-65535, `0` for ephemeral). |
96
- | `USER_AGENT` | `superFetch-MCP/2.0` | User-Agent header for outgoing requests. |
97
- | `CACHE_ENABLED` | `true` | Enable response caching. |
98
- | `CACHE_TTL` | `3600` | Cache TTL in seconds (60-86400). |
99
- | `LOG_LEVEL` | `info` | Logging level (`debug` enables verbose logs). |
100
- | `ALLOW_REMOTE` | `false` | Allow non-loopback binds (OAuth required). |
101
- | `ALLOWED_HOSTS` | (empty) | Additional allowed Host/Origin values (comma/space separated). |
102
- | `TRANSFORM_TIMEOUT_MS` | `30000` | Worker transform timeout in ms (5000-120000). |
103
- | `TOOL_TIMEOUT_MS` | `50000` | Overall tool timeout in ms (1000-300000). |
104
- | `TRANSFORM_METADATA_FORMAT` | `markdown` | Metadata format: `markdown` or `frontmatter`. |
105
- | `SUPERFETCH_EXTRA_NOISE_TOKENS` | (empty) | Extra noise tokens for DOM noise removal. |
106
- | `SUPERFETCH_EXTRA_NOISE_SELECTORS` | (empty) | Extra CSS selectors for DOM noise removal. |
107
-
108
- #### HTTP server tuning (optional)
109
-
110
- | Variable | Default | Description |
111
- | ------------------------------ | ------- | --------------------------------------------- |
112
- | `SERVER_HEADERS_TIMEOUT_MS` | (unset) | Sets `server.headersTimeout` (1000-600000). |
113
- | `SERVER_REQUEST_TIMEOUT_MS` | (unset) | Sets `server.requestTimeout` (1000-600000). |
114
- | `SERVER_KEEP_ALIVE_TIMEOUT_MS` | (unset) | Sets `server.keepAliveTimeout` (1000-600000). |
115
- | `SERVER_SHUTDOWN_CLOSE_IDLE` | `false` | Close idle connections on shutdown. |
116
- | `SERVER_SHUTDOWN_CLOSE_ALL` | `false` | Close all connections on shutdown. |
117
-
118
- #### Auth (HTTP mode)
119
-
120
- | Variable | Default | Description |
121
- | --------------- | ------- | ---------------------------------------------------- |
122
- | `AUTH_MODE` | auto | `static` or `oauth` (auto-detected from OAuth URLs). |
123
- | `ACCESS_TOKENS` | (empty) | Comma/space-separated static bearer tokens. |
124
- | `API_KEY` | (empty) | Adds a static bearer token and enables `X-API-Key`. |
125
-
126
- Static mode requires at least one token (`ACCESS_TOKENS` or `API_KEY`).
127
-
128
- #### OAuth (HTTP mode)
129
-
130
- Required when `AUTH_MODE=oauth` (or auto-selected by OAuth URLs):
131
-
132
- | Variable | Default | Description |
133
- | ------------------------- | ------- | ----------------------- |
134
- | `OAUTH_ISSUER_URL` | - | OAuth issuer. |
135
- | `OAUTH_AUTHORIZATION_URL` | - | Authorization endpoint. |
136
- | `OAUTH_TOKEN_URL` | - | Token endpoint. |
137
- | `OAUTH_INTROSPECTION_URL` | - | Introspection endpoint. |
138
-
139
- Optional:
140
-
141
- | Variable | Default | Description |
142
- | -------------------------------- | -------------------------- | ---------------------------------------- |
143
- | `OAUTH_REVOCATION_URL` | - | Revocation endpoint. |
144
- | `OAUTH_REGISTRATION_URL` | - | Dynamic client registration endpoint. |
145
- | `OAUTH_RESOURCE_URL` | `http://<host>:<port>/mcp` | Protected resource URL. |
146
- | `OAUTH_REQUIRED_SCOPES` | (empty) | Required scopes (comma/space separated). |
147
- | `OAUTH_CLIENT_ID` | - | Client ID for introspection. |
148
- | `OAUTH_CLIENT_SECRET` | - | Client secret for introspection. |
149
- | `OAUTH_INTROSPECTION_TIMEOUT_MS` | `5000` | Introspection timeout (1000-30000). |
150
-
151
- ### HTTP mode endpoints
152
-
153
- | Method | Path | Auth | Notes |
154
- | ------ | --------------------------------- | ---- | -------------------------------------------------- |
155
- | GET | `/health` | No | Health check. |
156
- | POST | `/mcp` | Yes | Streamable HTTP JSON-RPC requests. |
157
- | GET | `/mcp` | Yes | SSE stream (requires `Accept: text/event-stream`). |
158
- | DELETE | `/mcp` | Yes | Close the session. |
159
- | GET | `/mcp/downloads/:namespace/:hash` | Yes | Download cached markdown. |
160
-
161
- Sessions are managed via the `mcp-session-id` header. A `POST /mcp` `initialize` request creates a session and returns the session id.
162
-
163
- ## API Reference
164
-
165
- ### Tools
166
-
167
- #### `fetch-url`
168
-
169
- Fetches a webpage and converts it to clean Markdown.
170
-
171
- ##### Parameters
172
-
173
- | Name | Type | Required | Default | Description |
174
- | ----- | ------ | -------- | ------- | ------------------------------------ |
175
- | `url` | string | Yes | - | Public http(s) URL, max length 2048. |
176
-
177
- ##### Returns
178
-
179
- `structuredContent` fields:
180
-
181
- - `url` (string): fetched URL
182
- - `inputUrl` (string, optional): original input URL
183
- - `resolvedUrl` (string, optional): normalized or raw-content URL
184
- - `title` (string, optional): page title
185
- - `markdown` (string, optional): markdown content (inline when available)
186
- - `error` (string, optional): error message on failure
187
-
188
- ##### Example success
189
-
190
- ```json
191
- {
192
- "url": "https://example.com/docs",
193
- "inputUrl": "https://example.com/docs",
194
- "resolvedUrl": "https://example.com/docs",
195
- "title": "Example Docs",
196
- "markdown": "# Getting Started\n\n..."
197
- }
198
- ```
199
-
200
- ##### Example error
201
-
202
- ```json
203
- {
204
- "url": "https://example.com/404",
205
- "error": "Failed to fetch URL: 404 Not Found"
206
- }
207
- ```
208
-
209
- ##### Large content handling
210
-
211
- - Inline markdown is capped at 20,000 characters.
212
- - When content exceeds the inline limit and cache is enabled, responses include a `resource_link` to `superfetch://cache/markdown/{urlHash}`.
213
- - If cache is disabled, inline content is truncated with `...[truncated]`.
214
-
215
- ### Resources
216
-
217
- | URI pattern | Description | MIME type |
218
- | --------------------------------------- | ------------------------------ | --------------- |
219
- | `superfetch://cache/markdown/{urlHash}` | Cached markdown content entry. | `text/markdown` |
220
- | `internal://instructions` | Server usage instructions. | `text/markdown` |
221
-
222
- ### Prompts
223
-
224
- No prompts are registered in this server.
225
-
226
- ## Client Configuration Examples
227
-
228
- <details>
229
- <summary><strong>VS Code</strong></summary>
230
-
231
- Add to .vscode/mcp.json:
232
-
233
- ```json
234
- {
235
- "servers": {
236
- "superFetch": {
237
- "command": "npx",
238
- "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
239
- }
240
- }
241
- }
242
- ```
243
-
244
- </details>
245
-
246
- <details>
247
- <summary><strong>Claude Desktop</strong></summary>
248
-
249
- Add to claude_desktop_config.json:
250
-
251
- ```json
252
- {
253
- "mcpServers": {
254
- "superFetch": {
255
- "command": "npx",
256
- "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
257
- }
258
- }
259
- }
260
- ```
261
-
262
- </details>
263
-
264
- <details>
265
- <summary><strong>Cursor</strong></summary>
266
-
267
- ```json
268
- {
269
- "mcpServers": {
270
- "superFetch": {
271
- "command": "npx",
272
- "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
273
- }
274
- }
275
- }
276
- ```
277
-
278
- </details>
279
-
280
- <details>
281
- <summary><strong>Windsurf</strong></summary>
282
-
283
- ```json
284
- {
285
- "mcpServers": {
286
- "superFetch": {
287
- "command": "npx",
288
- "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
289
- }
290
- }
291
- }
292
- ```
293
-
294
- </details>
295
-
296
- ## Security
297
-
298
- - Stdio logs are written to stderr (stdout is reserved for MCP traffic).
299
- - HTTP mode validates Host and Origin headers against allowed hosts.
300
- - HTTP mode requires `MCP-Protocol-Version: 2025-11-25`.
301
- - Auth is required for HTTP mode (static tokens or OAuth).
302
- - SSRF protections block private IP ranges and common metadata endpoints.
303
- - Rate limiting: 100 requests/minute per IP (60s window) for HTTP routes.
304
-
305
- ## Development
306
-
307
- ### Prerequisites
308
-
309
- - Node.js >= 20.18.1
310
- - npm
311
-
312
- ### Scripts
313
-
314
- | Script | Command | Purpose |
315
- | ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- |
316
- | clean | `node scripts/clean.mjs` | Remove build artifacts. |
317
- | validate:instructions | `node scripts/validate-instructions.mjs` | Validate embedded instructions. |
318
- | build | `npm run clean && tsc -p tsconfig.json && npm run validate:instructions && npm run copy:assets && node scripts/make-executable.mjs` | Build the server. |
319
- | copy:assets | `node scripts/copy-assets.mjs` | Copy static assets. |
320
- | prepare | `npm run build` | Prepare package for publishing. |
321
- | dev | `tsc --watch --preserveWatchOutput` | TypeScript watch mode. |
322
- | dev:run | `node --watch dist/index.js` | Run compiled server in watch mode. |
323
- | start | `node dist/index.js` | Start HTTP server (default). |
324
- | format | `prettier --write .` | Format codebase. |
325
- | type-check | `tsc --noEmit` | Type checking. |
326
- | type-check:diagnostics | `tsc --noEmit --extendedDiagnostics` | Type check diagnostics. |
327
- | type-check:trace | `tsc --noEmit --generateTrace .ts-trace` | Generate TS trace. |
328
- | lint | `eslint .` | Lint. |
329
- | lint:fix | `eslint . --fix` | Lint and fix. |
330
- | test | `npm run build --silent && node --test --experimental-transform-types` | Run tests (builds first). |
331
- | test:coverage | `npm run build --silent && node --test --experimental-transform-types --experimental-test-coverage` | Test with coverage. |
332
- | knip | `knip` | Dead code analysis. |
333
- | knip:fix | `knip --fix` | Fix knip issues. |
334
- | inspector | `npx @modelcontextprotocol/inspector` | MCP Inspector. |
335
- | prepublishOnly | `npm run lint && npm run type-check && npm run build` | Prepublish checks. |
336
-
337
- ### Project structure
338
-
339
- ```text
340
- superFetch
341
- ├── docs
342
- │ └── logo.png
343
- ├── src
344
- │ ├── workers
345
- │ ├── cache.ts
346
- │ ├── config.ts
347
- │ ├── fetch.ts
348
- │ ├── http-native.ts
349
- │ ├── http-utils.ts
350
- │ ├── index.ts
351
- │ ├── instructions.md
352
- │ ├── mcp.ts
353
- │ ├── tools.ts
354
- │ ├── transform.ts
355
- │ └── ...
356
- ├── tests
357
- │ └── *.test.ts
358
- ├── CONFIGURATION.md
359
- ├── package.json
360
- └── tsconfig.json
361
- ```
362
-
363
- <!-- markdownlint-enable MD033 -->
1
+ <!-- markdownlint-disable MD033 -->
2
+
3
+ # superFetch MCP Server
4
+
5
+ Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable Markdown.
6
+
7
+ [![npm version](https://img.shields.io/npm/v/@j0hanz/superfetch.svg)](https://www.npmjs.com/package/@j0hanz/superfetch) [![license](https://img.shields.io/npm/l/@j0hanz/superfetch.svg)](https://www.npmjs.com/package/@j0hanz/superfetch) [![Node.js](https://img.shields.io/badge/Node.js-%3E=20.18.1-339933?logo=nodedotjs&logoColor=white)](https://nodejs.org/) [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-1.25.x-6f42c1)](https://github.com/modelcontextprotocol/sdk)
8
+
9
+ <img src="docs/logo.png" alt="SuperFetch MCP Logo" width="300">
10
+
11
+ ## One-Click Install
12
+
13
+ [![Install with NPX in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=superfetch&inputs=%5B%5D&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Fsuperfetch%40latest%22%2C%22--stdio%22%5D%7D)
14
+ [![Install with NPX in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=superfetch&inputs=%5B%5D&config=%7B%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40j0hanz%2Fsuperfetch%40latest%22%2C%22--stdio%22%5D%7D&quality=insiders)
15
+
16
+ [![Install in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/install-mcp?name=superfetch&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqMGhhbnovc3VwZXJmZXRjaEBsYXRlc3QiLCItLXN0ZGlvIl19)
17
+
18
+ ## Overview
19
+
20
+ | Feature | Details |
21
+ | -------------------- | -------------------------------------------------------------------------- |
22
+ | HTML → Markdown | Mozilla Readability + node-html-markdown pipeline with metadata injection. |
23
+ | Raw content handling | Rewrites supported GitHub/GitLab/Bitbucket/Gist URLs to raw content. |
24
+ | Caching + resources | LRU cache with resource listing and update notifications. |
25
+ | Transport | Stdio (local clients) and Streamable HTTP (self-hosted). |
26
+ | Safety | SSRF/IP blocklists, Host/Origin validation, auth for HTTP mode. |
27
+
28
+ ### When to use
29
+
30
+ - You need clean, AI-friendly Markdown from public http(s) URLs.
31
+ - You want a single MCP tool that handles fetching, extraction, and caching.
32
+ - You need self-hosted HTTP with auth and session management.
33
+
34
+ ## Quick Start
35
+
36
+ Recommended for MCP clients: stdio mode.
37
+
38
+ ```bash
39
+ npx -y @j0hanz/superfetch@latest --stdio
40
+ ```
41
+
42
+ Example MCP client configuration:
43
+
44
+ ```json
45
+ {
46
+ "mcpServers": {
47
+ "superFetch": {
48
+ "command": "npx",
49
+ "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
50
+ }
51
+ }
52
+ }
53
+ ```
54
+
55
+ ## Installation
56
+
57
+ ### NPX (recommended)
58
+
59
+ ```bash
60
+ npx -y @j0hanz/superfetch@latest --stdio
61
+ ```
62
+
63
+ ### Global install
64
+
65
+ ```bash
66
+ npm install -g @j0hanz/superfetch
67
+ superfetch --stdio
68
+ ```
69
+
70
+ ### From source
71
+
72
+ ```bash
73
+ git clone https://github.com/j0hanz/super-fetch-mcp-server.git
74
+ cd super-fetch-mcp-server
75
+ npm install
76
+ npm run build
77
+ node dist/index.js --stdio
78
+ ```
79
+
80
+ ## Configuration
81
+
82
+ ### CLI arguments
83
+
84
+ | Argument | Type | Default | Description |
85
+ | --------- | ------- | ------- | ----------------------------------- |
86
+ | `--stdio` | boolean | false | Run in stdio mode (no HTTP server). |
87
+
88
+ ### Environment variables
89
+
90
+ #### Core server settings
91
+
92
+ | Variable | Default | Description |
93
+ | ---------------------------------- | -------------------- | -------------------------------------------------------------- |
94
+ | `HOST` | `127.0.0.1` | HTTP bind address. |
95
+ | `PORT` | `3000` | HTTP server port (1024-65535, `0` for ephemeral). |
96
+ | `USER_AGENT` | `superFetch-MCP/2.0` | User-Agent header for outgoing requests. |
97
+ | `CACHE_ENABLED` | `true` | Enable response caching. |
98
+ | `CACHE_TTL` | `3600` | Cache TTL in seconds (60-86400). |
99
+ | `LOG_LEVEL` | `info` | Logging level (`debug` enables verbose logs). |
100
+ | `ALLOW_REMOTE` | `false` | Allow non-loopback binds (OAuth required). |
101
+ | `ALLOWED_HOSTS` | (empty) | Additional allowed Host/Origin values (comma/space separated). |
102
+ | `TRANSFORM_TIMEOUT_MS` | `30000` | Worker transform timeout in ms (5000-120000). |
103
+ | `TOOL_TIMEOUT_MS` | `50000` | Overall tool timeout in ms (1000-300000). |
104
+ | `TRANSFORM_METADATA_FORMAT` | `markdown` | Metadata format: `markdown` or `frontmatter`. |
105
+ | `SUPERFETCH_EXTRA_NOISE_TOKENS` | (empty) | Extra noise tokens for DOM noise removal. |
106
+ | `SUPERFETCH_EXTRA_NOISE_SELECTORS` | (empty) | Extra CSS selectors for DOM noise removal. |
107
+
108
+ #### HTTP server tuning (optional)
109
+
110
+ | Variable | Default | Description |
111
+ | ------------------------------ | ------- | --------------------------------------------- |
112
+ | `SERVER_HEADERS_TIMEOUT_MS` | (unset) | Sets `server.headersTimeout` (1000-600000). |
113
+ | `SERVER_REQUEST_TIMEOUT_MS` | (unset) | Sets `server.requestTimeout` (1000-600000). |
114
+ | `SERVER_KEEP_ALIVE_TIMEOUT_MS` | (unset) | Sets `server.keepAliveTimeout` (1000-600000). |
115
+ | `SERVER_SHUTDOWN_CLOSE_IDLE` | `false` | Close idle connections on shutdown. |
116
+ | `SERVER_SHUTDOWN_CLOSE_ALL` | `false` | Close all connections on shutdown. |
117
+
118
+ #### Auth (HTTP mode)
119
+
120
+ | Variable | Default | Description |
121
+ | --------------- | ------- | ---------------------------------------------------- |
122
+ | `AUTH_MODE` | auto | `static` or `oauth` (auto-detected from OAuth URLs). |
123
+ | `ACCESS_TOKENS` | (empty) | Comma/space-separated static bearer tokens. |
124
+ | `API_KEY` | (empty) | Adds a static bearer token and enables `X-API-Key`. |
125
+
126
+ Static mode requires at least one token (`ACCESS_TOKENS` or `API_KEY`).
127
+
128
+ #### OAuth (HTTP mode)
129
+
130
+ Required when `AUTH_MODE=oauth` (or auto-selected by OAuth URLs):
131
+
132
+ | Variable | Default | Description |
133
+ | ------------------------- | ------- | ----------------------- |
134
+ | `OAUTH_ISSUER_URL` | - | OAuth issuer. |
135
+ | `OAUTH_AUTHORIZATION_URL` | - | Authorization endpoint. |
136
+ | `OAUTH_TOKEN_URL` | - | Token endpoint. |
137
+ | `OAUTH_INTROSPECTION_URL` | - | Introspection endpoint. |
138
+
139
+ Optional:
140
+
141
+ | Variable | Default | Description |
142
+ | -------------------------------- | -------------------------- | ---------------------------------------- |
143
+ | `OAUTH_REVOCATION_URL` | - | Revocation endpoint. |
144
+ | `OAUTH_REGISTRATION_URL` | - | Dynamic client registration endpoint. |
145
+ | `OAUTH_RESOURCE_URL` | `http://<host>:<port>/mcp` | Protected resource URL. |
146
+ | `OAUTH_REQUIRED_SCOPES` | (empty) | Required scopes (comma/space separated). |
147
+ | `OAUTH_CLIENT_ID` | - | Client ID for introspection. |
148
+ | `OAUTH_CLIENT_SECRET` | - | Client secret for introspection. |
149
+ | `OAUTH_INTROSPECTION_TIMEOUT_MS` | `5000` | Introspection timeout (1000-30000). |
150
+
151
+ ### HTTP mode endpoints
152
+
153
+ | Method | Path | Auth | Notes |
154
+ | ------ | --------------------------------- | ---- | -------------------------------------------------- |
155
+ | GET | `/health` | No | Health check. |
156
+ | POST | `/mcp` | Yes | Streamable HTTP JSON-RPC requests. |
157
+ | GET | `/mcp` | Yes | SSE stream (requires `Accept: text/event-stream`). |
158
+ | DELETE | `/mcp` | Yes | Close the session. |
159
+ | GET | `/mcp/downloads/:namespace/:hash` | Yes | Download cached markdown. |
160
+
161
+ Sessions are managed via the `mcp-session-id` header. A `POST /mcp` `initialize` request creates a session and returns the session id.
162
+
163
+ ## API Reference
164
+
165
+ ### Tools
166
+
167
+ #### `fetch-url`
168
+
169
+ Fetches a webpage and converts it to clean Markdown.
170
+
171
+ ##### Parameters
172
+
173
+ | Name | Type | Required | Default | Description |
174
+ | ----- | ------ | -------- | ------- | ------------------------------------ |
175
+ | `url` | string | Yes | - | Public http(s) URL, max length 2048. |
176
+
177
+ ##### Returns
178
+
179
+ `structuredContent` fields:
180
+
181
+ - `url` (string): fetched URL
182
+ - `inputUrl` (string, optional): original input URL
183
+ - `resolvedUrl` (string, optional): normalized or raw-content URL
184
+ - `title` (string, optional): page title
185
+ - `markdown` (string, optional): markdown content (inline when available)
186
+ - `error` (string, optional): error message on failure
187
+
188
+ ##### Example success
189
+
190
+ ```json
191
+ {
192
+ "url": "https://example.com/docs",
193
+ "inputUrl": "https://example.com/docs",
194
+ "resolvedUrl": "https://example.com/docs",
195
+ "title": "Example Docs",
196
+ "markdown": "# Getting Started\n\n..."
197
+ }
198
+ ```
199
+
200
+ ##### Example error
201
+
202
+ ```json
203
+ {
204
+ "url": "https://example.com/404",
205
+ "error": "Failed to fetch URL: 404 Not Found"
206
+ }
207
+ ```
208
+
209
+ ##### Large content handling
210
+
211
+ - Inline markdown is capped at 20,000 characters.
212
+ - When content exceeds the inline limit and cache is enabled, responses include a `resource_link` to `superfetch://cache/markdown/{urlHash}`.
213
+ - If cache is disabled, inline content is truncated with `...[truncated]`.
214
+
215
+ ### Resources
216
+
217
+ | URI pattern | Description | MIME type |
218
+ | --------------------------------------- | ------------------------------ | --------------- |
219
+ | `superfetch://cache/markdown/{urlHash}` | Cached markdown content entry. | `text/markdown` |
220
+ | `internal://instructions` | Server usage instructions. | `text/markdown` |
221
+
222
+ ### Prompts
223
+
224
+ No prompts are registered in this server.
225
+
226
+ ## Client Configuration Examples
227
+
228
+ <details>
229
+ <summary><strong>VS Code</strong></summary>
230
+
231
+ Add to .vscode/mcp.json:
232
+
233
+ ```json
234
+ {
235
+ "servers": {
236
+ "superFetch": {
237
+ "command": "npx",
238
+ "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
239
+ }
240
+ }
241
+ }
242
+ ```
243
+
244
+ </details>
245
+
246
+ <details>
247
+ <summary><strong>Claude Desktop</strong></summary>
248
+
249
+ Add to claude_desktop_config.json:
250
+
251
+ ```json
252
+ {
253
+ "mcpServers": {
254
+ "superFetch": {
255
+ "command": "npx",
256
+ "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
257
+ }
258
+ }
259
+ }
260
+ ```
261
+
262
+ </details>
263
+
264
+ <details>
265
+ <summary><strong>Cursor</strong></summary>
266
+
267
+ ```json
268
+ {
269
+ "mcpServers": {
270
+ "superFetch": {
271
+ "command": "npx",
272
+ "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
273
+ }
274
+ }
275
+ }
276
+ ```
277
+
278
+ </details>
279
+
280
+ <details>
281
+ <summary><strong>Windsurf</strong></summary>
282
+
283
+ ```json
284
+ {
285
+ "mcpServers": {
286
+ "superFetch": {
287
+ "command": "npx",
288
+ "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
289
+ }
290
+ }
291
+ }
292
+ ```
293
+
294
+ </details>
295
+
296
+ ## Security
297
+
298
+ - Stdio logs are written to stderr (stdout is reserved for MCP traffic).
299
+ - HTTP mode validates Host and Origin headers against allowed hosts.
300
+ - HTTP mode requires `MCP-Protocol-Version: 2025-11-25`.
301
+ - Auth is required for HTTP mode (static tokens or OAuth).
302
+ - SSRF protections block private IP ranges and common metadata endpoints.
303
+ - Rate limiting: 100 requests/minute per IP (60s window) for HTTP routes.
304
+
305
+ ## Development
306
+
307
+ ### Prerequisites
308
+
309
+ - Node.js >= 20.18.1
310
+ - npm
311
+
312
+ ### Scripts
313
+
314
+ | Script | Command | Purpose |
315
+ | ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- |
316
+ | clean | `node scripts/clean.mjs` | Remove build artifacts. |
317
+ | validate:instructions | `node scripts/validate-instructions.mjs` | Validate embedded instructions. |
318
+ | build | `npm run clean && tsc -p tsconfig.json && npm run validate:instructions && npm run copy:assets && node scripts/make-executable.mjs` | Build the server. |
319
+ | copy:assets | `node scripts/copy-assets.mjs` | Copy static assets. |
320
+ | prepare | `npm run build` | Prepare package for publishing. |
321
+ | dev | `tsc --watch --preserveWatchOutput` | TypeScript watch mode. |
322
+ | dev:run | `node --watch dist/index.js` | Run compiled server in watch mode. |
323
+ | start | `node dist/index.js` | Start HTTP server (default). |
324
+ | format | `prettier --write .` | Format codebase. |
325
+ | type-check | `tsc --noEmit` | Type checking. |
326
+ | type-check:diagnostics | `tsc --noEmit --extendedDiagnostics` | Type check diagnostics. |
327
+ | type-check:trace | `tsc --noEmit --generateTrace .ts-trace` | Generate TS trace. |
328
+ | lint | `eslint .` | Lint. |
329
+ | lint:fix | `eslint . --fix` | Lint and fix. |
330
+ | test | `npm run build --silent && node --test --experimental-transform-types` | Run tests (builds first). |
331
+ | test:coverage | `npm run build --silent && node --test --experimental-transform-types --experimental-test-coverage` | Test with coverage. |
332
+ | knip | `knip` | Dead code analysis. |
333
+ | knip:fix | `knip --fix` | Fix knip issues. |
334
+ | inspector | `npx @modelcontextprotocol/inspector` | MCP Inspector. |
335
+ | prepublishOnly | `npm run lint && npm run type-check && npm run build` | Prepublish checks. |
336
+
337
+ ### Project structure
338
+
339
+ ```text
340
+ superFetch
341
+ ├── docs
342
+ │ └── logo.png
343
+ ├── src
344
+ │ ├── workers
345
+ │ ├── cache.ts
346
+ │ ├── config.ts
347
+ │ ├── fetch.ts
348
+ │ ├── http-native.ts
349
+ │ ├── http-utils.ts
350
+ │ ├── index.ts
351
+ │ ├── instructions.md
352
+ │ ├── mcp.ts
353
+ │ ├── tools.ts
354
+ │ ├── transform.ts
355
+ │ └── ...
356
+ ├── tests
357
+ │ └── *.test.ts
358
+ ├── CONFIGURATION.md
359
+ ├── package.json
360
+ └── tsconfig.json
361
+ ```
362
+
363
+ <!-- markdownlint-enable MD033 -->