@cyanheads/mcp-ts-core 0.6.10 → 0.6.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Agent Protocol
2
2
 
3
- **Package:** `@cyanheads/mcp-ts-core` · **Version:** 0.6.10
3
+ **Package:** `@cyanheads/mcp-ts-core` · **Version:** 0.6.11
4
4
  **npm:** [@cyanheads/mcp-ts-core](https://www.npmjs.com/package/@cyanheads/mcp-ts-core) · **Docker:** [ghcr.io/cyanheads/mcp-ts-core](https://ghcr.io/cyanheads/mcp-ts-core)
5
5
 
6
6
  > **Developer note:** Never assume. Read related files and docs before making changes. Read full file content for context. Never edit a file before reading it.
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
 
6
6
  <div align="center">
7
7
 
8
- [![Version](https://img.shields.io/badge/Version-0.6.10-blue.svg?style=flat-square)](./CHANGELOG.md) [![MCP Spec](https://img.shields.io/badge/MCP%20Spec-2025--11--25-8A2BE2.svg?style=flat-square)](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/docs/specification/2025-11-25/changelog.mdx) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-^1.29.0-green.svg?style=flat-square)](https://modelcontextprotocol.io/) [![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg?style=flat-square)](./LICENSE)
8
+ [![Version](https://img.shields.io/badge/Version-0.6.11-blue.svg?style=flat-square)](./CHANGELOG.md) [![MCP Spec](https://img.shields.io/badge/MCP%20Spec-2025--11--25-8A2BE2.svg?style=flat-square)](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/docs/specification/2025-11-25/changelog.mdx) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-^1.29.0-green.svg?style=flat-square)](https://modelcontextprotocol.io/) [![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg?style=flat-square)](./LICENSE)
9
9
 
10
10
  [![TypeScript](https://img.shields.io/badge/TypeScript-^6.0.3-3178C6.svg?style=flat-square)](https://www.typescriptlang.org/) [![Bun](https://img.shields.io/badge/Bun-v1.3.2-blueviolet.svg?style=flat-square)](https://bun.sh/)
11
11
 
package/biome.json CHANGED
@@ -1,5 +1,5 @@
1
1
  {
2
- "$schema": "https://biomejs.dev/schemas/2.4.12/schema.json",
2
+ "$schema": "https://biomejs.dev/schemas/2.4.13/schema.json",
3
3
  "vcs": {
4
4
  "enabled": true,
5
5
  "clientKind": "git",
@@ -0,0 +1,23 @@
1
+ ---
2
+ summary: Add HtmlExtractor Tier 3 utility — wraps defuddle + linkedom for extracting main article content and metadata from raw HTML into Markdown or cleaned HTML
3
+ breaking: false
4
+ ---
5
+
6
+ # 0.6.11 — 2026-04-23
7
+
8
+ Adds `HtmlExtractor`, a new Tier 3 parsing utility for turning raw HTML into clean article content plus best-effort metadata. Built for MCP servers that wrap scholarly or article APIs and need to hand page content to an LLM without hand-rolling extraction.
9
+
10
+ ## Added
11
+
12
+ - **`HtmlExtractor` (`htmlExtractor` singleton)** in `src/utils/parsing/htmlExtractor.ts` — wraps [`defuddle`](https://github.com/kepano/defuddle) (modern Readability successor, powers Obsidian Web Clipper) together with [`linkedom`](https://github.com/WebReflection/linkedom) for DOM parsing. Exported from `@cyanheads/mcp-ts-core/utils`. One method:
13
+ - `extract(html, options?, context?)` — returns `{ title?, author?, description?, content, domain?, favicon?, image?, language?, metaTags?, parseTime?, published?, schemaOrgData?, site?, wordCount? }`. Only `content` is guaranteed; all other fields are best-effort based on what the source page exposes.
14
+ - **Options** on `ExtractArticleOptions`: `format` (`'markdown' | 'html'`, defaults to `'markdown'`), `url`, `contentSelector`, `removeImages`, `debug`, `language`, `useAsync` (off by default — keeps extraction local and deterministic; opt in to allow Defuddle's third-party API fallbacks for SPAs like Twitter).
15
+ - **Peer dependencies** (both optional): `defuddle ^0.18.1`, `linkedom ^0.18.12`. Install with `bun add defuddle linkedom`. Cloudflare Workers note: linkedom works in Workers but adds ~150KB minified plus entity tables — factor into your Worker size budget. JSDOM is also supported via defuddle's node entry but is not the default.
16
+ - **Tests** at `tests/unit/utils/parsing/htmlExtractor.test.ts` covering clean articles, metadata extraction, boilerplate removal (nav/sidebar/footer), markdown vs HTML output, `contentSelector` override, `removeImages`, empty input (→ `ValidationError`), SPA shells, and malformed HTML.
17
+
18
+ ## Changed
19
+
20
+ - **`skills/field-test`** rewritten to v2.0 — pivots from "use the MCP tools already connected in your client" to "start the HTTP server locally and drive it with curl + JSON-RPC." Adds a reusable bash helper (`mcp_start`/`mcp_init`/`mcp_call`/`mcp_stop`) that persists PID/URL/session state across tool invocations, replaces the per-definition category matrix with a universal battery (happy path, parity, input error) plus trigger-gated situational categories so it scales to large servers, and tightens the report format (summary paragraph → grouped findings → numbered cherry-pick options).
21
+ - **`biome.json`** `$schema` URL bumped to `2.4.13` to match the `@biomejs/biome` version already in devDependencies.
22
+
23
+ Resolves [#46](https://github.com/cyanheads/mcp-ts-core/issues/46).
@@ -1,4 +1,4 @@
1
- {"level":50,"time":1776955661645,"env":"testing","version":"0.0.0-test","pid":87698,"requestId":"H425Q-YZMFD","timestamp":"2026-04-23T14:47:41.643Z","operation":"HandleToolRequest","input":{"message":"blocked"},"critical":false,"errorCode":-32005,"originalErrorType":"McpError","finalErrorType":"McpError","sessionId":"6d54688d1f715b37c579af1c10830fb526330c8eb39f929290a28bb4cebbc41e","toolName":"scoped_echo","tenantId":"authz-tenant","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"errorData":{"sessionId":"6d54688d1f715b37c579af1c10830fb526330c8eb39f929290a28bb4cebbc41e","toolName":"scoped_echo","input":{"message":"blocked"},"requestId":"H425Q-YZMFD","timestamp":"2026-04-23T14:47:41.643Z","tenantId":"authz-tenant","operation":"HandleToolRequest","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"originalErrorName":"McpError","originalMessage":"Insufficient permissions.","originalStack":"McpError: Insufficient permissions.\n at forbidden (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:84:58)\n at withRequiredScopes (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/authUtils.js:61:15)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:68:17)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Insufficient permissions.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:101:42)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)","msg":"Error in tool:scoped_echo: Insufficient permissions."}
2
- {"level":50,"time":1776955662508,"env":"testing","version":"0.6.10","pid":87732,"requestId":"RD7TQ-YRWS5","timestamp":"2026-04-23T14:47:42.507Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"RD7TQ-YRWS5","timestamp":"2026-04-23T14:47:42.507Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
3
- {"level":50,"time":1776955662525,"env":"testing","version":"0.6.10","pid":87732,"requestId":"UHB72-SFY34","timestamp":"2026-04-23T14:47:42.524Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"UHB72-SFY34","timestamp":"2026-04-23T14:47:42.524Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Token has expired.","originalStack":"McpError: Token has expired.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at handleJoseVerifyError (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/claimParser.js:56:11)\n at verify (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/strategies/jwtStrategy.js:91:13)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Token has expired.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Token has expired."}
4
- {"level":50,"time":1776955662529,"env":"testing","version":"0.6.10","pid":87732,"requestId":"TCIU4-1PP9W","timestamp":"2026-04-23T14:47:42.529Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"GET","errorData":{"path":"/mcp","method":"GET","requestId":"TCIU4-1PP9W","timestamp":"2026-04-23T14:47:42.529Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
1
+ {"level":50,"time":1776964260716,"env":"testing","version":"0.6.11","pid":41913,"requestId":"39HTQ-Z1TY2","timestamp":"2026-04-23T17:11:00.716Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"39HTQ-Z1TY2","timestamp":"2026-04-23T17:11:00.716Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
2
+ {"level":50,"time":1776964260730,"env":"testing","version":"0.6.11","pid":41913,"requestId":"96UGI-8URT9","timestamp":"2026-04-23T17:11:00.730Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"96UGI-8URT9","timestamp":"2026-04-23T17:11:00.730Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Token has expired.","originalStack":"McpError: Token has expired.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at handleJoseVerifyError (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/claimParser.js:56:11)\n at verify (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/strategies/jwtStrategy.js:91:13)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Token has expired.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Token has expired."}
3
+ {"level":50,"time":1776964260733,"env":"testing","version":"0.6.11","pid":41913,"requestId":"Y7DPA-0GN8D","timestamp":"2026-04-23T17:11:00.733Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"GET","errorData":{"path":"/mcp","method":"GET","requestId":"Y7DPA-0GN8D","timestamp":"2026-04-23T17:11:00.733Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
4
+ {"level":50,"time":1776964261975,"env":"testing","version":"0.0.0-test","pid":41951,"requestId":"I9RC9-PC6JL","timestamp":"2026-04-23T17:11:01.974Z","operation":"HandleToolRequest","input":{"message":"blocked"},"critical":false,"errorCode":-32005,"originalErrorType":"McpError","finalErrorType":"McpError","sessionId":"461886e20f7bb81bf0c81e4dc87a34cd9e982327ed1faaed8d7c15b148320462","toolName":"scoped_echo","tenantId":"authz-tenant","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"errorData":{"sessionId":"461886e20f7bb81bf0c81e4dc87a34cd9e982327ed1faaed8d7c15b148320462","toolName":"scoped_echo","input":{"message":"blocked"},"requestId":"I9RC9-PC6JL","timestamp":"2026-04-23T17:11:01.974Z","tenantId":"authz-tenant","operation":"HandleToolRequest","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"originalErrorName":"McpError","originalMessage":"Insufficient permissions.","originalStack":"McpError: Insufficient permissions.\n at forbidden (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:84:58)\n at withRequiredScopes (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/authUtils.js:61:15)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:68:17)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Insufficient permissions.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:101:42)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)","msg":"Error in tool:scoped_echo: Insufficient permissions."}
@@ -1,4 +1,4 @@
1
- {"level":50,"time":1776955661645,"env":"testing","version":"0.0.0-test","pid":87698,"requestId":"H425Q-YZMFD","timestamp":"2026-04-23T14:47:41.643Z","operation":"HandleToolRequest","input":{"message":"blocked"},"critical":false,"errorCode":-32005,"originalErrorType":"McpError","finalErrorType":"McpError","sessionId":"6d54688d1f715b37c579af1c10830fb526330c8eb39f929290a28bb4cebbc41e","toolName":"scoped_echo","tenantId":"authz-tenant","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"errorData":{"sessionId":"6d54688d1f715b37c579af1c10830fb526330c8eb39f929290a28bb4cebbc41e","toolName":"scoped_echo","input":{"message":"blocked"},"requestId":"H425Q-YZMFD","timestamp":"2026-04-23T14:47:41.643Z","tenantId":"authz-tenant","operation":"HandleToolRequest","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"originalErrorName":"McpError","originalMessage":"Insufficient permissions.","originalStack":"McpError: Insufficient permissions.\n at forbidden (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:84:58)\n at withRequiredScopes (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/authUtils.js:61:15)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:68:17)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Insufficient permissions.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:101:42)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)","msg":"Error in tool:scoped_echo: Insufficient permissions."}
2
- {"level":50,"time":1776955662508,"env":"testing","version":"0.6.10","pid":87732,"requestId":"RD7TQ-YRWS5","timestamp":"2026-04-23T14:47:42.507Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"RD7TQ-YRWS5","timestamp":"2026-04-23T14:47:42.507Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
3
- {"level":50,"time":1776955662525,"env":"testing","version":"0.6.10","pid":87732,"requestId":"UHB72-SFY34","timestamp":"2026-04-23T14:47:42.524Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"UHB72-SFY34","timestamp":"2026-04-23T14:47:42.524Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Token has expired.","originalStack":"McpError: Token has expired.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at handleJoseVerifyError (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/claimParser.js:56:11)\n at verify (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/strategies/jwtStrategy.js:91:13)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Token has expired.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Token has expired."}
4
- {"level":50,"time":1776955662529,"env":"testing","version":"0.6.10","pid":87732,"requestId":"TCIU4-1PP9W","timestamp":"2026-04-23T14:47:42.529Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"GET","errorData":{"path":"/mcp","method":"GET","requestId":"TCIU4-1PP9W","timestamp":"2026-04-23T14:47:42.529Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
1
+ {"level":50,"time":1776964260716,"env":"testing","version":"0.6.11","pid":41913,"requestId":"39HTQ-Z1TY2","timestamp":"2026-04-23T17:11:00.716Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"39HTQ-Z1TY2","timestamp":"2026-04-23T17:11:00.716Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
2
+ {"level":50,"time":1776964260730,"env":"testing","version":"0.6.11","pid":41913,"requestId":"96UGI-8URT9","timestamp":"2026-04-23T17:11:00.730Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"POST","errorData":{"path":"/mcp","method":"POST","requestId":"96UGI-8URT9","timestamp":"2026-04-23T17:11:00.730Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Token has expired.","originalStack":"McpError: Token has expired.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at handleJoseVerifyError (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/claimParser.js:56:11)\n at verify (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/strategies/jwtStrategy.js:91:13)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Token has expired.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Token has expired."}
3
+ {"level":50,"time":1776964260733,"env":"testing","version":"0.6.11","pid":41913,"requestId":"Y7DPA-0GN8D","timestamp":"2026-04-23T17:11:00.733Z","operation":"httpErrorHandler","critical":false,"errorCode":-32006,"originalErrorType":"McpError","finalErrorType":"McpError","path":"/mcp","method":"GET","errorData":{"path":"/mcp","method":"GET","requestId":"Y7DPA-0GN8D","timestamp":"2026-04-23T17:11:00.733Z","operation":"httpErrorHandler","originalErrorName":"McpError","originalMessage":"Missing or invalid Authorization header. Bearer scheme required.","originalStack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at unauthorized (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:86:61)\n at authMiddleware (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/authMiddleware.js:64:19)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpTransport.js:119:22)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:22:23)\n at cors2 (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/middleware/cors/index.js:82:11)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Missing or invalid Authorization header. Bearer scheme required.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/http/httpErrorHandler.js:59:39)\n at dispatch (/Users/casey/Developer/github/mcp-ts-core/node_modules/hono/dist/compose.js:26:25)\n at processTicksAndRejections (native:7:39)","msg":"Error in httpTransport: Missing or invalid Authorization header. Bearer scheme required."}
4
+ {"level":50,"time":1776964261975,"env":"testing","version":"0.0.0-test","pid":41951,"requestId":"I9RC9-PC6JL","timestamp":"2026-04-23T17:11:01.974Z","operation":"HandleToolRequest","input":{"message":"blocked"},"critical":false,"errorCode":-32005,"originalErrorType":"McpError","finalErrorType":"McpError","sessionId":"461886e20f7bb81bf0c81e4dc87a34cd9e982327ed1faaed8d7c15b148320462","toolName":"scoped_echo","tenantId":"authz-tenant","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"errorData":{"sessionId":"461886e20f7bb81bf0c81e4dc87a34cd9e982327ed1faaed8d7c15b148320462","toolName":"scoped_echo","input":{"message":"blocked"},"requestId":"I9RC9-PC6JL","timestamp":"2026-04-23T17:11:01.974Z","tenantId":"authz-tenant","operation":"HandleToolRequest","auth":{"sub":"authz-user","scopes":["tool:other:read"],"clientId":"authz-client","tenantId":"authz-tenant"},"originalErrorName":"McpError","originalMessage":"Insufficient permissions.","originalStack":"McpError: Insufficient permissions.\n at forbidden (/Users/casey/Developer/github/mcp-ts-core/dist/types-global/errors.js:84:58)\n at withRequiredScopes (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/transports/auth/lib/authUtils.js:61:15)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:68:17)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)"},"stack":"McpError: Insufficient permissions.\n at handleError (/Users/casey/Developer/github/mcp-ts-core/dist/utils/internal/error-handler/errorHandler.js:168:23)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/dist/mcp-server/tools/utils/toolHandlerFactory.js:101:42)\n at executeToolHandler (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:231:34)\n at <anonymous> (/Users/casey/Developer/github/mcp-ts-core/node_modules/@modelcontextprotocol/sdk/dist/esm/server/mcp.js:126:43)\n at processTicksAndRejections (native:7:39)","msg":"Error in tool:scoped_echo: Insufficient permissions."}
@@ -13,7 +13,7 @@ export { type ChatMessage, countChatTokens, countTokens, type ModelHeuristics, }
13
13
  export { type FetchWithTimeoutOptions, fetchWithTimeout } from './network/fetchWithTimeout.js';
14
14
  export { type RetryOptions, withRetry } from './network/retry.js';
15
15
  export { DEFAULT_PAGINATION_CONFIG, decodeCursor, encodeCursor, extractCursor, type PaginatedResult, type PaginationState, paginateArray, } from './pagination/pagination.js';
16
- export { type AddPageOptions, Allow, CsvParser, csvParser, type DrawImageOptions, type DrawTextOptions, dateParser, type EmbedImageOptions, type ExtractTextOptions, type ExtractTextResult, type FillFormOptions, FrontmatterParser, type FrontmatterResult, frontmatterParser, JsonParser, jsonParser, type PageRange, type PdfMetadata, PdfParser, parseDateString, parseDateStringDetailed, pdfParser, type SetMetadataOptions, thinkBlockRegex, XmlParser, xmlParser, YamlParser, yamlParser, } from './parsing/index.js';
16
+ export { type AddPageOptions, Allow, CsvParser, csvParser, type DrawImageOptions, type DrawTextOptions, dateParser, type EmbedImageOptions, type ExtractArticleOptions, type ExtractArticleResult, type ExtractTextOptions, type ExtractTextResult, type FillFormOptions, FrontmatterParser, type FrontmatterResult, frontmatterParser, HtmlExtractor, htmlExtractor, JsonParser, jsonParser, type PageRange, type PdfMetadata, PdfParser, parseDateString, parseDateStringDetailed, pdfParser, type SetMetadataOptions, thinkBlockRegex, XmlParser, xmlParser, YamlParser, yamlParser, } from './parsing/index.js';
17
17
  export { type Job, SchedulerService, schedulerService } from './scheduling/scheduler.js';
18
18
  export { type EntityPrefixConfig, generateRequestContextId, generateUUID, type HtmlSanitizeConfig, type IdGenerationOptions, IdGenerator, idGenerator, type PathSanitizeOptions, type RateLimitConfig, type RateLimitEntry, RateLimiter, Sanitization, type SanitizedPathInfo, type SanitizeStringOptions, sanitization, sanitizeInputForLogging, } from './security/index.js';
19
19
  export { ATTR_CODE_FUNCTION_NAME, ATTR_CODE_NAMESPACE, ATTR_GEN_AI_REQUEST_MAX_TOKENS, ATTR_GEN_AI_REQUEST_MODEL, ATTR_GEN_AI_REQUEST_STREAMING, ATTR_GEN_AI_REQUEST_TEMPERATURE, ATTR_GEN_AI_REQUEST_TOP_P, ATTR_GEN_AI_RESPONSE_MODEL, ATTR_GEN_AI_SYSTEM, ATTR_GEN_AI_TOKEN_TYPE, ATTR_GEN_AI_USAGE_INPUT_TOKENS, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, ATTR_GEN_AI_USAGE_TOTAL_TOKENS, ATTR_MCP_AUTH_FAILURE_REASON, ATTR_MCP_AUTH_METHOD, ATTR_MCP_AUTH_OUTCOME, ATTR_MCP_AUTH_SCOPES, ATTR_MCP_AUTH_SUBJECT, ATTR_MCP_CLIENT_ID, ATTR_MCP_ERROR_CLASSIFIED_CODE, ATTR_MCP_GRAPH_DURATION_MS, ATTR_MCP_GRAPH_OPERATION, ATTR_MCP_GRAPH_SUCCESS, ATTR_MCP_RESOURCE_DURATION_MS, ATTR_MCP_RESOURCE_ERROR_CODE, ATTR_MCP_RESOURCE_MIME_TYPE, ATTR_MCP_RESOURCE_SIZE_BYTES, ATTR_MCP_RESOURCE_SUCCESS, ATTR_MCP_RESOURCE_URI, ATTR_MCP_SESSION_EVENT, ATTR_MCP_SPEECH_DURATION_MS, ATTR_MCP_SPEECH_INPUT_BYTES, ATTR_MCP_SPEECH_OPERATION, ATTR_MCP_SPEECH_OUTPUT_BYTES, ATTR_MCP_SPEECH_PROVIDER, ATTR_MCP_SPEECH_SUCCESS, ATTR_MCP_STORAGE_DURATION_MS, ATTR_MCP_STORAGE_KEY_COUNT, ATTR_MCP_STORAGE_OPERATION, ATTR_MCP_STORAGE_SUCCESS, ATTR_MCP_TASK_STATUS, ATTR_MCP_TASK_STORE_TYPE, ATTR_MCP_TENANT_ID, ATTR_MCP_TOOL_BATCH_FAILED, ATTR_MCP_TOOL_BATCH_SUCCEEDED, ATTR_MCP_TOOL_DURATION_MS, ATTR_MCP_TOOL_ERROR_CATEGORY, ATTR_MCP_TOOL_ERROR_CODE, ATTR_MCP_TOOL_INPUT_BYTES, ATTR_MCP_TOOL_NAME, ATTR_MCP_TOOL_OUTPUT_BYTES, ATTR_MCP_TOOL_PARTIAL_SUCCESS, ATTR_MCP_TOOL_SUCCESS, } from './telemetry/attributes.js';
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/utils/index.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAGH,OAAO,EACL,KAAK,SAAS,EACd,KAAK,UAAU,EACf,aAAa,EACb,KAAK,oBAAoB,EACzB,aAAa,EACb,UAAU,EACV,KAAK,iBAAiB,EACtB,IAAI,EACJ,eAAe,EACf,QAAQ,EACR,QAAQ,EACR,cAAc,EACd,KAAK,qBAAqB,EAC1B,KAAK,UAAU,EACf,aAAa,EACb,KAAK,oBAAoB,EACzB,KAAK,QAAQ,EACb,KAAK,SAAS,EACd,cAAc,EACd,aAAa,EACb,SAAS,GACV,MAAM,uBAAuB,CAAC;AAE/B,OAAO,EAAE,mBAAmB,EAAE,cAAc,EAAE,cAAc,EAAE,MAAM,wBAAwB,CAAC;AAE7F,OAAO,EAAE,YAAY,EAAE,MAAM,0CAA0C,CAAC;AACxE,YAAY,EACV,gBAAgB,EAChB,YAAY,EACZ,mBAAmB,EACnB,YAAY,GACb,MAAM,mCAAmC,CAAC;AAE3C,OAAO,EAAE,MAAM,EAAE,MAAM,EAAE,KAAK,WAAW,EAAE,MAAM,sBAAsB,CAAC;AAExE,OAAO,EACL,KAAK,WAAW,EAChB,KAAK,0BAA0B,EAC/B,KAAK,cAAc,EACnB,qBAAqB,GACtB,MAAM,8BAA8B,CAAC;AAEtC,OAAO,EAAE,KAAK,mBAAmB,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAE9E,OAAO,EACL,KAAK,WAAW,EAChB,eAAe,EACf,WAAW,EACX,KAAK,eAAe,GACrB,MAAM,2BAA2B,CAAC;AAEnC,OAAO,EAAE,KAAK,uBAAuB,EAAE,gBAAgB,EAAE,MAAM,+BAA+B,CAAC;AAC/F,OAAO,EAAE,KAAK,YAAY,EAAE,SAAS,EAAE,MAAM,oBAAoB,CAAC;AAElE,OAAO,EACL,yBAAyB,EACzB,YAAY,EACZ,YAAY,EACZ,aAAa,EACb,KAAK,eAAe,EACpB,KAAK,eAAe,EACpB,aAAa,GACd,MAAM,4BAA4B,CAAC;AAEpC,OAAO,EACL,KAAK,cAAc,EACnB,KAAK,EACL,SAAS,EACT,SAAS,EACT,KAAK,gBAAgB,EACrB,KAAK,eAAe,EACpB,UAAU,EACV,KAAK,iBAAiB,EACtB,KAAK,kBAAkB,EACvB,KAAK,iBAAiB,EACtB,KAAK,eAAe,EACpB,iBAAiB,EACjB,KAAK,iBAAiB,EACtB,iBAAiB,EACjB,UAAU,EACV,UAAU,EACV,KAAK,SAAS,EACd,KAAK,WAAW,EAChB,SAAS,EACT,eAAe,EACf,uBAAuB,EACvB,SAAS,EACT,KAAK,kBAAkB,EACvB,eAAe,EACf,SAAS,EACT,SAAS,EACT,UAAU,EACV,UAAU,GACX,MAAM,oBAAoB,CAAC;AAE5B,OAAO,EAAE,KAAK,GAAG,EAAE,gBAAgB,EAAE,gBAAgB,EAAE,MAAM,2BAA2B,CAAC;AAEzF,OAAO,EACL,KAAK,kBAAkB,EACvB,wBAAwB,EACxB,YAAY,EACZ,KAAK,kBAAkB,EACvB,KAAK,mBAAmB,EACxB,WAAW,EACX,WAAW,EACX,KAAK,mBAAmB,EACxB,KAAK,eAAe,EACpB,KAAK,cAAc,EACnB,WAAW,EACX,YAAY,EACZ,KAAK,iBAAiB,EACtB,KAAK,qBAAqB,EAC1B,YAAY,EACZ,uBAAuB,GACxB,MAAM,qBAAqB,CAAC;AAE7B,OAAO,EACL,uBAAuB,EACvB,mBAAmB,EACnB,8BAA8B,EAC9B,yBAAyB,EACzB,6BAA6B,EAC7B,+BAA+B,EAC/B,yBAAyB,EACzB,0BAA0B,EAC1B,kBAAkB,EAClB,sBAAsB,EACtB,8BAA8B,EAC9B,+BAA+B,EAC/B,8BAA8B,EAC9B,4BAA4B,EAC5B,oBAAoB,EACpB,qBAAqB,EACrB,oBAAoB,EACpB,qBAAqB,EACrB,kBAAkB,EAClB,8BAA8B,EAC9B,0BAA0B,EAC1B,wBAAwB,EACxB,sBAAsB,EACtB,6BAA6B,EAC7B,4BAA4B,EAC5B,2BAA2B,EAC3B,4BAA4B,EAC5B,yBAAyB,EACzB,qBAAqB,EACrB,sBAAsB,EACtB,2BAA2B,EAC3B,2BAA2B,EAC3B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,uBAAuB,EACvB,4BAA4B,EAC5B,0BAA0B,EAC1B,0BAA0B,EAC1B,wBAAwB,EACxB,oBAAoB,EACpB,wBAAwB,EACxB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,yBAAyB,EACzB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,qBAAqB,GACtB,MAAM,2BAA2B,CAAC;AAEnC,OAAO,EACL,uBAAuB,EACvB,GAAG,EACH,qBAAqB,GACtB,MAAM,gCAAgC,CAAC;AAExC,OAAO,EACL,aAAa,EACb,eAAe,EACf,mBAAmB,EACnB,QAAQ,GACT,MAAM,wBAAwB,CAAC;AAEhC,OAAO,EACL,gBAAgB,EAChB,4BAA4B,EAC5B,kBAAkB,EAClB,wBAAwB,EACxB,YAAY,EACZ,KAAK,eAAe,EACpB,QAAQ,GACT,MAAM,sBAAsB,CAAC;AAE9B,OAAO,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,kBAAkB,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/utils/index.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAGH,OAAO,EACL,KAAK,SAAS,EACd,KAAK,UAAU,EACf,aAAa,EACb,KAAK,oBAAoB,EACzB,aAAa,EACb,UAAU,EACV,KAAK,iBAAiB,EACtB,IAAI,EACJ,eAAe,EACf,QAAQ,EACR,QAAQ,EACR,cAAc,EACd,KAAK,qBAAqB,EAC1B,KAAK,UAAU,EACf,aAAa,EACb,KAAK,oBAAoB,EACzB,KAAK,QAAQ,EACb,KAAK,SAAS,EACd,cAAc,EACd,aAAa,EACb,SAAS,GACV,MAAM,uBAAuB,CAAC;AAE/B,OAAO,EAAE,mBAAmB,EAAE,cAAc,EAAE,cAAc,EAAE,MAAM,wBAAwB,CAAC;AAE7F,OAAO,EAAE,YAAY,EAAE,MAAM,0CAA0C,CAAC;AACxE,YAAY,EACV,gBAAgB,EAChB,YAAY,EACZ,mBAAmB,EACnB,YAAY,GACb,MAAM,mCAAmC,CAAC;AAE3C,OAAO,EAAE,MAAM,EAAE,MAAM,EAAE,KAAK,WAAW,EAAE,MAAM,sBAAsB,CAAC;AAExE,OAAO,EACL,KAAK,WAAW,EAChB,KAAK,0BAA0B,EAC/B,KAAK,cAAc,EACnB,qBAAqB,GACtB,MAAM,8BAA8B,CAAC;AAEtC,OAAO,EAAE,KAAK,mBAAmB,EAAE,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAE9E,OAAO,EACL,KAAK,WAAW,EAChB,eAAe,EACf,WAAW,EACX,KAAK,eAAe,GACrB,MAAM,2BAA2B,CAAC;AAEnC,OAAO,EAAE,KAAK,uBAAuB,EAAE,gBAAgB,EAAE,MAAM,+BAA+B,CAAC;AAC/F,OAAO,EAAE,KAAK,YAAY,EAAE,SAAS,EAAE,MAAM,oBAAoB,CAAC;AAElE,OAAO,EACL,yBAAyB,EACzB,YAAY,EACZ,YAAY,EACZ,aAAa,EACb,KAAK,eAAe,EACpB,KAAK,eAAe,EACpB,aAAa,GACd,MAAM,4BAA4B,CAAC;AAEpC,OAAO,EACL,KAAK,cAAc,EACnB,KAAK,EACL,SAAS,EACT,SAAS,EACT,KAAK,gBAAgB,EACrB,KAAK,eAAe,EACpB,UAAU,EACV,KAAK,iBAAiB,EACtB,KAAK,qBAAqB,EAC1B,KAAK,oBAAoB,EACzB,KAAK,kBAAkB,EACvB,KAAK,iBAAiB,EACtB,KAAK,eAAe,EACpB,iBAAiB,EACjB,KAAK,iBAAiB,EACtB,iBAAiB,EACjB,aAAa,EACb,aAAa,EACb,UAAU,EACV,UAAU,EACV,KAAK,SAAS,EACd,KAAK,WAAW,EAChB,SAAS,EACT,eAAe,EACf,uBAAuB,EACvB,SAAS,EACT,KAAK,kBAAkB,EACvB,eAAe,EACf,SAAS,EACT,SAAS,EACT,UAAU,EACV,UAAU,GACX,MAAM,oBAAoB,CAAC;AAE5B,OAAO,EAAE,KAAK,GAAG,EAAE,gBAAgB,EAAE,gBAAgB,EAAE,MAAM,2BAA2B,CAAC;AAEzF,OAAO,EACL,KAAK,kBAAkB,EACvB,wBAAwB,EACxB,YAAY,EACZ,KAAK,kBAAkB,EACvB,KAAK,mBAAmB,EACxB,WAAW,EACX,WAAW,EACX,KAAK,mBAAmB,EACxB,KAAK,eAAe,EACpB,KAAK,cAAc,EACnB,WAAW,EACX,YAAY,EACZ,KAAK,iBAAiB,EACtB,KAAK,qBAAqB,EAC1B,YAAY,EACZ,uBAAuB,GACxB,MAAM,qBAAqB,CAAC;AAE7B,OAAO,EACL,uBAAuB,EACvB,mBAAmB,EACnB,8BAA8B,EAC9B,yBAAyB,EACzB,6BAA6B,EAC7B,+BAA+B,EAC/B,yBAAyB,EACzB,0BAA0B,EAC1B,kBAAkB,EAClB,sBAAsB,EACtB,8BAA8B,EAC9B,+BAA+B,EAC/B,8BAA8B,EAC9B,4BAA4B,EAC5B,oBAAoB,EACpB,qBAAqB,EACrB,oBAAoB,EACpB,qBAAqB,EACrB,kBAAkB,EAClB,8BAA8B,EAC9B,0BAA0B,EAC1B,wBAAwB,EACxB,sBAAsB,EACtB,6BAA6B,EAC7B,4BAA4B,EAC5B,2BAA2B,EAC3B,4BAA4B,EAC5B,yBAAyB,EACzB,qBAAqB,EACrB,sBAAsB,EACtB,2BAA2B,EAC3B,2BAA2B,EAC3B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,uBAAuB,EACvB,4BAA4B,EAC5B,0BAA0B,EAC1B,0BAA0B,EAC1B,wBAAwB,EACxB,oBAAoB,EACpB,wBAAwB,EACxB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,yBAAyB,EACzB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,qBAAqB,GACtB,MAAM,2BAA2B,CAAC;AAEnC,OAAO,EACL,uBAAuB,EACvB,GAAG,EACH,qBAAqB,GACtB,MAAM,gCAAgC,CAAC;AAExC,OAAO,EACL,aAAa,EACb,eAAe,EACf,mBAAmB,EACnB,QAAQ,GACT,MAAM,wBAAwB,CAAC;AAEhC,OAAO,EACL,gBAAgB,EAChB,4BAA4B,EAC5B,kBAAkB,EAClB,wBAAwB,EACxB,YAAY,EACZ,KAAK,eAAe,EACpB,QAAQ,GACT,MAAM,sBAAsB,CAAC;AAE9B,OAAO,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,kBAAkB,CAAC"}
@@ -22,7 +22,7 @@ export { withRetry } from './network/retry.js';
22
22
  // Pagination
23
23
  export { DEFAULT_PAGINATION_CONFIG, decodeCursor, encodeCursor, extractCursor, paginateArray, } from './pagination/pagination.js';
24
24
  // Parsing
25
- export { Allow, CsvParser, csvParser, dateParser, FrontmatterParser, frontmatterParser, JsonParser, jsonParser, PdfParser, parseDateString, parseDateStringDetailed, pdfParser, thinkBlockRegex, XmlParser, xmlParser, YamlParser, yamlParser, } from './parsing/index.js';
25
+ export { Allow, CsvParser, csvParser, dateParser, FrontmatterParser, frontmatterParser, HtmlExtractor, htmlExtractor, JsonParser, jsonParser, PdfParser, parseDateString, parseDateStringDetailed, pdfParser, thinkBlockRegex, XmlParser, xmlParser, YamlParser, yamlParser, } from './parsing/index.js';
26
26
  // Scheduling
27
27
  export { SchedulerService, schedulerService } from './scheduling/scheduler.js';
28
28
  // Security
@@ -1 +1 @@
1
- {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/utils/index.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,aAAa;AACb,OAAO,EAGL,aAAa,EAEb,aAAa,EACb,UAAU,EAEV,IAAI,EACJ,eAAe,EACf,QAAQ,EACR,QAAQ,EACR,cAAc,EAGd,aAAa,EAIb,cAAc,EACd,aAAa,EACb,SAAS,GACV,MAAM,uBAAuB,CAAC;AAC/B,WAAW;AACX,OAAO,EAAE,mBAAmB,EAAE,cAAc,EAAE,cAAc,EAAE,MAAM,wBAAwB,CAAC;AAC7F,gBAAgB;AAChB,OAAO,EAAE,YAAY,EAAE,MAAM,0CAA0C,CAAC;AAOxE,SAAS;AACT,OAAO,EAAE,MAAM,EAAE,MAAM,EAAoB,MAAM,sBAAsB,CAAC;AACxE,kBAAkB;AAClB,OAAO,EAIL,qBAAqB,GACtB,MAAM,8BAA8B,CAAC;AACtC,UAAU;AACV,OAAO,EAA4B,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAC9E,iBAAiB;AACjB,OAAO,EAEL,eAAe,EACf,WAAW,GAEZ,MAAM,2BAA2B,CAAC;AACnC,UAAU;AACV,OAAO,EAAgC,gBAAgB,EAAE,MAAM,+BAA+B,CAAC;AAC/F,OAAO,EAAqB,SAAS,EAAE,MAAM,oBAAoB,CAAC;AAClE,aAAa;AACb,OAAO,EACL,yBAAyB,EACzB,YAAY,EACZ,YAAY,EACZ,aAAa,EAGb,aAAa,GACd,MAAM,4BAA4B,CAAC;AACpC,UAAU;AACV,OAAO,EAEL,KAAK,EACL,SAAS,EACT,SAAS,EAGT,UAAU,EAKV,iBAAiB,EAEjB,iBAAiB,EACjB,UAAU,EACV,UAAU,EAGV,SAAS,EACT,eAAe,EACf,uBAAuB,EACvB,SAAS,EAET,eAAe,EACf,SAAS,EACT,SAAS,EACT,UAAU,EACV,UAAU,GACX,MAAM,oBAAoB,CAAC;AAC5B,aAAa;AACb,OAAO,EAAY,gBAAgB,EAAE,gBAAgB,EAAE,MAAM,2BAA2B,CAAC;AACzF,WAAW;AACX,OAAO,EAEL,wBAAwB,EACxB,YAAY,EAGZ,WAAW,EACX,WAAW,EAIX,WAAW,EACX,YAAY,EAGZ,YAAY,EACZ,uBAAuB,GACxB,MAAM,qBAAqB,CAAC;AAC7B,iCAAiC;AACjC,OAAO,EACL,uBAAuB,EACvB,mBAAmB,EACnB,8BAA8B,EAC9B,yBAAyB,EACzB,6BAA6B,EAC7B,+BAA+B,EAC/B,yBAAyB,EACzB,0BAA0B,EAC1B,kBAAkB,EAClB,sBAAsB,EACtB,8BAA8B,EAC9B,+BAA+B,EAC/B,8BAA8B,EAC9B,4BAA4B,EAC5B,oBAAoB,EACpB,qBAAqB,EACrB,oBAAoB,EACpB,qBAAqB,EACrB,kBAAkB,EAClB,8BAA8B,EAC9B,0BAA0B,EAC1B,wBAAwB,EACxB,sBAAsB,EACtB,6BAA6B,EAC7B,4BAA4B,EAC5B,2BAA2B,EAC3B,4BAA4B,EAC5B,yBAAyB,EACzB,qBAAqB,EACrB,sBAAsB,EACtB,2BAA2B,EAC3B,2BAA2B,EAC3B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,uBAAuB,EACvB,4BAA4B,EAC5B,0BAA0B,EAC1B,0BAA0B,EAC1B,wBAAwB,EACxB,oBAAoB,EACpB,wBAAwB,EACxB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,yBAAyB,EACzB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,qBAAqB,GACtB,MAAM,2BAA2B,CAAC;AACnC,8BAA8B;AAC9B,OAAO,EACL,uBAAuB,EACvB,GAAG,EACH,qBAAqB,GACtB,MAAM,gCAAgC,CAAC;AACxC,sBAAsB;AACtB,OAAO,EACL,aAAa,EACb,eAAe,EACf,mBAAmB,EACnB,QAAQ,GACT,MAAM,wBAAwB,CAAC;AAChC,oBAAoB;AACpB,OAAO,EACL,gBAAgB,EAChB,4BAA4B,EAC5B,kBAAkB,EAClB,wBAAwB,EACxB,YAAY,EAEZ,QAAQ,GACT,MAAM,sBAAsB,CAAC;AAC9B,cAAc;AACd,OAAO,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,kBAAkB,CAAC"}
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/utils/index.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,aAAa;AACb,OAAO,EAGL,aAAa,EAEb,aAAa,EACb,UAAU,EAEV,IAAI,EACJ,eAAe,EACf,QAAQ,EACR,QAAQ,EACR,cAAc,EAGd,aAAa,EAIb,cAAc,EACd,aAAa,EACb,SAAS,GACV,MAAM,uBAAuB,CAAC;AAC/B,WAAW;AACX,OAAO,EAAE,mBAAmB,EAAE,cAAc,EAAE,cAAc,EAAE,MAAM,wBAAwB,CAAC;AAC7F,gBAAgB;AAChB,OAAO,EAAE,YAAY,EAAE,MAAM,0CAA0C,CAAC;AAOxE,SAAS;AACT,OAAO,EAAE,MAAM,EAAE,MAAM,EAAoB,MAAM,sBAAsB,CAAC;AACxE,kBAAkB;AAClB,OAAO,EAIL,qBAAqB,GACtB,MAAM,8BAA8B,CAAC;AACtC,UAAU;AACV,OAAO,EAA4B,WAAW,EAAE,MAAM,uBAAuB,CAAC;AAC9E,iBAAiB;AACjB,OAAO,EAEL,eAAe,EACf,WAAW,GAEZ,MAAM,2BAA2B,CAAC;AACnC,UAAU;AACV,OAAO,EAAgC,gBAAgB,EAAE,MAAM,+BAA+B,CAAC;AAC/F,OAAO,EAAqB,SAAS,EAAE,MAAM,oBAAoB,CAAC;AAClE,aAAa;AACb,OAAO,EACL,yBAAyB,EACzB,YAAY,EACZ,YAAY,EACZ,aAAa,EAGb,aAAa,GACd,MAAM,4BAA4B,CAAC;AACpC,UAAU;AACV,OAAO,EAEL,KAAK,EACL,SAAS,EACT,SAAS,EAGT,UAAU,EAOV,iBAAiB,EAEjB,iBAAiB,EACjB,aAAa,EACb,aAAa,EACb,UAAU,EACV,UAAU,EAGV,SAAS,EACT,eAAe,EACf,uBAAuB,EACvB,SAAS,EAET,eAAe,EACf,SAAS,EACT,SAAS,EACT,UAAU,EACV,UAAU,GACX,MAAM,oBAAoB,CAAC;AAC5B,aAAa;AACb,OAAO,EAAY,gBAAgB,EAAE,gBAAgB,EAAE,MAAM,2BAA2B,CAAC;AACzF,WAAW;AACX,OAAO,EAEL,wBAAwB,EACxB,YAAY,EAGZ,WAAW,EACX,WAAW,EAIX,WAAW,EACX,YAAY,EAGZ,YAAY,EACZ,uBAAuB,GACxB,MAAM,qBAAqB,CAAC;AAC7B,iCAAiC;AACjC,OAAO,EACL,uBAAuB,EACvB,mBAAmB,EACnB,8BAA8B,EAC9B,yBAAyB,EACzB,6BAA6B,EAC7B,+BAA+B,EAC/B,yBAAyB,EACzB,0BAA0B,EAC1B,kBAAkB,EAClB,sBAAsB,EACtB,8BAA8B,EAC9B,+BAA+B,EAC/B,8BAA8B,EAC9B,4BAA4B,EAC5B,oBAAoB,EACpB,qBAAqB,EACrB,oBAAoB,EACpB,qBAAqB,EACrB,kBAAkB,EAClB,8BAA8B,EAC9B,0BAA0B,EAC1B,wBAAwB,EACxB,sBAAsB,EACtB,6BAA6B,EAC7B,4BAA4B,EAC5B,2BAA2B,EAC3B,4BAA4B,EAC5B,yBAAyB,EACzB,qBAAqB,EACrB,sBAAsB,EACtB,2BAA2B,EAC3B,2BAA2B,EAC3B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,uBAAuB,EACvB,4BAA4B,EAC5B,0BAA0B,EAC1B,0BAA0B,EAC1B,wBAAwB,EACxB,oBAAoB,EACpB,wBAAwB,EACxB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,yBAAyB,EACzB,4BAA4B,EAC5B,wBAAwB,EACxB,yBAAyB,EACzB,kBAAkB,EAClB,0BAA0B,EAC1B,6BAA6B,EAC7B,qBAAqB,GACtB,MAAM,2BAA2B,CAAC;AACnC,8BAA8B;AAC9B,OAAO,EACL,uBAAuB,EACvB,GAAG,EACH,qBAAqB,GACtB,MAAM,gCAAgC,CAAC;AACxC,sBAAsB;AACtB,OAAO,EACL,aAAa,EACb,eAAe,EACf,mBAAmB,EACnB,QAAQ,GACT,MAAM,wBAAwB,CAAC;AAChC,oBAAoB;AACpB,OAAO,EACL,gBAAgB,EAChB,4BAA4B,EAC5B,kBAAkB,EAClB,wBAAwB,EACxB,YAAY,EAEZ,QAAQ,GACT,MAAM,sBAAsB,CAAC;AAC9B,cAAc;AACd,OAAO,EAAE,eAAe,EAAE,QAAQ,EAAE,MAAM,kBAAkB,CAAC"}
@@ -0,0 +1,146 @@
1
+ import { type RequestContext } from '../../utils/internal/requestContext.js';
2
+ /**
3
+ * Options for HTML article extraction.
4
+ */
5
+ export interface ExtractArticleOptions {
6
+ /**
7
+ * CSS selector to use as the main content element, bypassing auto-detection.
8
+ * If the selector does not match any element, Defuddle falls back to
9
+ * auto-detection.
10
+ */
11
+ contentSelector?: string;
12
+ /**
13
+ * Enable Defuddle's debug logging and bypass div flattening. Useful when
14
+ * diagnosing why a specific page extracts poorly. Defaults to `false`.
15
+ */
16
+ debug?: boolean;
17
+ /**
18
+ * Output format for `content`. `'markdown'` converts to Markdown (the common
19
+ * case for LLM-bound text), `'html'` returns cleaned HTML. Defaults to
20
+ * `'markdown'`.
21
+ */
22
+ format?: 'html' | 'markdown';
23
+ /**
24
+ * Preferred language for extraction and transcript selection (BCP 47, e.g.
25
+ * `'en'`, `'fr'`, `'ja'`).
26
+ */
27
+ language?: string;
28
+ /**
29
+ * Strip all images from the extracted content. Defaults to `false`.
30
+ */
31
+ removeImages?: boolean;
32
+ /**
33
+ * URL of the page being parsed. Passed to Defuddle for site-specific
34
+ * extractors and resolved link rewriting.
35
+ */
36
+ url?: string;
37
+ /**
38
+ * Allow Defuddle's async extractors to fetch from third-party APIs
39
+ * (e.g. FxTwitter) when no local content is available in the HTML.
40
+ * Defaults to `false` to keep extraction fully local and deterministic.
41
+ */
42
+ useAsync?: boolean;
43
+ }
44
+ /**
45
+ * Result of HTML article extraction.
46
+ *
47
+ * All fields except `content` are best-effort — they may be undefined if the
48
+ * source page does not provide the corresponding metadata.
49
+ */
50
+ export interface ExtractArticleResult {
51
+ /** Article author, if detected. */
52
+ author?: string;
53
+ /** Cleaned main content, either as Markdown or HTML depending on `format`. */
54
+ content: string;
55
+ /** Description or summary of the article, if present in page metadata. */
56
+ description?: string;
57
+ /** Domain of the source page (e.g. `'example.com'`), if derivable. */
58
+ domain?: string;
59
+ /** URL of the source site's favicon, if detected. */
60
+ favicon?: string;
61
+ /** URL of the article's primary image, if detected. */
62
+ image?: string;
63
+ /** Page language in BCP 47 format (e.g. `'en'`, `'en-US'`), if detected. */
64
+ language?: string;
65
+ /** Meta tags extracted from the page head, keyed by name. */
66
+ metaTags?: Record<string, string>;
67
+ /** Time `defuddle` spent parsing, in milliseconds. */
68
+ parseTime?: number;
69
+ /** Publication date string, if detected. Format is source-dependent. */
70
+ published?: string;
71
+ /** Raw schema.org data extracted from the page, if present. */
72
+ schemaOrgData?: unknown;
73
+ /** Site name, if detected (e.g. from Open Graph `og:site_name`). */
74
+ site?: string;
75
+ /** Article title, if detected. */
76
+ title?: string;
77
+ /** Word count of the extracted content, as reported by `defuddle`. */
78
+ wordCount?: number;
79
+ }
80
+ /**
81
+ * Utility class for extracting main article content from raw HTML.
82
+ *
83
+ * Lazily loads `defuddle` and `linkedom` on first use — both are optional peer
84
+ * dependencies (`bun add defuddle linkedom`). Returns cleaned main content
85
+ * plus best-effort metadata: title, author, description, Open Graph fields,
86
+ * schema.org data, word count.
87
+ *
88
+ * Does not guarantee structure beyond "main content of the page." For quirky
89
+ * pages, malformed HTML, or SPA shells with minimal server-rendered content,
90
+ * the result may be sparse — callers should degrade gracefully.
91
+ */
92
+ export declare class HtmlExtractor {
93
+ /**
94
+ * Extracts the main article content from an HTML string.
95
+ *
96
+ * Async due to lazy loading of `defuddle` and `linkedom`, and because
97
+ * Defuddle's node entry is itself async (supports async fallback extractors
98
+ * gated by `useAsync`).
99
+ *
100
+ * @param html - Raw HTML string to extract from.
101
+ * @param options - Optional extraction options (format, URL, content selector, etc.).
102
+ * @param context - Optional `RequestContext` for correlated logging and error metadata.
103
+ * @returns Extracted content and metadata. Only `content` is guaranteed to
104
+ * be present; all other fields are best-effort.
105
+ * @throws {McpError} With `ConfigurationError` if `defuddle` or `linkedom` is not installed.
106
+ * @throws {McpError} With `ValidationError` if the HTML string is empty after trimming,
107
+ * or if `defuddle` fails to parse the page.
108
+ *
109
+ * @example
110
+ * ```typescript
111
+ * import { htmlExtractor } from '../../utils/parsing/htmlExtractor.js';
112
+ *
113
+ * const html = await fetch('https://example.com/article').then((r) => r.text());
114
+ * const result = await htmlExtractor.extract(html, {
115
+ * url: 'https://example.com/article',
116
+ * format: 'markdown',
117
+ * });
118
+ *
119
+ * console.log(result.title);
120
+ * console.log(result.content);
121
+ * ```
122
+ */
123
+ extract(html: string, options?: ExtractArticleOptions, context?: RequestContext): Promise<ExtractArticleResult>;
124
+ }
125
+ /**
126
+ * Singleton instance of {@link HtmlExtractor}.
127
+ *
128
+ * Prefer this over constructing a new `HtmlExtractor` directly. Lazily loads
129
+ * `defuddle` and `linkedom` on first call, so there is no startup cost if
130
+ * HTML extraction is never used.
131
+ *
132
+ * @example
133
+ * ```typescript
134
+ * import { htmlExtractor } from '../../utils/parsing/htmlExtractor.js';
135
+ *
136
+ * const article = await htmlExtractor.extract(rawHtml, {
137
+ * url: 'https://example.com/post',
138
+ * format: 'markdown',
139
+ * });
140
+ *
141
+ * // Hand the content + metadata to the LLM
142
+ * llm.prompt({ title: article.title, body: article.content });
143
+ * ```
144
+ */
145
+ export declare const htmlExtractor: HtmlExtractor;
146
+ //# sourceMappingURL=htmlExtractor.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"htmlExtractor.d.ts","sourceRoot":"","sources":["../../../src/utils/parsing/htmlExtractor.ts"],"names":[],"mappings":"AAgCA,OAAO,EAAE,KAAK,cAAc,EAAyB,MAAM,oCAAoC,CAAC;AAqChG;;GAEG;AACH,MAAM,WAAW,qBAAqB;IACpC;;;;OAIG;IACH,eAAe,CAAC,EAAE,MAAM,CAAC;IACzB;;;OAGG;IACH,KAAK,CAAC,EAAE,OAAO,CAAC;IAChB;;;;OAIG;IACH,MAAM,CAAC,EAAE,MAAM,GAAG,UAAU,CAAC;IAC7B;;;OAGG;IACH,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB;;OAEG;IACH,YAAY,CAAC,EAAE,OAAO,CAAC;IACvB;;;OAGG;IACH,GAAG,CAAC,EAAE,MAAM,CAAC;IACb;;;;OAIG;IACH,QAAQ,CAAC,EAAE,OAAO,CAAC;CACpB;AAED;;;;;GAKG;AACH,MAAM,WAAW,oBAAoB;IACnC,mCAAmC;IACnC,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,8EAA8E;IAC9E,OAAO,EAAE,MAAM,CAAC;IAChB,0EAA0E;IAC1E,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,sEAAsE;IACtE,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,qDAAqD;IACrD,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,uDAAuD;IACvD,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,4EAA4E;IAC5E,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,6DAA6D;IAC7D,QAAQ,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAAC,CAAC;IAClC,sDAAsD;IACtD,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,wEAAwE;IACxE,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,+DAA+D;IAC/D,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,oEAAoE;IACpE,IAAI,CAAC,EAAE,MAAM,CAAC;IACd,kCAAkC;IAClC,KAAK,CAAC,EAAE,MAAM,CAAC;IACf,sEAAsE;IACtE,SAAS,CAAC,EAAE,MAAM,CAAC;CACpB;AAED;;;;;;;;;;;GAWG;AACH,qBAAa,aAAa;IACxB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;OA6BG;IACG,OAAO,CACX,IAAI,EAAE,MAAM,EACZ,OAAO,CAAC,EAAE,qBAAqB,EAC/B,OAAO,CAAC,EAAE,cAAc,GACvB,OAAO,CAAC,oBAAoB,CAAC;CA4EjC;AAED;;;;;;;;;;;;;;;;;;;GAmBG;AACH,eAAO,MAAM,aAAa,eAAsB,CAAC"}
@@ -0,0 +1,171 @@
1
+ import { McpError, validationError } from '../../types-global/errors.js';
2
+ import { lazyImport } from '../../utils/internal/lazyImport.js';
3
+ import { logger } from '../../utils/internal/logger.js';
4
+ import { requestContextService } from '../../utils/internal/requestContext.js';
5
+ const getDefuddle = lazyImport(() => import('defuddle/node'), 'Install "defuddle" to use HTML article extraction: bun add defuddle linkedom');
6
+ const getLinkedom = lazyImport(() => import('linkedom'), 'Install "linkedom" to use HTML article extraction: bun add defuddle linkedom');
7
+ /** Flattens defuddle's `MetaTagItem[]` into a `Record<string, string>` keyed
8
+ * by `name ?? property`. Items without a usable key or content are skipped;
9
+ * returns `undefined` if nothing usable is left so callers can omit the
10
+ * field from the result entirely. */
11
+ function flattenMetaTags(tags) {
12
+ if (!tags || tags.length === 0)
13
+ return;
14
+ const out = {};
15
+ for (const tag of tags) {
16
+ const key = tag.name ?? tag.property;
17
+ if (!key || tag.content == null)
18
+ continue;
19
+ out[key] = tag.content;
20
+ }
21
+ return Object.keys(out).length > 0 ? out : undefined;
22
+ }
23
+ /**
24
+ * Utility class for extracting main article content from raw HTML.
25
+ *
26
+ * Lazily loads `defuddle` and `linkedom` on first use — both are optional peer
27
+ * dependencies (`bun add defuddle linkedom`). Returns cleaned main content
28
+ * plus best-effort metadata: title, author, description, Open Graph fields,
29
+ * schema.org data, word count.
30
+ *
31
+ * Does not guarantee structure beyond "main content of the page." For quirky
32
+ * pages, malformed HTML, or SPA shells with minimal server-rendered content,
33
+ * the result may be sparse — callers should degrade gracefully.
34
+ */
35
+ export class HtmlExtractor {
36
+ /**
37
+ * Extracts the main article content from an HTML string.
38
+ *
39
+ * Async due to lazy loading of `defuddle` and `linkedom`, and because
40
+ * Defuddle's node entry is itself async (supports async fallback extractors
41
+ * gated by `useAsync`).
42
+ *
43
+ * @param html - Raw HTML string to extract from.
44
+ * @param options - Optional extraction options (format, URL, content selector, etc.).
45
+ * @param context - Optional `RequestContext` for correlated logging and error metadata.
46
+ * @returns Extracted content and metadata. Only `content` is guaranteed to
47
+ * be present; all other fields are best-effort.
48
+ * @throws {McpError} With `ConfigurationError` if `defuddle` or `linkedom` is not installed.
49
+ * @throws {McpError} With `ValidationError` if the HTML string is empty after trimming,
50
+ * or if `defuddle` fails to parse the page.
51
+ *
52
+ * @example
53
+ * ```typescript
54
+ * import { htmlExtractor } from '../../utils/parsing/htmlExtractor.js';
55
+ *
56
+ * const html = await fetch('https://example.com/article').then((r) => r.text());
57
+ * const result = await htmlExtractor.extract(html, {
58
+ * url: 'https://example.com/article',
59
+ * format: 'markdown',
60
+ * });
61
+ *
62
+ * console.log(result.title);
63
+ * console.log(result.content);
64
+ * ```
65
+ */
66
+ async extract(html, options, context) {
67
+ const logContext = context ??
68
+ requestContextService.createRequestContext({
69
+ operation: 'HtmlExtractor.extract',
70
+ });
71
+ const trimmed = html.trim();
72
+ if (!trimmed) {
73
+ throw validationError('HTML string is empty.', context);
74
+ }
75
+ const [{ Defuddle }, { parseHTML }] = await Promise.all([getDefuddle(), getLinkedom()]);
76
+ const format = options?.format ?? 'markdown';
77
+ const defuddleOptions = {
78
+ markdown: format === 'markdown',
79
+ useAsync: options?.useAsync ?? false,
80
+ ...(options?.contentSelector !== undefined && {
81
+ contentSelector: options.contentSelector,
82
+ }),
83
+ ...(options?.removeImages !== undefined && {
84
+ removeImages: options.removeImages,
85
+ }),
86
+ ...(options?.debug !== undefined && { debug: options.debug }),
87
+ ...(options?.language !== undefined && { language: options.language }),
88
+ };
89
+ logger.debug('Extracting article content from HTML.', {
90
+ ...logContext,
91
+ byteLength: trimmed.length,
92
+ format,
93
+ hasUrl: Boolean(options?.url),
94
+ hasContentSelector: Boolean(options?.contentSelector),
95
+ });
96
+ try {
97
+ const { document } = parseHTML(trimmed);
98
+ const result = await Defuddle(document, options?.url, defuddleOptions);
99
+ logger.debug('Successfully extracted article.', {
100
+ ...logContext,
101
+ wordCount: result.wordCount,
102
+ titlePresent: Boolean(result.title),
103
+ parseTimeMs: result.parseTime,
104
+ });
105
+ const out = { content: result.content ?? '' };
106
+ if (result.title)
107
+ out.title = result.title;
108
+ if (result.author)
109
+ out.author = result.author;
110
+ if (result.description)
111
+ out.description = result.description;
112
+ if (result.domain)
113
+ out.domain = result.domain;
114
+ if (result.favicon)
115
+ out.favicon = result.favicon;
116
+ if (result.image)
117
+ out.image = result.image;
118
+ if (result.language)
119
+ out.language = result.language;
120
+ if (result.published)
121
+ out.published = result.published;
122
+ if (result.site)
123
+ out.site = result.site;
124
+ if (typeof result.parseTime === 'number')
125
+ out.parseTime = result.parseTime;
126
+ if (typeof result.wordCount === 'number')
127
+ out.wordCount = result.wordCount;
128
+ if (result.schemaOrgData)
129
+ out.schemaOrgData = result.schemaOrgData;
130
+ const metaTags = flattenMetaTags(result.metaTags);
131
+ if (metaTags)
132
+ out.metaTags = metaTags;
133
+ return out;
134
+ }
135
+ catch (e) {
136
+ if (e instanceof McpError)
137
+ throw e;
138
+ const error = e instanceof Error ? e : new Error(String(e));
139
+ logger.error('Failed to extract article from HTML.', {
140
+ ...logContext,
141
+ errorDetails: error.message,
142
+ });
143
+ throw validationError(`Failed to extract article from HTML: ${error.message}`, {
144
+ ...context,
145
+ rawError: error.stack ?? String(error),
146
+ });
147
+ }
148
+ }
149
+ }
150
+ /**
151
+ * Singleton instance of {@link HtmlExtractor}.
152
+ *
153
+ * Prefer this over constructing a new `HtmlExtractor` directly. Lazily loads
154
+ * `defuddle` and `linkedom` on first call, so there is no startup cost if
155
+ * HTML extraction is never used.
156
+ *
157
+ * @example
158
+ * ```typescript
159
+ * import { htmlExtractor } from '../../utils/parsing/htmlExtractor.js';
160
+ *
161
+ * const article = await htmlExtractor.extract(rawHtml, {
162
+ * url: 'https://example.com/post',
163
+ * format: 'markdown',
164
+ * });
165
+ *
166
+ * // Hand the content + metadata to the LLM
167
+ * llm.prompt({ title: article.title, body: article.content });
168
+ * ```
169
+ */
170
+ export const htmlExtractor = new HtmlExtractor();
171
+ //# sourceMappingURL=htmlExtractor.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"htmlExtractor.js","sourceRoot":"","sources":["../../../src/utils/parsing/htmlExtractor.ts"],"names":[],"mappings":"AA6BA,OAAO,EAAE,QAAQ,EAAE,eAAe,EAAE,MAAM,0BAA0B,CAAC;AACrE,OAAO,EAAE,UAAU,EAAE,MAAM,gCAAgC,CAAC;AAC5D,OAAO,EAAE,MAAM,EAAE,MAAM,4BAA4B,CAAC;AACpD,OAAO,EAAuB,qBAAqB,EAAE,MAAM,oCAAoC,CAAC;AAEhG,MAAM,WAAW,GAAG,UAAU,CAC5B,GAAG,EAAE,CAAC,MAAM,CAAC,eAAe,CAAC,EAC7B,8EAA8E,CAC/E,CAAC;AAEF,MAAM,WAAW,GAAG,UAAU,CAC5B,GAAG,EAAE,CAAC,MAAM,CAAC,UAAU,CAAC,EACxB,8EAA8E,CAC/E,CAAC;AAUF;;;sCAGsC;AACtC,SAAS,eAAe,CACtB,IAAuC;IAEvC,IAAI,CAAC,IAAI,IAAI,IAAI,CAAC,MAAM,KAAK,CAAC;QAAE,OAAO;IACvC,MAAM,GAAG,GAA2B,EAAE,CAAC;IACvC,KAAK,MAAM,GAAG,IAAI,IAAI,EAAE,CAAC;QACvB,MAAM,GAAG,GAAG,GAAG,CAAC,IAAI,IAAI,GAAG,CAAC,QAAQ,CAAC;QACrC,IAAI,CAAC,GAAG,IAAI,GAAG,CAAC,OAAO,IAAI,IAAI;YAAE,SAAS;QAC1C,GAAG,CAAC,GAAG,CAAC,GAAG,GAAG,CAAC,OAAO,CAAC;IACzB,CAAC;IACD,OAAO,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,SAAS,CAAC;AACvD,CAAC;AAkFD;;;;;;;;;;;GAWG;AACH,MAAM,OAAO,aAAa;IACxB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;OA6BG;IACH,KAAK,CAAC,OAAO,CACX,IAAY,EACZ,OAA+B,EAC/B,OAAwB;QAExB,MAAM,UAAU,GACd,OAAO;YACP,qBAAqB,CAAC,oBAAoB,CAAC;gBACzC,SAAS,EAAE,uBAAuB;aACnC,CAAC,CAAC;QAEL,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,EAAE,CAAC;QAC5B,IAAI,CAAC,OAAO,EAAE,CAAC;YACb,MAAM,eAAe,CAAC,uBAAuB,EAAE,OAAO,CAAC,CAAC;QAC1D,CAAC;QAED,MAAM,CAAC,EAAE,QAAQ,EAAE,EAAE,EAAE,SAAS,EAAE,CAAC,GAAG,MAAM,OAAO,CAAC,GAAG,CAAC,CAAC,WAAW,EAAE,EAAE,WAAW,EAAE,CAAC,CAAC,CAAC;QAExF,MAAM,MAAM,GAAG,OAAO,EAAE,MAAM,IAAI,UAAU,CAAC;QAC7C,MAAM,eAAe,GAAoB;YACvC,QAAQ,EAAE,MAAM,KAAK,UAAU;YAC/B,QAAQ,EAAE,OAAO,EAAE,QAAQ,IAAI,KAAK;YACpC,GAAG,CAAC,OAAO,EAAE,eAAe,KAAK,SAAS,IAAI;gBAC5C,eAAe,EAAE,OAAO,CAAC,eAAe;aACzC,CAAC;YACF,GAAG,CAAC,OAAO,EAAE,YAAY,KAAK,SAAS,IAAI;gBACzC,YAAY,EAAE,OAAO,CAAC,YAAY;aACnC,CAAC;YACF,GAAG,CAAC,OAAO,EAAE,KAAK,KAAK,SAAS,IAAI,EAAE,KAAK,EAAE,OAAO,CAAC,KAAK,EAAE,CAAC;YAC7D,GAAG,CAAC,OAAO,EAAE,QAAQ,KAAK,SAAS,IAAI,EAAE,QAAQ,EAAE,OAAO,CAAC,QAAQ,EAAE,CAAC;SACvE,CAAC;QAEF,MAAM,CAAC,KAAK,CAAC,uCAAuC,EAAE;YACpD,GAAG,UAAU;YACb,UAAU,EAAE,OAAO,CAAC,MAAM;YAC1B,MAAM;YACN,MAAM,EAAE,OAAO,CAAC,OAAO,EAAE,GAAG,CAAC;YAC7B,kBAAkB,EAAE,OAAO,CAAC,OAAO,EAAE,eAAe,CAAC;SACtD,CAAC,CAAC;QAEH,IAAI,CAAC;YACH,MAAM,EAAE,QAAQ,EAAE,GAAG,SAAS,CAAC,OAAO,CAAC,CAAC;YACxC,MAAM,MAAM,GAAG,MAAM,QAAQ,CAAC,QAAQ,EAAE,OAAO,EAAE,GAAG,EAAE,eAAe,CAAC,CAAC;YAEvE,MAAM,CAAC,KAAK,CAAC,iCAAiC,EAAE;gBAC9C,GAAG,UAAU;gBACb,SAAS,EAAE,MAAM,CAAC,SAAS;gBAC3B,YAAY,EAAE,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC;gBACnC,WAAW,EAAE,MAAM,CAAC,SAAS;aAC9B,CAAC,CAAC;YAEH,MAAM,GAAG,GAAyB,EAAE,OAAO,EAAE,MAAM,CAAC,OAAO,IAAI,EAAE,EAAE,CAAC;YACpE,IAAI,MAAM,CAAC,KAAK;gBAAE,GAAG,CAAC,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC;YAC3C,IAAI,MAAM,CAAC,MAAM;gBAAE,GAAG,CAAC,MAAM,GAAG,MAAM,CAAC,MAAM,CAAC;YAC9C,IAAI,MAAM,CAAC,WAAW;gBAAE,GAAG,CAAC,WAAW,GAAG,MAAM,CAAC,WAAW,CAAC;YAC7D,IAAI,MAAM,CAAC,MAAM;gBAAE,GAAG,CAAC,MAAM,GAAG,MAAM,CAAC,MAAM,CAAC;YAC9C,IAAI,MAAM,CAAC,OAAO;gBAAE,GAAG,CAAC,OAAO,GAAG,MAAM,CAAC,OAAO,CAAC;YACjD,IAAI,MAAM,CAAC,KAAK;gBAAE,GAAG,CAAC,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC;YAC3C,IAAI,MAAM,CAAC,QAAQ;gBAAE,GAAG,CAAC,QAAQ,GAAG,MAAM,CAAC,QAAQ,CAAC;YACpD,IAAI,MAAM,CAAC,SAAS;gBAAE,GAAG,CAAC,SAAS,GAAG,MAAM,CAAC,SAAS,CAAC;YACvD,IAAI,MAAM,CAAC,IAAI;gBAAE,GAAG,CAAC,IAAI,GAAG,MAAM,CAAC,IAAI,CAAC;YACxC,IAAI,OAAO,MAAM,CAAC,SAAS,KAAK,QAAQ;gBAAE,GAAG,CAAC,SAAS,GAAG,MAAM,CAAC,SAAS,CAAC;YAC3E,IAAI,OAAO,MAAM,CAAC,SAAS,KAAK,QAAQ;gBAAE,GAAG,CAAC,SAAS,GAAG,MAAM,CAAC,SAAS,CAAC;YAC3E,IAAI,MAAM,CAAC,aAAa;gBAAE,GAAG,CAAC,aAAa,GAAG,MAAM,CAAC,aAAa,CAAC;YACnE,MAAM,QAAQ,GAAG,eAAe,CAAC,MAAM,CAAC,QAAQ,CAAC,CAAC;YAClD,IAAI,QAAQ;gBAAE,GAAG,CAAC,QAAQ,GAAG,QAAQ,CAAC;YACtC,OAAO,GAAG,CAAC;QACb,CAAC;QAAC,OAAO,CAAU,EAAE,CAAC;YACpB,IAAI,CAAC,YAAY,QAAQ;gBAAE,MAAM,CAAC,CAAC;YACnC,MAAM,KAAK,GAAG,CAAC,YAAY,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,KAAK,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC;YAC5D,MAAM,CAAC,KAAK,CAAC,sCAAsC,EAAE;gBACnD,GAAG,UAAU;gBACb,YAAY,EAAE,KAAK,CAAC,OAAO;aAC5B,CAAC,CAAC;YACH,MAAM,eAAe,CAAC,wCAAwC,KAAK,CAAC,OAAO,EAAE,EAAE;gBAC7E,GAAG,OAAO;gBACV,QAAQ,EAAE,KAAK,CAAC,KAAK,IAAI,MAAM,CAAC,KAAK,CAAC;aACvC,CAAC,CAAC;QACL,CAAC;IACH,CAAC;CACF;AAED;;;;;;;;;;;;;;;;;;;GAmBG;AACH,MAAM,CAAC,MAAM,aAAa,GAAG,IAAI,aAAa,EAAE,CAAC"}
@@ -7,6 +7,7 @@
7
7
  export { CsvParser, csvParser } from './csvParser.js';
8
8
  export { dateParser, parseDateString, parseDateStringDetailed } from './dateParser.js';
9
9
  export { FrontmatterParser, type FrontmatterResult, frontmatterParser, } from './frontmatterParser.js';
10
+ export { type ExtractArticleOptions, type ExtractArticleResult, HtmlExtractor, htmlExtractor, } from './htmlExtractor.js';
10
11
  export { Allow, JsonParser, jsonParser } from './jsonParser.js';
11
12
  export { type AddPageOptions, type DrawImageOptions, type DrawTextOptions, type EmbedImageOptions, type ExtractTextOptions, type ExtractTextResult, type FillFormOptions, type PageRange, type PdfMetadata, PdfParser, pdfParser, type SetMetadataOptions, } from './pdfParser.js';
12
13
  export { thinkBlockRegex } from './thinkBlock.js';
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../../src/utils/parsing/index.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,eAAe,EAAE,uBAAuB,EAAE,MAAM,iBAAiB,CAAC;AACvF,OAAO,EACL,iBAAiB,EACjB,KAAK,iBAAiB,EACtB,iBAAiB,GAClB,MAAM,wBAAwB,CAAC;AAChC,OAAO,EAAE,KAAK,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC;AAChE,OAAO,EACL,KAAK,cAAc,EACnB,KAAK,gBAAgB,EACrB,KAAK,eAAe,EACpB,KAAK,iBAAiB,EACtB,KAAK,kBAAkB,EACvB,KAAK,iBAAiB,EACtB,KAAK,eAAe,EACpB,KAAK,SAAS,EACd,KAAK,WAAW,EAChB,SAAS,EACT,SAAS,EACT,KAAK,kBAAkB,GACxB,MAAM,gBAAgB,CAAC;AACxB,OAAO,EAAE,eAAe,EAAE,MAAM,iBAAiB,CAAC;AAClD,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../../src/utils/parsing/index.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,eAAe,EAAE,uBAAuB,EAAE,MAAM,iBAAiB,CAAC;AACvF,OAAO,EACL,iBAAiB,EACjB,KAAK,iBAAiB,EACtB,iBAAiB,GAClB,MAAM,wBAAwB,CAAC;AAChC,OAAO,EACL,KAAK,qBAAqB,EAC1B,KAAK,oBAAoB,EACzB,aAAa,EACb,aAAa,GACd,MAAM,oBAAoB,CAAC;AAC5B,OAAO,EAAE,KAAK,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC;AAChE,OAAO,EACL,KAAK,cAAc,EACnB,KAAK,gBAAgB,EACrB,KAAK,eAAe,EACpB,KAAK,iBAAiB,EACtB,KAAK,kBAAkB,EACvB,KAAK,iBAAiB,EACtB,KAAK,eAAe,EACpB,KAAK,SAAS,EACd,KAAK,WAAW,EAChB,SAAS,EACT,SAAS,EACT,KAAK,kBAAkB,GACxB,MAAM,gBAAgB,CAAC;AACxB,OAAO,EAAE,eAAe,EAAE,MAAM,iBAAiB,CAAC;AAClD,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC"}
@@ -7,6 +7,7 @@
7
7
  export { CsvParser, csvParser } from './csvParser.js';
8
8
  export { dateParser, parseDateString, parseDateStringDetailed } from './dateParser.js';
9
9
  export { FrontmatterParser, frontmatterParser, } from './frontmatterParser.js';
10
+ export { HtmlExtractor, htmlExtractor, } from './htmlExtractor.js';
10
11
  export { Allow, JsonParser, jsonParser } from './jsonParser.js';
11
12
  export { PdfParser, pdfParser, } from './pdfParser.js';
12
13
  export { thinkBlockRegex } from './thinkBlock.js';
@@ -1 +1 @@
1
- {"version":3,"file":"index.js","sourceRoot":"","sources":["../../../src/utils/parsing/index.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,eAAe,EAAE,uBAAuB,EAAE,MAAM,iBAAiB,CAAC;AACvF,OAAO,EACL,iBAAiB,EAEjB,iBAAiB,GAClB,MAAM,wBAAwB,CAAC;AAChC,OAAO,EAAE,KAAK,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC;AAChE,OAAO,EAUL,SAAS,EACT,SAAS,GAEV,MAAM,gBAAgB,CAAC;AACxB,OAAO,EAAE,eAAe,EAAE,MAAM,iBAAiB,CAAC;AAClD,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC"}
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../../../src/utils/parsing/index.ts"],"names":[],"mappings":"AAAA;;;;;GAKG;AAEH,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,eAAe,EAAE,uBAAuB,EAAE,MAAM,iBAAiB,CAAC;AACvF,OAAO,EACL,iBAAiB,EAEjB,iBAAiB,GAClB,MAAM,wBAAwB,CAAC;AAChC,OAAO,EAGL,aAAa,EACb,aAAa,GACd,MAAM,oBAAoB,CAAC;AAC5B,OAAO,EAAE,KAAK,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC;AAChE,OAAO,EAUL,SAAS,EACT,SAAS,GAEV,MAAM,gBAAgB,CAAC;AACxB,OAAO,EAAE,eAAe,EAAE,MAAM,iBAAiB,CAAC;AAClD,OAAO,EAAE,SAAS,EAAE,SAAS,EAAE,MAAM,gBAAgB,CAAC;AACtD,OAAO,EAAE,UAAU,EAAE,UAAU,EAAE,MAAM,iBAAiB,CAAC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@cyanheads/mcp-ts-core",
3
- "version": "0.6.10",
3
+ "version": "0.6.11",
4
4
  "mcpName": "io.github.cyanheads/mcp-ts-core",
5
5
  "description": "Agent-native TypeScript framework for building MCP servers. Declarative definitions with auth, multi-backend storage, OpenTelemetry, and first-class support for Bun/Node/Cloudflare Workers.",
6
6
  "main": "dist/core/index.js",
@@ -159,12 +159,12 @@
159
159
  "yaml": "1.10.3"
160
160
  },
161
161
  "devDependencies": {
162
- "@biomejs/biome": "2.4.12",
162
+ "@biomejs/biome": "2.4.13",
163
163
  "@cloudflare/workers-types": "^4.20260423.1",
164
164
  "@hono/otel": "^1.1.1",
165
- "@opentelemetry/instrumentation-http": "^0.215.0",
166
165
  "@opentelemetry/exporter-metrics-otlp-http": "^0.215.0",
167
166
  "@opentelemetry/exporter-trace-otlp-http": "^0.215.0",
167
+ "@opentelemetry/instrumentation-http": "^0.215.0",
168
168
  "@opentelemetry/instrumentation-pino": "^0.61.0",
169
169
  "@opentelemetry/resources": "^2.7.0",
170
170
  "@opentelemetry/sdk-metrics": "^2.7.0",
@@ -183,20 +183,22 @@
183
183
  "bun-types": "^1.3.13",
184
184
  "chrono-node": "^2.9.0",
185
185
  "clipboardy": "^5.3.1",
186
+ "defuddle": "^0.18.1",
186
187
  "depcheck": "^1.4.7",
187
188
  "diff": "^9.0.0",
188
189
  "execa": "^9.6.1",
189
190
  "fast-check": "^4.7.0",
190
- "js-yaml": "^4.1.1",
191
191
  "ignore": "^7.0.5",
192
+ "js-yaml": "^4.1.1",
193
+ "linkedom": "^0.18.12",
192
194
  "node-cron": "^4.2.1",
193
195
  "openai": "^6.34.0",
194
196
  "papaparse": "^5.5.3",
195
197
  "partial-json": "^0.1.7",
196
198
  "pdf-lib": "^1.17.1",
197
199
  "pino-pretty": "^13.1.3",
198
- "sanitize-html": "^2.17.3",
199
200
  "repomix": "^1.13.1",
201
+ "sanitize-html": "^2.17.3",
200
202
  "tsc-alias": "^1.8.16",
201
203
  "typedoc": "^0.28.19",
202
204
  "typescript": "^6.0.3",
@@ -278,9 +280,11 @@
278
280
  "@opentelemetry/semantic-conventions": "^1.40.0",
279
281
  "@supabase/supabase-js": "^2.103.3",
280
282
  "chrono-node": "^2.9.0",
283
+ "defuddle": "^0.18.1",
281
284
  "diff": "latest",
282
285
  "fast-xml-parser": "latest",
283
286
  "js-yaml": "^4.1.1",
287
+ "linkedom": "^0.18.12",
284
288
  "node-cron": "^4.2.1",
285
289
  "openai": "^6.34.0",
286
290
  "papaparse": "^5.5.3",
@@ -327,6 +331,9 @@
327
331
  "chrono-node": {
328
332
  "optional": true
329
333
  },
334
+ "defuddle": {
335
+ "optional": true
336
+ },
330
337
  "diff": {
331
338
  "optional": true
332
339
  },
@@ -336,6 +343,9 @@
336
343
  "js-yaml": {
337
344
  "optional": true
338
345
  },
346
+ "linkedom": {
347
+ "optional": true
348
+ },
339
349
  "node-cron": {
340
350
  "optional": true
341
351
  },
@@ -1,127 +1,250 @@
1
1
  ---
2
2
  name: field-test
3
3
  description: >
4
- Exercise tools, resources, and prompts with real-world inputs to verify behavior end-to-end. Use after adding or modifying definitions, or when the user asks to test, try out, or verify their MCP surface. Calls each definition with realistic and adversarial inputs and produces a report of issues, pain points, and recommendations.
4
+ Exercise tools, resources, and prompts against a live HTTP server via MCP JSON-RPC over curl. Starts the server, surfaces the catalog, runs real and adversarial inputs, and produces a tight report with concrete findings and numbered follow-up options. Use after adding or modifying definitions, or when the user asks to test, try out, or verify their MCP surface.
5
5
  metadata:
6
6
  author: cyanheads
7
- version: "1.3"
7
+ version: "2.0"
8
8
  audience: external
9
9
  type: debug
10
10
  ---
11
11
 
12
12
  ## Context
13
13
 
14
- Unit tests (`add-test` skill) verify handler logic with mocked context. Field testing verifies the full picture: real server, real transport, real inputs, real outputs. It catches issues that unit tests miss — bad descriptions, awkward input shapes, unhelpful error messages, missing format functions, schema mismatches, silent divergence between `structuredContent` and model-visible `content[]`, and surprising edge-case behavior.
14
+ Unit tests (`add-test` skill) verify handler logic with mocked context. Field testing exercises the real HTTP transport with real JSON-RPC: starts the server, calls `initialize`, surfaces the catalog, runs inputs, and checks what a client actually sees. It catches what unit tests miss — awkward input shapes, unhelpful errors, missing format output, drift between `structuredContent` and `content[]`, edge-case surprises.
15
15
 
16
- **Actively use** the tools — don't just read their code.
16
+ **Actively call the tools. Don't read code and guess.**
17
17
 
18
18
  ---
19
19
 
20
20
  ## Steps
21
21
 
22
- ### 1. Surface available definitions
22
+ ### 1. Start the server
23
+
24
+ Write the helper to `/tmp/mcp-field-test.sh` once, then source it in every subsequent Bash call. Helper keeps PID / URL / session id in `/tmp/mcp-field-test.env` so state survives across tool invocations.
25
+
26
+ ```bash
27
+ cat > /tmp/mcp-field-test.sh <<'HELPER_EOF'
28
+ #!/bin/bash
29
+ # Field-test helper: manage an MCP HTTP server + JSON-RPC session across shell calls.
30
+ STATE_FILE="/tmp/mcp-field-test.env"
31
+ [ -f "$STATE_FILE" ] && . "$STATE_FILE"
32
+
33
+ mcp_start() {
34
+ local dir="${1:-$PWD}"
35
+ echo "building $dir ..."
36
+ (cd "$dir" && bun run rebuild) >/tmp/mcp-build.log 2>&1 \
37
+ || { echo "BUILD FAILED — see /tmp/mcp-build.log"; return 1; }
38
+ echo "starting server ..."
39
+ (cd "$dir" && bun run start:http) >/tmp/mcp-server.log 2>&1 &
40
+ local pid=$!
41
+ local line=""
42
+ for _ in $(seq 1 40); do
43
+ line=$(grep -Eo 'listening at http://[^" ]+/mcp' /tmp/mcp-server.log | head -1)
44
+ [ -n "$line" ] && break
45
+ sleep 0.25
46
+ done
47
+ if [ -z "$line" ]; then
48
+ echo "server failed to start — see /tmp/mcp-server.log"
49
+ kill "$pid" 2>/dev/null
50
+ return 1
51
+ fi
52
+ local url="${line#listening at }"
53
+ local port; port=$(echo "$url" | sed -E 's|.*:([0-9]+)/.*|\1|')
54
+ cat > "$STATE_FILE" <<EOF
55
+ export MCP_PID=$pid
56
+ export MCP_URL=$url
57
+ export MCP_PORT=$port
58
+ EOF
59
+ . "$STATE_FILE"
60
+ echo "ready pid=$pid url=$url"
61
+ }
62
+
63
+ mcp_init() {
64
+ [ -z "$MCP_URL" ] && { echo "run mcp_start first"; return 1; }
65
+ local hdr="/tmp/mcp-init-headers.txt"
66
+ curl -sS -D "$hdr" -X POST "$MCP_URL" \
67
+ -H "Content-Type: application/json" \
68
+ -H "Accept: application/json, text/event-stream" \
69
+ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"field-test","version":"2.0"}}}' >/dev/null
70
+ local sid; sid=$(grep -i '^mcp-session-id:' "$hdr" | awk '{print $2}' | tr -d '\r\n')
71
+ [ -z "$sid" ] && { echo "no session id returned"; return 1; }
72
+ cat > "$STATE_FILE" <<EOF
73
+ export MCP_PID=$MCP_PID
74
+ export MCP_URL=$MCP_URL
75
+ export MCP_PORT=$MCP_PORT
76
+ export MCP_SID=$sid
77
+ EOF
78
+ . "$STATE_FILE"
79
+ curl -sS -X POST "$MCP_URL" \
80
+ -H "Content-Type: application/json" \
81
+ -H "Accept: application/json, text/event-stream" \
82
+ -H "Mcp-Session-Id: $sid" \
83
+ -d '{"jsonrpc":"2.0","method":"notifications/initialized"}' >/dev/null
84
+ echo "session=$sid"
85
+ }
86
+
87
+ # Usage: mcp_call METHOD [JSON_PARAMS]
88
+ # Prints the JSON-RPC response (SSE framing stripped). Pipe to `jq`.
89
+ mcp_call() {
90
+ [ -z "$MCP_SID" ] && { echo "run mcp_init first"; return 1; }
91
+ local method="$1"; local params="${2:-}"
92
+ local body
93
+ if [ -z "$params" ]; then
94
+ body=$(printf '{"jsonrpc":"2.0","id":%d,"method":"%s"}' "$RANDOM" "$method")
95
+ else
96
+ body=$(printf '{"jsonrpc":"2.0","id":%d,"method":"%s","params":%s}' "$RANDOM" "$method" "$params")
97
+ fi
98
+ curl -sS -X POST "$MCP_URL" \
99
+ -H "Content-Type: application/json" \
100
+ -H "Accept: application/json, text/event-stream" \
101
+ -H "Mcp-Session-Id: $MCP_SID" \
102
+ -d "$body" | sed -n 's/^data: //p'
103
+ }
104
+
105
+ mcp_stop() {
106
+ [ -n "$MCP_PID" ] && kill "$MCP_PID" 2>/dev/null
107
+ rm -f "$STATE_FILE"
108
+ echo "stopped"
109
+ }
110
+ HELPER_EOF
111
+
112
+ . /tmp/mcp-field-test.sh
113
+ mcp_start /absolute/path/to/server # replace with the target server
114
+ ```
115
+
116
+ **Notes**
117
+
118
+ - `MCP_HTTP_PORT` is a *starting* port — the server auto-increments if taken. Helper parses the real URL from the log (`HTTP transport listening at ...`).
119
+ - If `bun run rebuild` fails, stop. Don't field-test broken code — fix the build first.
120
+ - If a server is already listening on the project's port (`lsof -i :<port>`), confirm with the user before killing it; it may be their own session.
121
+
122
+ ### 2. Initialize the session
123
+
124
+ ```bash
125
+ . /tmp/mcp-field-test.sh
126
+ mcp_init
127
+ ```
128
+
129
+ Runs `initialize`, captures the session id, sends `notifications/initialized`.
130
+
131
+ ### 3. Surface the catalog
132
+
133
+ ```bash
134
+ . /tmp/mcp-field-test.sh
135
+ mcp_call tools/list | jq '.result.tools[] | {name, description, inputSchema}'
136
+ mcp_call resources/list | jq '.result.resources[] | {uri, name, mimeType}'
137
+ mcp_call prompts/list | jq '.result.prompts[] | {name, description, arguments}'
138
+ ```
139
+
140
+ Present a compact catalog to the user: each definition's name + 1-line description. Flag vague or missing descriptions as you go — those feed into the report. Use this to build the test plan.
141
+
142
+ ### 4. Plan the test pass
143
+
144
+ **Budget.** Don't run every category against every definition — the cross-product is infeasible. Apply the **universal battery** to everything; apply **situational categories** only when the definition triggers them.
145
+
146
+ **Universal battery — run on every tool**
147
+
148
+ | Category | What to verify |
149
+ |:---------|:---------------|
150
+ | Happy path | One realistic input. Output shape matches schema. `content[]` text reads clearly to a human. |
151
+ | `structuredContent` ↔ `content[]` parity | Every field in `structuredContent` is surfaced in the text. Parity gap = client-specific blindness. |
152
+ | Input error | One invalid input (wrong type or missing required). Error text says *what*, *why*, *how to fix*. |
153
+
154
+ **Situational — add only when triggered**
155
+
156
+ | Trigger (look in input schema or `annotations`) | Add category |
157
+ |:------------------------------------------------|:-------------|
158
+ | `include` / `fields` / `expand` / `view` / `projection` parameter | Field selection: non-default value renders requested fields |
159
+ | Array return with `query` / `filter` inputs | Empty result: does response explain *why* (echo criteria, suggest broadening)? |
160
+ | Batch / bulk input (arrays of IDs, multi-item ops) | Partial success: mix valid + invalid items |
161
+ | `annotations.readOnlyHint: true` | Confirm no mutation happened |
162
+ | `annotations.idempotentHint: true` | Call twice with same input — safe? |
163
+ | Hits external API / live upstream | One call that exercises upstream; note rate-limit / timeout / transient-failure behavior |
164
+ | Chained with other tools (search → detail → act) | Run one representative chain end-to-end; does each step return the IDs/cursors the next needs? |
165
+ | `cursor` / `offset` / `limit` params | Pagination: second page, end-of-list |
23
166
 
24
- List the MCP tools, resources, and prompts available in your environment. This confirms the server is connected and gives you everything you need — names, descriptions, parameter schemas — to plan your tests.
167
+ **Resources.** Happy path, not-found URI, `list` if defined, pagination if used.
168
+ **Prompts.** Happy path, defaults omitted, skim message quality.
25
169
 
26
- If you don't see any MCP tools from this server, ask the user to connect it first (e.g. `claude mcp add` for Claude Code, or the equivalent for their client). Don't proceed until the tools are visible.
170
+ **Sampling for large servers.** If more than 15 tools, run the universal battery on all, but pick roughly 30–40% for situational testing. Weight toward: write-shaped tools, complex schemas, external deps. List which ones you skipped in the report.
27
171
 
28
- Present what you find: each definition's name, parameters (with types and descriptions), and any notable schema details (optional fields, enums, constraints). This is your test surface.
172
+ **Auth & external state.**
29
173
 
30
- ### 2. Test each definition
174
+ - If a tool needs real API keys and they're not set, note `skipped — requires $VAR` and move on. Don't fabricate inputs.
175
+ - Tools that write to real external systems (third-party APIs, shared DBs): confirm with the user before running, or use a dry-run input if one exists.
31
176
 
32
- For every tool, resource, and prompt, run through these categories:
177
+ ### 5. Execute
33
178
 
34
- #### Tools
179
+ Use `TaskCreate` — one task per definition. Mark complete as you go. Don't batch.
35
180
 
36
- | Category | What to test |
37
- |:---------|:-------------|
38
- | **Happy path** | Realistic input that should succeed. Verify output shape matches the output schema. Verify format function produces sensible content blocks. |
39
- | **`structuredContent` parity** | The `format-parity` lint rule already asserts every terminal field in the output schema appears in `format()`'s rendered text (via sentinel injection at startup). Field testing layers real-data checks on top: are values rendered accurately (not just their labels)? Do conditional-render branches in `format()` still render every field when specific values are present? Does the content look right to a human reading the LLM's view? |
40
- | **Variations** | Different valid input combinations — optional fields omitted, optional fields included, different enum values, min/max boundaries. |
41
- | **Field selection / projection** | For tools with `fields`, `include`, `expand`, `view`, or similar parameters, call the tool with non-default selections. Verify the handler returns the requested fields and `format()` renders each requested field rather than a hardcoded summary subset. |
42
- | **Edge cases** | Empty strings, zero values, very long inputs, special characters, Unicode. |
43
- | **Error paths** | Missing required fields, wrong types, nonexistent IDs, inputs that should trigger domain errors. Verify errors are clear and actionable — they should name what went wrong, why, and what to do next. |
44
- | **Empty results** | Inputs that match nothing. Verify the response explains *why* (echoes criteria, suggests broadening) rather than returning a bare empty array. |
45
- | **Partial success** | For tools that operate on multiple items, test cases where some succeed and some fail. Verify both outcomes are reported — not just the successes. |
46
- | **Annotations** | Review tool `annotations` (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`) against actual behavior. If a tool is marked read-only, verify it does not mutate state. If it is marked idempotent, verify retries with the same input are safe. If it is marked open-world false, verify it is not silently depending on live external systems. |
47
- | **Workflow chaining** | For servers with multi-step workflows, execute 1-2 representative chains end-to-end. Example: search → detail → follow-up action. Verify each step returns the IDs, cursors, URIs, tokens, or state needed for the next step without guessing. |
48
- | **Response quality** | Inspect successful responses for: (1) chaining IDs needed for follow-up calls, (2) operational metadata (counts, applied filters, truncation notices), (3) filtering transparency (if anything was excluded, does the response say what and how to include it?), (4) reasonable response size (not dumping unbounded data into context). See the `add-tool` skill's **Tool Response Design** section for the full set of patterns. |
49
- | **Resilience** | For tools backed by external APIs or slow subsystems, test or explicitly note rate-limit, timeout, and transient-failure behavior. Verify retries/backoff happen where intended, or at minimum that the error message clearly tells the user whether to retry, wait, or change input. |
50
- | **Descriptions** | Read every field's `.describe()` — would a user/LLM understand what to provide? Flag vague or missing descriptions. |
181
+ For each call, capture: input sent, response (trim huge payloads to files), whether `isError: true` appeared, anything surprising (slow response, parity drift, unhelpful text, crash).
51
182
 
52
- #### Resources
183
+ **Interpreting responses**
53
184
 
54
- | Category | What to test |
55
- |:---------|:-------------|
56
- | **Happy path** | Valid URI with known params. Verify returned content and MIME type. |
57
- | **List** | Call `list` if defined. Verify returned resources have names and valid URIs. |
58
- | **Not found** | URI with nonexistent params. Verify a clear error, not a crash. |
59
- | **Pagination** | If the resource uses `extractCursor`/`paginateArray`, test with varying limits and cursors. |
185
+ - Tool domain errors return `{result: {content: [...], isError: true}}` — they live in `result`, not `error`. Check `isError`, not the JSON-RPC error field.
186
+ - JSON-RPC `error` only appears for protocol issues (bad session, malformed envelope, unknown method).
187
+ - `mcp_call` already strips SSE framing. Pipe to `jq` for readability.
60
188
 
61
- #### Prompts
189
+ ### 6. Tear down
62
190
 
63
- | Category | What to test |
64
- |:---------|:-------------|
65
- | **Happy path** | Valid args. Verify generated messages are well-formed. |
66
- | **Defaults** | Omit optional args. Verify the output still makes sense. |
67
- | **Content quality** | Read the generated messages — are they clear, well-structured prompts? |
191
+ ```bash
192
+ . /tmp/mcp-field-test.sh
193
+ mcp_stop
194
+ ```
68
195
 
69
- ### 3. Track progress
196
+ Kills the background server, clears state. Do this *before* writing the report so nothing leaks into the next session.
70
197
 
71
- Use a todo list to track each definition and its test status. Mark each as you go — don't batch.
198
+ ### 7. Report
72
199
 
73
- ### 4. Produce the report
200
+ Three sections. Tight. The user should be able to skim the summary, read details only for what matters, and act on numbered options.
74
201
 
75
- After testing everything, present a structured report:
202
+ #### Summary (1 paragraph)
76
203
 
77
- #### Summary table
204
+ One paragraph. How many definitions exercised, how many passed clean, how many have issues, and the single most important finding. No tables, no lists.
78
205
 
79
- | Definition | Type | Status | Issues |
80
- |:-----------|:-----|:-------|:-------|
81
- | `acme_search_items` | tool | pass | — |
82
- | `acme_get_item` | tool | issues | Error message unhelpful for missing ID |
83
- | `item://` | resource | fail | Crashes on nonexistent ID |
206
+ #### Findings
84
207
 
85
- #### Detailed findings
208
+ Only include definitions with issues. Group by severity. Each finding is 2–4 lines unless it genuinely needs more.
86
209
 
87
- For each definition with issues, include:
210
+ | Severity | Meaning |
211
+ |:---------|:--------|
212
+ | **bug** | Broken: crash, wrong output, `isError: true` on valid input, data loss, schema violation |
213
+ | **ux** | Works but degrades the user/LLM experience: vague description, unhelpful error text, missing `format()`, parity drift, annotation mismatches behavior |
214
+ | **nit** | Polish: phrasing, inconsistent tone, minor doc gaps |
88
215
 
89
- - **What happened** — the input, the output or error, and what was expected
90
- - **Severity** — `bug` (broken behavior), `ux` (works but confusing/unhelpful), `nit` (minor polish)
91
- - **Recommendation** — specific fix suggestion
216
+ Format:
92
217
 
93
- #### Pain points
218
+ ```
219
+ **<tool_name> — <bug|ux|nit>**
220
+ Input: `<short input>` → <what happened>
221
+ Expected: <what should happen>
222
+ Fix: <one sentence>
223
+ ```
94
224
 
95
- Cross-cutting observations that aren't tied to a single definition:
225
+ #### Options
96
226
 
97
- - Inconsistent error message patterns across tools
98
- - Missing format functions (raw JSON returned to user)
99
- - `structuredContent` contains data that `content[]` silently drops
100
- - Requested projected fields are returned programmatically but not rendered for the model
101
- - Description quality issues (vague, missing, or misleading)
102
- - Schema design issues (required fields that should be optional, missing defaults, overly broad types, non-JSON-Schema-serializable types like `z.custom()` or `z.date()`)
103
- - Annotation hints that do not match real behavior (`readOnlyHint`, `idempotentHint`, `openWorldHint`)
104
- - Response quality issues (empty results with no context, silent filtering, missing chaining IDs, oversized payloads, no operational metadata)
105
- - Multi-step workflows that cannot be completed because intermediate outputs omit required IDs, cursors, or URIs
106
- - Error messages that don't guide recovery (generic "not found" instead of naming alternatives)
107
- - Resilience issues (rate limits, timeouts, transient upstream failures handled poorly or explained poorly)
108
- - Performance observations (unexpectedly slow responses)
227
+ Numbered, actionable, cherry-pickable. Each item maps to a concrete change.
228
+
229
+ ```
230
+ 1. Fix empty-result message in `pubmed_search_articles` echo criteria (finding #2)
231
+ 2. Add `format()` to `pubmed_lookup_mesh` currently returns raw JSON (finding #5)
232
+ 3. Tighten `ids` description in `pubmed_fetch_articles` silent on PMID vs DOI (finding #8)
233
+ ```
234
+
235
+ End with:
236
+
237
+ > Pick by number (e.g. "do 1, 3, 5" or "expand on 2").
109
238
 
110
239
  ---
111
240
 
112
241
  ## Checklist
113
242
 
114
- - [ ] All registered tools tested (happy path + edge cases + empty results)
115
- - [ ] All registered resources tested (happy path + not found)
116
- - [ ] All registered prompts tested (happy path + defaults)
117
- - [ ] Error messages reviewed for clarity and recovery guidance
118
- - [ ] Empty-result responses reviewed for context (criteria echo, suggestions)
119
- - [ ] `structuredContent` and `content[]` reviewed for parity
120
- - [ ] Field-selection / projection behavior reviewed where applicable
121
- - [ ] Response quality reviewed (chaining IDs, metadata, filtering transparency, payload size)
122
- - [ ] Tool annotations reviewed against actual behavior
123
- - [ ] Representative multi-step workflows exercised where applicable
124
- - [ ] External API resilience reviewed where applicable (rate limits, timeouts, transient failures)
125
- - [ ] Descriptions reviewed for completeness and accuracy
126
- - [ ] Format functions verified (or absence noted)
127
- - [ ] Summary report presented to user
243
+ - [ ] Server built and started; real port parsed from log
244
+ - [ ] Session initialized; `notifications/initialized` sent
245
+ - [ ] Catalog surfaced and presented
246
+ - [ ] Universal battery run on every definition
247
+ - [ ] Situational categories applied only when triggered
248
+ - [ ] External-state / auth-gated tools handled explicitly (run, skip, or confirm)
249
+ - [ ] Server stopped; state file removed
250
+ - [ ] Report: summary paragraph grouped findings numbered options