npm - mcp-web-reader - Versions diffs - 2.0.0 → 2.1.0 - Mend

mcp-web-reader 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -1,255 +1,109 @@
 # MCP Web Reader
-一个强大的 MCP (Model Context Protocol) 服务器，让 Claude 和其他大语言模型能够读取和解析网页内容。支持突破访问限制，轻松获取微信文章、时代杂志等受保护内容。
+A powerful MCP (Model Context Protocol) server that enables Claude and other LLMs to read and parse web content. Bypasses access restrictions for WeChat articles, paywalled sites, and Cloudflare-protected pages.
-## 功能特点
+[简体中文](./README_CN.md)
-- 🚀 **三引擎支持**：集成 Jina Reader API、本地解析器和 Playwright 浏览器
-- 🔄 **智能降级**：Jina Reader → 本地解析 → Playwright 浏览器三层自动切换
-- 🌐 **突破限制**：使用 Playwright 处理 Cloudflare、验证码等访问限制
-- 📦 **批量处理**：支持同时获取多个 URL
-- 🎯 **灵活控制**：可选择强制使用特定解析方式
-- 📝 **Markdown 输出**：自动转换为清晰的 Markdown 格式
+## Features
-## 安装
+- 🚀 **Multi-engine**: Jina Reader API, local parser, and Playwright browser
+- 🔄 **Smart fallback**: Auto-switches Jina → Local → Playwright browser
+- 🌐 **Bypass restrictions**: Cloudflare, CAPTCHAs, access controls
+- 📦 **Batch processing**: Fetch multiple URLs simultaneously
+- 📝 **Markdown output**: Automatic conversion to clean Markdown
-### 方法 1：从源码安装
+## Installation
 ```bash
-# 克隆仓库
-git clone https://github.com/zacfire/mcp-web-reader.git
-cd mcp-web-reader
-# 安装依赖
-npm install
-# 构建项目
-npm run build
-# 安装 Playwright 浏览器（必需）
-npx playwright install chromium
+npm install -g mcp-web-reader
 ```
-### 方法 2：使用 npm 安装（推荐）
+> **Note**: Chromium browser (~100-200MB) will be automatically downloaded. This is required for:
+> - WeChat articles (need browser rendering)
+> - Cloudflare-protected sites
+> - JavaScript-heavy sites
+> - CAPTCHA/access restrictions
-发布后，您可以简单地通过 npm 安装：
+Download may take 1-5 minutes depending on network speed.
-```bash
-npm install -g mcp-web-reader
-```
-**首次发布步骤**：
-如果这是第一次发布，请运行提供的发布脚本：
+### From Source
 ```bash
-# 确保已登录 npm
-npm login
-# 运行发布脚本
-./publish.sh
+git clone https://github.com/Gracker/mcp-web-reader.git
+cd mcp-web-reader
+npm install
+npm run build
 ```
-## 配置
+## Configuration
-### 快速配置
+### Claude Desktop
-在 Claude Desktop 的配置文件中添加：
+Add to your config file:
 **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
 **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
 ```json
 {
   "mcpServers": {
     "web-reader": {
-      "command": "node",
-      "args": ["/absolute/path/to/mcp-web-reader/dist/index.js"]
+      "command": "mcp-web-reader"
     }
   }
 }
 ```
-**重要**: 将 `/absolute/path/to/mcp-web-reader/dist/index.js` 替换为你的实际路径。
-### 详细配置指南
-📖 **完整的使用指南**：请查看 [USAGE_GUIDE.md](./USAGE_GUIDE.md)，包含：
-- **命令行使用（推荐）** - 适合使用 CLI 的用户
-- Claude Desktop 配置
-- Claude Code (Cursor) 配置
-- 其他 MCP 客户端配置
-- 使用示例和故障排除
-📖 **命令行专用指南**：请查看 [CLI_USAGE.md](./CLI_USAGE.md)，包含：
-- 使用 MCP Inspector 测试
-- 创建 CLI 包装器
-- 集成到自定义脚本
-- 命令行工具使用示例
-## 使用方法
-### 命令行使用（推荐）
-如果你主要使用命令行的 Claude，可以使用提供的 CLI 工具：
-```bash
-# 智能获取（自动降级）
-node cli.js fetch https://example.com
-# 强制使用 Jina Reader
-node cli.js jina https://example.com
-# 强制使用本地解析
-node cli.js local https://example.com
-# 强制使用浏览器模式（适用于微信文章等受限网站）
-node cli.js browser https://mp.weixin.qq.com/...
-```
-或者使用 MCP Inspector 进行交互式测试：
+### Claude Code
 ```bash
-npx @modelcontextprotocol/inspector node dist/index.js
+claude mcp add web-reader -- mcp-web-reader
+claude mcp list
 ```
-📖 详细说明请查看 [CLI_USAGE.md](./CLI_USAGE.md)
-### 在 Claude 中使用
-配置完成后，在 Claude 中可以使用以下命令：
-1.  **智能获取（推荐）**
-    * "请获取 [https://example.com](https://example.com) 的内容"
-    * 自动三层降级：Jina Reader → 本地解析 → Playwright 浏览器
-2.  **批量获取**
-    * "请获取这些网页：[url1, url2, url3]"
-    * 每个URL都享受智能降级策略
+## Usage
-3.  **强制使用 Jina Reader**
-    * "使用 Jina Reader 获取 [https://example.com](https://example.com)"
+In Claude:
+- "Fetch content from https://example.com"
+- "Get content using browser for https://mp.weixin.qq.com/..."
+- "Fetch multiple URLs: [url1, url2, url3]"
-4.  **强制使用本地解析**
-    * "使用本地解析器获取 [https://example.com](https://example.com)"
+## Supported Sites
-5.  **强制使用浏览器模式**
-    * "使用浏览器获取 [https://example.com](https://example.com)"
-    * 直接跳过其他方式，适用于确定有访问限制的网站
+- WeChat articles (mp.weixin.qq.com)
+- Paywalled sites (NYT, Time Magazine, etc.)
+- Cloudflare-protected sites
+- JavaScript-heavy sites
+- CAPTCHA-protected sites
-## 支持的受限网站类型
+## Tools
-✅ **微信公众号文章** - 自动绕过访问限制
-✅ **时代杂志、纽约时报** - 突破付费墙和地区限制
-✅ **Cloudflare 保护网站** - 通过真实浏览器绕过检测
-✅ **需要 JavaScript 渲染的页面** - 完整执行页面脚本
-✅ **有验证码/人机验证的网站** - 模拟真实用户行为
+- `fetch_url` - Smart fetching with automatic fallback
+- `fetch_url_with_jina` - Force Jina Reader
+- `fetch_url_local` - Force local parsing
+- `fetch_url_with_browser` - Force browser mode (for restricted sites)
+- `fetch_multiple_urls` - Batch URL fetching
-## 工具列表
+## Architecture
--   `fetch_url` - 智能获取（三层降级：Jina → 本地 → Playwright）
--   `fetch_url_with_jina` - 强制使用 Jina Reader
--   `fetch_url_local` - 强制使用本地解析器
--   `fetch_url_with_browser` - 强制使用 Playwright 浏览器（突破访问限制）
--   `fetch_multiple_urls` - 批量获取多个 URL
-## 技术架构
-### 智能降级策略
-```
-用户请求 URL
-    ↓
-1. Jina Reader API (最快，成功率高)
-    ↓ 失败
-2. 本地解析器 (Node.js + JSDOM)
-    ↓ 检测到访问限制
-3. Playwright 浏览器 (真实浏览器，突破限制)
-```
-### 访问限制检测
-自动识别以下情况并启用浏览器模式：
-- HTTP 状态码：403, 429, 503, 520-524
-- 错误关键词：Cloudflare, CAPTCHA, Access Denied, Rate Limit
-- 内容关键词：Security Check, Human Verification
-## 开发
-```bash
-# 开发模式（自动重新编译）
-npm run dev
-# 构建
-npm run build
-# 测试运行
-npm start
-# 安装浏览器二进制文件（首次使用必需）
-npx playwright install chromium
-```
-## 性能优化
-- ⚡ **浏览器实例复用** - 避免重复启动开销
-- 🚫 **资源过滤** - 阻止图片、样式表等不必要加载
-- 🎯 **智能选择** - 优先使用快速方法，必要时才用浏览器
-- 💾 **优雅关闭** - 正确清理浏览器资源
-## 验证安装
-### 测试 MCP 服务器
-1. **使用 MCP Inspector 测试**：
-```bash
-npx @modelcontextprotocol/inspector node dist/index.js
+Intelligent fallback:
 ```
-2. **测试工具功能**：
-在 Inspector 中输入以下 JSON 测试各种工具：
-```json
-{"method": "tools/call", "params": {"name": "fetch_url", "arguments": {"url": "https://example.com"}}}
+URL Request → Jina Reader → Local Parser → Playwright Browser
 ```
-### 在 Claude Desktop 中验证
-配置完成后，重启 Claude Desktop，然后在对话中输入：
-- "请获取 https://httpbin.org/json 的内容"
-如果能成功返回内容，说明安装成功。
-## 故障排除
+Auto-detects restrictions and switches to browser for:
+- HTTP status codes: 403, 429, 503, 520-524
+- Keywords: Cloudflare, CAPTCHA, Access Denied
+- Content patterns: Security checks, human verification
-### 常见问题
+## Development
-1. **"找不到模块" 错误**
-   - 确保已运行 `npm install`
-   - 确保已运行 `npm run build`
-2. **Claude Desktop 无法连接到 MCP 服务器**
-   - 检查配置文件路径是否正确
-   - 检查 `dist/index.js` 路径是否正确
-   - 重启 Claude Desktop
-3. **Playwright 浏览器相关错误**
-   - 确保已运行 `npx playwright install chromium`
-   - 检查系统是否支持图形界面（某些服务器环境可能需要额外配置）
-4. **微信文章无法获取**
-   - 微信文章需要 Playwright 浏览器模式
-   - 使用 `fetch_url_with_browser` 工具强制使用浏览器
-### 调试模式
-启用详细日志：
 ```bash
-DEBUG=* node dist/index.js
+npm run dev    # Development with auto-rebuild
+npm run build  # Build production version
+npm start      # Test run
 ```
-## 贡献
-欢迎提交 Pull Request！
-## 许可证
+## License
 MIT License

package/dist/index.js CHANGED Viewed

@@ -5,7 +5,7 @@ import fetch from "node-fetch";
 import { JSDOM } from "jsdom";
 import TurndownService from "turndown";
 import { chromium } from "playwright";
-// 创建服务器实例
+// Create server instance
 const server = new Server({
     name: "web-reader",
     version: "2.0.0",
@@ -14,19 +14,19 @@ const server = new Server({
         tools: {},
     },
 });
-// 初始化Turndown服务（将HTML转换为Markdown）
+// Initialize Turndown service (convert HTML to Markdown)
 const turndownService = new TurndownService({
     headingStyle: "atx",
     codeBlockStyle: "fenced",
 });
-// 配置Turndown规则
+// Configure Turndown rules
 turndownService.addRule("skipScripts", {
     filter: ["script", "style", "noscript"],
     replacement: () => "",
 });
-// 浏览器实例管理
+// Browser instance management
 let browser = null;
-// 获取或创建浏览器实例
+// Get or create browser instance
 async function getBrowser() {
     if (!browser) {
         browser = await chromium.launch({
@@ -34,7 +34,7 @@ async function getBrowser() {
             args: [
                 '--no-sandbox',
                 '--disable-dev-shm-usage',
-                '--disable-blink-features=AutomationControlled', // 禁用自动化检测
+                '--disable-blink-features=AutomationControlled', // Disable automation detection
                 '--disable-infobars',
                 '--window-size=1920,1080',
                 '--start-maximized',
@@ -43,14 +43,14 @@ async function getBrowser() {
     }
     return browser;
 }
-// 清理浏览器实例
+// Clean up browser instance
 async function closeBrowser() {
     if (browser) {
         await browser.close();
         browser = null;
     }
 }
-// URL验证函数
+// URL validation function
 function isValidUrl(urlString) {
     try {
         const url = new URL(urlString);
@@ -60,18 +60,18 @@ function isValidUrl(urlString) {
         return false;
     }
 }
-// 检测是否是微信文章链接
+// Check if it's a WeChat article link
 function isWeixinUrl(url) {
     return url.includes('mp.weixin.qq.com') || url.includes('weixin.qq.com');
 }
-// 检测是否需要使用浏览器模式
+// Check if browser mode is needed
 function shouldUseBrowser(error, statusCode, content) {
     const errorMessage = error.message.toLowerCase();
-    // 基于HTTP状态码判断
+    // Based on HTTP status codes
     if (statusCode && [403, 429, 503, 520, 521, 522, 523, 524].includes(statusCode)) {
         return true;
     }
-    // 基于错误消息判断
+    // Based on error messages
     const browserTriggers = [
         'cloudflare',
         'access denied',
@@ -83,13 +83,13 @@ function shouldUseBrowser(error, statusCode, content) {
         'blocked',
         'protection',
         'verification required',
-        '环境异常',
-        '验证'
+        'environment anomaly',
+        'verify'
     ];
     if (browserTriggers.some(trigger => errorMessage.includes(trigger))) {
         return true;
     }
-    // 基于响应内容判断
+    // Based on response content
     if (content) {
         const contentLower = content.toLowerCase();
         const contentTriggers = [
@@ -99,10 +99,10 @@ function shouldUseBrowser(error, statusCode, content) {
             'security check',
             'human verification',
             'captcha',
-            // 微信特有的验证关键词
-            '环境异常',
-            '去验证',
-            '完成验证后即可继续访问',
+            // WeChat-specific verification keywords
+            'environment anomaly',
+            'verify',
+            'complete verification to continue',
             'verify'
         ];
         if (contentTriggers.some(trigger => contentLower.includes(trigger))) {
@@ -111,12 +111,12 @@ function shouldUseBrowser(error, statusCode, content) {
     }
     return false;
 }
-// 使用Jina Reader获取内容
+// Fetch content using Jina Reader
 async function fetchWithJinaReader(url) {
     try {
         // Jina Reader API URL
         const jinaUrl = `https://r.jina.ai/${url}`;
-        // 创建超时控制器
+        // Create timeout controller
         const controller = new AbortController();
         const timeoutId = setTimeout(() => controller.abort(), 30000);
         const response = await fetch(jinaUrl, {
@@ -131,9 +131,9 @@ async function fetchWithJinaReader(url) {
             throw new Error(`Jina Reader API error! status: ${response.status}`);
         }
         const markdown = await response.text();
-        // 从Markdown中提取标题（通常是第一个#标题）
+        // Extract title from Markdown (usually the first # heading)
         const titleMatch = markdown.match(/^#\s+(.+)$/m);
-        const title = titleMatch ? titleMatch[1] : "无标题";
+        const title = titleMatch ? titleMatch[1] : "No title";
         return {
             title,
             content: markdown,
@@ -148,21 +148,21 @@ async function fetchWithJinaReader(url) {
     catch (error) {
         if (error instanceof Error) {
             if (error.name === 'AbortError') {
-                throw new Error(`Jina Reader请求超时（30秒）`);
+                throw new Error(`Jina Reader request timeout (30s)`);
             }
-            throw new Error(`Jina Reader获取失败: ${error.message}`);
+            throw new Error(`Jina Reader fetch failed: ${error.message}`);
         }
-        throw new Error(`Jina Reader获取失败: ${String(error)}`);
+        throw new Error(`Jina Reader fetch failed: ${String(error)}`);
     }
 }
-// 使用Playwright获取网页内容
+// Fetch web content using Playwright
 async function fetchWithPlaywright(url) {
     let page = null;
     const isWeixin = isWeixinUrl(url);
     try {
         const browserInstance = await getBrowser();
         page = await browserInstance.newPage();
-        // 设置真实的 User-Agent（模拟 Chrome on Mac）
+        // Set real User-Agent (simulate Chrome on Mac)
         const userAgent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36';
         await page.setExtraHTTPHeaders({
             'User-Agent': userAgent,
@@ -174,7 +174,7 @@ async function fetchWithPlaywright(url) {
             ...(isWeixin ? { 'Referer': 'https://mp.weixin.qq.com/' } : {}),
         });
         await page.setViewportSize({ width: 1920, height: 1080 });
-        // 微信文章需要加载样式以正确渲染，其他网站可以过滤
+        // WeChat articles need to load styles for correct rendering, filter for other sites
         if (!isWeixin) {
             await page.route('**/*', (route) => {
                 const resourceType = route.request().resourceType();
@@ -186,30 +186,30 @@ async function fetchWithPlaywright(url) {
                 }
             });
         }
-        // 导航到页面，设置更长的超时时间
+        // Navigate to page with longer timeout
         await page.goto(url, {
             timeout: 45000,
-            waitUntil: 'networkidle' // 等待网络空闲，确保 JS 完全执行
+            waitUntil: 'networkidle' // Wait for network idle to ensure JS execution
         });
-        // 微信文章需要更长的等待时间
+        // WeChat articles need longer wait time
         const waitTime = isWeixin ? 5000 : 2000;
         await page.waitForTimeout(waitTime);
-        // 获取页面标题
-        const title = await page.title() || "无标题";
-        // 移除不需要的元素
+        // Get page title
+        const title = await page.title() || "No title";
+        // Remove unwanted elements
         await page.evaluate(() => {
             const elementsToRemove = document.querySelectorAll('script, style, nav, header, footer, aside, .advertisement, .ads, .sidebar, .comments, .social-share');
             elementsToRemove.forEach(el => el.remove());
         });
-        // 获取主要内容（微信文章有特定的 DOM 结构）
+        // Get main content (WeChat articles have specific DOM structure)
         const htmlContent = await page.evaluate(() => {
-            // 微信文章特定选择器
+            // WeChat article specific selectors
             const weixinContent = document.querySelector('#js_content') ||
                 document.querySelector('.rich_media_content');
             if (weixinContent) {
                 return weixinContent.innerHTML;
             }
-            // 通用选择器
+            // Common selectors
             const mainContent = document.querySelector('main') ||
                 document.querySelector('article') ||
                 document.querySelector('[role="main"]') ||
@@ -220,9 +220,9 @@ async function fetchWithPlaywright(url) {
                 document.body;
             return mainContent ? mainContent.innerHTML : document.body.innerHTML;
         });
-        // 转换为Markdown
+        // Convert to Markdown
         const markdown = turndownService.turndown(htmlContent);
-        // 清理内容
+        // Clean content
         const cleanedContent = markdown
             .replace(/\n{3,}/g, "\n\n")
             .replace(/^\s+$/gm, "")
@@ -240,9 +240,9 @@ async function fetchWithPlaywright(url) {
     }
     catch (error) {
         if (error instanceof Error) {
-            throw new Error(`Playwright获取失败: ${error.message}`);
+            throw new Error(`Playwright fetch failed: ${error.message}`);
         }
-        throw new Error(`Playwright获取失败: ${String(error)}`);
+        throw new Error(`Playwright fetch failed: ${String(error)}`);
     }
     finally {
         if (page) {
@@ -250,13 +250,13 @@ async function fetchWithPlaywright(url) {
         }
     }
 }
-// 本地提取网页内容的函数
+// Local web content extraction function
 async function fetchWithLocalParser(url) {
     try {
-        // 创建超时控制器
+        // Create timeout controller
         const controller = new AbortController();
         const timeoutId = setTimeout(() => controller.abort(), 30000);
-        // 发送HTTP请求
+        // Send HTTP request
         const response = await fetch(url, {
             headers: {
                 "User-Agent": "Mozilla/5.0 (compatible; MCP-URLFetcher/2.0)",
@@ -267,17 +267,17 @@ async function fetchWithLocalParser(url) {
         if (!response.ok) {
             throw new Error(`HTTP error! status: ${response.status}`);
         }
-        // 获取HTML内容
+        // Get HTML content
         const html = await response.text();
-        // 使用JSDOM解析HTML
+        // Parse HTML with JSDOM
         const dom = new JSDOM(html);
         const document = dom.window.document;
-        // 获取标题
-        const title = document.querySelector("title")?.textContent || "无标题";
-        // 移除不需要的元素
+        // Get title
+        const title = document.querySelector("title")?.textContent || "No title";
+        // Remove unwanted elements
         const elementsToRemove = document.querySelectorAll("script, style, nav, header, footer, aside, .advertisement, .ads, .sidebar, .comments");
         elementsToRemove.forEach(el => el.remove());
-        // 获取主要内容区域
+        // Get main content area
         const mainContent = document.querySelector("main") ||
             document.querySelector("article") ||
             document.querySelector('[role="main"]') ||
@@ -286,9 +286,9 @@ async function fetchWithLocalParser(url) {
             document.querySelector(".post") ||
             document.querySelector(".entry-content") ||
             document.body;
-        // 转换为Markdown
+        // Convert to Markdown
         const markdown = turndownService.turndown(mainContent.innerHTML);
-        // 清理多余的空行和空格
+        // Clean extra whitespace
         const cleanedContent = markdown
             .replace(/\n{3,}/g, "\n\n")
             .replace(/^\s+$/gm, "")
@@ -307,62 +307,62 @@ async function fetchWithLocalParser(url) {
     catch (error) {
         if (error instanceof Error) {
             if (error.name === 'AbortError') {
-                throw new Error(`本地解析请求超时（30秒）`);
+                throw new Error(`Local parser request timeout (30s)`);
             }
-            throw new Error(`本地解析失败: ${error.message}`);
+            throw new Error(`Local parser failed: ${error.message}`);
         }
-        throw new Error(`本地解析失败: ${String(error)}`);
+        throw new Error(`Local parser failed: ${String(error)}`);
     }
 }
-// 智能获取网页内容（三层降级策略：Jina → 本地 → Playwright）
-// 对于微信等已知需要浏览器的网站，直接使用浏览器模式
+// Smart web content fetching (three-tier fallback: Jina → Local → Playwright)
+// For known sites requiring browser (like WeChat), use browser mode directly
 async function fetchWebContent(url, preferJina = true) {
-    // 微信文章直接使用浏览器模式，因为其他方式无法绕过验证
+    // WeChat articles use browser mode directly as other methods cannot bypass verification
     if (isWeixinUrl(url)) {
-        console.error("检测到微信文章，直接使用Playwright浏览器模式");
+        console.error("Detected WeChat article, using Playwright browser mode");
         return await fetchWithPlaywright(url);
     }
     if (preferJina) {
-        // 第一层：尝试Jina Reader
+        // Tier 1: Try Jina Reader
         try {
             return await fetchWithJinaReader(url);
         }
         catch (jinaError) {
-            console.error("Jina Reader失败，尝试本地解析:", jinaError instanceof Error ? jinaError.message : String(jinaError));
-            // 第二层：尝试本地解析
+            console.error("Jina Reader failed, trying local parser:", jinaError instanceof Error ? jinaError.message : String(jinaError));
+            // Tier 2: Try local parser
             try {
                 return await fetchWithLocalParser(url);
             }
             catch (localError) {
-                console.error("本地解析失败，检查是否需要浏览器模式:", localError instanceof Error ? localError.message : String(localError));
-                // 判断是否需要使用浏览器模式
+                console.error("Local parser failed, checking if browser mode needed:", localError instanceof Error ? localError.message : String(localError));
+                // Check if browser mode is needed
                 const jinaErr = jinaError instanceof Error ? jinaError : new Error(String(jinaError));
                 const localErr = localError instanceof Error ? localError : new Error(String(localError));
                 if (shouldUseBrowser(jinaErr) || shouldUseBrowser(localErr)) {
-                    console.error("检测到访问限制，使用Playwright浏览器模式");
+                    console.error("Detected access restrictions, using Playwright browser mode");
                     try {
-                        // 第三层：使用Playwright浏览器
+                        // Tier 3: Use Playwright browser
                         return await fetchWithPlaywright(url);
                     }
                     catch (browserError) {
-                        throw new Error(`所有方法都失败了。Jina: ${jinaErr.message}, 本地: ${localErr.message}, 浏览器: ${browserError instanceof Error ? browserError.message : String(browserError)}`);
+                        throw new Error(`All methods failed. Jina: ${jinaErr.message}, Local: ${localErr.message}, Browser: ${browserError instanceof Error ? browserError.message : String(browserError)}`);
                     }
                 }
                 else {
-                    throw new Error(`Jina和本地解析都失败了。Jina: ${jinaErr.message}, 本地: ${localErr.message}`);
+                    throw new Error(`Jina and local parser both failed. Jina: ${jinaErr.message}, Local: ${localErr.message}`);
                 }
             }
         }
     }
     else {
-        // 如果不优先使用Jina，直接从本地解析开始
+        // If not prioritizing Jina, start with local parser
         try {
             return await fetchWithLocalParser(url);
         }
         catch (localError) {
             const localErr = localError instanceof Error ? localError : new Error(String(localError));
             if (shouldUseBrowser(localErr)) {
-                console.error("本地解析失败，检测到访问限制，使用Playwright浏览器模式");
+                console.error("Local parser failed, detected access restrictions, using Playwright browser mode");
                 return await fetchWithPlaywright(url);
             }
             else {
@@ -371,23 +371,23 @@ async function fetchWebContent(url, preferJina = true) {
         }
     }
 }
-// 处理工具列表请求
+// Handle tool list requests
 server.setRequestHandler(ListToolsRequestSchema, async () => {
     return {
         tools: [
             {
                 name: "fetch_url",
-                description: "获取指定URL的网页内容，并转换为Markdown格式。默认使用Jina Reader，失败时自动切换到本地解析",
+                description: "Fetch web content from specified URL and convert to Markdown format. Uses Jina Reader by default, automatically falls back to local parser on failure",
                 inputSchema: {
                     type: "object",
                     properties: {
                         url: {
                             type: "string",
-                            description: "要获取内容的网页URL（必须是http或https协议）",
+                            description: "Webpage URL to fetch (must be http or https protocol)",
                         },
                         preferJina: {
                             type: "boolean",
-                            description: "是否优先使用Jina Reader（默认为true）",
+                            description: "Whether to prioritize Jina Reader (default: true)",
                             default: true,
                         },
                     },
@@ -396,7 +396,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
             },
             {
                 name: "fetch_multiple_urls",
-                description: "批量获取多个URL的网页内容",
+                description: "Batch fetch web content from multiple URLs",
                 inputSchema: {
                     type: "object",
                     properties: {
@@ -405,12 +405,12 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
                             items: {
                                 type: "string",
                             },
-                            description: "要获取内容的网页URL列表",
-                            maxItems: 10, // 限制最多10个URL
+                            description: "List of webpage URLs to fetch",
+                            maxItems: 10, // Limit to 10 URLs
                         },
                         preferJina: {
                             type: "boolean",
-                            description: "是否优先使用Jina Reader（默认为true）",
+                            description: "Whether to prioritize Jina Reader (default: true)",
                             default: true,
                         },
                     },
@@ -419,13 +419,13 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
             },
             {
                 name: "fetch_url_with_jina",
-                description: "强制使用Jina Reader获取网页内容（适用于复杂网页）",
+                description: "Force fetch using Jina Reader (suitable for complex webpages)",
                 inputSchema: {
                     type: "object",
                     properties: {
                         url: {
                             type: "string",
-                            description: "要获取内容的网页URL",
+                            description: "Webpage URL to fetch",
                         },
                     },
                     required: ["url"],
@@ -433,13 +433,13 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
             },
             {
                 name: "fetch_url_local",
-                description: "强制使用本地解析器获取网页内容（适用于简单网页或Jina不可用时）",
+                description: "Force fetch using local parser (suitable for simple webpages or when Jina is unavailable)",
                 inputSchema: {
                     type: "object",
                     properties: {
                         url: {
                             type: "string",
-                            description: "要获取内容的网页URL",
+                            description: "Webpage URL to fetch",
                         },
                     },
                     required: ["url"],
@@ -447,13 +447,13 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
             },
             {
                 name: "fetch_url_with_browser",
-                description: "强制使用Playwright浏览器获取网页内容（适用于有访问限制的网站，如Cloudflare保护、验证码等）",
+                description: "Force fetch using Playwright browser (suitable for websites with access restrictions, such as Cloudflare protection, CAPTCHA, etc.)",
                 inputSchema: {
                     type: "object",
                     properties: {
                         url: {
                             type: "string",
-                            description: "要获取内容的网页URL",
+                            description: "Webpage URL to fetch",
                         },
                     },
                     required: ["url"],
@@ -462,23 +462,23 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
         ],
     };
 });
-// 处理工具调用请求
+// Handle tool call requests
 server.setRequestHandler(CallToolRequestSchema, async (request) => {
     const { name, arguments: args } = request.params;
     try {
         if (name === "fetch_url") {
             const { url, preferJina = true } = args;
-            // 验证URL
+            // Validate URL
             if (!isValidUrl(url)) {
-                throw new McpError(ErrorCode.InvalidParams, "无效的URL格式，请提供http或https协议的URL");
+                throw new McpError(ErrorCode.InvalidParams, "Invalid URL format, please provide http or https protocol URL");
             }
-            // 获取网页内容
+            // Fetch web content
             const result = await fetchWebContent(url, preferJina);
             return {
                 content: [
                     {
                         type: "text",
-                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**获取时间**: ${result.metadata.fetchedAt}\n**内容长度**: ${result.metadata.contentLength} 字符\n**解析方法**: ${result.metadata.method}\n\n---\n\n${result.content}`,
+                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**Fetched At**: ${result.metadata.fetchedAt}\n**Content Length**: ${result.metadata.contentLength} characters\n**Method**: ${result.metadata.method}\n\n---\n\n${result.content}`,
                     },
                 ],
             };
@@ -486,14 +486,14 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
         else if (name === "fetch_url_with_jina") {
             const { url } = args;
             if (!isValidUrl(url)) {
-                throw new McpError(ErrorCode.InvalidParams, "无效的URL格式");
+                throw new McpError(ErrorCode.InvalidParams, "Invalid URL format");
             }
             const result = await fetchWithJinaReader(url);
             return {
                 content: [
                     {
                         type: "text",
-                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**获取时间**: ${result.metadata.fetchedAt}\n**内容长度**: ${result.metadata.contentLength} 字符\n**解析方法**: Jina Reader\n\n---\n\n${result.content}`,
+                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**Fetched At**: ${result.metadata.fetchedAt}\n**Content Length**: ${result.metadata.contentLength} characters\n**Method**: Jina Reader\n\n---\n\n${result.content}`,
                     },
                 ],
             };
@@ -501,42 +501,42 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
         else if (name === "fetch_url_local") {
             const { url } = args;
             if (!isValidUrl(url)) {
-                throw new McpError(ErrorCode.InvalidParams, "无效的URL格式");
+                throw new McpError(ErrorCode.InvalidParams, "Invalid URL format");
             }
             const result = await fetchWithLocalParser(url);
             return {
                 content: [
                     {
                         type: "text",
-                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**获取时间**: ${result.metadata.fetchedAt}\n**内容长度**: ${result.metadata.contentLength} 字符\n**解析方法**: 本地解析器\n\n---\n\n${result.content}`,
+                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**Fetched At**: ${result.metadata.fetchedAt}\n**Content Length**: ${result.metadata.contentLength} characters\n**Method**: Local Parser\n\n---\n\n${result.content}`,
                     },
                 ],
             };
         }
         else if (name === "fetch_multiple_urls") {
             const { urls, preferJina = true } = args;
-            // 验证所有URL
+            // Validate all URLs
             const invalidUrls = urls.filter(url => !isValidUrl(url));
             if (invalidUrls.length > 0) {
-                throw new McpError(ErrorCode.InvalidParams, `以下URL格式无效: ${invalidUrls.join(", ")}`);
+                throw new McpError(ErrorCode.InvalidParams, `The following URLs have invalid format: ${invalidUrls.join(", ")}`);
             }
-            // 并发获取所有URL内容
+            // Fetch all URLs concurrently
             const results = await Promise.allSettled(urls.map(url => fetchWebContent(url, preferJina)));
-            // 整理结果
-            let combinedContent = "# 批量URL内容获取结果\n\n";
+            // Combine results
+            let combinedContent = "# Batch URL Content Fetch Results\n\n";
             results.forEach((result, index) => {
                 const url = urls[index];
                 combinedContent += `## ${index + 1}. ${url}\n\n`;
                 if (result.status === "fulfilled") {
                     const { title, content, metadata } = result.value;
-                    combinedContent += `**标题**: ${title}\n`;
-                    combinedContent += `**获取时间**: ${metadata.fetchedAt}\n`;
-                    combinedContent += `**内容长度**: ${metadata.contentLength} 字符\n`;
-                    combinedContent += `**解析方法**: ${metadata.method}\n\n`;
-                    combinedContent += `### 内容\n\n${content}\n\n`;
+                    combinedContent += `**Title**: ${title}\n`;
+                    combinedContent += `**Fetched At**: ${metadata.fetchedAt}\n`;
+                    combinedContent += `**Content Length**: ${metadata.contentLength} characters\n`;
+                    combinedContent += `**Method**: ${metadata.method}\n\n`;
+                    combinedContent += `### Content\n\n${content}\n\n`;
                 }
                 else {
-                    combinedContent += `**错误**: ${result.reason}\n\n`;
+                    combinedContent += `**Error**: ${result.reason}\n\n`;
                 }
                 combinedContent += "---\n\n";
             });
@@ -552,47 +552,47 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
         else if (name === "fetch_url_with_browser") {
             const { url } = args;
             if (!isValidUrl(url)) {
-                throw new McpError(ErrorCode.InvalidParams, "无效的URL格式");
+                throw new McpError(ErrorCode.InvalidParams, "Invalid URL format");
             }
             const result = await fetchWithPlaywright(url);
             return {
                 content: [
                     {
                         type: "text",
-                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**获取时间**: ${result.metadata.fetchedAt}\n**内容长度**: ${result.metadata.contentLength} 字符\n**解析方法**: Playwright浏览器\n\n---\n\n${result.content}`,
+                        text: `# ${result.title}\n\n**URL**: ${result.metadata.url}\n**Fetched At**: ${result.metadata.fetchedAt}\n**Content Length**: ${result.metadata.contentLength} characters\n**Method**: Playwright Browser\n\n---\n\n${result.content}`,
                     },
                 ],
             };
         }
         else {
-            throw new McpError(ErrorCode.MethodNotFound, `未知的工具: ${name}`);
+            throw new McpError(ErrorCode.MethodNotFound, `Unknown tool: ${name}`);
         }
     }
     catch (error) {
         if (error instanceof McpError) {
             throw error;
         }
-        throw new McpError(ErrorCode.InternalError, `工具执行失败: ${error instanceof Error ? error.message : String(error)}`);
+        throw new McpError(ErrorCode.InternalError, `Tool execution failed: ${error instanceof Error ? error.message : String(error)}`);
     }
 });
-// 启动服务器
+// Start server
 async function main() {
     const transport = new StdioServerTransport();
     await server.connect(transport);
-    console.error("MCP Web Reader v2.0 已启动（支持Jina Reader + Playwright）");
+    console.error("MCP Web Reader v2.0 started (with Jina Reader + Playwright support)");
 }
-// 优雅关闭处理
+// Graceful shutdown handling
 process.on('SIGINT', async () => {
-    console.error("接收到SIGINT信号，正在关闭浏览器...");
+    console.error("Received SIGINT signal, closing browser...");
     await closeBrowser();
     process.exit(0);
 });
 process.on('SIGTERM', async () => {
-    console.error("接收到SIGTERM信号，正在关闭浏览器...");
+    console.error("Received SIGTERM signal, closing browser...");
     await closeBrowser();
     process.exit(0);
 });
 main().catch((error) => {
-    console.error("服务器启动失败:", error);
+    console.error("Server startup failed:", error);
     process.exit(1);
 });

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
     "name": "mcp-web-reader",
-    "version": "2.0.0",
+    "version": "2.1.0",
     "description": "MCP server for reading web content with Jina Reader and local parser support",
     "main": "dist/index.js",
     "bin": {
@@ -11,7 +11,8 @@
       "build": "tsc",
       "start": "node dist/index.js",
       "dev": "tsc --watch",
-      "claude-code": "node dist/index.js"
+      "claude-code": "node dist/index.js",
+      "postinstall": "npx playwright install chromium"
     },
     "repository": {
       "type": "git",