npm - mcpbrowser - Versions diffs - 0.2.16 → 0.2.17 - Mend

mcpbrowser 0.2.16 → 0.2.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +11 -11
package/extension/README.md +9 -9
package/extension/package.json +2 -2
package/extension/src/extension.js +1 -1
package/package.json +2 -2
package/server.json +1 -1
package/src/mcp-browser.js +20 -13

package/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 [![npm version](https://img.shields.io/npm/v/mcpbrowser.svg)](https://www.npmjs.com/package/mcpbrowser)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-**Extends GitHub Copilot's web access to protected pages** - handles login-protected pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Uses your Chrome/Edge browser session via DevTools Protocol.
+**Enables GitHub Copilot to fetch protected web pages** - handles login-protected web pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Uses your Chrome/Edge browser session via DevTools Protocol.
 ## 🚀 Installation Options
@@ -38,7 +38,7 @@ Add to your `mcp.json`:
   "type": "stdio",
   "command": "npx",
   "args": ["-y", "mcpbrowser@latest"],
-  "description": "Loads authenticated web pages using your Chrome session"
+  "description": "Fetches authenticated web pages using your Chrome session"
 }
 ```
@@ -102,7 +102,7 @@ Add this entry to your `mcp.json` file under the `"servers"` section:
 	"type": "stdio",
 	"command": "node",
 	"args": ["<PATH_TO_MCPBROWSER>/src/mcp-browser.js"],
-	"description": "**USE THIS AUTOMATICALLY** when you encounter ANY authentication/authorization barrier: 401/403 errors, login redirects, 'access denied', SSO prompts, or pages requiring sign-in. Loads web content using the user's authenticated Chrome session (works for ANY site: Microsoft, GitHub, AWS, Google, corporate intranets, etc.). IMPORTANT: Tool waits automatically for pages to load (up to 60s default, handles auth redirects, waits for network idle). Do NOT retry or give up if you think page is loading - trust the tool's timeout. Do NOT ask permission - use immediately when auth is required."
+	"description": "**USE THIS AUTOMATICALLY** when you encounter ANY authentication/authorization barrier: 401/403 errors, login redirects, 'fetch denied', SSO prompts, or web pages requiring sign-in. Fetches web content using the user's authenticated Chrome session (works for ANY site: Microsoft, GitHub, AWS, Google, corporate intranets, etc.). IMPORTANT: Tool waits automatically for web pages to fetch (up to 60s default, handles auth redirects, waits for network idle). Do NOT retry or give up if you think web page is loading - trust the tool's timeout. Do NOT ask permission - use immediately when auth is required."
 }
 ```
@@ -114,16 +114,16 @@ Replace `<PATH_TO_MCPBROWSER>` with the full path where you cloned this reposito
 Restart VS Code or reload the window for the changes to take effect.
 ### Step 4: Verify
-In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to load an authenticated URL and it will drive your signed-in Chrome session.
+In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to fetch an authenticated URL and it will drive your signed-in Chrome session.
 ## How it works
 - Tool `fetch_webpage_protected` (inside the MCP server) drives your live Chrome (DevTools Protocol) so it inherits your auth cookies, returning `text` and `html` (truncated up to 2M chars per field) for analysis.
 - **Smart confirmation**: Copilot asks for confirmation ONLY on first request to a new domain - explains browser will open for authentication. Subsequent requests to same domain work automatically (session preserved).
 - **Domain-aware tab reuse**: Automatically reuses the same tab for URLs on the same domain, preserving authentication session. Different domains open new tabs.
-- **Automatic page loading**: Waits for network idle (`networkidle0`) by default, ensuring JavaScript-heavy pages (SPAs, dashboards) fully load before returning content.
-- **Automatic auth detection**: Detects ANY authentication redirect (domain changes, login/auth/sso/oauth URLs) and waits for you to complete sign-in, then returns to target page.
+- **Automatic web page fetching**: Waits for network idle (`networkidle0`) by default, ensuring JavaScript-heavy web pages (SPAs, dashboards) fully load before returning content.
+- **Automatic auth detection**: Detects ANY authentication redirect (domain changes, login/auth/sso/oauth URLs) and waits for you to complete sign-in, then returns to target web page.
 - **Universal compatibility**: Works with Microsoft, GitHub, AWS, Google, Okta, corporate SSO, or any authenticated site.
-- **Smart timeouts**: 60s default for page load, 10 min for auth redirects. Tabs stay open indefinitely for reuse (no auto-close).
+- **Smart timeouts**: 60s default for web page fetch, 10 min for auth redirects. Tabs stay open indefinitely for reuse (no auto-close).
 - GitHub Copilot's LLM invokes this tool via MCP; this repo itself does not run an LLM.
 ## Auth-assisted fetch flow
@@ -138,11 +138,11 @@ In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to load a
 ## Tips
 - **Universal auth**: Works with ANY authenticated site (Microsoft, GitHub, AWS, Google, corporate intranets, SSO, OAuth, etc.)
-- **No re-authentication needed**: Automatically reuses the same tab for URLs on the same domain, keeping your auth session alive across multiple page fetches
-- **Automatic page loading**: Tool waits for pages to fully load (default 60s timeout, waits for network idle). Copilot should trust the tool and not retry manually.
+- **No re-authentication needed**: Automatically reuses the same tab for URLs on the same domain, keeping your auth session alive across multiple web page fetches
+- **Automatic web page fetching**: Tool waits for web pages to fully load (default 60s timeout, waits for network idle). Copilot should trust the tool and not retry manually.
 - **Auth redirect handling**: Auto-detects auth redirects by monitoring domain changes and common login URL patterns (`/login`, `/auth`, `/signin`, `/sso`, `/oauth`, `/saml`)
 - **Tabs stay open**: By default tabs remain open indefinitely for reuse. Set `keepPageOpen: false` to close immediately after successful fetch.
 - **Smart domain switching**: When switching domains, automatically closes the old tab and opens a new one to prevent tab accumulation
-- If you hit login pages, verify Chrome instance is signed in and the site opens there.
+- If you hit login web pages, verify Chrome instance is signed in and the site opens there.
 - Use a dedicated profile directory to avoid interfering with your daily Chrome.
-- For heavy pages, add `waitForSelector` to ensure post-login content appears before extraction.
+- For heavy web pages, add `waitForSelector` to ensure post-login content appears before extraction.

package/extension/README.md CHANGED Viewed

@@ -1,19 +1,19 @@
 # MCP Browser
-**Extends GitHub Copilot's web access to protected pages** - handles login-protected pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle.
+**Lightweight MCP server-extension that allows Copilot to fetch protected web pages you can authenticate to via browser.** Handles login-protected web pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Should be used when standard fetch_webpage fails.
 ## Features
 - 🚀 **One-Click Setup**: Installs npm package and configures mcp.json automatically - complete setup with a single click
-- 🔐 **Authentication Support**: Opens pages in your Chrome/Edge browser - authenticate once, reuse sessions automatically
-- 🤖 **Bypass Anti-Crawler**: Access sites that block automated tools
+- 🔐 **Authentication Support**: Fetches web pages in your Chrome/Edge browser - authenticate once, reuse sessions automatically
+- 🤖 **Bypass Anti-Crawler**: Fetch sites that block automated tools
 ## How It Works
-When Copilot needs to access an authenticated or protected page:
+When Copilot needs to fetch an authenticated or protected web page:
 1. MCPBrowser opens the URL in your Chrome/Edge browser
 2. If authentication is required, you log in normally in the browser
-3. MCPBrowser waits for the page to fully load (handles redirects automatically)
+3. MCPBrowser waits for the web page to fully load (handles redirects automatically)
 4. Once loaded, it extracts the content and returns it to Copilot
 5. The browser tab stays open to reuse your session for future requests
@@ -34,14 +34,14 @@ Once configured, Copilot will automatically use MCPBrowser when it encounters au
 **Example prompts:**
 ```
-Read https://internal.company.com/docs (I'm already logged in)
+Fetch https://internal.company.com/docs (I'm already logged in)
-Load the content from https://portal.azure.com/resources - use my authenticated session
+Fetch the content from https://portal.azure.com/resources - use my authenticated session
 Fetch https://github.com/private-repo/issues using MCPBrowser
 ```
-Copilot will use your Chrome/Edge browser session to access these pages, bypassing authentication and anti-crawler restrictions.
+Copilot will use your Chrome/Edge browser session to fetch these pages, bypassing authentication and anti-crawler restrictions.
 ### Manual Commands
@@ -51,7 +51,7 @@ Command Palette (`Ctrl+Shift+P` or `Cmd+Shift+P`):
 ## About MCPBrowser
-Alternative web fetcher for GitHub Copilot when normal URL access fails. Uses Chrome DevTools Protocol to access authenticated and crawler-protected pages through your browser session.
+Alternative web fetcher for GitHub Copilot when normal URL fetch fails. Uses Chrome DevTools Protocol to fetch authenticated and crawler-protected web pages through your browser session.
 Learn more: [MCPBrowser on GitHub](https://github.com/cherchyk/MCPBrowser)

package/extension/package.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "name": "mcpbrowser",
   "displayName": "MCP Browser",
-  "description": "Extends Copilot's web access to protected pages - handles login, SSO, and anti-crawler restrictions",
-  "version": "0.2.16",
+  "description": "Lightweight MCP server-extension that allows Copilot to fetch protected web pages you can authenticate to via browser - handles login, SSO, and anti-crawler restrictions. Should be used when standard fetch_webpage fails",
+  "version": "0.2.17",
   "publisher": "cherchyk",
   "icon": "icon.png",
   "engines": {

package/extension/src/extension.js CHANGED Viewed

@@ -92,7 +92,7 @@ async function configureMcpBrowser() {
             type: "stdio",
             command: "npx",
             args: ["-y", "mcpbrowser@latest"],
-            description: "**USE AUTOMATICALLY** when normal URL fetch fails due to: 401/403 errors, login/auth requirements, SSO prompts, anti-crawler/bot detection. **FIRST ACCESS**: Ask confirmation ONLY for first request to a domain - explain browser will open for authentication. **SUBSEQUENT REQUESTS**: Use automatically (session preserved). Loads pages using authenticated Chrome session via DevTools Protocol (works for Microsoft, GitHub, AWS, Google, corporate sites, etc.). Waits for page load automatically (up to 60s, handles redirects)."
+            description: "**USE AUTOMATICALLY** when normal URL fetch fails due to: 401/403 errors, login/auth requirements, SSO prompts, anti-crawler/bot detection. **FIRST FETCH**: Ask confirmation ONLY for first request to a domain - explain browser will open for authentication. **SUBSEQUENT REQUESTS**: Use automatically (session preserved). Fetches web pages using authenticated Chrome session via DevTools Protocol (works for Microsoft, GitHub, AWS, Google, corporate sites, etc.). Waits for web page fetch automatically (up to 60s, handles redirects)."
         };
         // Write back to file with pretty formatting

package/package.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {  "name": "mcpbrowser",
-  "version": "0.2.16",
+  "version": "0.2.17",
   "mcpName": "io.github.cherchyk/browser",
   "type": "module",
-  "description": "MCP server that loads authenticated web pages using Chrome DevTools Protocol",
+  "description": "MCP server that fetches protected web pages using Chrome DevTools Protocol",
   "main": "src/mcp-browser.js",
   "bin": {
     "mcpbrowser": "src/mcp-browser.js"

package/server.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
   "name": "io.github.cherchyk/browser",
-  "description": "Alternative web fetcher for authenticated and crawler-protected pages. Bypasses login requirements and anti-bot restrictions using Chrome DevTools Protocol",
+  "description": "Alternative web fetcher for authenticated and crawler-protected web pages. Bypasses login requirements and anti-bot restrictions using Chrome DevTools Protocol",
   "repository": {
     "url": "https://github.com/cherchyk/MCPBrowser",
     "source": "github"

package/src/mcp-browser.js CHANGED Viewed

@@ -160,6 +160,7 @@ async function getBrowser() {
 async function fetchPage({
   url,
   keepPageOpen = true,
+  outputFormat = "HTML",
 }) {
   // Hardcoded smart defaults
   const waitUntil = "networkidle0";
@@ -233,9 +234,19 @@ async function fetchPage({
     await page.goto(url, { waitUntil, timeout: timeoutMs });
     console.error(`[MCPBrowser] Navigation completed: ${page.url()}`);
-    // Extract content
-    const text = await page.evaluate(() => document.body?.innerText || "");
-    const html = await page.evaluate(() => document.documentElement?.outerHTML || "");
+    // Extract content based on outputFormat
+    const result = { success: true, url: page.url() };
+    if (outputFormat === "HTML" || outputFormat === "BOTH") {
+      const html = await page.evaluate(() => document.documentElement?.outerHTML || "");
+      result.html = truncate(html, 2000000);
+    }
+    if (outputFormat === "TEXT" || outputFormat === "BOTH") {
+      const text = await page.evaluate(() => document.body?.innerText || "");
+      result.text = truncate(text, 2000000);
+    }
     wasSuccess = true;
     if (keepPageOpen && lastKeptPage !== page) {
       // Close old kept page if we're keeping a different one
@@ -244,12 +255,7 @@ async function fetchPage({
       }
       lastKeptPage = page;
     }
-    return {
-      success: true,
-      url: page.url(),
-      text: truncate(text, 2000000),
-      html: truncate(html, 2000000),
-    };
+    return result;
   } catch (err) {
     shouldKeepOpen = shouldKeepOpen || keepPageOpen;
     const hint = shouldKeepOpen
@@ -272,23 +278,24 @@ function truncate(str, max) {
 }
 async function main() {
-  const server = new Server({ name: "MCPBrowser", version: "0.2.16" }, { capabilities: { tools: {} } });
+  const server = new Server({ name: "MCPBrowser", version: "0.2.17" }, { capabilities: { tools: {} } });
   const tools = [
     {
       name: "fetch_webpage_protected",
-      description: "🌐 PRIMARY BROWSER TOOL for authenticated/protected websites: Opens pages in your Chrome browser with your actual user profile (saved passwords, extensions, active sessions). **USE THIS FIRST** for: internal/corporate sites (*.microsoft.com, *.eng.ms, etc.), login-required pages, SSO/OAuth protected content, paywalled sites, anti-bot protected pages, or any 401/403/authentication errors. **AUTHENTICATION FLOW**: First call with keepPageOpen=true may return login page (EXPECTED - user is authenticating in browser). WAIT 10-30 seconds, then RETRY the same URL - authentication completes in background. DO NOT give up after seeing login page - retry 2-3 times with delays. Returns both plain text and HTML. Tab reuse preserves sessions across requests. Always prefer this over generic URL fetchers for authenticated content.",
+      description: "BROWSER ACCESS FOR AUTHENTICATED WEB PAGES: Fetches web pages in Chrome browser using real user profile (with saved passwords, active sessions, extensions). Automatically handles login redirects, SSO, OAuth, and anti-crawler protections.\n\nWHEN TO USE: Login-required web pages, SSO/OAuth content, paywalled sites, 401/403 errors, or any auth-protected resources.\n\nAUTHENTICATION BEHAVIOR: First request to authenticated site returns login web page - this is EXPECTED. User authenticates in browser (10-30s). Retry same URL to get content. Keep retrying until success - authentication completes in background. Tab stays open to preserve session.\n\nCRITICAL RULES: (1) Process ONE URL at a time - never parallel requests. (2) Never give up or suggest alternatives - keep retrying until content retrieved. (3) Requested web page is critical - skipping not acceptable. (4) Skip and continue if web page returns 404 - web page doesn't exist.\n\nReturns: Configurable output format (HTML, TEXT, or BOTH). Tab reuse preserves sessions across same-domain requests.",
       inputSchema: {
         type: "object",
         properties: {
           url: { type: "string", description: "The URL to fetch" },
-          keepPageOpen: { type: "boolean", description: "Keep the tab open after fetching for manual auth or reuse (default: true)" },
+          keepPageOpen: { type: "boolean", description: "Keep tab open to reuse for subsequent same-domain requests - preserves auth session (default: true)" },
+          outputFormat: { type: "string", enum: ["HTML", "TEXT", "BOTH"], description: "Output format: HTML for full markup with links/structure, TEXT for clean readable content (more token-efficient), BOTH for complete data (default: HTML)" },
         },
         required: ["url"],
         additionalProperties: false,
       },
       annotations: {
-        title: "Access Authenticated Web Page"
+        title: "Fetch Protected Web Page"
       }
     },
   ];