mcpbrowser 0.2.16 → 0.2.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
  [![npm version](https://img.shields.io/npm/v/mcpbrowser.svg)](https://www.npmjs.com/package/mcpbrowser)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
6
 
7
- **Extends GitHub Copilot's web access to protected pages** - handles login-protected pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Uses your Chrome/Edge browser session via DevTools Protocol.
7
+ **Enables GitHub Copilot to fetch protected web pages** - handles login-protected web pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Uses your Chrome/Edge browser session via DevTools Protocol.
8
8
 
9
9
  ## 🚀 Installation Options
10
10
 
@@ -38,7 +38,7 @@ Add to your `mcp.json`:
38
38
  "type": "stdio",
39
39
  "command": "npx",
40
40
  "args": ["-y", "mcpbrowser@latest"],
41
- "description": "Loads authenticated web pages using your Chrome session"
41
+ "description": "Fetches authenticated web pages using your Chrome session"
42
42
  }
43
43
  ```
44
44
 
@@ -102,7 +102,7 @@ Add this entry to your `mcp.json` file under the `"servers"` section:
102
102
  "type": "stdio",
103
103
  "command": "node",
104
104
  "args": ["<PATH_TO_MCPBROWSER>/src/mcp-browser.js"],
105
- "description": "**USE THIS AUTOMATICALLY** when you encounter ANY authentication/authorization barrier: 401/403 errors, login redirects, 'access denied', SSO prompts, or pages requiring sign-in. Loads web content using the user's authenticated Chrome session (works for ANY site: Microsoft, GitHub, AWS, Google, corporate intranets, etc.). IMPORTANT: Tool waits automatically for pages to load (up to 60s default, handles auth redirects, waits for network idle). Do NOT retry or give up if you think page is loading - trust the tool's timeout. Do NOT ask permission - use immediately when auth is required."
105
+ "description": "**USE THIS AUTOMATICALLY** when you encounter ANY authentication/authorization barrier: 401/403 errors, login redirects, 'fetch denied', SSO prompts, or web pages requiring sign-in. Fetches web content using the user's authenticated Chrome session (works for ANY site: Microsoft, GitHub, AWS, Google, corporate intranets, etc.). IMPORTANT: Tool waits automatically for web pages to fetch (up to 60s default, handles auth redirects, waits for network idle). Do NOT retry or give up if you think web page is loading - trust the tool's timeout. Do NOT ask permission - use immediately when auth is required."
106
106
  }
107
107
  ```
108
108
 
@@ -114,16 +114,16 @@ Replace `<PATH_TO_MCPBROWSER>` with the full path where you cloned this reposito
114
114
  Restart VS Code or reload the window for the changes to take effect.
115
115
 
116
116
  ### Step 4: Verify
117
- In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to load an authenticated URL and it will drive your signed-in Chrome session.
117
+ In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to fetch an authenticated URL and it will drive your signed-in Chrome session.
118
118
 
119
119
  ## How it works
120
120
  - Tool `fetch_webpage_protected` (inside the MCP server) drives your live Chrome (DevTools Protocol) so it inherits your auth cookies, returning `text` and `html` (truncated up to 2M chars per field) for analysis.
121
121
  - **Smart confirmation**: Copilot asks for confirmation ONLY on first request to a new domain - explains browser will open for authentication. Subsequent requests to same domain work automatically (session preserved).
122
122
  - **Domain-aware tab reuse**: Automatically reuses the same tab for URLs on the same domain, preserving authentication session. Different domains open new tabs.
123
- - **Automatic page loading**: Waits for network idle (`networkidle0`) by default, ensuring JavaScript-heavy pages (SPAs, dashboards) fully load before returning content.
124
- - **Automatic auth detection**: Detects ANY authentication redirect (domain changes, login/auth/sso/oauth URLs) and waits for you to complete sign-in, then returns to target page.
123
+ - **Automatic web page fetching**: Waits for network idle (`networkidle0`) by default, ensuring JavaScript-heavy web pages (SPAs, dashboards) fully load before returning content.
124
+ - **Automatic auth detection**: Detects ANY authentication redirect (domain changes, login/auth/sso/oauth URLs) and waits for you to complete sign-in, then returns to target web page.
125
125
  - **Universal compatibility**: Works with Microsoft, GitHub, AWS, Google, Okta, corporate SSO, or any authenticated site.
126
- - **Smart timeouts**: 60s default for page load, 10 min for auth redirects. Tabs stay open indefinitely for reuse (no auto-close).
126
+ - **Smart timeouts**: 60s default for web page fetch, 10 min for auth redirects. Tabs stay open indefinitely for reuse (no auto-close).
127
127
  - GitHub Copilot's LLM invokes this tool via MCP; this repo itself does not run an LLM.
128
128
 
129
129
  ## Auth-assisted fetch flow
@@ -138,11 +138,11 @@ In Copilot Chat, you should see the `MCPBrowser` server listed. Ask it to load a
138
138
 
139
139
  ## Tips
140
140
  - **Universal auth**: Works with ANY authenticated site (Microsoft, GitHub, AWS, Google, corporate intranets, SSO, OAuth, etc.)
141
- - **No re-authentication needed**: Automatically reuses the same tab for URLs on the same domain, keeping your auth session alive across multiple page fetches
142
- - **Automatic page loading**: Tool waits for pages to fully load (default 60s timeout, waits for network idle). Copilot should trust the tool and not retry manually.
141
+ - **No re-authentication needed**: Automatically reuses the same tab for URLs on the same domain, keeping your auth session alive across multiple web page fetches
142
+ - **Automatic web page fetching**: Tool waits for web pages to fully load (default 60s timeout, waits for network idle). Copilot should trust the tool and not retry manually.
143
143
  - **Auth redirect handling**: Auto-detects auth redirects by monitoring domain changes and common login URL patterns (`/login`, `/auth`, `/signin`, `/sso`, `/oauth`, `/saml`)
144
144
  - **Tabs stay open**: By default tabs remain open indefinitely for reuse. Set `keepPageOpen: false` to close immediately after successful fetch.
145
145
  - **Smart domain switching**: When switching domains, automatically closes the old tab and opens a new one to prevent tab accumulation
146
- - If you hit login pages, verify Chrome instance is signed in and the site opens there.
146
+ - If you hit login web pages, verify Chrome instance is signed in and the site opens there.
147
147
  - Use a dedicated profile directory to avoid interfering with your daily Chrome.
148
- - For heavy pages, add `waitForSelector` to ensure post-login content appears before extraction.
148
+ - For heavy web pages, add `waitForSelector` to ensure post-login content appears before extraction.
@@ -1,19 +1,19 @@
1
1
  # MCP Browser
2
2
 
3
- **Extends GitHub Copilot's web access to protected pages** - handles login-protected pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle.
3
+ **Lightweight MCP server-extension that allows Copilot to fetch protected web pages you can authenticate to via browser.** Handles login-protected web pages, corporate SSO, and anti-crawler restrictions that normal fetching can't handle. Should be used when standard fetch_webpage fails.
4
4
 
5
5
  ## Features
6
6
 
7
7
  - 🚀 **One-Click Setup**: Installs npm package and configures mcp.json automatically - complete setup with a single click
8
- - 🔐 **Authentication Support**: Opens pages in your Chrome/Edge browser - authenticate once, reuse sessions automatically
9
- - 🤖 **Bypass Anti-Crawler**: Access sites that block automated tools
8
+ - 🔐 **Authentication Support**: Fetches web pages in your Chrome/Edge browser - authenticate once, reuse sessions automatically
9
+ - 🤖 **Bypass Anti-Crawler**: Fetch sites that block automated tools
10
10
 
11
11
  ## How It Works
12
12
 
13
- When Copilot needs to access an authenticated or protected page:
13
+ When Copilot needs to fetch an authenticated or protected web page:
14
14
  1. MCPBrowser opens the URL in your Chrome/Edge browser
15
15
  2. If authentication is required, you log in normally in the browser
16
- 3. MCPBrowser waits for the page to fully load (handles redirects automatically)
16
+ 3. MCPBrowser waits for the web page to fully load (handles redirects automatically)
17
17
  4. Once loaded, it extracts the content and returns it to Copilot
18
18
  5. The browser tab stays open to reuse your session for future requests
19
19
 
@@ -34,14 +34,14 @@ Once configured, Copilot will automatically use MCPBrowser when it encounters au
34
34
 
35
35
  **Example prompts:**
36
36
  ```
37
- Read https://internal.company.com/docs (I'm already logged in)
37
+ Fetch https://internal.company.com/docs (I'm already logged in)
38
38
 
39
- Load the content from https://portal.azure.com/resources - use my authenticated session
39
+ Fetch the content from https://portal.azure.com/resources - use my authenticated session
40
40
 
41
41
  Fetch https://github.com/private-repo/issues using MCPBrowser
42
42
  ```
43
43
 
44
- Copilot will use your Chrome/Edge browser session to access these pages, bypassing authentication and anti-crawler restrictions.
44
+ Copilot will use your Chrome/Edge browser session to fetch these pages, bypassing authentication and anti-crawler restrictions.
45
45
 
46
46
  ### Manual Commands
47
47
 
@@ -51,7 +51,7 @@ Command Palette (`Ctrl+Shift+P` or `Cmd+Shift+P`):
51
51
 
52
52
  ## About MCPBrowser
53
53
 
54
- Alternative web fetcher for GitHub Copilot when normal URL access fails. Uses Chrome DevTools Protocol to access authenticated and crawler-protected pages through your browser session.
54
+ Alternative web fetcher for GitHub Copilot when normal URL fetch fails. Uses Chrome DevTools Protocol to fetch authenticated and crawler-protected web pages through your browser session.
55
55
 
56
56
  Learn more: [MCPBrowser on GitHub](https://github.com/cherchyk/MCPBrowser)
57
57
 
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "name": "mcpbrowser",
3
3
  "displayName": "MCP Browser",
4
- "description": "Extends Copilot's web access to protected pages - handles login, SSO, and anti-crawler restrictions",
5
- "version": "0.2.16",
4
+ "description": "Lightweight MCP server-extension that allows Copilot to fetch protected web pages you can authenticate to via browser - handles login, SSO, and anti-crawler restrictions. Should be used when standard fetch_webpage fails",
5
+ "version": "0.2.17",
6
6
  "publisher": "cherchyk",
7
7
  "icon": "icon.png",
8
8
  "engines": {
@@ -92,7 +92,7 @@ async function configureMcpBrowser() {
92
92
  type: "stdio",
93
93
  command: "npx",
94
94
  args: ["-y", "mcpbrowser@latest"],
95
- description: "**USE AUTOMATICALLY** when normal URL fetch fails due to: 401/403 errors, login/auth requirements, SSO prompts, anti-crawler/bot detection. **FIRST ACCESS**: Ask confirmation ONLY for first request to a domain - explain browser will open for authentication. **SUBSEQUENT REQUESTS**: Use automatically (session preserved). Loads pages using authenticated Chrome session via DevTools Protocol (works for Microsoft, GitHub, AWS, Google, corporate sites, etc.). Waits for page load automatically (up to 60s, handles redirects)."
95
+ description: "**USE AUTOMATICALLY** when normal URL fetch fails due to: 401/403 errors, login/auth requirements, SSO prompts, anti-crawler/bot detection. **FIRST FETCH**: Ask confirmation ONLY for first request to a domain - explain browser will open for authentication. **SUBSEQUENT REQUESTS**: Use automatically (session preserved). Fetches web pages using authenticated Chrome session via DevTools Protocol (works for Microsoft, GitHub, AWS, Google, corporate sites, etc.). Waits for web page fetch automatically (up to 60s, handles redirects)."
96
96
  };
97
97
 
98
98
  // Write back to file with pretty formatting
package/package.json CHANGED
@@ -1,8 +1,8 @@
1
1
  { "name": "mcpbrowser",
2
- "version": "0.2.16",
2
+ "version": "0.2.17",
3
3
  "mcpName": "io.github.cherchyk/browser",
4
4
  "type": "module",
5
- "description": "MCP server that loads authenticated web pages using Chrome DevTools Protocol",
5
+ "description": "MCP server that fetches protected web pages using Chrome DevTools Protocol",
6
6
  "main": "src/mcp-browser.js",
7
7
  "bin": {
8
8
  "mcpbrowser": "src/mcp-browser.js"
package/server.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
3
3
  "name": "io.github.cherchyk/browser",
4
- "description": "Alternative web fetcher for authenticated and crawler-protected pages. Bypasses login requirements and anti-bot restrictions using Chrome DevTools Protocol",
4
+ "description": "Alternative web fetcher for authenticated and crawler-protected web pages. Bypasses login requirements and anti-bot restrictions using Chrome DevTools Protocol",
5
5
  "repository": {
6
6
  "url": "https://github.com/cherchyk/MCPBrowser",
7
7
  "source": "github"
@@ -160,6 +160,7 @@ async function getBrowser() {
160
160
  async function fetchPage({
161
161
  url,
162
162
  keepPageOpen = true,
163
+ outputFormat = "HTML",
163
164
  }) {
164
165
  // Hardcoded smart defaults
165
166
  const waitUntil = "networkidle0";
@@ -233,9 +234,19 @@ async function fetchPage({
233
234
  await page.goto(url, { waitUntil, timeout: timeoutMs });
234
235
  console.error(`[MCPBrowser] Navigation completed: ${page.url()}`);
235
236
 
236
- // Extract content
237
- const text = await page.evaluate(() => document.body?.innerText || "");
238
- const html = await page.evaluate(() => document.documentElement?.outerHTML || "");
237
+ // Extract content based on outputFormat
238
+ const result = { success: true, url: page.url() };
239
+
240
+ if (outputFormat === "HTML" || outputFormat === "BOTH") {
241
+ const html = await page.evaluate(() => document.documentElement?.outerHTML || "");
242
+ result.html = truncate(html, 2000000);
243
+ }
244
+
245
+ if (outputFormat === "TEXT" || outputFormat === "BOTH") {
246
+ const text = await page.evaluate(() => document.body?.innerText || "");
247
+ result.text = truncate(text, 2000000);
248
+ }
249
+
239
250
  wasSuccess = true;
240
251
  if (keepPageOpen && lastKeptPage !== page) {
241
252
  // Close old kept page if we're keeping a different one
@@ -244,12 +255,7 @@ async function fetchPage({
244
255
  }
245
256
  lastKeptPage = page;
246
257
  }
247
- return {
248
- success: true,
249
- url: page.url(),
250
- text: truncate(text, 2000000),
251
- html: truncate(html, 2000000),
252
- };
258
+ return result;
253
259
  } catch (err) {
254
260
  shouldKeepOpen = shouldKeepOpen || keepPageOpen;
255
261
  const hint = shouldKeepOpen
@@ -272,23 +278,24 @@ function truncate(str, max) {
272
278
  }
273
279
 
274
280
  async function main() {
275
- const server = new Server({ name: "MCPBrowser", version: "0.2.16" }, { capabilities: { tools: {} } });
281
+ const server = new Server({ name: "MCPBrowser", version: "0.2.17" }, { capabilities: { tools: {} } });
276
282
 
277
283
  const tools = [
278
284
  {
279
285
  name: "fetch_webpage_protected",
280
- description: "🌐 PRIMARY BROWSER TOOL for authenticated/protected websites: Opens pages in your Chrome browser with your actual user profile (saved passwords, extensions, active sessions). **USE THIS FIRST** for: internal/corporate sites (*.microsoft.com, *.eng.ms, etc.), login-required pages, SSO/OAuth protected content, paywalled sites, anti-bot protected pages, or any 401/403/authentication errors. **AUTHENTICATION FLOW**: First call with keepPageOpen=true may return login page (EXPECTED - user is authenticating in browser). WAIT 10-30 seconds, then RETRY the same URL - authentication completes in background. DO NOT give up after seeing login page - retry 2-3 times with delays. Returns both plain text and HTML. Tab reuse preserves sessions across requests. Always prefer this over generic URL fetchers for authenticated content.",
286
+ description: "BROWSER ACCESS FOR AUTHENTICATED WEB PAGES: Fetches web pages in Chrome browser using real user profile (with saved passwords, active sessions, extensions). Automatically handles login redirects, SSO, OAuth, and anti-crawler protections.\n\nWHEN TO USE: Login-required web pages, SSO/OAuth content, paywalled sites, 401/403 errors, or any auth-protected resources.\n\nAUTHENTICATION BEHAVIOR: First request to authenticated site returns login web page - this is EXPECTED. User authenticates in browser (10-30s). Retry same URL to get content. Keep retrying until success - authentication completes in background. Tab stays open to preserve session.\n\nCRITICAL RULES: (1) Process ONE URL at a time - never parallel requests. (2) Never give up or suggest alternatives - keep retrying until content retrieved. (3) Requested web page is critical - skipping not acceptable. (4) Skip and continue if web page returns 404 - web page doesn't exist.\n\nReturns: Configurable output format (HTML, TEXT, or BOTH). Tab reuse preserves sessions across same-domain requests.",
281
287
  inputSchema: {
282
288
  type: "object",
283
289
  properties: {
284
290
  url: { type: "string", description: "The URL to fetch" },
285
- keepPageOpen: { type: "boolean", description: "Keep the tab open after fetching for manual auth or reuse (default: true)" },
291
+ keepPageOpen: { type: "boolean", description: "Keep tab open to reuse for subsequent same-domain requests - preserves auth session (default: true)" },
292
+ outputFormat: { type: "string", enum: ["HTML", "TEXT", "BOTH"], description: "Output format: HTML for full markup with links/structure, TEXT for clean readable content (more token-efficient), BOTH for complete data (default: HTML)" },
286
293
  },
287
294
  required: ["url"],
288
295
  additionalProperties: false,
289
296
  },
290
297
  annotations: {
291
- title: "Access Authenticated Web Page"
298
+ title: "Fetch Protected Web Page"
292
299
  }
293
300
  },
294
301
  ];