@natchs/browser-mcp 2.4.0 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,470 +1,367 @@
1
- # @natchs/browser-mcp
2
-
3
- [![npm version](https://img.shields.io/npm/v/%40natchs%2Fbrowser-mcp?label=npm&logo=npm)](https://www.npmjs.com/package/@natchs/browser-mcp)
4
- [![License](https://img.shields.io/npm/l/%40natchs%2Fbrowser-mcp?color=blue&label=license)](LICENSE)
5
- [![Node Version](https://img.shields.io/node/v/%40natchs%2Fbrowser-mcp?logo=node.js)](package.json)
6
-
7
- ---
8
-
9
- ## 🇹🇷 Türkçe
10
-
11
- **Browser automation MCP server** — reverse engineering, web scraping ve data extraction için tasarlanmış, AI ajanların tarayıcıyı tam kontrol etmesini sağlayan bir Model Context Protocol sunucusu.
12
-
13
- ### Quick Start
14
-
15
- ```bash
16
- # 1. Install
17
- npm install @natchs/browser-mcp
18
-
19
- # 2. Run (Chromium browser otomatik kurulur, ilk çalıştırmada ~30sn)
20
- npx @natchs/browser-mcp
21
- ```
22
-
23
- #### MCP Client Config
24
-
25
- Claude Desktop, Cursor, VS Code veya herhangi bir MCP istemcisine eklemek için:
26
-
27
- ```json
28
- {
29
- "mcpServers": {
30
- "browser-mcp": {
31
- "command": "npx",
32
- "args": ["@natchs/browser-mcp"]
33
- }
34
- }
35
- }
36
- ```
37
-
38
- Alternatif: projeyi klonlayıp yerel build ile de kullanabilirsiniz:
39
-
40
- ```bash
41
- git clone https://github.com/natchs/browser-mcp.git
42
- cd browser-mcp
43
- npm ci
44
- npm run build
45
- npx @natchs/browser-mcp
46
- ```
47
-
48
- #### Browser Profili: 3 Mod
49
-
50
- Ajanın hangi tarayıcıyı kullanacağını `BROWSER_MODE` ortam değişkeni belirler:
51
-
52
- | Mod | Açıklama | Ne Zaman Kullanılır? |
53
- |-----|----------|---------------------|
54
- | `fresh` (default) | Playwright'ın kendi Chromium'u. Her seferinde sıfır profil, çerez/ext/login yok | Temiz ortam, iz bırakmamak |
55
- | `persistent` | **Senin Chrome profilin.** Gerçek çerezler, geçmiş, uzantılar, oturum açmış hesaplar — hepsi ajana kullanıma hazır | Ajanın sitelere "sen" olarak girmesi gereken işlemler |
56
- | `connect` | Halihazırda açık Chrome'una CDP ile bağlanır | Chrome'u elle kontrol ederken ajanın eşlik etmesi |
57
-
58
- **Örnek persistent (otomatik profil tespiti):**
59
-
60
- ```bash
61
- BROWSER_MODE=persistent npx @natchs/browser-mcp
62
- ```
63
-
64
- v2.3.0 ile Chrome profili otomatik tespit edilir. Hiçbir yol belirtmeniz gerekmez. Windows/macOS/Linux hepsinde çalışır. Manuel yol belirtmek için:
65
-
66
- ```bash
67
- BROWSER_MODE=persistent BROWSER_USER_DATA_DIR=C:\Users\...\User Data\Default npx @natchs/browser-mcp
68
- ```
69
-
70
- **Örnek — connect (mevcut Chrome'a bağlan):**
71
-
72
- ```bash
73
- # Önce Chrome'u debug port ile aç:
74
- "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
75
-
76
- # Sonra ajanı bağla:
77
- BROWSER_MODE=connect npx @natchs/browser-mcp
78
- ```
79
-
80
- **MCP Client Config ile kullanım — her mod için ayrı profil:**
81
-
82
- ```json
83
- {
84
- "mcpServers": {
85
- "browser-mcp-persistent": {
86
- "command": "npx",
87
- "args": ["@natchs/browser-mcp"],
88
- "env": {
89
- "BROWSER_MODE": "persistent"
90
- }
91
- },
92
- "browser-mcp-fresh": {
93
- "command": "npx",
94
- "args": ["@natchs/browser-mcp"],
95
- "env": {
96
- "BROWSER_MODE": "fresh"
97
- }
98
- }
99
- }
100
- }
101
- ```
102
-
103
- ---
104
-
105
- ### Ortam Değişkenleri
106
-
107
- Tüm yapılandırma, ortam değişkenleri aracılığıyla yapılabilir. Aşağıdaki tablo, desteklenen tüm değişkenleri listeler.
108
-
109
- #### Browser
110
-
111
- | Variable | Default | Description | Required |
112
- |----------|---------|-------------|----------|
113
- | `BROWSER_MODE` | `fresh` | Browser modu: `fresh`, `persistent`, `connect` | No |
114
- | `BROWSER_HEADLESS` | `true` | Headless modda çalıştır (`true`/`false`/`1`/`0`/`yes`/`no`) | No |
115
- | `BROWSER_USER_DATA_DIR` | — | Chrome kullanıcı veri dizini (mutlak yol) | No |
116
- | `BROWSER_CHANNEL` | `""` | Browser kanalı (`chrome`, `msedge`, `chromium`, vb.) | No |
117
- | `BROWSER_DEBUG_PORT` | `9222` | CDP debug port numarası | No |
118
- | `BROWSER_AUTO_DETECT_PROFILE` | `true` | Chrome profilini otomatik tespit et | No |
119
- | `BROWSER_VIEWPORT_WIDTH` | `1280` | Görüntü alanı genişliği (px) | No |
120
- | `BROWSER_VIEWPORT_HEIGHT` | `720` | Görüntü alanı yüksekliği (px) | No |
121
- | `BROWSER_USER_AGENT` | — | Özel User-Agent string'i | No |
122
- | `BROWSER_LOCALE` | | Tarayıcı locale değeri (örn. `tr-TR`) | No |
123
- | `BROWSER_TIMEOUT` | `30000` | Browser işlemleri için timeout (ms) | No |
124
- | `BROWSER_AUTO_INSTALL` | `true` | Chromium'u otomatik kur (`false` ile devre dışı) | No |
125
-
126
- #### Network & Download
127
-
128
- | Variable | Default | Description | Required |
129
- |----------|---------|-------------|----------|
130
- | `NETWORK_HAR_ENABLED` | `1` | HAR yakalamayı etkinleştir (`0` ile devre dışı) | No |
131
- | `NETWORK_MAX_ENTRIES` | `5000` | Maksimum network kaydı sayısı | No |
132
- | `NETWORK_MAX_RESPONSE_BODY_SIZE` | `262144` | Maksimum response body boyutu (bytes) | No |
133
- | `NETWORK_STORE_RESPONSE_BODIES` | `false` | Response body'leri hafızada tut | No |
134
- | `NETWORK_EXCLUDE_BODY_TYPES` | `["image","media","font","stylesheet"]` | Body kaydı dışı bırakılacak tipler (JSON array) | No |
135
- | `NETWORK_EXPORT_DIR` | `./network-logs` | Network log dışa aktarma dizini | No |
136
- | `NETWORK_CAPTURE_FAILED` | `true` | Başarısız istekleri de yakala | No |
137
- | `NETWORK_CAPTURE_WS` | `true` | WebSocket frame'lerini yakala | No |
138
- | `NETWORK_WS_MAX_FRAMES` | `1000` | Maksimum WS frame sayısı | No |
139
- | `NETWORK_WS_MAX_FRAME_SIZE` | `65536` | Maksimum WS frame payload boyutu | No |
140
- | `NETWORK_MAX_MEMORY_MB` | `256` | Network capture için maksimum bellek (MB) | No |
141
- | `NETWORK_CAPTURE_REQUEST_BODY` | `false` | Request body'lerini de yakala | No |
142
- | `NETWORK_DOWNLOAD_DIR` | `./downloads` | İndirilen dosyalar için dizin | No |
143
- | `NETWORK_MAX_DOWNLOAD_SIZE` | `104857600` | Maksimum indirme boyutu (bytes, varsayılan 100MB) | No |
144
- | `NETWORK_DOWNLOAD_ENABLED` | `false` | Dosya indirme özelliğini etkinleştir | No |
145
- | `NETWORK_SAVE_ENABLED` | `false` | Network capture kaydetme özelliğini etkinleştir (`network_save` tool'u) | No |
146
-
147
- #### Rate Limiting
148
-
149
- | Variable | Default | Description | Required |
150
- |----------|---------|-------------|----------|
151
- | `RATE_LIMIT_ENABLED` | `true` | Rate limiting'i etkinleştir | No |
152
- | `RATE_LIMIT_GLOBAL_RPM` | `120` | Global dakikalık istek limiti | No |
153
- | `RATE_LIMIT_PER_TOOL_RPM` | `30` | Tool başına dakikalık istek limiti | No |
154
- | `RATE_LIMIT_BURST_SIZE` | `10` | Maksimum burst boyutu | No |
155
-
156
- #### Cache
157
-
158
- | Variable | Default | Description | Required |
159
- |----------|---------|-------------|----------|
160
- | `CACHE_MAX_ENTRIES` | `100` | Maksimum önbellek girişi sayısı | No |
161
- | `CACHE_TTL_SECONDS` | `300` | Önbellek TTL değeri (saniye) | No |
162
-
163
- #### Security
164
-
165
- | Variable | Default | Description | Required |
166
- |----------|---------|-------------|----------|
167
- | `SECURITY_API_KEY` | `""` | API anahtarı (boş = auth kapalı) | No |
168
- | `SECURITY_ALLOWED_DIRS` | `[]` | İzin verilen dizinler (JSON array) | No |
169
- | `SECURITY_MAX_MEMORY_MB` | `512` | Maksimum bellek kullanımı (MB) | No |
170
- | `SECURITY_DEFAULT_TIMEOUT` | `30000` | Varsayılan işlem timeout'u (ms) | No |
171
- | `SECURITY_MAX_TIMEOUT` | `120000` | Maksimum timeout (ms) | No |
172
- | `SECURITY_MAX_SESSIONS` | `10` | Maksimum eşzamanlı browser session sayısı | No |
173
-
174
- #### Config
175
-
176
- | Variable | Default | Description | Required |
177
- |----------|---------|-------------|----------|
178
- | `BROWSER_MCP_CONFIG` | — | Yapılandırma dosyası yolu (JSON) | No |
179
-
180
- ---
181
-
182
- ### Detaylı Özellikler
183
-
184
- #### Mevcut Yetenekler
185
-
186
- | Kategori | Tool Sayısı | Neler Yapabilir? |
187
- |----------|------------|-------------------|
188
- | **Navigation** | 5 | Sayfa açma, geri/ileri gitme, refresh, element/url bekleme |
189
- | **Interaction** | 8 | Tıklama, form doldurma, select kutusu, hover, drag-drop, klavye, dosya yükleme |
190
- | **Extraction** | 7 | HTML, Markdown, plain text, CSS selector, tablo, Schema.org, Open Graph |
191
- | **Network** | 10 | İstek/cevap takibi, HAR/JSON/CSV export, WebSocket frame yakalama, body kaydetme, console log |
192
- | **Browser Control** | 8 | Screenshot, PDF, çerez yönetimi, JavaScript çalıştırma, dosya indirme |
193
- | **Session** | 3 | Çoklu session açma/kapama/listeleme, otomatik TTL cleanup |
194
- | **Admin** | 3 | Server durumu, cache istatistikleri, cache temizleme |
195
- | **RE Tools** | 4 | JavaScript beautify/deobfuscate, API endpoint discovery, auth analizi |
196
- | **Stealth** | 2 | Fingerprint değiştirme, human behavior simülasyonu |
197
-
198
- **Toplam: 40+ tool, 9 kategori, production-grade mimari**
199
-
200
- #### Ne Yapabilir?
201
-
202
- - **Reverse Engineering**: JS deobfuscation, API auth analizi, network isteklerini HAR formatında export
203
- - **Web Scraping**: Cloudflare koruması olmayan sitelerden HTML/Markdown/text çekme, tablo ve Schema.org yapılandırılmış veri extraction
204
- - **Network Monitoring**: HTTP/HTTPS isteklerini gerçek zamanlı yakalama, WebSocket frame'lerini izleme, binary body'leri diske kaydetme
205
- - **Form Automation**: Login formları, search, multi-step form doldurma, dosya upload
206
- - **Session Yönetimi**: Her biri izole, aynı anda birden çok browser session'ı yönetme, otomatik TTL ile cleanup
207
- - **Güvenlik**: SSRF koruması (internal IP bloklama, DNS rebind engelleme, redirect bypass koruması), path traversal engelleme, MIME tabanlı extension blocklist, cookie leak koruması
208
- - **Data Export**: network capture'ları HAR/JSON/CSV export, body save, WS frame save (.jsonl)
209
- - **Observability**: Structured JSON logging, tool metrikleri (call count, süre, hata), LRU cache + TTL
210
- - **Rate Limiting**: Token bucket, global + per-tool, yapılandırılabilir RPM
211
-
212
- #### Browser Profil Yönetimi (v2.3.0)
213
-
214
- Ajan, 3 farklı browser modundan biriyle çalışır:
215
-
216
- - **`fresh` (default):** Playwright'ın kendi Chromium'unu kullanır, her seferinde sıfır profil. Çerez, geçmiş, uzantı — hiçbiri taşınmaz.
217
- - **`persistent`:** Gerçek Chrome profilinizi kullanır. Oturum açtığınız tüm sitelere (Gmail, GitHub, ChatGPT, kurumsal VPN) ajan da "sizmiş gibi" erişir. v2.3.0 ile profil otomatik tespit edilir — `BROWSER_MODE=persistent` yazmanız yeterli.
218
- - **`connect`:** Halihazırda `--remote-debugging-port` ile açılmış Chrome'unuza bağlanır. Mevcut sekmeleri ve oturumları ajanla paylaşırsınız.
219
-
220
- #### Ne Yapamaz? (Mevcut Sınırlamalar)
221
-
222
- - **Cloudflare/anti-bot korumalı siteler**: Challenge sayfalarını geçemez, manuel çözüm gerekir
223
- - **Captcha çözümü**: Dahili captcha çözücü yoktur (üçüncü parti servislerle entegre edilebilir)
224
- - **Görsel tanıma**: Screenshot alabilir ama içeriği yorumlayamaz
225
- - **Native dosya indirme**: `browser_download` URL bazlıdır, Playwright native download event'ini kapsamaz
226
- - **Canvas/WebGL fingerprinting**: Temel fingerprint değişir ama gelişmiş anti-bot sistemlerini geçemeyebilir
227
- - **Mobil browser simülasyonu**: Sadece Chromium desktop, mobile viewport taklidi yapabilir
228
-
229
- #### Güvenlik Özellikleri
230
-
231
- - `redirect: 'manual'` tüm URL fetch çağrılarında SSRF redirect bypass koruması
232
- - `validateUrlAsync` DNS lookup DNS rebind saldırılarına karşı
233
- - `DANGEROUS_EXTENSIONS` blocklist (.exe, .bat, .sh vb) → güvenli `.bin` fallback
234
- - Pseudo-FS path blocking (`/proc/`, `/sys/`, `/etc/` vb)
235
- - Cookie header filtresi — Stage 3 HTTP re-fetch'te cross-origin credential sızıntısı önlenir
236
- - `downloadEnabled` (download) ve `networkSaveEnabled` (network_save) varsayılan **false** — kullanıcı açıkça enable etmeden çalışmaz
237
- - Input sanitizasyonu (null byte, path traversal, fileName injection)
238
-
239
- #### MCP Client Uyumluluğu
240
-
241
- Claude Desktop, Cursor, VS Code (Cline, Roo Code), Continue.dev, özel MCP istemcileri — protocol uyumlu tüm platformlar.
242
-
243
- ---
244
-
245
- ### Daha Fazlası Yolda
246
-
247
- Bu proje aktif geliştirme aşamasındadır. Kısa süre içinde:
248
-
249
- - **Yeni tool'lar**: PDF extraction, screenshot annotation, form detection, cookie manager, session snapshot/restore
250
- - **Cloudflare bypass**: Playwright Stealth entegrasyonu, rotatable proxy desteği
251
- - **Batch scraping**: Çoklu sayfa scraping pipeline, queue sistemi
252
- - **Performance**: Ring buffer optimizasyonu, streaming response, lazy evaluation
253
- - **Developer Experience**: Interaktif CLI, playground UI, type-safe client SDK
254
-
255
- ---
256
-
257
- ### Geliştirme
258
-
259
- ```bash
260
- # Test
261
- npm test
262
- npm run test:coverage
263
-
264
- # Type check (tsc --noEmit)
265
- npm run lint
266
-
267
- # Build
268
- npm run build
269
- ```
270
-
271
- ### Lisans
272
-
273
- MIT
274
-
275
- ---
276
-
277
- ## 🇬🇧 English
278
-
279
- **Browser automation MCP server** — a Model Context Protocol server designed for reverse engineering, web scraping, and data extraction. It gives AI agents full control over a browser.
280
-
281
- ### Quick Start
282
-
283
- ```bash
284
- # 1. Install
285
- npm install @natchs/browser-mcp
286
-
287
- # 2. Run (Chromium installs automatically ~30s on first run)
288
- npx @natchs/browser-mcp
289
- ```
290
-
291
- #### MCP Client Config
292
-
293
- Add to Claude Desktop, Cursor, VS Code, or any MCP client:
294
-
295
- ```json
296
- {
297
- "mcpServers": {
298
- "browser-mcp": {
299
- "command": "npx",
300
- "args": ["@natchs/browser-mcp"]
301
- }
302
- }
303
- }
304
- ```
305
-
306
- You can also clone and build locally:
307
-
308
- ```bash
309
- git clone https://github.com/natchs/browser-mcp.git
310
- cd browser-mcp
311
- npm ci
312
- npm run build
313
- npx @natchs/browser-mcp
314
- ```
315
-
316
- #### Browser Profile: 3 Modes
317
-
318
- The `BROWSER_MODE` environment variable determines which browser the agent uses:
319
-
320
- | Mode | Description | When to Use |
321
- |------|-------------|-------------|
322
- | `fresh` (default) | Playwright's own Chromium. Fresh profile every time, no cookies/extensions/logins | Clean environment, leave no trace |
323
- | `persistent` | **Your Chrome profile.** Real cookies, history, extensions, logged-in accounts — all available to the agent | When the agent needs to act "as you" on sites |
324
- | `connect` | Connects to your already-open Chrome via CDP | When manually controlling Chrome alongside the agent |
325
-
326
- **Example — persistent (auto-detect profile):**
327
-
328
- ```bash
329
- BROWSER_MODE=persistent npx @natchs/browser-mcp
330
- ```
331
-
332
- Since v2.3.0 the Chrome profile is auto-detected. No path needed. Works on Windows/macOS/Linux. To specify a manual path:
333
-
334
- ```bash
335
- BROWSER_MODE=persistent BROWSER_USER_DATA_DIR=C:\Users\...\User Data\Default npx @natchs/browser-mcp
336
- ```
337
-
338
- **Example — connect (attach to existing Chrome):**
339
-
340
- ```bash
341
- # First open Chrome with debug port:
342
- "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
343
-
344
- # Then start the agent:
345
- BROWSER_MODE=connect npx @natchs/browser-mcp
346
- ```
347
-
348
- **MCP Client Config — separate profiles per mode:**
349
-
350
- ```json
351
- {
352
- "mcpServers": {
353
- "browser-mcp-persistent": {
354
- "command": "npx",
355
- "args": ["@natchs/browser-mcp"],
356
- "env": {
357
- "BROWSER_MODE": "persistent"
358
- }
359
- },
360
- "browser-mcp-fresh": {
361
- "command": "npx",
362
- "args": ["@natchs/browser-mcp"],
363
- "env": {
364
- "BROWSER_MODE": "fresh"
365
- }
366
- }
367
- }
368
- }
369
- ```
370
-
371
- ---
372
-
373
- ### Environment Variables
374
-
375
- All configuration can be done via environment variables. The table below lists all supported variables. (Same as the Turkish table above — see the Turkish section for the full list.)
376
-
377
- ---
378
-
379
- ### Detailed Features
380
-
381
- #### Current Capabilities
382
-
383
- | Category | Tools | Capabilities |
384
- |----------|-------|--------------|
385
- | **Navigation** | 5 | Open page, back/forward, refresh, wait for element/URL |
386
- | **Interaction** | 8 | Click, fill form, select box, hover, drag-drop, keyboard, file upload |
387
- | **Extraction** | 7 | HTML, Markdown, plain text, CSS selector, tables, Schema.org, Open Graph |
388
- | **Network** | 10 | Request/response tracking, HAR/JSON/CSV export, WebSocket frame capture, body save, console log |
389
- | **Browser Control** | 8 | Screenshot, PDF, cookie management, JavaScript execution, file download |
390
- | **Session** | 3 | Multi-session open/close/list, automatic TTL cleanup |
391
- | **Admin** | 3 | Server status, cache statistics, cache clear |
392
- | **RE Tools** | 4 | JavaScript beautify/deobfuscate, API endpoint discovery, auth analysis |
393
- | **Stealth** | 2 | Fingerprint rotation, human behavior simulation |
394
-
395
- **Total: 40+ tools, 9 categories, production-grade architecture**
396
-
397
- #### What It Can Do
398
-
399
- - **Reverse Engineering**: JS deobfuscation, API auth analysis, network request export in HAR format
400
- - **Web Scraping**: HTML/Markdown/text extraction from non-Cloudflare sites, table and Schema.org structured data extraction
401
- - **Network Monitoring**: Real-time HTTP/HTTPS interception, WebSocket frame inspection, binary body save to disk
402
- - **Form Automation**: Login forms, search, multi-step form filling, file upload
403
- - **Session Management**: Each isolated, concurrent browser sessions with automatic TTL cleanup
404
- - **Security**: SSRF protection (internal IP blocking, DNS rebind prevention, redirect bypass guard), path traversal prevention, MIME-based extension blocklist, cookie leak protection
405
- - **Data Export**: Network captures as HAR/JSON/CSV, body save, WS frame save (.jsonl)
406
- - **Observability**: Structured JSON logging, tool metrics (call count, duration, errors), LRU cache + TTL
407
- - **Rate Limiting**: Token bucket, global + per-tool, configurable RPM
408
-
409
- #### Browser Profile Management (v2.3.0)
410
-
411
- The agent works in one of 3 browser modes:
412
-
413
- - **`fresh` (default):** Uses Playwright's own Chromium, fresh profile every time. No cookies, history, or extensions are carried over.
414
- - **`persistent`:** Uses your real Chrome profile. The agent accesses all your logged-in sites (Gmail, GitHub, ChatGPT, corporate VPN) as "you." Since v2.3.0, profiles are auto-detected — just set `BROWSER_MODE=persistent`.
415
- - **`connect`:** Connects to your already-running Chrome via `--remote-debugging-port`. Share existing tabs and sessions with the agent.
416
-
417
- #### Limitations
418
-
419
- - **Cloudflare/anti-bot sites**: Cannot bypass challenge pages, manual solving required
420
- - **Captcha solving**: No built-in captcha solver (third-party services can be integrated)
421
- - **Visual recognition**: Can take screenshots but cannot interpret content
422
- - **Native file download**: `browser_download` is URL-based, does not cover Playwright native download events
423
- - **Canvas/WebGL fingerprinting**: Basic fingerprint changes but may not bypass advanced anti-bot systems
424
- - **Mobile browser simulation**: Chromium desktop only, can emulate mobile viewport
425
-
426
- #### Security Features
427
-
428
- - `redirect: 'manual'` on all URL fetch calls — SSRF redirect bypass protection
429
- - `validateUrlAsync` DNS lookup — DNS rebind attack prevention
430
- - `DANGEROUS_EXTENSIONS` blocklist (.exe, .bat, .sh etc.) → safe `.bin` fallback
431
- - Pseudo-FS path blocking (`/proc/`, `/sys/`, `/etc/` etc.)
432
- - Cookie header filter — prevents cross-origin credential leakage in Stage 3 HTTP re-fetch
433
- - `downloadEnabled` and `networkSaveEnabled` default to **false** — must be explicitly enabled
434
- - Input sanitization (null byte, path traversal, fileName injection)
435
-
436
- #### MCP Client Compatibility
437
-
438
- Claude Desktop, Cursor, VS Code (Cline, Roo Code), Continue.dev, custom MCP clients — all protocol-compliant platforms.
439
-
440
- ---
441
-
442
- ### Roadmap
443
-
444
- This project is under active development. Coming soon:
445
-
446
- - **New tools**: PDF extraction, screenshot annotation, form detection, cookie manager, session snapshot/restore
447
- - **Cloudflare bypass**: Playwright Stealth integration, rotatable proxy support
448
- - **Batch scraping**: Multi-page scraping pipeline, queue system
449
- - **Performance**: Ring buffer optimization, streaming response, lazy evaluation
450
- - **Developer Experience**: Interactive CLI, playground UI, type-safe client SDK
451
-
452
- ---
453
-
454
- ### Development
455
-
456
- ```bash
457
- # Test
458
- npm test
459
- npm run test:coverage
460
-
461
- # Type check (tsc --noEmit)
462
- npm run lint
463
-
464
- # Build
465
- npm run build
466
- ```
467
-
468
- ### License
469
-
470
- MIT
1
+ # @natchs/browser-mcp
2
+
3
+ [![npm version](https://img.shields.io/npm/v/%40natchs%2Fbrowser-mcp?label=npm&logo=npm)](https://www.npmjs.com/package/@natchs/browser-mcp)
4
+ [![License](https://img.shields.io/npm/l/%40natchs%2Fbrowser-mcp?color=blue&label=license)](LICENSE)
5
+ [![Node Version](https://img.shields.io/node/v/%40natchs%2Fbrowser-mcp?logo=node.js)](package.json)
6
+
7
+ ---
8
+
9
+ # Your AI Agent's Browser Superpowers — Reverse Engineering, Network Interception & Full Browser Control
10
+
11
+ **`@natchs/browser-mcp`** is a Model Context Protocol server that gives AI agents **complete browser control** navigate, click, scrape, intercept network traffic, export HAR files, capture WebSocket frames, deobfuscate JavaScript, and more. All through a single MCP interface.
12
+
13
+ Three browser modes (fresh / persistent / connect), production-grade security, 50+ tools, 9 categories. Built for reverse engineers, data extraction pipelines, and AI-powered automation.
14
+
15
+ ---
16
+
17
+ ## Features
18
+
19
+ - 🔍 **Reverse Engineering Toolkit** JS beautify/deobfuscate, API endpoint discovery, auth flow analysis
20
+ - 🌐 **Web Scraping** — HTML, Markdown, text, CSS selectors, tables, Schema.org JSON-LD, Open Graph
21
+ - 📡 **Network Intelligence** — Real-time HTTP/HTTPS interception, WebSocket frame capture, HAR/JSON/CSV export
22
+ - 🕶️ **Stealth Mode** — Fingerprint rotation, human behavior simulation
23
+ - 🔐 **Enterprise Security** — SSRF protection, DNS rebind prevention, path traversal guards, cookie leak prevention
24
+ - 🧩 **3 Browser Profiles** — Fresh (isolated), Persistent (your Chrome profile, auto-detected), Connect (existing browser)
25
+ - 📦 **50+ Tools** — Navigation, interaction, extraction, network, browser control, sessions, admin, RE, stealth
26
+ - ⚡ **Production Ready** — Rate limiting, LRU cache, structured logging, metrics, plugin system, configurable timeouts
27
+
28
+ ---
29
+
30
+ ## Quick Start
31
+
32
+ ```bash
33
+ # Install
34
+ npm install @natchs/browser-mcp
35
+
36
+ # Run (Chromium installs automatically ~30s on first run)
37
+ npx @natchs/browser-mcp
38
+ ```
39
+
40
+ ### MCP Client Config
41
+
42
+ Add to Claude Desktop, Cursor, VS Code (Cline, Roo Code), Continue.dev, or any MCP-compatible client:
43
+
44
+ ```json
45
+ {
46
+ "mcpServers": {
47
+ "browser-mcp": {
48
+ "command": "npx",
49
+ "args": ["@natchs/browser-mcp"]
50
+ }
51
+ }
52
+ }
53
+ ```
54
+
55
+ ### Local Build
56
+
57
+ ```bash
58
+ git clone https://github.com/natchs/browser-mcp.git
59
+ cd browser-mcp
60
+ npm ci
61
+ npm run build
62
+ npx @natchs/browser-mcp
63
+ ```
64
+
65
+ ---
66
+
67
+ ## Browser Modes
68
+
69
+ The `BROWSER_MODE` environment variable controls which browser profile the agent uses:
70
+
71
+ | Mode | Description | Best For |
72
+ |------|-------------|----------|
73
+ | `fresh` (default) | Playwright's own Chromium. Fresh profile every time — no cookies, history, or extensions carried over | Clean room analysis, leave-no-trace operations |
74
+ | `persistent` | **Your real Chrome profile.** Auto-detected since v2.3.0. All cookies, sessions, extensions, and logged-in accounts available to the agent | When the agent needs to act "as you" — Gmail, GitHub, ChatGPT, corporate portals |
75
+ | `connect` | Attaches to your already-running Chrome via CDP debug port | Hybrid workflows — manually drive Chrome while the agent assists |
76
+
77
+ ### Persistent (auto-detect)
78
+
79
+ ```bash
80
+ BROWSER_MODE=persistent npx @natchs/browser-mcp
81
+ ```
82
+
83
+ Since v2.3.0 the Chrome profile is auto-detected on Windows/macOS/Linux. No path needed. To specify a manual path:
84
+
85
+ ```bash
86
+ BROWSER_MODE=persistent BROWSER_USER_DATA_DIR=C:\Users\...\User Data\Default npx @natchs/browser-mcp
87
+ ```
88
+
89
+ ### Connect (attach to existing Chrome)
90
+
91
+ ```bash
92
+ # Start Chrome with debug port first:
93
+ "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
94
+
95
+ # Then launch the agent:
96
+ BROWSER_MODE=connect npx @natchs/browser-mcp
97
+ ```
98
+
99
+ ### Per-Mode Client Profiles
100
+
101
+ ```json
102
+ {
103
+ "mcpServers": {
104
+ "browser-mcp-persistent": {
105
+ "command": "npx",
106
+ "args": ["@natchs/browser-mcp"],
107
+ "env": { "BROWSER_MODE": "persistent" }
108
+ },
109
+ "browser-mcp-fresh": {
110
+ "command": "npx",
111
+ "args": ["@natchs/browser-mcp"],
112
+ "env": { "BROWSER_MODE": "fresh" }
113
+ }
114
+ }
115
+ }
116
+ ```
117
+
118
+ ---
119
+
120
+ ## Environment Variables
121
+
122
+ All configuration is managed through environment variables.
123
+
124
+ ### Browser
125
+
126
+ | Variable | Default | Description | Required |
127
+ |----------|---------|-------------|----------|
128
+ | `BROWSER_MODE` | `fresh` | Browser mode: `fresh`, `persistent`, `connect` | No |
129
+ | `BROWSER_HEADLESS` | `true` | Run in headless mode (`true`/`false`/`1`/`0`/`yes`/`no`) | No |
130
+ | `BROWSER_USER_DATA_DIR` | | Chrome user data directory (absolute path) | No |
131
+ | `BROWSER_CHANNEL` | `""` | Browser channel (`chrome`, `msedge`, `chromium`, etc.) | No |
132
+ | `BROWSER_DEBUG_PORT` | `9222` | CDP debug port number | No |
133
+ | `BROWSER_AUTO_DETECT_PROFILE` | `true` | Auto-detect Chrome profile location | No |
134
+ | `BROWSER_VIEWPORT_WIDTH` | `1280` | Viewport width (px) | No |
135
+ | `BROWSER_VIEWPORT_HEIGHT` | `720` | Viewport height (px) | No |
136
+ | `BROWSER_USER_AGENT` | | Custom User-Agent string | No |
137
+ | `BROWSER_LOCALE` | | Browser locale (e.g. `en-US`) | No |
138
+ | `BROWSER_TIMEOUT` | `30000` | Browser operation timeout (ms) | No |
139
+ | `BROWSER_AUTO_INSTALL` | `true` | Auto-install Chromium (`false` to disable) | No |
140
+
141
+ ### Network & Download
142
+
143
+ | Variable | Default | Description | Required |
144
+ |----------|---------|-------------|----------|
145
+ | `NETWORK_HAR_ENABLED` | `1` | Enable HAR capture (`0` to disable) | No |
146
+ | `NETWORK_MAX_ENTRIES` | `5000` | Max network entries stored | No |
147
+ | `NETWORK_MAX_RESPONSE_BODY_SIZE` | `262144` | Max response body size (bytes) | No |
148
+ | `NETWORK_STORE_RESPONSE_BODIES` | `false` | Keep response bodies in memory | No |
149
+ | `NETWORK_EXCLUDE_BODY_TYPES` | `["image","media","font","stylesheet"]` | Body types excluded from storage (JSON array) | No |
150
+ | `NETWORK_EXPORT_DIR` | `./network-logs` | Network log export directory | No |
151
+ | `NETWORK_CAPTURE_FAILED` | `true` | Capture failed requests too | No |
152
+ | `NETWORK_CAPTURE_WS` | `true` | Capture WebSocket frames | No |
153
+ | `NETWORK_WS_MAX_FRAMES` | `1000` | Max WebSocket frames stored | No |
154
+ | `NETWORK_WS_MAX_FRAME_SIZE` | `65536` | Max WebSocket frame payload (bytes) | No |
155
+ | `NETWORK_MAX_MEMORY_MB` | `256` | Max memory for network capture (MB) | No |
156
+ | `NETWORK_CAPTURE_REQUEST_BODY` | `false` | Capture request bodies too | No |
157
+ | `NETWORK_DOWNLOAD_DIR` | `./downloads` | Download directory | No |
158
+ | `NETWORK_MAX_DOWNLOAD_SIZE` | `104857600` | Max download size (bytes, default 100MB) | No |
159
+ | `NETWORK_DOWNLOAD_ENABLED` | `false` | Enable file downloads (`browser_download` tool) | No |
160
+ | `NETWORK_SAVE_ENABLED` | `false` | Enable network capture save (`network_save` tool) | No |
161
+
162
+ ### Rate Limiting
163
+
164
+ | Variable | Default | Description | Required |
165
+ |----------|---------|-------------|----------|
166
+ | `RATE_LIMIT_ENABLED` | `true` | Enable rate limiting | No |
167
+ | `RATE_LIMIT_GLOBAL_RPM` | `120` | Global requests per minute | No |
168
+ | `RATE_LIMIT_PER_TOOL_RPM` | `30` | Per-tool requests per minute | No |
169
+ | `RATE_LIMIT_BURST_SIZE` | `10` | Max burst size | No |
170
+
171
+ ### Cache
172
+
173
+ | Variable | Default | Description | Required |
174
+ |----------|---------|-------------|----------|
175
+ | `CACHE_MAX_ENTRIES` | `100` | Max cache entries | No |
176
+ | `CACHE_TTL_SECONDS` | `300` | Cache TTL (seconds) | No |
177
+
178
+ ### Security
179
+
180
+ | Variable | Default | Description | Required |
181
+ |----------|---------|-------------|----------|
182
+ | `SECURITY_API_KEY` | `""` | API key for authentication (empty = auth disabled) | No |
183
+ | `SECURITY_ALLOWED_DIRS` | `[]` | Allowed directories for file access (JSON array) | No |
184
+ | `SECURITY_MAX_MEMORY_MB` | `512` | Max memory usage (MB) | No |
185
+ | `SECURITY_DEFAULT_TIMEOUT` | `30000` | Default operation timeout (ms) | No |
186
+ | `SECURITY_MAX_TIMEOUT` | `120000` | Max timeout (ms) | No |
187
+ | `SECURITY_MAX_SESSIONS` | `10` | Max concurrent browser sessions | No |
188
+
189
+ ### Config
190
+
191
+ | Variable | Default | Description | Required |
192
+ |----------|---------|-------------|----------|
193
+ | `BROWSER_MCP_CONFIG` | | Config file path (JSON) | No |
194
+
195
+ ---
196
+
197
+ ## All Tools
198
+
199
+ ### Navigation (5)
200
+
201
+ | Tool | Description |
202
+ |------|-------------|
203
+ | `browser_navigate` | Navigate to a URL |
204
+ | `browser_go_back` | Go back in history |
205
+ | `browser_go_forward` | Go forward in history |
206
+ | `browser_refresh` | Refresh the current page |
207
+ | `browser_wait_for` | Wait for a specified timeout |
208
+
209
+ ### Interaction (8)
210
+
211
+ | Tool | Description |
212
+ |------|-------------|
213
+ | `browser_click` | Click an element by CSS selector |
214
+ | `browser_fill` | Fill text into an input field |
215
+ | `browser_select` | Select option(s) in a dropdown |
216
+ | `browser_hover` | Hover over an element |
217
+ | `browser_drag` | Drag and drop an element |
218
+ | `browser_type` | Type text character by character |
219
+ | `browser_press_key` | Press a keyboard key |
220
+ | `browser_file_upload` | Upload files via file input |
221
+
222
+ ### Extraction (7)
223
+
224
+ | Tool | Description |
225
+ |------|-------------|
226
+ | `browser_extract_html` | Extract full page HTML |
227
+ | `browser_extract_markdown` | Extract page as approximate markdown |
228
+ | `browser_extract_text` | Extract visible text |
229
+ | `browser_extract_with_css` | Extract data matching a CSS selector |
230
+ | `browser_extract_table` | Extract tables as structured JSON |
231
+ | `browser_extract_schema_org` | Extract Schema.org JSON-LD data |
232
+ | `browser_extract_open_graph` | Extract Open Graph meta tags |
233
+
234
+ ### Network (10)
235
+
236
+ | Tool | Description |
237
+ |------|-------------|
238
+ | `browser_network_requests` | List network requests made by the page |
239
+ | `browser_network_response` | Get full response details for a request |
240
+ | `browser_get_console` | Get console messages from the page |
241
+ | `browser_handle_dialog` | Accept or dismiss a browser dialog |
242
+ | `browser_wait_for_navigation` | Wait for the page to navigate |
243
+ | `browser_get_network_entries` | List detailed network entries with timing/sizes/headers |
244
+ | `browser_network_export` | Export network log to HAR, JSON, or CSV |
245
+ | `browser_websocket_frames` | List captured WebSocket frames |
246
+ | `browser_network_clear` | Clear captured network data |
247
+ | `browser_network_save` | Save network response or WS frames to disk |
248
+
249
+ ### Browser Control (9)
250
+
251
+ | Tool | Description |
252
+ |------|-------------|
253
+ | `browser_screenshot` | Take a screenshot (page or element) |
254
+ | `browser_page_info` | Get page title, URL, viewport info |
255
+ | `browser_get_cookies` | Get all cookies |
256
+ | `browser_set_cookie` | Set a cookie |
257
+ | `browser_delete_cookie` | Delete a cookie by name |
258
+ | `browser_evaluate` | Execute JavaScript in page context |
259
+ | `browser_pdf` | Generate a PDF of the current page |
260
+ | `browser_download` | Download a URL to disk |
261
+ | `browser_scroll` | Scroll the page or an element |
262
+
263
+ ### Session (3)
264
+
265
+ | Tool | Description |
266
+ |------|-------------|
267
+ | `browser_open_session` | Open a new browser session (tab) |
268
+ | `browser_close_session` | Close a session by ID |
269
+ | `browser_list_sessions` | List all active sessions |
270
+
271
+ ### Admin (3)
272
+
273
+ | Tool | Description |
274
+ |------|-------------|
275
+ | `browser_server_status` | Server status, version, session/cache stats |
276
+ | `browser_cache_stats` | Cache statistics (hits, misses, size) |
277
+ | `browser_clear_cache` | Clear the entire result cache |
278
+
279
+ ### Reverse Engineering (4)
280
+
281
+ | Tool | Description |
282
+ |------|-------------|
283
+ | `browser_js_beautify` | Beautify and format JavaScript code |
284
+ | `browser_js_deobfuscate` | Deobfuscate JS (hex, unicode, base64) |
285
+ | `browser_api_discover` | Discover API endpoints in JS source |
286
+ | `browser_auth_analyze` | Analyze network logs for auth flows |
287
+
288
+ ### Stealth (2)
289
+
290
+ | Tool | Description |
291
+ |------|-------------|
292
+ | `browser_fingerprint` | Generate randomized browser fingerprint |
293
+ | `browser_human_behavior` | Generate human-like typing/mouse/scroll profiles |
294
+
295
+ **Total: 51 tools across 9 categories, production-grade architecture**
296
+
297
+ ---
298
+
299
+ ## Use Cases
300
+
301
+ - **Reverse Engineering** — Deobfuscate JavaScript, uncover API endpoints, map auth flows, export HAR for offline analysis
302
+ - **Web Scraping** — Extract HTML, Markdown, structured data (Schema.org, Open Graph, tables) from any browser-accessible page
303
+ - **Network Monitoring** — Intercept HTTP/HTTPS traffic in real time, inspect WebSocket frames, save binary bodies to disk
304
+ - **Form Automation** — Login flows, multi-step forms, file uploads, dropdown selections — all driven by AI
305
+ - **Session Management** — Isolated concurrent browser sessions with automatic TTL cleanup
306
+ - **Data Export** Network captures in HAR/JSON/CSV, body saves, WebSocket frame logs (.jsonl)
307
+
308
+ ---
309
+
310
+ ## Security
311
+
312
+ - `redirect: 'manual'` on all URL fetches — SSRF redirect bypass protection
313
+ - `validateUrlAsync` with DNS lookup — DNS rebind attack prevention
314
+ - `DANGEROUS_EXTENSIONS` blocklist (`.exe`, `.bat`, `.sh` etc.) → safe `.bin` fallback
315
+ - Pseudo-FS path blocking (`/proc/`, `/sys/`, `/etc/` etc.) — path traversal prevention
316
+ - Cookie header filter — cross-origin credential leakage prevention in HTTP re-fetch
317
+ - `NETWORK_DOWNLOAD_ENABLED` and `NETWORK_SAVE_ENABLED` default to **false** — must be explicitly enabled
318
+ - Input sanitization null byte, path traversal, fileName injection guards
319
+ - Rate limiting — token bucket algorithm, global + per-tool, configurable RPM
320
+ - LRU cache with TTL bounded memory usage
321
+
322
+ ---
323
+
324
+ ## Development
325
+
326
+ ```bash
327
+ # Test
328
+ npm test
329
+ npm run test:coverage
330
+
331
+ # Type check (tsc --noEmit)
332
+ npm run lint
333
+
334
+ # Build
335
+ npm run build
336
+ ```
337
+
338
+ ---
339
+
340
+ ## Roadmap
341
+
342
+ - **New tools**: PDF extraction, screenshot annotation, form detection, cookie manager, session snapshot/restore
343
+ - **Cloudflare bypass**: Playwright Stealth integration, rotatable proxy support
344
+ - **Batch scraping**: Multi-page scraping pipeline, queue system
345
+ - **Performance**: Ring buffer optimization, streaming responses, lazy evaluation
346
+ - **Developer Experience**: Interactive CLI, playground UI, type-safe client SDK
347
+
348
+ ---
349
+
350
+ ## 🇹🇷 Türkçe
351
+
352
+ **`@natchs/browser-mcp`** — reverse engineering, web scraping, network interception ve HAR export için tasarlanmış, AI ajanların tarayıcıyı tam kontrol etmesini sağlayan bir Model Context Protocol sunucusu.
353
+
354
+ ```bash
355
+ npm install @natchs/browser-mcp
356
+ npx @natchs/browser-mcp
357
+ ```
358
+
359
+ Üç browser modu: `fresh` (izole), `persistent` (gerçek Chrome profilin, otomatik tespit), `connect` (mevcut Chrome'a bağlan). 50+ araç, 9 kategori, kurumsal güvenlik.
360
+
361
+ Detaylı bilgi için yukarıdaki İngilizce dokümantasyonu inceleyin.
362
+
363
+ ---
364
+
365
+ ## License
366
+
367
+ MIT