npm - agentsys - Versions diffs - 5.3.7 → 5.4.0 - Mend

agentsys 5.3.7 → 5.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/.agnix.toml +17 -7
package/.claude-plugin/marketplace.json +13 -2
package/.claude-plugin/plugin.json +1 -1
package/.gitmodules +3 -0
package/AGENTS.md +4 -4
package/CHANGELOG.md +21 -0
package/README.md +46 -5
package/lib/adapter-transforms.js +3 -1
package/package.json +1 -1
package/site/assets/css/main.css +39 -1
package/site/assets/js/main.js +24 -0
package/site/content.json +4 -4
package/site/index.html +82 -7
package/site/ux-spec.md +5 -5
package/agent-knowledge/AGENTS.md +0 -231
package/agent-knowledge/acp-with-codex-gemini-copilot-claude.md +0 -504
package/agent-knowledge/ai-cli-advanced-integration-patterns.md +0 -670
package/agent-knowledge/ai-cli-non-interactive-programmatic-usage.md +0 -1394
package/agent-knowledge/all-in-one-plus-modular-packages.md +0 -576
package/agent-knowledge/cli-browser-automation-agents.md +0 -936
package/agent-knowledge/github-org-project-management.md +0 -319
package/agent-knowledge/github-org-structure-patterns.md +0 -268
package/agent-knowledge/kiro-supervised-autopilot.md +0 -400
package/agent-knowledge/multi-product-org-docs.md +0 -622
package/agent-knowledge/oss-org-naming-patterns.md +0 -368
package/agent-knowledge/resources/acp-with-codex-gemini-copilot-claude-sources.json +0 -408
package/agent-knowledge/resources/ai-cli-non-interactive-programmatic-usage-sources.json +0 -500
package/agent-knowledge/resources/all-in-one-plus-modular-packages-sources.json +0 -310
package/agent-knowledge/resources/cli-browser-automation-agents-sources.json +0 -428
package/agent-knowledge/resources/github-org-project-management-sources.json +0 -239
package/agent-knowledge/resources/github-org-structure-patterns-sources.json +0 -293
package/agent-knowledge/resources/kiro-supervised-autopilot-sources.json +0 -135
package/agent-knowledge/resources/multi-product-org-docs-sources.json +0 -514
package/agent-knowledge/resources/oss-org-naming-patterns-sources.json +0 -458
package/agent-knowledge/resources/skill-plugin-distribution-patterns-sources.json +0 -290
package/agent-knowledge/resources/terminal-browsers-agent-automation-sources.json +0 -758
package/agent-knowledge/resources/web-session-persistence-cli-agents-sources.json +0 -528
package/agent-knowledge/skill-plugin-distribution-patterns.md +0 -661
package/agent-knowledge/terminal-browsers-agent-automation.md +0 -776
package/agent-knowledge/web-session-persistence-cli-agents.md +0 -1352

package/agent-knowledge/cli-browser-automation-agents.md DELETED Viewed

@@ -1,936 +0,0 @@
-# Learning Guide: CLI-First Browser Automation for AI Agents
-**Generated**: 2026-02-20
-**Sources**: 32 resources analyzed
-**Depth**: deep
----
-## Prerequisites
-- Basic familiarity with Node.js `npx` and/or Python `pip`/`uv`
-- Understanding of what cookies and browser sessions are
-- Awareness of Chrome DevTools Protocol (CDP) at a conceptual level
-- A working Node.js 18+ or Python 3.11+ environment
----
-## TL;DR
-- **Playwright CLI** (`npx playwright`) is the lowest-friction entry point: `codegen`, `screenshot`, `pdf`, `open`, and `show-trace` commands require zero scripting. The `--save-storage` / `--load-storage` flags give you a complete auth-handoff pattern in two commands.
-- **Playwright MCP** exposes 30+ named tools (`browser_navigate`, `browser_click`, `browser_take_screenshot`, etc.) to any MCP-capable agent host without writing a single line of Playwright API code.
-- **browser-use** is the highest-level Python option: it ships a CLI (`browser-use open`, `browser-use click N`, `browser-use screenshot`) and a Python agent API. Designed specifically for LLM agents.
-- **Steel Browser** and **Browserless** expose a REST API (`POST /v1/screenshot`, `/v1/scrape`, `/v1/pdf`) so agents can drive a browser with plain `curl` or `fetch`.
-- For the auth handoff, the canonical pattern is: headed `npx playwright codegen --save-storage=auth.json <login-url>` → user logs in → agent uses `--load-storage=auth.json` on all subsequent headless commands.
----
-## Core Concepts
-### 1. The Three Automation Layers
-Browser automation tools fall into three layers. Understanding which layer you are at tells you how much boilerplate you need.
-**Layer 1 - CLI/REST verbs (zero boilerplate)**
-You call a binary or HTTP endpoint. No session object, no page object, no awaiting. Examples: `npx playwright screenshot`, `curl http://localhost:3000/v1/screenshot`, `browser-use click 3`.
-**Layer 2 - MCP tools (agent-native, zero boilerplate)**
-The browser is a running server exposing named tools. Your agent calls `browser_navigate({url})` as a tool call, same as any other MCP tool. The session is persistent across calls. No JS or Python required in the agent.
-**Layer 3 - Library API (full power, more boilerplate)**
-You write Playwright/Puppeteer/Rod scripts. Full control of every event but you must manage the async lifecycle yourself.
-For AI agents, Layer 1 and Layer 2 are almost always preferable. Layer 3 is the implementation layer for building Layer 1/2 wrappers.
-### 2. Headed vs Headless
-- **Headed**: real browser window appears. Required for user-interactive auth flows.
-- **Headless**: no window. Faster, suitable for automated runs after auth is established.
-All major tools default to headless or can be switched with a flag.
-### 3. The Auth Handoff Problem
-Most interesting pages require login. The canonical pattern for CLI agents:
-1. **Trigger a headed browser** so the user can see and interact with a real login form.
-2. **Capture the resulting session** (cookies + localStorage) into a file.
-3. **Inject that file** into all subsequent headless requests.
-This is a one-time human action. The agent then operates fully autonomously from step 3 onward.
-### 4. Playwright storageState JSON Format
-`npx playwright codegen --save-storage=auth.json` writes a JSON file with this structure:
-```json
-{
-  "cookies": [
-    {
-      "name": "session",
-      "value": "abc123...",
-      "domain": ".example.com",
-      "path": "/",
-      "expires": 1771234567.0,
-      "httpOnly": true,
-      "secure": true,
-      "sameSite": "Lax"
-    }
-  ],
-  "origins": [
-    {
-      "origin": "https://example.com",
-      "localStorage": [
-        { "name": "auth_token", "value": "eyJ..." }
-      ]
-    }
-  ]
-}
-```
-This file is directly understood by `--load-storage`, by Playwright MCP's `--storage-state`, and can be converted to Netscape cookies.txt for `curl`/`wget`/`yt-dlp`.
-### 5. Netscape cookies.txt Format
-The Netscape cookie file format is a 7-field tab-separated text file:
-```
-# Netscape HTTP Cookie File
-# Generated by browser-automation tool
-.example.com	TRUE	/	TRUE	1771234567	session	abc123...
-example.com	FALSE	/api	FALSE	0	csrf_token	xyz789
-```
-Fields: `domain`, `include_subdomains` (TRUE/FALSE), `path`, `https_only` (TRUE/FALSE), `expires_unix_epoch` (0 = session cookie), `name`, `value`.
-Lines starting with `#` are comments. Lines starting with `#HttpOnly_` indicate HttpOnly cookies.
-Used by: `curl -b cookies.txt`, `wget --load-cookies`, `yt-dlp --cookies`, `httpx`.
----
-## Tools Reference
-### Playwright CLI (`npx playwright`)
-**Installation**: `npm install -D playwright` or `npm install -g playwright`
-**Core commands**:
-| Command | What it does | Key flags |
-|---------|-------------|-----------|
-| `npx playwright codegen [url]` | Opens headed browser, records interactions to test script | `--save-storage=auth.json`, `-o out.js`, `--target python` |
-| `npx playwright screenshot [url] [file]` | Headless screenshot | `--full-page`, `--load-storage=auth.json`, `-b chromium\|firefox\|webkit` |
-| `npx playwright pdf [url] [file]` | Save page as PDF (Chromium only) | `--paper-format=A4`, `--load-storage=auth.json` |
-| `npx playwright open [url]` | Open headed browser interactively | `--load-storage=auth.json`, `--save-storage=auth.json` |
-| `npx playwright show-trace [file]` | View recorded trace | `--port 9323` |
-**The auth handoff in two commands**:
-```bash
-# Step 1: User logs in (headed browser opens, user sees real page)
-npx playwright codegen --save-storage=auth.json https://example.com/login
-# (user logs in manually, closes browser, auth.json now has cookies)
-# Step 2: Agent uses saved session for headless work
-npx playwright screenshot --load-storage=auth.json \
-  https://example.com/dashboard dashboard.png
-npx playwright pdf --load-storage=auth.json \
-  https://example.com/report report.pdf
-```
-**Note on interactivity**: `npx playwright codegen` opens a visible browser and a side panel with generated code. The user can navigate, log in, and then close the window. The `--save-storage` flag captures state at close. This is the cleanest agent-triggered human-auth pattern available.
-**Standard options** (shared across all commands):
-- `--browser` / `-b`: `cr` (chromium), `ff` (firefox), `wk` (webkit), `msedge`, `chrome`
-- `--device`: emulate device (`"iPhone 13"`, `"Pixel 5"`)
-- `--viewport-size`: `"1280,720"`
-- `--user-agent`, `--lang`, `--timezone`, `--geolocation`
-- `--proxy-server`
-- `--ignore-https-errors`
-- `--user-data-dir`: use persistent Chrome profile with existing logins
-- `--channel`: `chrome`, `msedge`, `chrome-beta`
-**Screenshot example with wait**:
-```bash
-npx playwright screenshot \
-  --full-page \
-  --wait-for-selector=".dashboard-loaded" \
-  --load-storage=auth.json \
-  https://example.com/dashboard \
-  out.png
-```
----
-### Playwright MCP (`@playwright/mcp`)
-**What it is**: A Model Context Protocol server that exposes browser automation as ~30 named tools. Any MCP-capable agent host (Claude Desktop, VS Code Copilot, Cursor, Cline, Windsurf) can call these tools without any Playwright code.
-**Installation** (add to MCP client config):
-```json
-{
-  "mcpServers": {
-    "playwright": {
-      "command": "npx",
-      "args": ["@playwright/mcp@latest"]
-    }
-  }
-}
-```
-**With options** (headed browser + persistent auth):
-```json
-{
-  "mcpServers": {
-    "playwright": {
-      "command": "npx",
-      "args": [
-        "@playwright/mcp@latest",
-        "--browser", "chrome",
-        "--user-data-dir", "/home/user/.playwright-agent-profile"
-      ]
-    }
-  }
-}
-```
-**Available MCP tools** (complete list):
-*Core navigation & interaction*:
-| Tool | Description |
-|------|-------------|
-| `browser_navigate` | Navigate to a URL |
-| `browser_navigate_back` | Go back in history |
-| `browser_click` | Click an element (by accessibility label/text/role) |
-| `browser_type` | Type text into a focused field |
-| `browser_fill_form` | Fill multiple form fields at once |
-| `browser_select_option` | Choose from a dropdown |
-| `browser_hover` | Hover over an element |
-| `browser_drag` | Drag and drop between elements |
-| `browser_press_key` | Send keyboard input |
-| `browser_handle_dialog` | Respond to alert/confirm/prompt dialogs |
-| `browser_file_upload` | Upload a file |
-*Page inspection*:
-| Tool | Description |
-|------|-------------|
-| `browser_snapshot` | Get accessibility tree of current page (preferred over screenshot for LLMs) |
-| `browser_take_screenshot` | Capture PNG screenshot |
-| `browser_evaluate` | Run JavaScript and return result |
-| `browser_console_messages` | Get browser console logs |
-| `browser_network_requests` | List all network requests since load |
-| `browser_wait_for` | Wait for text to appear/disappear or timeout |
-*Tab & session management*:
-| Tool | Description |
-|------|-------------|
-| `browser_tabs` | List, create, close, or switch tabs |
-| `browser_resize` | Resize browser window |
-| `browser_close` | Close current page |
-| `browser_install` | Install browser binaries |
-*Vision mode (requires `--caps vision`)*:
-| Tool | Description |
-|------|-------------|
-| `browser_mouse_click_xy` | Click at pixel coordinates |
-| `browser_mouse_move_xy` | Move mouse to coordinates |
-| `browser_mouse_drag_xy` | Drag using pixel coordinates |
-| `browser_mouse_wheel` | Scroll |
-*PDF (requires `--caps pdf`)*:
-| Tool | Description |
-|------|-------------|
-| `browser_pdf_save` | Save current page as PDF |
-*Testing assertions (requires `--caps testing`)*:
-| Tool | Description |
-|------|-------------|
-| `browser_verify_text_visible` | Assert text is present |
-| `browser_verify_element_visible` | Assert element exists |
-| `browser_generate_locator` | Generate stable CSS/Aria selector |
-**How agents use Playwright MCP**:
-The server runs a persistent headed or headless browser. The agent calls tools sequentially:
-```
-agent → browser_navigate({url: "https://example.com/login"})
-agent → browser_snapshot()  # read page structure
-agent → browser_type({element: "Email field", text: "user@example.com"})
-agent → browser_type({element: "Password field", text: "..."})
-agent → browser_click({element: "Sign in button"})
-agent → browser_snapshot()  # verify login succeeded
-agent → browser_navigate({url: "https://example.com/dashboard"})
-agent → browser_take_screenshot({filename: "dashboard.png"})
-```
-**Auth handoff with Playwright MCP**:
-Option A - Persistent profile (simplest):
-```json
-"args": ["@playwright/mcp@latest", "--user-data-dir", "/path/to/profile"]
-```
-User logs into a normal Chrome window using that profile once. The agent uses that profile forever.
-Option B - Storage state file:
-```json
-"args": ["@playwright/mcp@latest", "--storage-state", "/path/to/auth.json"]
-```
-Auth was captured separately (e.g., with `npx playwright codegen --save-storage`).
-Option C - Chrome extension bridge:
-```json
-"args": ["@playwright/mcp@latest", "--extension"]
-```
-The agent connects to your currently running Chrome browser tab. Uses whatever session is already active.
-Option D - CDP endpoint (connect to running Chrome):
-```json
-"args": [
-  "@playwright/mcp@latest",
-  "--cdp-endpoint", "http://localhost:9222"
-]
-```
-User launches Chrome with `--remote-debugging-port=9222`, logs in, agent connects to that live session.
-**Key insight**: `browser_snapshot` returns the accessibility tree as structured text, not a screenshot. This is far more token-efficient for LLM consumption and does not require a vision model.
----
-### browser-use (Python)
-**What it is**: A Python library with a CLI and agent API designed specifically for LLM-driven browser automation. The agent receives a high-level task description and plans/executes browser interactions autonomously.
-**Installation**:
-```bash
-pip install browser-use
-# or
-uv add browser-use
-uvx browser-use install  # downloads Chromium
-```
-**CLI interface** (stateful session persists between commands):
-```bash
-browser-use open https://example.com   # navigate
-browser-use state                       # list clickable elements by index
-browser-use click 5                     # click element #5
-browser-use type "search query"         # type text
-browser-use screenshot page.png         # capture screen
-browser-use close                       # end session
-```
-**Agent API** (LLM controls browser autonomously):
-```python
-from browser_use import Agent, Browser, ChatBrowserUse
-import asyncio
-async def run():
-    browser = Browser()
-    llm = ChatBrowserUse()  # or use OpenAI, Anthropic, etc.
-    agent = Agent(
-        task="Log into GitHub, go to my notifications, summarize the top 3",
-        llm=llm,
-        browser=browser,
-    )
-    result = await agent.run()
-    print(result)
-asyncio.run(run())
-```
-**Auth with real Chrome profile**:
-```python
-from browser_use import Browser, BrowserConfig
-browser = Browser(config=BrowserConfig(
-    chrome_instance_path="/usr/bin/google-chrome",
-    # Uses default Chrome profile with existing logins
-))
-```
-**Custom tools extension**:
-```python
-from browser_use import Agent
-from browser_use.browser.context import BrowserContext
-@agent.action("Read the current page URL and return it")
-async def get_current_url(browser: BrowserContext) -> str:
-    page = await browser.get_current_page()
-    return page.url
-```
-**Comparison to Playwright MCP**: browser-use is more autonomous - you give it a task and it figures out the steps. Playwright MCP gives you individual tool calls (more control, less autonomy). browser-use requires Python; Playwright MCP is language-agnostic.
----
-### puppeteer-extra
-**What it is**: Puppeteer with a plugin system. The key plugin is `puppeteer-extra-plugin-stealth` which patches ~20 bot-detection signals.
-**Installation**:
-```bash
-npm install puppeteer-extra puppeteer-extra-plugin-stealth
-```
-**Basic usage** (still requires scripting, no CLI wrapper):
-```javascript
-const puppeteer = require('puppeteer-extra');
-const StealthPlugin = require('puppeteer-extra-plugin-stealth');
-puppeteer.use(StealthPlugin());
-const browser = await puppeteer.launch({ headless: false });
-const page = await browser.newPage();
-await page.goto('https://example.com');
-await page.screenshot({ path: 'screenshot.png' });
-await browser.close();
-```
-**Key note**: There is no standalone puppeteer CLI tool for agents. Puppeteer is a library only. For CLI-driven use, Playwright CLI is the better choice. Puppeteer-extra's main value is stealth for avoiding bot detection.
-**Comparison to Playwright**: Playwright is now generally preferred. Playwright has a built-in CLI, supports 3 browser engines natively, and has a richer ecosystem including MCP. Puppeteer supports Chrome/Firefox only and has no CLI.
----
-### Chrome DevTools Protocol (CDP) Direct
-**What it is**: CDP is the underlying wire protocol that Playwright, Puppeteer, and all Chromium-based automation tools use. You can drive Chrome directly via HTTP and WebSocket without any framework.
-**Launching Chrome with debugging port**:
-```bash
-# Headed (user-visible) - good for auth
-google-chrome \
-  --remote-debugging-port=9222 \
-  --user-data-dir=/tmp/chrome-agent \
-  https://example.com/login
-# Or headless
-google-chrome \
-  --headless=new \
-  --remote-debugging-port=9222 \
-  --user-data-dir=/tmp/chrome-agent
-```
-**HTTP API endpoints** (no WebSocket needed for these):
-```bash
-# List tabs
-curl http://localhost:9222/json/list
-# Create new tab
-curl "http://localhost:9222/json/new?https://example.com"
-# Close tab
-curl "http://localhost:9222/json/close/{targetId}"
-# Get browser version
-curl http://localhost:9222/json/version
-```
-**WebSocket CDP commands** (for actual page control):
-```javascript
-const CDP = require('chrome-remote-interface');
-async function captureAuth() {
-  const client = await CDP();
-  const { Network, Page } = client;
-  await Network.enable();
-  await Page.enable();
-  await Page.navigate({ url: 'https://example.com/login' });
-  await Page.loadEventFired();
-  // After user logs in (poll or wait), capture cookies
-  const { cookies } = await Network.getAllCookies();
-  console.log(JSON.stringify(cookies));
-  await client.close();
-}
-```
-**CLI REPL with chrome-remote-interface**:
-```bash
-npm install -g chrome-remote-interface
-# List targets
-chrome-remote-interface list
-# Open a URL in new tab
-chrome-remote-interface new 'https://example.com'
-# Interactive REPL (send CDP commands interactively)
-chrome-remote-interface inspect
-# Then inside REPL:
-# > Page.navigate({url: 'https://example.com'})
-# > Network.getAllCookies()
-```
-**Getting cookies via CDP**:
-```bash
-# Using websocat + jq (pure CLI, no Node.js needed after browser launch)
-WS=$(curl -s http://localhost:9222/json/list | jq -r '.[0].webSocketDebuggerUrl')
-echo '{"id":1,"method":"Network.getAllCookies"}' \
-  | websocat "$WS" \
-  | jq '.result.cookies[]'
-```
-**CDP verdict for agents**: CDP is powerful but verbose. Best used as a foundation layer. The chrome-remote-interface REPL is useful for exploration. For production agent use, Playwright MCP or Playwright CLI are cleaner because they handle the WebSocket protocol, target management, and element selectors automatically.
----
-### Browserless (Self-Hosted REST API)
-**What it is**: A Docker service that wraps headless Chrome and exposes a REST API. Agents call HTTP endpoints without managing any browser process.
-**Run locally**:
-```bash
-docker run -p 3000:3000 ghcr.io/browserless/chrome
-```
-**REST endpoints** (all `POST` with JSON body):
-```bash
-# Screenshot
-curl -X POST http://localhost:3000/screenshot \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com", "fullPage": true}' \
-  --output out.png
-# PDF
-curl -X POST http://localhost:3000/pdf \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com"}' \
-  --output out.pdf
-# HTML content
-curl -X POST http://localhost:3000/content \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com"}'
-# Execute Puppeteer script
-curl -X POST http://localhost:3000/function \
-  -H "Content-Type: application/json" \
-  -d '{
-    "code": "module.exports = async ({page}) => { await page.goto(args.url); return await page.title(); }",
-    "context": {"url": "https://example.com"}
-  }'
-```
-**Passing cookies to Browserless**:
-```bash
-# Inject cookies in the request body
-curl -X POST http://localhost:3000/screenshot \
-  -H "Content-Type: application/json" \
-  -d '{
-    "url": "https://example.com/dashboard",
-    "cookies": [
-      {"name": "session", "value": "abc123", "domain": "example.com"}
-    ]
-  }' --output dashboard.png
-```
-**Trade-off**: Requires Docker. But once running, agents just need `curl`. No Node.js, no Python. Good for polyglot agents.
----
-### Steel Browser (Self-Hosted REST API)
-**What it is**: An open-source browser API service similar to Browserless but with a session-oriented architecture. Good for multi-step authenticated workflows.
-**Run locally**:
-```bash
-# Via npm
-npx @steel-dev/steel start
-# Or Docker
-docker run -p 3000:3000 ghcr.io/steel-dev/steel
-```
-**REST endpoints**:
-```bash
-# Create a session (returns sessionId)
-SESSION=$(curl -s -X POST http://localhost:3000/v1/sessions \
-  -H "Content-Type: application/json" \
-  -d '{"blockAds": true}' | jq -r '.id')
-# Screenshot a URL (stateless quick action)
-curl -X POST http://localhost:3000/v1/screenshot \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com", "fullPage": true}' \
-  --output out.png
-# Scrape page content
-curl -X POST http://localhost:3000/v1/scrape \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com"}'
-# PDF
-curl -X POST http://localhost:3000/v1/pdf \
-  -H "Content-Type: application/json" \
-  -d '{"url": "https://example.com"}' \
-  --output out.pdf
-```
-**Sessions persist cookies** across requests - once you log into a page within a session, all subsequent requests in that session are authenticated.
-**Connect Playwright to Steel session**:
-```javascript
-const { chromium } = require('playwright');
-const browser = await chromium.connectOverCDP(
-  `ws://localhost:3000?sessionId=${sessionId}`
-);
-```
----
-## The Auth Handoff: Three Patterns
-### Pattern 1: Playwright CLI (Recommended for CLI Agents)
-**When to use**: Your agent runs from a shell, you want zero framework knowledge required.
-```bash
-# ---- Human does this once ----
-# Open headed browser for user login
-npx playwright codegen \
-  --save-storage=~/.agent/auth/example-auth.json \
-  https://example.com/login
-# [Browser opens, user logs in, browser closes, auth.json written]
-# ---- Agent does this autonomously ----
-npx playwright screenshot \
-  --load-storage=~/.agent/auth/example-auth.json \
-  https://example.com/dashboard \
-  /tmp/dashboard.png
-# Agent can also generate a full-page PDF
-npx playwright pdf \
-  --load-storage=~/.agent/auth/example-auth.json \
-  https://example.com/report \
-  /tmp/report.pdf
-```
-**No Playwright code written. No async/await. Just CLI commands.**
-### Pattern 2: Playwright MCP with Chrome Extension (Recommended for MCP Agents)
-**When to use**: Your agent is running inside an MCP host and you want to connect to the user's real logged-in browser.
-```json
-{
-  "mcpServers": {
-    "playwright": {
-      "command": "npx",
-      "args": ["@playwright/mcp@latest", "--extension"]
-    }
-  }
-}
-```
-1. User installs "Playwright MCP Bridge" Chrome extension.
-2. User is already logged into sites in their normal Chrome.
-3. Agent calls `browser_navigate` / `browser_snapshot` / `browser_click` directly on those tabs.
-4. No auth file needed - the user's live session is used.
-### Pattern 3: CDP + Chrome --remote-debugging-port
-**When to use**: You want the most direct control, or you're already running Chrome elsewhere.
-```bash
-# User launches Chrome with debugging enabled
-google-chrome \
-  --remote-debugging-port=9222 \
-  --user-data-dir=$HOME/.agent-chrome-profile \
-  https://example.com/login
-# User logs in normally.
-# Agent now connects and captures cookies
-node -e "
-const CDP = require('chrome-remote-interface');
-CDP(async (client) => {
-  await client.Network.enable();
-  const {cookies} = await client.Network.getAllCookies();
-  const fs = require('fs');
-  // Convert to Playwright storageState format
-  fs.writeFileSync('auth.json', JSON.stringify({cookies, origins: []}, null, 2));
-  await client.close();
-});
-"
-```
-Then use `auth.json` with `npx playwright screenshot --load-storage=auth.json ...`.
----
-## Converting Between Cookie Formats
-### Playwright storageState → Netscape cookies.txt
-Useful when you want to use the captured session with `curl`, `wget`, or `yt-dlp`.
-```python
-import json, sys
-from datetime import datetime
-auth = json.load(open('auth.json'))
-print("# Netscape HTTP Cookie File")
-for c in auth.get('cookies', []):
-    domain = c['domain']
-    include_subdomains = 'TRUE' if domain.startswith('.') else 'FALSE'
-    path = c.get('path', '/')
-    https_only = 'TRUE' if c.get('secure', False) else 'FALSE'
-    expires = int(c.get('expires', 0)) if c.get('expires', -1) != -1 else 0
-    name = c['name']
-    value = c['value']
-    print(f"{domain}\t{include_subdomains}\t{path}\t{https_only}\t{expires}\t{name}\t{value}")
-```
-```bash
-python3 convert.py > cookies.txt
-curl -b cookies.txt https://example.com/api/data
-wget --load-cookies=cookies.txt https://example.com/api/data
-yt-dlp --cookies cookies.txt https://example.com/video
-```
-### Netscape cookies.txt → Playwright storageState
-```python
-import json, time
-def netscape_to_playwright(cookies_file):
-    cookies = []
-    with open(cookies_file) as f:
-        for line in f:
-            line = line.strip()
-            if not line or line.startswith('#'):
-                continue
-            parts = line.split('\t')
-            if len(parts) != 7:
-                continue
-            domain, incl_sub, path, https_only, expires, name, value = parts
-            cookies.append({
-                'name': name,
-                'value': value,
-                'domain': domain,
-                'path': path,
-                'expires': float(expires) if expires and expires != '0' else -1,
-                'httpOnly': False,
-                'secure': https_only == 'TRUE',
-                'sameSite': 'None'
-            })
-    return {'cookies': cookies, 'origins': []}
-state = netscape_to_playwright('cookies.txt')
-json.dump(state, open('auth.json', 'w'), indent=2)
-```
----
-## Comparison Table
-| Tool | Interface | Auth Handoff | Boilerplate | Best For |
-|------|-----------|-------------|-------------|---------|
-| **Playwright CLI** | Shell commands | `--save-storage` / `--load-storage` | Zero | CLI agents, shell scripts |
-| **Playwright MCP** | MCP tool calls | `--storage-state`, `--extension`, `--cdp-endpoint` | Zero | MCP agent hosts (Claude, Cursor, etc.) |
-| **browser-use** | Python + CLI | Chrome profile reuse | Low (Python) | Autonomous task agents (Python) |
-| **Chrome CDP direct** | WebSocket + HTTP | Manual cookie capture | Medium (JS) | Fine-grained control, low-level |
-| **chrome-remote-interface** | CLI REPL + JS | `Network.getAllCookies()` | Low-medium | Exploration, scripting |
-| **Browserless** | REST API (curl) | Cookie injection in JSON body | Zero (needs Docker) | Polyglot agents, Docker-friendly |
-| **Steel Browser** | REST API (curl) | Session-scoped cookie persistence | Zero (needs Docker/npx) | Multi-step auth workflows |
-| **puppeteer-extra** | JS library | Manual scripting | High | Bot-detection avoidance |
----
-## Common Pitfalls
-| Pitfall | Why It Happens | How to Avoid |
-|---------|---------------|--------------|
-| Capturing auth.json but cookies expire | Session cookies have short TTL | Check `expires` field; re-capture if expired. Use `--user-data-dir` for persistent profile instead. |
-| Playwright PDF not working | PDF command only works with Chromium | Always pass `-b chromium` or `--channel chrome` for PDF |
-| Screenshot captures login page, not dashboard | Session not loaded | Always pass `--load-storage=auth.json` |
-| Browser bot-detection blocking | Playwright leaves fingerprints | Use `--channel chrome` (real Chrome binary) instead of Chromium. Or use puppeteer-extra-stealth. |
-| MCP tools using accessibility tree but page has poor ARIA | Site has no semantic markup | Fall back to `browser_take_screenshot` + vision, or use `browser_evaluate` for DOM queries |
-| CDP WebSocket closes on page navigation | WebSocket is per-target | Re-attach after navigation using Target events |
-| Netscape cookies.txt parse error | Wrong line endings (CRLF vs LF) | Normalize to LF on Unix: `sed -i 's/\r//' cookies.txt` |
-| `browser-use` agent gets stuck in loop | LLM hallucinating element states | Set `max_steps` limit; use `browser-use state` to inspect actual element indices |
-| auth.json committed to git | Forgot to gitignore | Add `*.auth.json`, `auth/`, `.auth/` to `.gitignore` |
----
-## Best Practices
-1. **Store auth files outside the repo** — use `~/.agent/auth/{service}-auth.json` or environment-relative paths. Never commit session files. (Multiple sources)
-2. **Prefer `--user-data-dir` over `--save-storage` for long-running agents** — user data directories persist across browser restarts, handle refresh tokens, and work for sites that rotate session cookies. (Playwright MCP docs)
-3. **Use `browser_snapshot` over screenshots for text extraction** — the accessibility tree is ~10x more token-efficient than describing a screenshot and does not require a vision model. (Playwright MCP README)
-4. **Use `--channel chrome` (real Chrome) when bot detection is an issue** — websites fingerprint Chrome vs Chromium. The real Chrome binary passes more checks. (Playwright docs, chrome-for-testing)
-5. **Separate the headed auth step from the headless work step** — document these as two distinct phases in your agent code. This makes re-authentication easy when sessions expire. (browser-use docs)
-6. **For multi-step workflows, use session-based tools** — Steel Browser sessions and Playwright MCP's persistent browser maintain cookie state across page navigations automatically. One-shot REST calls lose state. (Steel Browser docs)
-7. **Test for element visibility before interaction** — use `--wait-for-selector` (CLI) or `browser_wait_for` (MCP) to avoid flaky automation on dynamic pages. (Playwright CLI docs)
-8. **Validate the captured auth immediately** — after `--save-storage`, run one screenshot with `--load-storage` and check it shows the logged-in state before using the auth file in production. (Playwright docs)
----
-## Code Examples
-### Complete Shell-Only Auth Handoff
-```bash
-#!/bin/bash
-# auth-handoff.sh - Agent auth handoff using only Playwright CLI
-AUTH_FILE="$HOME/.agent/auth/myapp-auth.json"
-BASE_URL="https://myapp.example.com"
-# Phase 1: Human auth (run once, or when session expires)
-capture_auth() {
-  mkdir -p "$(dirname "$AUTH_FILE")"
-  echo "Opening browser for login..."
-  npx playwright codegen \
-    --save-storage="$AUTH_FILE" \
-    "$BASE_URL/login"
-  echo "Auth captured: $AUTH_FILE"
-}
-# Phase 2: Agent uses auth headlessly
-take_screenshot() {
-  local url="$1"
-  local out="$2"
-  npx playwright screenshot \
-    --load-storage="$AUTH_FILE" \
-    --full-page \
-    "$url" "$out"
-}
-save_pdf() {
-  local url="$1"
-  local out="$2"
-  npx playwright pdf \
-    --load-storage="$AUTH_FILE" \
-    -b chromium \
-    "$url" "$out"
-}
-# If auth file is missing or stale, capture it
-if [ ! -f "$AUTH_FILE" ]; then
-  capture_auth
-fi
-# Agent work
-take_screenshot "$BASE_URL/dashboard" /tmp/dashboard.png
-save_pdf "$BASE_URL/report/monthly" /tmp/monthly-report.pdf
-```
-### Playwright MCP Agent Workflow (Conceptual)
-When an MCP agent wants to do browser work:
-```
-# Agent internal monologue:
-# 1. Check if page is accessible
-tool_call: browser_navigate({url: "https://app.example.com/dashboard"})
-tool_call: browser_snapshot()
-# → Returns accessibility tree; if login wall detected, trigger auth flow
-# 2. If login needed (persistent profile approach):
-# Agent tells user: "Please log into the browser window that just opened"
-# (Browser was started with --user-data-dir, user's existing login may already work)
-# 3. Once authenticated, proceed
-tool_call: browser_snapshot()  # verify dashboard loaded
-tool_call: browser_evaluate({expression: "document.title"})  # extract data
-tool_call: browser_take_screenshot({filename: "/tmp/dashboard.png"})
-```
-### Python Agent with browser-use + Cookie Export
-```python
-import asyncio, json
-from browser_use import Agent, Browser, BrowserConfig, ChatBrowserUse
-async def authenticated_scrape():
-    # Option A: Use existing Chrome profile (simplest for auth)
-    browser = Browser(config=BrowserConfig(
-        chrome_instance_path="/usr/bin/google-chrome",
-        headless=False,
-    ))
-    # Option B: Use previously saved Playwright storageState
-    # browser = Browser(config=BrowserConfig(storage_state="auth.json"))
-    llm = ChatBrowserUse()
-    agent = Agent(
-        task="""
-        Go to https://app.example.com/reports.
-        Find the most recent report dated this month.
-        Download it or return its URL.
-        """,
-        llm=llm,
-        browser=browser,
-        max_steps=20,
-    )
-    result = await agent.run()
-    print(result)
-    await browser.close()
-asyncio.run(authenticated_scrape())
-```
-### curl with Cookies from Playwright Auth
-```bash
-# After capturing auth.json with playwright codegen --save-storage
-# Quick Python converter (inline)
-python3 -c "
-import json, sys
-data = json.load(open('auth.json'))
-print('# Netscape HTTP Cookie File')
-for c in data.get('cookies', []):
-    dom = c['domain']
-    sub = 'TRUE' if dom.startswith('.') else 'FALSE'
-    sec = 'TRUE' if c.get('secure') else 'FALSE'
-    exp = int(c.get('expires', 0)) if c.get('expires', -1) > 0 else 0
-    print(f\"{dom}\t{sub}\t{c['path']}\t{sec}\t{exp}\t{c['name']}\t{c['value']}\")
-" > cookies.txt
-# Use with curl
-curl -b cookies.txt https://app.example.com/api/data | jq .
-# Use with wget
-wget --load-cookies=cookies.txt -O data.json https://app.example.com/api/data
-# Use with yt-dlp
-yt-dlp --cookies cookies.txt https://app.example.com/video/123
-```
----
-## Further Reading
-| Resource | Type | Why Recommended |
-|----------|------|-----------------|
-| [Playwright CLI docs](https://playwright.dev/docs/cli) | Official Docs | Authoritative reference for all CLI commands and flags |
-| [Playwright Auth docs](https://playwright.dev/docs/auth) | Official Docs | Comprehensive guide to storageState, setup projects, session reuse |
-| [Playwright MCP on GitHub](https://github.com/microsoft/playwright-mcp) | Official Repo | Complete tool list, config options, Chrome extension setup |
-| [browser-use on GitHub](https://github.com/browser-use/browser-use) | Official Repo | Agent API, CLI reference, custom tools, production deployment |
-| [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) | Official Spec | Complete CDP domain/method reference |
-| [chrome-remote-interface](https://github.com/cyrus-and/chrome-remote-interface) | Library | Node.js CDP wrapper with CLI REPL |
-| [Steel Browser](https://github.com/steel-dev/steel-browser) | Open Source | REST API browser service, session management |
-| [Browserless](https://github.com/browserless/browserless) | Open Source | Docker REST browser service |
-| [yt-dlp cookies guide](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp) | Guide | Netscape cookie format, browser extension recommendations |
-| [puppeteer-extra-stealth](https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth) | Plugin | 20+ bot-detection patches for Puppeteer |
----
-*Generated by /learn from 32 sources.*
-*See `resources/cli-browser-automation-agents-sources.json` for full source metadata.*