npm - @vpxa/aikit - Versions diffs - 0.1.140 → 0.1.142 - Mend

@vpxa/aikit 0.1.140 → 0.1.142

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/package.json +2 -1
package/packages/browser/dist/index.d.ts +92 -0
package/packages/browser/dist/index.js +10 -0
package/packages/indexer/dist/index.js +1 -1
package/packages/server/dist/index.js +1 -1
package/packages/server/dist/{routes-OaSHcA6x.js → routes-gbC5Wmr9.js} +1 -1
package/packages/server/dist/{server-Cr0Y3q6C.js → server-Mioq3dZQ.js} +139 -139
package/packages/tools/dist/index.d.ts +1 -0
package/packages/tools/dist/index.js +72 -70
package/scaffold/dist/adapters/copilot.mjs +13 -12
package/scaffold/dist/definitions/bodies.mjs +7 -7
package/scaffold/dist/definitions/plugins.mjs +1 -1
package/scaffold/dist/definitions/protocols.mjs +82 -293
package/scaffold/dist/definitions/skills/aikit.mjs +1 -1
package/scaffold/dist/definitions/skills/browser-use.mjs +187 -285
package/scaffold/dist/definitions/skills/c4-architecture.mjs +9 -4
package/scaffold/dist/definitions/skills/multi-agents-development.mjs +46 -55
package/scaffold/dist/definitions/skills/present.mjs +4 -4
package/scaffold/dist/definitions/skills/repo-access.mjs +198 -1
package/scaffold/dist/definitions/tools.mjs +1 -1
package/scaffold/generated/block-docs.mjs +0 -13

package/scaffold/dist/definitions/skills/browser-use.mjs CHANGED Viewed

@@ -1,6 +1,6 @@
 var e=[{file:`SKILL.md`,content:`---
 name: browser-use
-description: "Browser automation for AI agents using Playwright MCP browser tools. Triggered when: (1) repo-access skill exhausts its Strategy Ladder and auth requires browser interaction, (2) \`web_fetch\` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that \`web_fetch\` cannot handle, (5) user asks to browse, scrape, test, or automate a website. Zero setup — uses tools available to any MCP client with Playwright MCP server."
+description: "Browser automation for AI agents using AI Kit's owned \`browser\` MCP tool. Triggered when: (1) repo-access exhausts its Strategy Ladder and auth requires browser interaction, (2) \`web_fetch\` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that \`web_fetch\` cannot handle, (5) user asks to browse, scrape, test, or automate a website. Uses AI Kit's owned Chromium runtime — no external MCP server dependency."
 metadata:
   category: cross-cutting
   domain: general
@@ -8,432 +8,334 @@ metadata:
   inputs: [url, auth-error, browser-task, login-wall]
   outputs: [page-content, screenshots, extracted-data, authenticated-session]
   requires: []
-  relatedSkills: [repo-access, aikit]
+  relatedSkills: [repo-access, present, aikit]
 argument-hint: "URL or browser task description"
 ---
 # Browser Automation for AI Agents
-Drive the Playwright MCP browser to solve authentication barriers, extract data, fill forms, and interact with web applications. This skill bridges the gap between CLI-based access (which fails on login walls, SAML SSO, CAPTCHAs) and real browser interaction.
+Use AI Kit's owned \`browser\` MCP tool to solve authentication barriers, extract data, fill forms, and interact with web applications. This skill bridges CLI-based access failures (login walls, SAML SSO, OAuth, CAPTCHAs) and real browser interaction without any external browser MCP dependency.
-**Zero setup required** — all tools are provided by the Playwright MCP server. No installs, no API keys, no user configuration.
+## Runtime Model
+- Single MCP tool: \`browser({ action: ... })\`
+- Action-based dispatch across eight actions: \`open\`, \`read\`, \`act\`, \`navigate\`, \`eval\`, \`screenshot\`, \`dialog\`, \`session\`
+- Owned Chromium runtime managed by AI Kit itself
+- Install browser binaries once with \`aikit browser install\`
+- Runtime modes: \`headless\` for CI, \`ui\` for desktop browser windows, \`panel\` for VS Code-hosted browsing
+- Auto-idle shutdown closes inactive browser sessions after the configured timeout
+- No external MCP server, no separate browser tool registration, no extra setup after install
 ## When to Activate
 ### Reactive Triggers
-- \`repo-access\` skill exhausted its Strategy Ladder — SAML SSO, OAuth, or login walls block all CLI paths.
-- \`web_fetch\` returns login page HTML, SAML redirect, or CAPTCHA challenge instead of content.
-- \`http\` returns \`401\`/\`403\` and the user confirms they can access the resource in their browser.
-- Any tool output contains "CAPTCHA", "bot detection", "Cloudflare", "Please verify you are human", or similar anti-bot language.
-- User asks to interact with a web application (fill forms, click buttons, navigate, extract data).
-- User asks to take screenshots, test UI, or debug a web page.
-- A site requires JavaScript rendering that \`web_fetch\` cannot handle.
+- \`repo-access\` exhausted its Strategy Ladder and SAML SSO, OAuth, or a login wall blocks CLI access.
+- \`web_fetch\` returns login HTML, redirect markup, or a CAPTCHA challenge instead of target content.
+- \`http\` returns \`401\` or \`403\` and the user confirms they can access the site in a browser.
+- Tool output mentions "CAPTCHA", "bot detection", "Cloudflare", "verify you are human", or similar anti-bot language.
+- User asks to interact with a web application, fill forms, click buttons, navigate flows, or extract rendered content.
+- User asks to take screenshots, inspect accessibility output, or debug a page that requires JavaScript.
 ### Proactive Triggers
-- Task involves an internal/enterprise web application with SSO.
-- User asks to scrape, automate, or interact with a website.
-- User mentions a site that requires login.
+- Task involves an internal or enterprise web application with SSO.
+- User asks to browse, scrape, test, or automate a website.
+- A workflow already uses \`present({ format: 'browser' })\` and you need to open the returned local dashboard URL.
 ## When NOT to Activate
-- Public pages that \`web_fetch\` handles fine (no login, no JS rendering needed).
-- API endpoints accessible via \`http\` tool with proper auth headers.
-- Static file downloads that work with \`http\`.
-- Tasks that only need \`read_page\` on an already-open browser tab.
-## Available Browser Tools
+- Public pages that \`web_fetch\` handles correctly and do not require interaction.
+- API endpoints that are reachable via \`http\` with proper auth headers.
+- Static downloads that work through \`http\` or repo-local tooling.
+- Tasks that only need raw HTML, links, or outline extraction.
-All tools are provided by the Playwright MCP server — zero setup, always available:
+## Browser Action Reference
-| Tool | Purpose | Key Parameters |
-|------|---------|----------------|
-| \`open_browser_page\` | Open URL in the integrated browser | \`url\`, \`forceNew\` |
-| \`read_page\` | Get accessibility snapshot with element refs | \`pageId\` |
-| \`click_element\` | Click by ref, selector, or description | \`pageId\`, \`ref\`/\`selector\`, \`element\` |
-| \`type_in_page\` | Type text or press keys into elements | \`pageId\`, \`text\`/\`key\`, \`ref\`/\`selector\` |
-| \`navigate_page\` | Navigate by URL, back/forward, reload | \`pageId\`, \`url\`/\`type\` |
-| \`hover_element\` | Hover over elements (tooltips, menus) | \`pageId\`, \`ref\`/\`selector\` |
-| \`drag_element\` | Drag and drop between elements | \`pageId\`, \`fromRef\`, \`toRef\` |
-| \`handle_dialog\` | Respond to alerts, confirms, file choosers | \`pageId\`, \`acceptModal\` |
-| \`screenshot_page\` | Capture visual screenshot | \`pageId\`, \`ref\`/\`selector\` |
-| \`run_playwright_code\` | Run custom Playwright scripts for advanced automation | \`pageId\`, \`code\` |
+| Action | Purpose | Key Fields |
+|--------|---------|------------|
+| \`open\` | Open a page in AI Kit's owned browser runtime | \`url\`, \`mode?\`, \`waitUntil?\` |
+| \`read\` | Return accessibility snapshot with refs and visible structure | \`pageId\` |
+| \`act\` | Interact with the page: click, type, press, hover, drag, select | \`pageId\`, \`kind\`, selector/ref/text/key fields |
+| \`navigate\` | Go to URL, back, forward, reload, or wait for navigation | \`pageId\`, \`url?\`, \`type?\`, \`waitFor?\` |
+| \`eval\` | Run sandboxed JavaScript in the page context | \`pageId\`, \`code\` |
+| \`screenshot\` | Capture page or element screenshot | \`pageId\`, selector/ref fields |
+| \`dialog\` | Accept or dismiss modal dialogs and related prompts | \`pageId\`, \`accept\`, \`promptText?\` |
+| \`session\` | List open pages, close a page, or export cookies | \`sessionAction\`, \`pageId?\` |
 ## Core Workflow
-Every browser interaction follows this pattern:
+Every browser task follows the same loop:
 \`\`\`
-1. OPEN  → open_browser_page({ url: "<target>" })
-2. READ  → read_page({ pageId })  — get accessibility tree with element refs
-3. ACT   → click_element / type_in_page / navigate_page — interact with elements
-4. READ  → read_page({ pageId })  — verify the result
+1. OPEN  → browser({ action: 'open', url: '<target>', mode: 'ui' })
+2. READ  → browser({ action: 'read', pageId })
+3. ACT   → browser({ action: 'act', pageId, kind: 'click' | 'type' | 'press' | 'hover' | 'drag' | 'select', ... })
+4. READ  → browser({ action: 'read', pageId })
 5. LOOP  → Repeat steps 3-4 until the task is complete
 \`\`\`
-### Example: Login to a Web Application
-\`\`\`
-open_browser_page({ url: "https://example.com/login" })
-  → Returns pageId
-read_page({ pageId })
-  → Shows form with refs: @username-input, @password-input, @login-button
-type_in_page({ pageId, ref: "@username-input", text: "user@example.com" })
-  → Note: ASK the user for credentials, NEVER guess
+## Usage Examples
-type_in_page({ pageId, ref: "@password-input", text: "<user-provided>" })
+### Open and Inspect a Page
-click_element({ pageId, ref: "@login-button", element: "Login button" })
-read_page({ pageId })
-  → Verify: page shows dashboard/welcome content, not login form
 \`\`\`
-### Example: Extract Content from Authenticated Page
+const { pageId } = await browser({ action: 'open', url: 'https://example.com', mode: 'ui' })
+await browser({ action: 'read', pageId })
 \`\`\`
-open_browser_page({ url: "https://internal.company.com/docs" })
-read_page({ pageId })
-  → If login wall: follow login flow (see auth patterns)
-  → If content visible: extract what you need
+### Login to a Web Application
-run_playwright_code({
-  pageId,
-  code: \\\`return page.evaluate(() => document.querySelector('main').innerText)\\\`
-})
-  → Returns the page text content
 \`\`\`
+const { pageId } = await browser({ action: 'open', url: 'https://example.com/login', mode: 'ui' })
-### Example: Fill a Form
+await browser({ action: 'read', pageId })
+await browser({ action: 'act', pageId, kind: 'type', ref: '@username-input', text: 'user@example.com' })
+await browser({ action: 'act', pageId, kind: 'type', ref: '@password-input', text: '<user-provided>' })
+await browser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })
+await browser({ action: 'read', pageId })
 \`\`\`
-open_browser_page({ url: "https://example.com/form" })
-read_page({ pageId })  → identify form fields and their refs
-type_in_page({ pageId, ref: "@name-field", text: "John Doe" })
-type_in_page({ pageId, ref: "@email-field", text: "john@example.com" })
-click_element({ pageId, ref: "@country-select", element: "country dropdown" })
-click_element({ pageId, ref: "@us-option", element: "United States option" })
-click_element({ pageId, ref: "@submit-button", element: "Submit button" })
+**Rule:** ask the user for credentials and 2FA codes. Never guess, reuse, or log them.
+### Extract Content from an Authenticated Page
-read_page({ pageId })  → verify submission success
 \`\`\`
+const { pageId } = await browser({ action: 'open', url: 'https://internal.company.com/docs', mode: 'ui' })
+await browser({ action: 'read', pageId })
-## Advanced: run_playwright_code
+await browser({
+  action: 'eval',
+  pageId,
+  code: "return page.evaluate(() => document.querySelector('main')?.innerText ?? '')",
+})
+\`\`\`
-For complex automation that basic tools can't handle, use \`run_playwright_code\`:
+### Navigate, Hover, and Capture a Screenshot
-### Extract All Links
-\`\`\`javascript
-return page.evaluate(() =>
-  Array.from(document.querySelectorAll('a[href]'))
-    .map(a => ({ text: a.textContent.trim(), href: a.href }))
-    .filter(l => l.text)
-)
 \`\`\`
-### Wait for Dynamic Content
-\`\`\`javascript
-await page.waitForSelector('.results-loaded', { timeout: 10000 })
-return page.evaluate(() => document.querySelector('.results').innerText)
+await browser({ action: 'navigate', pageId, url: 'https://example.com/dashboard' })
+await browser({ action: 'act', pageId, kind: 'hover', selector: '[data-help]' })
+await browser({ action: 'screenshot', pageId })
 \`\`\`
-### Extract Table Data
-\`\`\`javascript
-return page.evaluate(() => {
-  const rows = document.querySelectorAll('table tbody tr')
-  return Array.from(rows).map(row =>
-    Array.from(row.cells).map(cell => cell.textContent.trim())
-  )
-})
-\`\`\`
+### Session Management
-### Extract Cookies (for session transfer)
-\`\`\`javascript
-const cookies = await page.context().cookies()
-return cookies.filter(c => c.name.includes('session') || c.name.includes('auth'))
 \`\`\`
-### Scroll and Load More
-\`\`\`javascript
-let previousHeight = 0
-while (true) {
-  const height = await page.evaluate(() => document.body.scrollHeight)
-  if (height === previousHeight) break
-  previousHeight = height
-  await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight))
-  await page.waitForTimeout(1000)
-}
-return page.evaluate(() => document.body.innerText)
+await browser({ action: 'session', sessionAction: 'list' })
+await browser({ action: 'session', sessionAction: 'cookies', pageId })
+await browser({ action: 'session', sessionAction: 'close', pageId })
 \`\`\`
-## Integration with repo-access Skill
-This skill is the **browser escalation path** for repo-access. When repo-access cannot solve authentication via CLI strategies (Steps 1-5), browser-use provides the final recovery:
+Use cookie export only when the user explicitly needs session transfer back into CLI tools.
-### Scenario: SAML SSO on GitHub Enterprise
+## Security Model (HARD GATE)
-1. \`repo-access\` detects SAML SSO redirect in \`web_fetch\` output
-2. \`repo-access\` walks Strategy Ladder — all CLI paths fail
-3. **Escalate to browser-use:**
-   a. \`open_browser_page({ url: repoUrl })\` — open repo page
-   b. \`read_page\` — check if SSO redirect appears
-   c. If SSO login form → interact with it (user may need to provide credentials)
-   d. If SSO auto-completes (IdP session) → page loads with repo content
-   e. Extract content via \`read_page\` or \`run_playwright_code\`
-   f. For git clone access: extract cookies/tokens via \`run_playwright_code\` to use in CLI
+- AI Kit enforces URL allowlisting before page navigation; respect denials instead of trying alternate bypasses.
+- \`eval\` runs inside AI Kit's browser sandbox. Keep scripts minimal, purpose-built, and limited to the user-approved task.
+- Password field values are redacted by the runtime. Never ask the tool to expose them and never echo them back to the user.
+- Cookie export is gated behind \`action: 'session'\`. Only request cookies when necessary, tell the user they are sensitive, and never store them in code, commits, or logs.
+- Never screenshot or copy pages that visibly reveal passwords, tokens, or other secrets.
+- Never automate destructive or irreversible actions unless the user explicitly requested them.
+- Never bypass 2FA, CAPTCHA, or rate limits. Ask the user to complete the human step, then continue.
-### Scenario: OAuth Login Flow
+## Integration with Other Skills
-1. A service requires OAuth consent screen interaction
-2. \`open_browser_page({ url: oauthUrl })\` — open OAuth page
-3. \`read_page\` → find the "Authorize" / "Allow" button
-4. \`click_element\` → authorize
-5. \`read_page\` → URL now contains \`?code=abc123\`
-6. Extract the authorization code → return to CLI workflow for token exchange
+### repo-access
-### Scenario: 2FA / MFA Challenge
+This skill is the final browser escalation path for \`repo-access\`. Use it when CLI auth recovery fails and the target requires SSO, OAuth, or a login wall. Typical flow:
-1. Open login page, fill credentials
-2. Page shows 2FA prompt
-3. **Ask the user** for their 2FA code (NEVER guess or bypass)
-4. \`type_in_page\` → enter code
-5. Verify login succeeded via \`read_page\`
+1. \`repo-access\` exhausts Steps 1-5.
+2. Load \`browser-use\`.
+3. \`browser({ action: 'open', url: repoUrl, mode: 'ui' })\`
+4. \`browser({ action: 'read', pageId })\` to inspect login state.
+5. Use \`browser({ action: 'act', kind: 'type' | 'click', ... })\` for login fields and buttons.
+6. Use \`browser({ action: 'eval', ... })\` or \`browser({ action: 'session', sessionAction: 'cookies', ... })\` only when the user explicitly needs extracted content or session transfer.
-### Scenario: Content Behind Login Wall
+### present
-1. \`web_fetch\` returns login HTML instead of content
-2. \`open_browser_page({ url })\` → open the page
-3. If login form visible → guide through login (ask user for credentials)
-4. Once authenticated → \`read_page\` or \`run_playwright_code\` to extract content
-5. Content is now available without needing \`web_fetch\`
+When \`present({ format: 'browser' })\` returns a local dashboard URL, open it with AI Kit's browser tool instead of an external browser MCP:
-## Security Rules (HARD GATE)
+\`\`\`
+browser({ action: 'open', url: 'http://127.0.0.1:{port}', mode: 'ui' })
+\`\`\`
-- **NEVER** extract, log, or display user passwords or secrets from browser sessions.
-- **NEVER** screenshot pages containing visible credentials, tokens, or sensitive data.
-- **NEVER** automate actions the user hasn't explicitly requested (no purchasing, no sending messages, no deleting content).
-- **NEVER** bypass 2FA/MFA — always ask the user for codes.
-- **ALWAYS** ask the user for credentials rather than guessing or using stored values.
-- **ALWAYS** confirm before submitting forms or performing irreversible actions.
-- When extracting cookies via \`run_playwright_code\`, warn the user they contain auth tokens.
-- **NEVER** store extracted cookies in code, commits, or logs.
-- Close browser sessions when done if they contain authenticated state.
+This keeps the viewing workflow inside the same owned runtime.
 ## Troubleshooting
-| Problem | Solution |
+| Problem | Response |
 |---------|----------|
-| \`open_browser_page\` fails | Check the URL is valid and accessible. Try with \`forceNew: true\` |
-| Page shows "No active page" | The pageId is stale — re-open with \`open_browser_page\` |
-| Element not found by ref | Refs change on page re-render — call \`read_page\` again to get fresh refs |
-| Element not visible | Use \`run_playwright_code\` to scroll: \`await page.evaluate(() => window.scrollBy(0, 500))\` |
-| Login redirect loop | The site may need cookies from a different domain — check with \`run_playwright_code\` |
-| CAPTCHA appears | Ask the user to solve it manually in the browser panel, then continue |
-| Page loads empty/blank | Site may block headless browsers — try \`screenshot_page\` to see what rendered |
-| Dynamic content not loaded | Use \`run_playwright_code\` with \`page.waitForSelector\` before reading |
-| Form submission fails | Check for hidden fields or CSRF tokens — use \`run_playwright_code\` to inspect |
-| Multiple pages needed | Track multiple pageIds — \`open_browser_page\` returns unique IDs per page |
+| Browser runtime missing | Run \`aikit browser install\` and retry |
+| No active page or stale \`pageId\` | Re-open with \`action: 'open'\` or inspect \`action: 'session'\` \`list\` output |
+| Element refs stop matching | Re-run \`browser({ action: 'read', pageId })\` after each re-render |
+| Headless blocked by target site | Retry with \`mode: 'ui'\` or \`mode: 'panel'\` |
+| CAPTCHA appears | Ask the user to solve it manually, then continue from \`read\` |
+| Need to inspect cookies | Use \`browser({ action: 'session', sessionAction: 'cookies', pageId })\` and warn the user |
+| Need complex DOM extraction | Use \`browser({ action: 'eval', ... })\` with a small, targeted script |
 ## Decision Flow
 \`\`\`
-Need to access a web resource?
-├─ Public, no JS needed?           → web_fetch (don't use browser)
-├─ Public, needs JS rendering?     → open_browser_page → read_page
-├─ Behind login wall?              → open_browser_page → login flow → extract content
-├─ repo-access exhausted?          → browser-use is the final escalation
-├─ Need to fill forms/click?       → open_browser_page → read_page → interact
-├─ Need screenshots?               → open_browser_page → screenshot_page
-├─ CAPTCHA blocking access?        → ask user to solve in browser panel
-└─ Complex multi-step automation?  → run_playwright_code for custom scripts
+Need browser help?
+├─ Public page, no JS or auth needed?   → web_fetch
+├─ Needs JS rendering or interaction?   → browser open/read
+├─ Login wall or SSO flow?              → repo-access → browser-use
+├─ Need local dashboard viewing?        → present(browser) → browser open
+├─ Need screenshot or accessibility?    → browser screenshot/read
+└─ Need cookie/session transfer?        → browser session (with user approval)
 \`\`\`
 `},{file:`references/auth-patterns.md`,content:`# Browser Auth Patterns
-Patterns for using Playwright MCP browser tools to solve authentication challenges that block CLI-based access.
+Patterns for using AI Kit's owned \`browser\` tool to solve authentication challenges that block CLI-based access.
 ## Pattern 1: SAML SSO Recovery
-**Problem:** \`web_fetch\` returns SAML redirect HTML instead of content. \`repo-access\` Strategy Ladder exhausted.
+**Problem:** \`web_fetch\` returns SAML redirect HTML instead of content and \`repo-access\` exhausted its Strategy Ladder.
 **Solution:**
 \`\`\`
 1. Open the target URL:
-   open_browser_page({ url: targetUrl })
+   const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
-2. Read page to check state:
-   read_page({ pageId })
-   → If SSO login form: proceed to step 3
-   → If content already visible: skip to step 5
+2. Read page state:
+   await browser({ action: 'read', pageId })
+   → If SSO login form: continue to step 3
+   → If content is already visible: skip to step 5
 3. SSO login interaction:
-   - Find username/email field → type_in_page({ ref, text: userEmail })
-   - Find password field → type_in_page({ ref, text: userPassword })
-   - Click "Sign In" → click_element({ ref, element: "Sign In button" })
-   - NOTE: ASK the user for credentials first
+   - Username/email field → browser({ action: 'act', pageId, kind: 'type', ref: usernameRef, text: userEmail })
+   - Password field → browser({ action: 'act', pageId, kind: 'type', ref: passwordRef, text: userPassword })
+   - Submit button → browser({ action: 'act', pageId, kind: 'click', ref: signInButtonRef })
+   - Ask the user for credentials first. Never guess.
-4. Handle SSO redirect chain:
-   - The browser will auto-follow redirects through the IdP
-   - read_page({ pageId }) after each redirect to check state
-   - If 2FA prompt appears → ask user for code → type_in_page
+4. Handle redirect chain:
+   - Re-run \`browser({ action: 'read', pageId })\` after redirects
+   - If 2FA prompt appears, ask the user for the code and enter it with \`kind: 'type'\`
 5. Extract content:
-   - read_page({ pageId }) → get accessible text
-   - Or: run_playwright_code → document.querySelector('main').innerText
-   - Or: screenshot_page for visual content
+   - \`browser({ action: 'read', pageId })\` for accessible text
+   - \`browser({ action: 'eval', ... })\` for targeted extraction
+   - \`browser({ action: 'screenshot', pageId })\` for visual capture
 \`\`\`
 ## Pattern 2: OAuth Consent Flow
-**Problem:** Service requires OAuth consent that can't be completed in CLI.
+**Problem:** Service requires OAuth consent that cannot be completed in CLI.
 **Solution:**
 \`\`\`
-1. open_browser_page({ url: oauthAuthorizeUrl })
+1. const { pageId } = await browser({ action: 'open', url: oauthAuthorizeUrl, mode: 'ui' })
-2. read_page({ pageId })
+2. await browser({ action: 'read', pageId })
    → Find the "Authorize" / "Allow" / "Grant access" button
-3. click_element({ pageId, ref: authorizeButtonRef, element: "Authorize button" })
+3. await browser({ action: 'act', pageId, kind: 'click', ref: authorizeButtonRef })
-4. read_page({ pageId })
-   → URL now contains ?code=abc123 (the authorization code)
+4. await browser({ action: 'read', pageId })
+   → URL now contains ?code=abc123 or the consent flow is complete
-5. Extract the code:
-   run_playwright_code({
-     pageId,
-     code: 'return page.url()'
-   })
-   → Parse the authorization code from the URL
+5. Extract the final URL when needed:
+   await browser({ action: 'eval', pageId, code: 'return page.url()' })
-6. Return code to CLI workflow for token exchange
+6. Return the authorization code or completed session to the CLI workflow
 \`\`\`
 ## Pattern 3: 2FA / MFA Challenge
-**Problem:** Login requires 2FA code that only the user can provide.
+**Problem:** Login requires a 2FA code that only the user can provide.
-**CRITICAL:** NEVER try to bypass 2FA. NEVER guess codes. ALWAYS ask the user.
+**CRITICAL:** Never bypass 2FA and never guess codes.
 **Solution:**
 \`\`\`
-1. Complete username/password entry (Pattern 1 steps 1-3)
-2. read_page({ pageId })
-   → Page shows 2FA input field
+1. Complete username/password entry from Pattern 1
-3. Ask the user for their 2FA code via elicitation
+2. await browser({ action: 'read', pageId })
+   → Confirm the page shows a 2FA input field
-4. type_in_page({ pageId, ref: totpInputRef, text: userProvidedCode })
+3. Ask the user for the code via elicitation
-5. click_element({ pageId, ref: verifyButtonRef, element: "Verify button" })
-   → Or: type_in_page({ pageId, key: "Enter" })
+4. await browser({ action: 'act', pageId, kind: 'type', ref: totpInputRef, text: userProvidedCode })
+5. await browser({ action: 'act', pageId, kind: 'press', key: 'Enter' })
-6. read_page({ pageId })
-   → Verify: page shows authenticated content, not login/2FA form
+6. await browser({ action: 'read', pageId })
+   → Verify the page shows authenticated content, not the login form
 \`\`\`
-## Pattern 4: Cookie/Token Extraction
+## Pattern 4: Cookie or Token Transfer
-**Problem:** Need to extract auth tokens from an authenticated browser session for use in CLI tools.
+**Problem:** CLI tools need authenticated session state from the browser.
 **Solution:**
 \`\`\`
-1. Complete login flow (Patterns 1-3)
-2. Extract cookies:
-   run_playwright_code({
-     pageId,
-     code: \\\`
-       const cookies = await page.context().cookies()
-       return cookies.filter(c =>
-         c.name.includes('session') ||
-         c.name.includes('auth') ||
-         c.name.includes('token')
-       )
-     \\\`
-   })
-3. Use extracted cookie values in http tool:
-   http({
-     url: apiEndpoint,
-     headers: { "Cookie": "session=<extracted-value>" }
-   })
-4. WARNING: Tell the user these tokens are ephemeral and will expire.
-   NEVER store them in code, commits, or logs.
+1. Complete login flow first
+2. Export cookies only if the user explicitly asked for session transfer:
+   await browser({ action: 'session', sessionAction: 'cookies', pageId })
+3. Use the returned cookie data with CLI tools or \`http\` as needed
+4. Tell the user the cookies are sensitive and ephemeral.
+   Never commit, log, or persist them in source files.
 \`\`\`
-## Pattern 5: Content Behind Login Wall
+## Pattern 5: Content Behind a Login Wall
-**Problem:** \`web_fetch\` returns a login page instead of content.
+**Problem:** \`web_fetch\` returns a login page instead of the target content.
 **Solution:**
 \`\`\`
-1. open_browser_page({ url: targetUrl })
+1. const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
-2. read_page({ pageId })
-   → Login form visible
+2. await browser({ action: 'read', pageId })
+   → Confirm login form is visible
-3. Ask user for credentials (NEVER guess)
+3. Ask the user for credentials
-4. Fill and submit login form:
-   type_in_page({ pageId, ref: usernameRef, text: userEmail })
-   type_in_page({ pageId, ref: passwordRef, text: userPassword })
-   click_element({ pageId, ref: loginButtonRef, element: "Login button" })
+4. Fill and submit the form:
+   - browser({ action: 'act', pageId, kind: 'type', ref: usernameRef, text: userEmail })
+   - browser({ action: 'act', pageId, kind: 'type', ref: passwordRef, text: userPassword })
+   - browser({ action: 'act', pageId, kind: 'click', ref: loginButtonRef })
 5. Handle post-login challenges:
-   read_page({ pageId })
-   → 2FA? → Pattern 3
-   → Consent screen? → Pattern 2
-   → Content visible? → Continue
-6. Extract the content:
-   read_page({ pageId })  → accessible text
-   run_playwright_code → targeted extraction
-   screenshot_page → visual capture
+   - 2FA → Pattern 3
+   - Consent screen → Pattern 2
+   - Success → continue
+6. Extract content with \`read\`, \`eval\`, or \`screenshot\`
 \`\`\`
 ## Pattern 6: CAPTCHA Handling
-**Problem:** Target site shows CAPTCHA challenge.
+**Problem:** Target site shows a CAPTCHA or anti-bot challenge.
 **Detection signals:**
-- "Checking your browser..." (Cloudflare)
-- reCAPTCHA / hCaptcha / Turnstile widget visible
+- "Checking your browser..."
+- reCAPTCHA, hCaptcha, or Turnstile widgets
 - "Please verify you are human"
 **Solution:**
 \`\`\`
-1. open_browser_page({ url: targetUrl })
+1. const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
-2. read_page({ pageId }) or screenshot_page({ pageId })
-   → CAPTCHA visible
+2. Inspect with:
+   - browser({ action: 'read', pageId })
+   - browser({ action: 'screenshot', pageId })
-3. ASK THE USER to solve the CAPTCHA:
-   "A CAPTCHA challenge appeared on the page. Please solve it
-    in the browser panel, then let me know when done."
+3. Ask the user to solve the CAPTCHA in the browser window or panel
-4. After user confirms:
-   read_page({ pageId })
-   → Content should now be accessible
+4. After the user confirms, continue with:
+   browser({ action: 'read', pageId })
-5. If CAPTCHA reappears → the site may be aggressively blocking
-   automation. Report to user and suggest manual access.
+5. If the CAPTCHA loops, report that manual access is required
 \`\`\`
-**Key rule:** NEVER attempt to solve CAPTCHAs programmatically. Always ask the user.
 ## Security Reminders
-- Always ask the user for credentials — NEVER guess, infer, or reuse
-- Extracted cookies/tokens are SECRETS — never log, store, or commit them
-- Tell the user when you extract auth tokens and that they expire
+- Always ask the user for credentials and 2FA codes; never guess or reuse hidden values
+- Exported cookies or tokens are secrets; never log, store, or commit them
 - Confirm before submitting forms or performing irreversible actions
-- Close authenticated sessions when the task is complete
-- Never bypass security measures (2FA, CAPTCHA, rate limits)
+- Close authenticated pages when the task is complete: \`browser({ action: 'session', sessionAction: 'close', pageId })\`
+- Respect allowlisting, sandboxing, and any runtime security denial from the browser tool
 `}];export{e as default};