@vpxa/aikit 0.1.140 → 0.1.142
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +2 -1
- package/packages/browser/dist/index.d.ts +92 -0
- package/packages/browser/dist/index.js +10 -0
- package/packages/indexer/dist/index.js +1 -1
- package/packages/server/dist/index.js +1 -1
- package/packages/server/dist/{routes-OaSHcA6x.js → routes-gbC5Wmr9.js} +1 -1
- package/packages/server/dist/{server-Cr0Y3q6C.js → server-Mioq3dZQ.js} +139 -139
- package/packages/tools/dist/index.d.ts +1 -0
- package/packages/tools/dist/index.js +72 -70
- package/scaffold/dist/adapters/copilot.mjs +13 -12
- package/scaffold/dist/definitions/bodies.mjs +7 -7
- package/scaffold/dist/definitions/plugins.mjs +1 -1
- package/scaffold/dist/definitions/protocols.mjs +82 -293
- package/scaffold/dist/definitions/skills/aikit.mjs +1 -1
- package/scaffold/dist/definitions/skills/browser-use.mjs +187 -285
- package/scaffold/dist/definitions/skills/c4-architecture.mjs +9 -4
- package/scaffold/dist/definitions/skills/multi-agents-development.mjs +46 -55
- package/scaffold/dist/definitions/skills/present.mjs +4 -4
- package/scaffold/dist/definitions/skills/repo-access.mjs +198 -1
- package/scaffold/dist/definitions/tools.mjs +1 -1
- package/scaffold/generated/block-docs.mjs +0 -13
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
var e=[{file:`SKILL.md`,content:`---
|
|
2
2
|
name: browser-use
|
|
3
|
-
description: "Browser automation for AI agents using
|
|
3
|
+
description: "Browser automation for AI agents using AI Kit's owned \`browser\` MCP tool. Triggered when: (1) repo-access exhausts its Strategy Ladder and auth requires browser interaction, (2) \`web_fetch\` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that \`web_fetch\` cannot handle, (5) user asks to browse, scrape, test, or automate a website. Uses AI Kit's owned Chromium runtime — no external MCP server dependency."
|
|
4
4
|
metadata:
|
|
5
5
|
category: cross-cutting
|
|
6
6
|
domain: general
|
|
@@ -8,432 +8,334 @@ metadata:
|
|
|
8
8
|
inputs: [url, auth-error, browser-task, login-wall]
|
|
9
9
|
outputs: [page-content, screenshots, extracted-data, authenticated-session]
|
|
10
10
|
requires: []
|
|
11
|
-
relatedSkills: [repo-access, aikit]
|
|
11
|
+
relatedSkills: [repo-access, present, aikit]
|
|
12
12
|
argument-hint: "URL or browser task description"
|
|
13
13
|
---
|
|
14
14
|
|
|
15
15
|
# Browser Automation for AI Agents
|
|
16
16
|
|
|
17
|
-
|
|
17
|
+
Use AI Kit's owned \`browser\` MCP tool to solve authentication barriers, extract data, fill forms, and interact with web applications. This skill bridges CLI-based access failures (login walls, SAML SSO, OAuth, CAPTCHAs) and real browser interaction without any external browser MCP dependency.
|
|
18
18
|
|
|
19
|
-
|
|
19
|
+
## Runtime Model
|
|
20
|
+
|
|
21
|
+
- Single MCP tool: \`browser({ action: ... })\`
|
|
22
|
+
- Action-based dispatch across eight actions: \`open\`, \`read\`, \`act\`, \`navigate\`, \`eval\`, \`screenshot\`, \`dialog\`, \`session\`
|
|
23
|
+
- Owned Chromium runtime managed by AI Kit itself
|
|
24
|
+
- Install browser binaries once with \`aikit browser install\`
|
|
25
|
+
- Runtime modes: \`headless\` for CI, \`ui\` for desktop browser windows, \`panel\` for VS Code-hosted browsing
|
|
26
|
+
- Auto-idle shutdown closes inactive browser sessions after the configured timeout
|
|
27
|
+
- No external MCP server, no separate browser tool registration, no extra setup after install
|
|
20
28
|
|
|
21
29
|
## When to Activate
|
|
22
30
|
|
|
23
31
|
### Reactive Triggers
|
|
24
32
|
|
|
25
|
-
- \`repo-access\`
|
|
26
|
-
- \`web_fetch\` returns login
|
|
27
|
-
- \`http\` returns \`401
|
|
28
|
-
-
|
|
29
|
-
- User asks to interact with a web application
|
|
30
|
-
- User asks to take screenshots,
|
|
31
|
-
- A site requires JavaScript rendering that \`web_fetch\` cannot handle.
|
|
33
|
+
- \`repo-access\` exhausted its Strategy Ladder and SAML SSO, OAuth, or a login wall blocks CLI access.
|
|
34
|
+
- \`web_fetch\` returns login HTML, redirect markup, or a CAPTCHA challenge instead of target content.
|
|
35
|
+
- \`http\` returns \`401\` or \`403\` and the user confirms they can access the site in a browser.
|
|
36
|
+
- Tool output mentions "CAPTCHA", "bot detection", "Cloudflare", "verify you are human", or similar anti-bot language.
|
|
37
|
+
- User asks to interact with a web application, fill forms, click buttons, navigate flows, or extract rendered content.
|
|
38
|
+
- User asks to take screenshots, inspect accessibility output, or debug a page that requires JavaScript.
|
|
32
39
|
|
|
33
40
|
### Proactive Triggers
|
|
34
41
|
|
|
35
|
-
- Task involves an internal
|
|
36
|
-
- User asks to scrape,
|
|
37
|
-
-
|
|
42
|
+
- Task involves an internal or enterprise web application with SSO.
|
|
43
|
+
- User asks to browse, scrape, test, or automate a website.
|
|
44
|
+
- A workflow already uses \`present({ format: 'browser' })\` and you need to open the returned local dashboard URL.
|
|
38
45
|
|
|
39
46
|
## When NOT to Activate
|
|
40
47
|
|
|
41
|
-
- Public pages that \`web_fetch\` handles
|
|
42
|
-
- API endpoints
|
|
43
|
-
- Static
|
|
44
|
-
- Tasks that only need
|
|
45
|
-
|
|
46
|
-
## Available Browser Tools
|
|
48
|
+
- Public pages that \`web_fetch\` handles correctly and do not require interaction.
|
|
49
|
+
- API endpoints that are reachable via \`http\` with proper auth headers.
|
|
50
|
+
- Static downloads that work through \`http\` or repo-local tooling.
|
|
51
|
+
- Tasks that only need raw HTML, links, or outline extraction.
|
|
47
52
|
|
|
48
|
-
|
|
53
|
+
## Browser Action Reference
|
|
49
54
|
|
|
50
|
-
|
|
|
51
|
-
|
|
52
|
-
| \`
|
|
53
|
-
| \`
|
|
54
|
-
| \`
|
|
55
|
-
| \`
|
|
56
|
-
| \`
|
|
57
|
-
| \`
|
|
58
|
-
| \`
|
|
59
|
-
| \`
|
|
60
|
-
| \`screenshot_page\` | Capture visual screenshot | \`pageId\`, \`ref\`/\`selector\` |
|
|
61
|
-
| \`run_playwright_code\` | Run custom Playwright scripts for advanced automation | \`pageId\`, \`code\` |
|
|
55
|
+
| Action | Purpose | Key Fields |
|
|
56
|
+
|--------|---------|------------|
|
|
57
|
+
| \`open\` | Open a page in AI Kit's owned browser runtime | \`url\`, \`mode?\`, \`waitUntil?\` |
|
|
58
|
+
| \`read\` | Return accessibility snapshot with refs and visible structure | \`pageId\` |
|
|
59
|
+
| \`act\` | Interact with the page: click, type, press, hover, drag, select | \`pageId\`, \`kind\`, selector/ref/text/key fields |
|
|
60
|
+
| \`navigate\` | Go to URL, back, forward, reload, or wait for navigation | \`pageId\`, \`url?\`, \`type?\`, \`waitFor?\` |
|
|
61
|
+
| \`eval\` | Run sandboxed JavaScript in the page context | \`pageId\`, \`code\` |
|
|
62
|
+
| \`screenshot\` | Capture page or element screenshot | \`pageId\`, selector/ref fields |
|
|
63
|
+
| \`dialog\` | Accept or dismiss modal dialogs and related prompts | \`pageId\`, \`accept\`, \`promptText?\` |
|
|
64
|
+
| \`session\` | List open pages, close a page, or export cookies | \`sessionAction\`, \`pageId?\` |
|
|
62
65
|
|
|
63
66
|
## Core Workflow
|
|
64
67
|
|
|
65
|
-
Every browser
|
|
68
|
+
Every browser task follows the same loop:
|
|
66
69
|
|
|
67
70
|
\`\`\`
|
|
68
|
-
1. OPEN →
|
|
69
|
-
2. READ →
|
|
70
|
-
3. ACT →
|
|
71
|
-
4. READ →
|
|
71
|
+
1. OPEN → browser({ action: 'open', url: '<target>', mode: 'ui' })
|
|
72
|
+
2. READ → browser({ action: 'read', pageId })
|
|
73
|
+
3. ACT → browser({ action: 'act', pageId, kind: 'click' | 'type' | 'press' | 'hover' | 'drag' | 'select', ... })
|
|
74
|
+
4. READ → browser({ action: 'read', pageId })
|
|
72
75
|
5. LOOP → Repeat steps 3-4 until the task is complete
|
|
73
76
|
\`\`\`
|
|
74
77
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
\`\`\`
|
|
78
|
-
open_browser_page({ url: "https://example.com/login" })
|
|
79
|
-
→ Returns pageId
|
|
80
|
-
|
|
81
|
-
read_page({ pageId })
|
|
82
|
-
→ Shows form with refs: @username-input, @password-input, @login-button
|
|
83
|
-
|
|
84
|
-
type_in_page({ pageId, ref: "@username-input", text: "user@example.com" })
|
|
85
|
-
→ Note: ASK the user for credentials, NEVER guess
|
|
78
|
+
## Usage Examples
|
|
86
79
|
|
|
87
|
-
|
|
80
|
+
### Open and Inspect a Page
|
|
88
81
|
|
|
89
|
-
click_element({ pageId, ref: "@login-button", element: "Login button" })
|
|
90
|
-
|
|
91
|
-
read_page({ pageId })
|
|
92
|
-
→ Verify: page shows dashboard/welcome content, not login form
|
|
93
82
|
\`\`\`
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
83
|
+
const { pageId } = await browser({ action: 'open', url: 'https://example.com', mode: 'ui' })
|
|
84
|
+
await browser({ action: 'read', pageId })
|
|
97
85
|
\`\`\`
|
|
98
|
-
open_browser_page({ url: "https://internal.company.com/docs" })
|
|
99
86
|
|
|
100
|
-
|
|
101
|
-
→ If login wall: follow login flow (see auth patterns)
|
|
102
|
-
→ If content visible: extract what you need
|
|
87
|
+
### Login to a Web Application
|
|
103
88
|
|
|
104
|
-
run_playwright_code({
|
|
105
|
-
pageId,
|
|
106
|
-
code: \\\`return page.evaluate(() => document.querySelector('main').innerText)\\\`
|
|
107
|
-
})
|
|
108
|
-
→ Returns the page text content
|
|
109
89
|
\`\`\`
|
|
90
|
+
const { pageId } = await browser({ action: 'open', url: 'https://example.com/login', mode: 'ui' })
|
|
110
91
|
|
|
111
|
-
|
|
112
|
-
|
|
92
|
+
await browser({ action: 'read', pageId })
|
|
93
|
+
await browser({ action: 'act', pageId, kind: 'type', ref: '@username-input', text: 'user@example.com' })
|
|
94
|
+
await browser({ action: 'act', pageId, kind: 'type', ref: '@password-input', text: '<user-provided>' })
|
|
95
|
+
await browser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })
|
|
96
|
+
await browser({ action: 'read', pageId })
|
|
113
97
|
\`\`\`
|
|
114
|
-
open_browser_page({ url: "https://example.com/form" })
|
|
115
|
-
read_page({ pageId }) → identify form fields and their refs
|
|
116
98
|
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
click_element({ pageId, ref: "@us-option", element: "United States option" })
|
|
121
|
-
click_element({ pageId, ref: "@submit-button", element: "Submit button" })
|
|
99
|
+
**Rule:** ask the user for credentials and 2FA codes. Never guess, reuse, or log them.
|
|
100
|
+
|
|
101
|
+
### Extract Content from an Authenticated Page
|
|
122
102
|
|
|
123
|
-
read_page({ pageId }) → verify submission success
|
|
124
103
|
\`\`\`
|
|
104
|
+
const { pageId } = await browser({ action: 'open', url: 'https://internal.company.com/docs', mode: 'ui' })
|
|
105
|
+
await browser({ action: 'read', pageId })
|
|
125
106
|
|
|
126
|
-
|
|
107
|
+
await browser({
|
|
108
|
+
action: 'eval',
|
|
109
|
+
pageId,
|
|
110
|
+
code: "return page.evaluate(() => document.querySelector('main')?.innerText ?? '')",
|
|
111
|
+
})
|
|
112
|
+
\`\`\`
|
|
127
113
|
|
|
128
|
-
|
|
114
|
+
### Navigate, Hover, and Capture a Screenshot
|
|
129
115
|
|
|
130
|
-
### Extract All Links
|
|
131
|
-
\`\`\`javascript
|
|
132
|
-
return page.evaluate(() =>
|
|
133
|
-
Array.from(document.querySelectorAll('a[href]'))
|
|
134
|
-
.map(a => ({ text: a.textContent.trim(), href: a.href }))
|
|
135
|
-
.filter(l => l.text)
|
|
136
|
-
)
|
|
137
116
|
\`\`\`
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
await page.waitForSelector('.results-loaded', { timeout: 10000 })
|
|
142
|
-
return page.evaluate(() => document.querySelector('.results').innerText)
|
|
117
|
+
await browser({ action: 'navigate', pageId, url: 'https://example.com/dashboard' })
|
|
118
|
+
await browser({ action: 'act', pageId, kind: 'hover', selector: '[data-help]' })
|
|
119
|
+
await browser({ action: 'screenshot', pageId })
|
|
143
120
|
\`\`\`
|
|
144
121
|
|
|
145
|
-
###
|
|
146
|
-
\`\`\`javascript
|
|
147
|
-
return page.evaluate(() => {
|
|
148
|
-
const rows = document.querySelectorAll('table tbody tr')
|
|
149
|
-
return Array.from(rows).map(row =>
|
|
150
|
-
Array.from(row.cells).map(cell => cell.textContent.trim())
|
|
151
|
-
)
|
|
152
|
-
})
|
|
153
|
-
\`\`\`
|
|
122
|
+
### Session Management
|
|
154
123
|
|
|
155
|
-
### Extract Cookies (for session transfer)
|
|
156
|
-
\`\`\`javascript
|
|
157
|
-
const cookies = await page.context().cookies()
|
|
158
|
-
return cookies.filter(c => c.name.includes('session') || c.name.includes('auth'))
|
|
159
124
|
\`\`\`
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
let previousHeight = 0
|
|
164
|
-
while (true) {
|
|
165
|
-
const height = await page.evaluate(() => document.body.scrollHeight)
|
|
166
|
-
if (height === previousHeight) break
|
|
167
|
-
previousHeight = height
|
|
168
|
-
await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight))
|
|
169
|
-
await page.waitForTimeout(1000)
|
|
170
|
-
}
|
|
171
|
-
return page.evaluate(() => document.body.innerText)
|
|
125
|
+
await browser({ action: 'session', sessionAction: 'list' })
|
|
126
|
+
await browser({ action: 'session', sessionAction: 'cookies', pageId })
|
|
127
|
+
await browser({ action: 'session', sessionAction: 'close', pageId })
|
|
172
128
|
\`\`\`
|
|
173
129
|
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
This skill is the **browser escalation path** for repo-access. When repo-access cannot solve authentication via CLI strategies (Steps 1-5), browser-use provides the final recovery:
|
|
130
|
+
Use cookie export only when the user explicitly needs session transfer back into CLI tools.
|
|
177
131
|
|
|
178
|
-
|
|
132
|
+
## Security Model (HARD GATE)
|
|
179
133
|
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
e. Extract content via \`read_page\` or \`run_playwright_code\`
|
|
188
|
-
f. For git clone access: extract cookies/tokens via \`run_playwright_code\` to use in CLI
|
|
134
|
+
- AI Kit enforces URL allowlisting before page navigation; respect denials instead of trying alternate bypasses.
|
|
135
|
+
- \`eval\` runs inside AI Kit's browser sandbox. Keep scripts minimal, purpose-built, and limited to the user-approved task.
|
|
136
|
+
- Password field values are redacted by the runtime. Never ask the tool to expose them and never echo them back to the user.
|
|
137
|
+
- Cookie export is gated behind \`action: 'session'\`. Only request cookies when necessary, tell the user they are sensitive, and never store them in code, commits, or logs.
|
|
138
|
+
- Never screenshot or copy pages that visibly reveal passwords, tokens, or other secrets.
|
|
139
|
+
- Never automate destructive or irreversible actions unless the user explicitly requested them.
|
|
140
|
+
- Never bypass 2FA, CAPTCHA, or rate limits. Ask the user to complete the human step, then continue.
|
|
189
141
|
|
|
190
|
-
|
|
142
|
+
## Integration with Other Skills
|
|
191
143
|
|
|
192
|
-
|
|
193
|
-
2. \`open_browser_page({ url: oauthUrl })\` — open OAuth page
|
|
194
|
-
3. \`read_page\` → find the "Authorize" / "Allow" button
|
|
195
|
-
4. \`click_element\` → authorize
|
|
196
|
-
5. \`read_page\` → URL now contains \`?code=abc123\`
|
|
197
|
-
6. Extract the authorization code → return to CLI workflow for token exchange
|
|
144
|
+
### repo-access
|
|
198
145
|
|
|
199
|
-
|
|
146
|
+
This skill is the final browser escalation path for \`repo-access\`. Use it when CLI auth recovery fails and the target requires SSO, OAuth, or a login wall. Typical flow:
|
|
200
147
|
|
|
201
|
-
1.
|
|
202
|
-
2.
|
|
203
|
-
3.
|
|
204
|
-
4. \`
|
|
205
|
-
5.
|
|
148
|
+
1. \`repo-access\` exhausts Steps 1-5.
|
|
149
|
+
2. Load \`browser-use\`.
|
|
150
|
+
3. \`browser({ action: 'open', url: repoUrl, mode: 'ui' })\`
|
|
151
|
+
4. \`browser({ action: 'read', pageId })\` to inspect login state.
|
|
152
|
+
5. Use \`browser({ action: 'act', kind: 'type' | 'click', ... })\` for login fields and buttons.
|
|
153
|
+
6. Use \`browser({ action: 'eval', ... })\` or \`browser({ action: 'session', sessionAction: 'cookies', ... })\` only when the user explicitly needs extracted content or session transfer.
|
|
206
154
|
|
|
207
|
-
###
|
|
155
|
+
### present
|
|
208
156
|
|
|
209
|
-
|
|
210
|
-
2. \`open_browser_page({ url })\` → open the page
|
|
211
|
-
3. If login form visible → guide through login (ask user for credentials)
|
|
212
|
-
4. Once authenticated → \`read_page\` or \`run_playwright_code\` to extract content
|
|
213
|
-
5. Content is now available without needing \`web_fetch\`
|
|
157
|
+
When \`present({ format: 'browser' })\` returns a local dashboard URL, open it with AI Kit's browser tool instead of an external browser MCP:
|
|
214
158
|
|
|
215
|
-
|
|
159
|
+
\`\`\`
|
|
160
|
+
browser({ action: 'open', url: 'http://127.0.0.1:{port}', mode: 'ui' })
|
|
161
|
+
\`\`\`
|
|
216
162
|
|
|
217
|
-
|
|
218
|
-
- **NEVER** screenshot pages containing visible credentials, tokens, or sensitive data.
|
|
219
|
-
- **NEVER** automate actions the user hasn't explicitly requested (no purchasing, no sending messages, no deleting content).
|
|
220
|
-
- **NEVER** bypass 2FA/MFA — always ask the user for codes.
|
|
221
|
-
- **ALWAYS** ask the user for credentials rather than guessing or using stored values.
|
|
222
|
-
- **ALWAYS** confirm before submitting forms or performing irreversible actions.
|
|
223
|
-
- When extracting cookies via \`run_playwright_code\`, warn the user they contain auth tokens.
|
|
224
|
-
- **NEVER** store extracted cookies in code, commits, or logs.
|
|
225
|
-
- Close browser sessions when done if they contain authenticated state.
|
|
163
|
+
This keeps the viewing workflow inside the same owned runtime.
|
|
226
164
|
|
|
227
165
|
## Troubleshooting
|
|
228
166
|
|
|
229
|
-
| Problem |
|
|
167
|
+
| Problem | Response |
|
|
230
168
|
|---------|----------|
|
|
231
|
-
|
|
|
232
|
-
|
|
|
233
|
-
| Element
|
|
234
|
-
|
|
|
235
|
-
|
|
|
236
|
-
|
|
|
237
|
-
|
|
|
238
|
-
| Dynamic content not loaded | Use \`run_playwright_code\` with \`page.waitForSelector\` before reading |
|
|
239
|
-
| Form submission fails | Check for hidden fields or CSRF tokens — use \`run_playwright_code\` to inspect |
|
|
240
|
-
| Multiple pages needed | Track multiple pageIds — \`open_browser_page\` returns unique IDs per page |
|
|
169
|
+
| Browser runtime missing | Run \`aikit browser install\` and retry |
|
|
170
|
+
| No active page or stale \`pageId\` | Re-open with \`action: 'open'\` or inspect \`action: 'session'\` \`list\` output |
|
|
171
|
+
| Element refs stop matching | Re-run \`browser({ action: 'read', pageId })\` after each re-render |
|
|
172
|
+
| Headless blocked by target site | Retry with \`mode: 'ui'\` or \`mode: 'panel'\` |
|
|
173
|
+
| CAPTCHA appears | Ask the user to solve it manually, then continue from \`read\` |
|
|
174
|
+
| Need to inspect cookies | Use \`browser({ action: 'session', sessionAction: 'cookies', pageId })\` and warn the user |
|
|
175
|
+
| Need complex DOM extraction | Use \`browser({ action: 'eval', ... })\` with a small, targeted script |
|
|
241
176
|
|
|
242
177
|
## Decision Flow
|
|
243
178
|
|
|
244
179
|
\`\`\`
|
|
245
|
-
Need
|
|
246
|
-
├─ Public, no JS needed?
|
|
247
|
-
├─
|
|
248
|
-
├─
|
|
249
|
-
├─
|
|
250
|
-
├─ Need
|
|
251
|
-
|
|
252
|
-
├─ CAPTCHA blocking access? → ask user to solve in browser panel
|
|
253
|
-
└─ Complex multi-step automation? → run_playwright_code for custom scripts
|
|
180
|
+
Need browser help?
|
|
181
|
+
├─ Public page, no JS or auth needed? → web_fetch
|
|
182
|
+
├─ Needs JS rendering or interaction? → browser open/read
|
|
183
|
+
├─ Login wall or SSO flow? → repo-access → browser-use
|
|
184
|
+
├─ Need local dashboard viewing? → present(browser) → browser open
|
|
185
|
+
├─ Need screenshot or accessibility? → browser screenshot/read
|
|
186
|
+
└─ Need cookie/session transfer? → browser session (with user approval)
|
|
254
187
|
\`\`\`
|
|
255
188
|
`},{file:`references/auth-patterns.md`,content:`# Browser Auth Patterns
|
|
256
189
|
|
|
257
|
-
Patterns for using
|
|
190
|
+
Patterns for using AI Kit's owned \`browser\` tool to solve authentication challenges that block CLI-based access.
|
|
258
191
|
|
|
259
192
|
## Pattern 1: SAML SSO Recovery
|
|
260
193
|
|
|
261
|
-
**Problem:** \`web_fetch\` returns SAML redirect HTML instead of content
|
|
194
|
+
**Problem:** \`web_fetch\` returns SAML redirect HTML instead of content and \`repo-access\` exhausted its Strategy Ladder.
|
|
262
195
|
|
|
263
196
|
**Solution:**
|
|
264
197
|
\`\`\`
|
|
265
198
|
1. Open the target URL:
|
|
266
|
-
|
|
199
|
+
const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
|
|
267
200
|
|
|
268
|
-
2. Read page
|
|
269
|
-
|
|
270
|
-
→ If SSO login form:
|
|
271
|
-
→ If content already visible: skip to step 5
|
|
201
|
+
2. Read page state:
|
|
202
|
+
await browser({ action: 'read', pageId })
|
|
203
|
+
→ If SSO login form: continue to step 3
|
|
204
|
+
→ If content is already visible: skip to step 5
|
|
272
205
|
|
|
273
206
|
3. SSO login interaction:
|
|
274
|
-
-
|
|
275
|
-
-
|
|
276
|
-
-
|
|
277
|
-
-
|
|
207
|
+
- Username/email field → browser({ action: 'act', pageId, kind: 'type', ref: usernameRef, text: userEmail })
|
|
208
|
+
- Password field → browser({ action: 'act', pageId, kind: 'type', ref: passwordRef, text: userPassword })
|
|
209
|
+
- Submit button → browser({ action: 'act', pageId, kind: 'click', ref: signInButtonRef })
|
|
210
|
+
- Ask the user for credentials first. Never guess.
|
|
278
211
|
|
|
279
|
-
4. Handle
|
|
280
|
-
-
|
|
281
|
-
-
|
|
282
|
-
- If 2FA prompt appears → ask user for code → type_in_page
|
|
212
|
+
4. Handle redirect chain:
|
|
213
|
+
- Re-run \`browser({ action: 'read', pageId })\` after redirects
|
|
214
|
+
- If 2FA prompt appears, ask the user for the code and enter it with \`kind: 'type'\`
|
|
283
215
|
|
|
284
216
|
5. Extract content:
|
|
285
|
-
-
|
|
286
|
-
-
|
|
287
|
-
-
|
|
217
|
+
- \`browser({ action: 'read', pageId })\` for accessible text
|
|
218
|
+
- \`browser({ action: 'eval', ... })\` for targeted extraction
|
|
219
|
+
- \`browser({ action: 'screenshot', pageId })\` for visual capture
|
|
288
220
|
\`\`\`
|
|
289
221
|
|
|
290
222
|
## Pattern 2: OAuth Consent Flow
|
|
291
223
|
|
|
292
|
-
**Problem:** Service requires OAuth consent that
|
|
224
|
+
**Problem:** Service requires OAuth consent that cannot be completed in CLI.
|
|
293
225
|
|
|
294
226
|
**Solution:**
|
|
295
227
|
\`\`\`
|
|
296
|
-
1.
|
|
228
|
+
1. const { pageId } = await browser({ action: 'open', url: oauthAuthorizeUrl, mode: 'ui' })
|
|
297
229
|
|
|
298
|
-
2.
|
|
230
|
+
2. await browser({ action: 'read', pageId })
|
|
299
231
|
→ Find the "Authorize" / "Allow" / "Grant access" button
|
|
300
232
|
|
|
301
|
-
3.
|
|
233
|
+
3. await browser({ action: 'act', pageId, kind: 'click', ref: authorizeButtonRef })
|
|
302
234
|
|
|
303
|
-
4.
|
|
304
|
-
→ URL now contains ?code=abc123
|
|
235
|
+
4. await browser({ action: 'read', pageId })
|
|
236
|
+
→ URL now contains ?code=abc123 or the consent flow is complete
|
|
305
237
|
|
|
306
|
-
5. Extract the
|
|
307
|
-
|
|
308
|
-
pageId,
|
|
309
|
-
code: 'return page.url()'
|
|
310
|
-
})
|
|
311
|
-
→ Parse the authorization code from the URL
|
|
238
|
+
5. Extract the final URL when needed:
|
|
239
|
+
await browser({ action: 'eval', pageId, code: 'return page.url()' })
|
|
312
240
|
|
|
313
|
-
6. Return code to CLI workflow
|
|
241
|
+
6. Return the authorization code or completed session to the CLI workflow
|
|
314
242
|
\`\`\`
|
|
315
243
|
|
|
316
244
|
## Pattern 3: 2FA / MFA Challenge
|
|
317
245
|
|
|
318
|
-
**Problem:** Login requires 2FA code that only the user can provide.
|
|
246
|
+
**Problem:** Login requires a 2FA code that only the user can provide.
|
|
319
247
|
|
|
320
|
-
**CRITICAL:**
|
|
248
|
+
**CRITICAL:** Never bypass 2FA and never guess codes.
|
|
321
249
|
|
|
322
250
|
**Solution:**
|
|
323
251
|
\`\`\`
|
|
324
|
-
1. Complete username/password entry
|
|
325
|
-
|
|
326
|
-
2. read_page({ pageId })
|
|
327
|
-
→ Page shows 2FA input field
|
|
252
|
+
1. Complete username/password entry from Pattern 1
|
|
328
253
|
|
|
329
|
-
|
|
254
|
+
2. await browser({ action: 'read', pageId })
|
|
255
|
+
→ Confirm the page shows a 2FA input field
|
|
330
256
|
|
|
331
|
-
|
|
257
|
+
3. Ask the user for the code via elicitation
|
|
332
258
|
|
|
333
|
-
|
|
334
|
-
|
|
259
|
+
4. await browser({ action: 'act', pageId, kind: 'type', ref: totpInputRef, text: userProvidedCode })
|
|
260
|
+
5. await browser({ action: 'act', pageId, kind: 'press', key: 'Enter' })
|
|
335
261
|
|
|
336
|
-
6.
|
|
337
|
-
→ Verify
|
|
262
|
+
6. await browser({ action: 'read', pageId })
|
|
263
|
+
→ Verify the page shows authenticated content, not the login form
|
|
338
264
|
\`\`\`
|
|
339
265
|
|
|
340
|
-
## Pattern 4: Cookie
|
|
266
|
+
## Pattern 4: Cookie or Token Transfer
|
|
341
267
|
|
|
342
|
-
**Problem:**
|
|
268
|
+
**Problem:** CLI tools need authenticated session state from the browser.
|
|
343
269
|
|
|
344
270
|
**Solution:**
|
|
345
271
|
\`\`\`
|
|
346
|
-
1. Complete login flow
|
|
347
|
-
|
|
348
|
-
2.
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
c.name.includes('auth') ||
|
|
356
|
-
c.name.includes('token')
|
|
357
|
-
)
|
|
358
|
-
\\\`
|
|
359
|
-
})
|
|
360
|
-
|
|
361
|
-
3. Use extracted cookie values in http tool:
|
|
362
|
-
http({
|
|
363
|
-
url: apiEndpoint,
|
|
364
|
-
headers: { "Cookie": "session=<extracted-value>" }
|
|
365
|
-
})
|
|
366
|
-
|
|
367
|
-
4. WARNING: Tell the user these tokens are ephemeral and will expire.
|
|
368
|
-
NEVER store them in code, commits, or logs.
|
|
272
|
+
1. Complete login flow first
|
|
273
|
+
|
|
274
|
+
2. Export cookies only if the user explicitly asked for session transfer:
|
|
275
|
+
await browser({ action: 'session', sessionAction: 'cookies', pageId })
|
|
276
|
+
|
|
277
|
+
3. Use the returned cookie data with CLI tools or \`http\` as needed
|
|
278
|
+
|
|
279
|
+
4. Tell the user the cookies are sensitive and ephemeral.
|
|
280
|
+
Never commit, log, or persist them in source files.
|
|
369
281
|
\`\`\`
|
|
370
282
|
|
|
371
|
-
## Pattern 5: Content Behind Login Wall
|
|
283
|
+
## Pattern 5: Content Behind a Login Wall
|
|
372
284
|
|
|
373
|
-
**Problem:** \`web_fetch\` returns a login page instead of content.
|
|
285
|
+
**Problem:** \`web_fetch\` returns a login page instead of the target content.
|
|
374
286
|
|
|
375
287
|
**Solution:**
|
|
376
288
|
\`\`\`
|
|
377
|
-
1.
|
|
289
|
+
1. const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
|
|
378
290
|
|
|
379
|
-
2.
|
|
380
|
-
→
|
|
291
|
+
2. await browser({ action: 'read', pageId })
|
|
292
|
+
→ Confirm login form is visible
|
|
381
293
|
|
|
382
|
-
3. Ask user for credentials
|
|
294
|
+
3. Ask the user for credentials
|
|
383
295
|
|
|
384
|
-
4. Fill and submit
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
296
|
+
4. Fill and submit the form:
|
|
297
|
+
- browser({ action: 'act', pageId, kind: 'type', ref: usernameRef, text: userEmail })
|
|
298
|
+
- browser({ action: 'act', pageId, kind: 'type', ref: passwordRef, text: userPassword })
|
|
299
|
+
- browser({ action: 'act', pageId, kind: 'click', ref: loginButtonRef })
|
|
388
300
|
|
|
389
301
|
5. Handle post-login challenges:
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
6. Extract the content:
|
|
396
|
-
read_page({ pageId }) → accessible text
|
|
397
|
-
run_playwright_code → targeted extraction
|
|
398
|
-
screenshot_page → visual capture
|
|
302
|
+
- 2FA → Pattern 3
|
|
303
|
+
- Consent screen → Pattern 2
|
|
304
|
+
- Success → continue
|
|
305
|
+
|
|
306
|
+
6. Extract content with \`read\`, \`eval\`, or \`screenshot\`
|
|
399
307
|
\`\`\`
|
|
400
308
|
|
|
401
309
|
## Pattern 6: CAPTCHA Handling
|
|
402
310
|
|
|
403
|
-
**Problem:** Target site shows CAPTCHA challenge.
|
|
311
|
+
**Problem:** Target site shows a CAPTCHA or anti-bot challenge.
|
|
404
312
|
|
|
405
313
|
**Detection signals:**
|
|
406
|
-
- "Checking your browser..."
|
|
407
|
-
- reCAPTCHA
|
|
314
|
+
- "Checking your browser..."
|
|
315
|
+
- reCAPTCHA, hCaptcha, or Turnstile widgets
|
|
408
316
|
- "Please verify you are human"
|
|
409
317
|
|
|
410
318
|
**Solution:**
|
|
411
319
|
\`\`\`
|
|
412
|
-
1.
|
|
320
|
+
1. const { pageId } = await browser({ action: 'open', url: targetUrl, mode: 'ui' })
|
|
413
321
|
|
|
414
|
-
2.
|
|
415
|
-
|
|
322
|
+
2. Inspect with:
|
|
323
|
+
- browser({ action: 'read', pageId })
|
|
324
|
+
- browser({ action: 'screenshot', pageId })
|
|
416
325
|
|
|
417
|
-
3.
|
|
418
|
-
"A CAPTCHA challenge appeared on the page. Please solve it
|
|
419
|
-
in the browser panel, then let me know when done."
|
|
326
|
+
3. Ask the user to solve the CAPTCHA in the browser window or panel
|
|
420
327
|
|
|
421
|
-
4. After user confirms:
|
|
422
|
-
|
|
423
|
-
→ Content should now be accessible
|
|
328
|
+
4. After the user confirms, continue with:
|
|
329
|
+
browser({ action: 'read', pageId })
|
|
424
330
|
|
|
425
|
-
5. If CAPTCHA
|
|
426
|
-
automation. Report to user and suggest manual access.
|
|
331
|
+
5. If the CAPTCHA loops, report that manual access is required
|
|
427
332
|
\`\`\`
|
|
428
333
|
|
|
429
|
-
**Key rule:** NEVER attempt to solve CAPTCHAs programmatically. Always ask the user.
|
|
430
|
-
|
|
431
334
|
## Security Reminders
|
|
432
335
|
|
|
433
|
-
- Always ask the user for credentials
|
|
434
|
-
-
|
|
435
|
-
- Tell the user when you extract auth tokens and that they expire
|
|
336
|
+
- Always ask the user for credentials and 2FA codes; never guess or reuse hidden values
|
|
337
|
+
- Exported cookies or tokens are secrets; never log, store, or commit them
|
|
436
338
|
- Confirm before submitting forms or performing irreversible actions
|
|
437
|
-
- Close authenticated
|
|
438
|
-
-
|
|
339
|
+
- Close authenticated pages when the task is complete: \`browser({ action: 'session', sessionAction: 'close', pageId })\`
|
|
340
|
+
- Respect allowlisting, sandboxing, and any runtime security denial from the browser tool
|
|
439
341
|
`}];export{e as default};
|