@vpxa/aikit 0.1.307 → 0.1.309

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/package.json +1 -1
  2. package/packages/cli/dist/index.js +4 -4
  3. package/packages/cli/dist/{init-CyjUXjQw.js → init-VP9ig7OK.js} +1 -1
  4. package/packages/cli/dist/{templates-BQ1J4HzY.js → templates-WsJg6Pkc.js} +5 -5
  5. package/packages/server/dist/bin.js +1 -1
  6. package/packages/server/dist/index.js +1 -1
  7. package/packages/server/dist/repair-json-B6Q_HRoP.js +3 -0
  8. package/packages/server/dist/repair-json-D4mft_HA.js +4 -0
  9. package/packages/server/dist/{server-B_KbLM43.js → server-DZKWh8ZG.js} +176 -170
  10. package/packages/server/dist/{server-utMi-Qu3.js → server-RV1UYywi.js} +177 -169
  11. package/packages/server/dist/{server-http-B-TDT3t-.js → server-http-DeWcQphZ.js} +1 -1
  12. package/packages/server/dist/{server-http-BbuuthEP.js → server-http-Dk16rq4T.js} +1 -1
  13. package/packages/server/dist/server-stdio-Bx_Aa99F.js +1 -0
  14. package/packages/server/dist/server-stdio-CebgeeBc.js +2 -0
  15. package/packages/server/dist/{version-check-DSWaugPC.js → version-check-CdBHTxtt.js} +1 -1
  16. package/packages/server/dist/{version-check-6qDKknz4.js → version-check-CggUKvv8.js} +1 -1
  17. package/scaffold/INSTRUCTIONS.md +273 -0
  18. package/scaffold/dist/adapters/copilot.mjs +2 -9
  19. package/scaffold/dist/adapters/hermes-agent.mjs +2 -2
  20. package/scaffold/dist/adapters/hermes.mjs +8 -4
  21. package/scaffold/dist/adapters/hooks.mjs +1 -1
  22. package/scaffold/dist/adapters/intellij.mjs +7 -3
  23. package/scaffold/dist/adapters/skills.mjs +3 -1
  24. package/scaffold/dist/adapters/zed.mjs +6 -2
  25. package/scaffold/dist/definitions/agents.mjs +2 -2
  26. package/scaffold/dist/definitions/bodies.mjs +98 -369
  27. package/scaffold/dist/definitions/flows.mjs +6 -6
  28. package/scaffold/dist/definitions/prompts.mjs +12 -12
  29. package/scaffold/dist/definitions/protocols.mjs +117 -556
  30. package/scaffold/dist/definitions/skills/adr-skill.mjs +41 -197
  31. package/scaffold/dist/definitions/skills/aikit.mjs +52 -205
  32. package/scaffold/dist/definitions/skills/brainstorming.mjs +74 -112
  33. package/scaffold/dist/definitions/skills/browser-use.mjs +128 -184
  34. package/scaffold/dist/definitions/skills/c4-architecture.mjs +45 -106
  35. package/scaffold/dist/definitions/skills/docs.mjs +236 -380
  36. package/scaffold/dist/definitions/skills/frontend-design.mjs +96 -193
  37. package/scaffold/dist/definitions/skills/lesson-learned.mjs +57 -184
  38. package/scaffold/dist/definitions/skills/multi-agents-development.mjs +98 -408
  39. package/scaffold/dist/definitions/skills/present.mjs +193 -1
  40. package/scaffold/dist/definitions/skills/react.mjs +68 -111
  41. package/scaffold/dist/definitions/skills/repo-access.mjs +24 -169
  42. package/scaffold/dist/definitions/skills/requirements-clarity.mjs +45 -94
  43. package/scaffold/dist/definitions/skills/typescript.mjs +162 -230
  44. package/packages/server/dist/server-stdio-BUb39kqq.js +0 -2
  45. package/packages/server/dist/server-stdio-Ch7yAxNk.js +0 -1
@@ -14,44 +14,51 @@ argument-hint: "Feature, component, or behavior to design"
14
14
 
15
15
  # Brainstorming Ideas Into Designs
16
16
 
17
- ## Quick Reference
17
+ Use before implementation when multiple viable approaches exist, architecture/product tradeoffs matter, or user asks for guidance. Skip for mechanical changes and code-agent execution.
18
18
 
19
- **Purpose:** Explore requirements and design space before implementation starts.
19
+ ## Guardrail
20
20
 
21
- **Use this skill when:** there are multiple viable approaches, architecture trade-offs matter, or the user is asking for guidance rather than code.
21
+ No code/scaffold handoff until design is clear enough for implementation. Planning agents use this; implementation agents request context instead.
22
22
 
23
- **Skip this skill when:** the work is already decided, the change is mechanical, or only one reasonable path exists.
23
+ ## Flow
24
24
 
25
- **HARD GATE:** Do NOT write code, scaffold files, or hand off to implementation until a design is presented and approved.
25
+ 1. Frame problem: goal, users, constraints, non-goals, success signal.
26
+ 2. Ask Why? and Simpler? unless already answered.
27
+ 3. Generate 2-4 distinct options with tradeoffs.
28
+ 4. Stress test: risks, dependencies, unknowns, reversibility, cost.
29
+ 5. Recommend one path with confidence and first step.
30
+ 6. Hand off concise design to Planner/Orchestrator.
26
31
 
27
- **Expected output:** a short design or spec with constraints, alternatives, recommendation, risks, and acceptance criteria.
32
+ ## Option Shape
28
33
 
29
- <HARD-GATE>
30
- Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
31
- </HARD-GATE>
34
+ Each option includes:
35
+ - Name.
36
+ - How it works.
37
+ - Best fit.
38
+ - Costs/risks.
39
+ - What must be true.
40
+ - Verification path.
32
41
 
33
- ## When Brainstorming Is Appropriate
42
+ ## Quality Bar
34
43
 
35
- Use brainstorming when at least one of these is true:
44
+ Good brainstorming narrows choices. Avoid idea dumps. Surface disagreement, hidden constraints, and cheaper alternatives. If requirements are too vague, load requirements-clarity and score before design.
36
45
 
37
- - Multiple valid approaches exist and the choice changes cost, risk, or UX
38
- - The work introduces or changes boundaries: service, module, API, workflow, ownership
39
- - The user is asking for options, strategy, trade-offs, or "best way" guidance
40
- - Constraints are real but incomplete: performance, timeline, migration cost, reversibility, team familiarity
46
+ ## References
41
47
 
42
- Do not use brainstorming when the path is already obvious:
48
+ Load on demand:
49
+ - references/mode-selection.md — Simple vs Full mode selection criteria and examples.
50
+ - references/decision-protocol.md — Multi-Researcher and Single-Agent decision protocol details.
51
+ - references/design-quality.md — Review checklist, quality signals, and output contract formats.
52
+ - references/spec-document-reviewer-prompt.md — Prompt template for dispatching spec document review.
43
53
 
44
- - Fixing a typo, renaming a variable, or updating a pinned dependency
45
- - Implementing an already-approved spec or ADR
46
- - Mechanical migrations where the desired end state is predetermined
54
+ ## Output
47
55
 
48
- ---
49
-
50
- ## Mode Selection
56
+ Use short sections: Problem, Constraints, Options, Recommendation, Open Questions, Handoff Notes. Include decision confidence: low/medium/high.
57
+ `},{file:`references/mode-selection.md`,content:`# Mode Selection
51
58
 
52
- This skill operates in two modes. You choose the mode from the problem shape; do not ask the user which mode they want.
59
+ The brainstorming skill operates in two modes. Choose the mode from the problem shape; do not ask the user which mode they want.
53
60
 
54
- ### Simple Mode
61
+ ## Simple Mode
55
62
 
56
63
  Use when **all** of these are true:
57
64
  - Affects ≤3 files
@@ -62,7 +69,9 @@ Use when **all** of these are true:
62
69
 
63
70
  Examples: config change, utility function, bug fix with design ambiguity, small feature in existing component.
64
71
 
65
- ### Full Mode
72
+ Output: concise design note with Goal, Constraints, Options, Recommendation, Risks.
73
+
74
+ ## Full Mode
66
75
 
67
76
  Use when **any** of these are true:
68
77
  - Affects >3 files or crosses service boundaries
@@ -74,65 +83,38 @@ Use when **any** of these are true:
74
83
 
75
84
  Examples: new notification channel, new API service, platform redesign, multi-service feature, infrastructure changes.
76
85
 
77
- ---
78
-
79
- ## NEVER
80
-
81
- - **NEVER skip divergent thinking to jump to solutions** - premature convergence is the most common brainstorming failure. Force at least 3 alternatives before evaluating.
82
- - **NEVER present options without trade-off analysis** - listing A/B/C without pros, cons, and context is noise. Each option needs: strength, weakness, best-when.
83
- - **NEVER conflate requirements with design** - "the user wants to log in" is a requirement; "use OAuth2 + JWT" is a design. Clarify the problem before exploring the solution space.
84
- - **NEVER evaluate during the divergent phase** - "that won't work" too early kills useful exploration. Generate first, judge second.
85
- - **NEVER propose more than 5 options** - too many options create analysis paralysis. 3-5 well-differentiated choices is enough.
86
- - **NEVER present the obvious choice as the only real option with strawman alternatives** - if one option clearly dominates before analysis, say brainstorming is unnecessary and recommend directly.
87
- - **NEVER use brainstorming for single-path decisions** - if there is only one credible path, execute or hand off; do not manufacture fake alternatives.
88
- - **NEVER hide the deciding constraint** - if compliance, migration cost, latency, or org ownership is driving the decision, state it explicitly.
89
-
90
- ## Core Workflow
91
-
92
- | Stage | What you do | What good output looks like |
93
- |-------|-------------|-----------------------------|
94
- | Frame | State the problem, constraints, and success criteria | Clear decision question and explicit boundaries |
95
- | Clarify | Ask one question at a time until ambiguity drops | Requirements are separate from solutions |
96
- | Diverge | Generate 3 alternatives that differ in shape, not just labels | Options have meaningfully different trade-offs |
97
- | Evaluate | Compare options against criteria, constraints, reversibility, and delivery cost | Rejection reasons are explicit |
98
- | Recommend | Pick one option and explain why the others lost | Recommendation matches the stated constraints |
99
- | Confirm | Present the design, get approval, then hand off | Implementation can proceed without reopening core design questions |
100
-
101
- Simple Mode compresses the workflow into a short design note. Full Mode produces a fuller spec and spends more time on constraints, interfaces, migration, and rollout.
102
-
103
- ---
104
-
105
- ## Decision Protocol
86
+ Output: spec covering boundaries, interfaces, data flow, migration, error handling, test strategy.
106
87
 
107
- Use the heaviest process the problem actually needs. The protocol scales with decision complexity; it does not assume a giant agent swarm.
88
+ ## Large Project Decomposition
108
89
 
109
- ### Multi-Researcher Mode
90
+ Before asking detailed questions, assess scope. If the request includes multiple independent subsystems, decompose first. Do not brainstorm the whole platform as one blob. Break into sub-projects, define order, then brainstorm the first sub-project only.
91
+ `},{file:`references/decision-protocol.md`,content:`# Decision Protocol
110
92
 
111
- Use this when the decision is high-impact and the Orchestrator can actually dispatch multiple research passes.
93
+ Use the heaviest process the problem actually needs. The protocol scales with decision complexity.
112
94
 
113
- 1. **Independent passes** - gather 2-4 perspectives if available. More is optional, not mandatory.
114
- 2. **Cross-check blind spots** - compare where the perspectives agree, clash, and what each missed.
115
- 3. **Structured verdict** - produce one recommendation with confidence, deciding constraints, rejected alternatives, and first implementation step.
95
+ ## Multi-Researcher Mode
116
96
 
117
- Do not block waiting for an 8-agent protocol. If capacity is limited, reduce the number of perspectives and keep the structure.
97
+ Use when the decision is high-impact and you can dispatch multiple research passes.
118
98
 
119
- ### Single-Agent Fallback
99
+ 1. **Independent passes** — gather 2-4 perspectives if available. More is optional, not mandatory.
100
+ 2. **Cross-check blind spots** — compare where the perspectives agree, clash, and what each missed.
101
+ 3. **Structured verdict** — produce one recommendation with confidence, deciding constraints, rejected alternatives, and first implementation step.
120
102
 
121
- When running as a single agent (no Orchestrator dispatching multi-model research), use this structured approach:
103
+ Do not block waiting for an 8-agent protocol. If capacity is limited, reduce perspectives and keep the structure.
122
104
 
123
- 1. **Frame the decision space** - state the question, list constraints, define success criteria
124
- 2. **Generate 3 alternatives** - force diversity by thinking from different lenses:
125
- - Lens A: Simplest approach (minimum viable, least risk)
126
- - Lens B: Best long-term approach (might be more work now, pays off later)
127
- - Lens C: Unconventional approach (different framing, different trade-off)
128
- 3. **Evaluate against criteria** - for each alternative, score against the stated success criteria
129
- 4. **Recommend with rationale** - pick one, explain WHY, acknowledge what you're trading away
105
+ ## Single-Agent Fallback
130
106
 
131
- This replaces the full Multi-Model Decision Protocol when running solo.
107
+ When running as a single agent (no multi-model research dispatch):
132
108
 
133
- ### Verdict Format
109
+ 1. **Frame the decision space** — state the question, list constraints, define success criteria.
110
+ 2. **Generate 3 alternatives** — force diversity by thinking from different lenses:
111
+ - Lens A: Simplest approach (minimum viable, least risk)
112
+ - Lens B: Best long-term approach (more work now, pays off later)
113
+ - Lens C: Unconventional approach (different framing, different trade-off)
114
+ 3. **Evaluate against criteria** — for each alternative, score against the stated success criteria.
115
+ 4. **Recommend with rationale** — pick one, explain WHY, acknowledge what you are trading away.
134
116
 
135
- Express the result in structured text, not diagrams:
117
+ ## Verdict Format
136
118
 
137
119
  | Section | Required content |
138
120
  |---------|------------------|
@@ -143,30 +125,25 @@ Express the result in structured text, not diagrams:
143
125
  | Rejections | Why the other options lost |
144
126
  | Risks | What could fail and how reversible it is |
145
127
  | First Step | Smallest useful next move after approval |
146
-
147
- ### Large Project Decomposition
148
-
149
- Before asking detailed questions, assess scope. If the request includes multiple independent subsystems, decompose first. Do not brainstorm the whole platform as one blob.
150
-
151
- If the project is too large for a single spec, break it into sub-projects, define the order, then brainstorm the first sub-project only.
128
+ `},{file:`references/design-quality.md`,content:`# Design Quality
152
129
 
153
130
  ## Design Quality Signals
154
131
 
155
- - **Options are genuinely different** - if they only differ in naming, you have not explored the space.
156
- - **Trade-offs are real** - if one option dominates on all dimensions, the others are strawmen.
157
- - **Constraints are explicit** - "we chose X because of Y constraint" is good; "X is better" is lazy.
158
- - **Reversibility is assessed** - one-way doors need more analysis than two-way doors.
159
- - **You can explain why NOT the other options** - if you cannot articulate the rejection reason, the analysis is shallow.
132
+ - **Options are genuinely different** if they only differ in naming, you have not explored the space.
133
+ - **Trade-offs are real** if one option dominates on all dimensions, the others are strawmen.
134
+ - **Constraints are explicit** "we chose X because of Y constraint" is good; "X is better" is lazy.
135
+ - **Reversibility is assessed** one-way doors need more analysis than two-way doors.
136
+ - **You can explain why NOT the other options** if you cannot articulate the rejection reason, the analysis is shallow.
160
137
 
161
138
  ## Design Review Checklist
162
139
 
163
- Before presenting the design, verify these quality bars:
140
+ Before presenting the design, verify:
164
141
 
165
- - **Completeness** - no TODOs, placeholders, or missing constraints that block planning.
166
- - **Consistency** - sections do not contradict each other.
167
- - **Precision** - two competent developers would implement compatible solutions from this spec.
168
- - **Scope control** - the design solves the asked problem and avoids unrequested extras.
169
- - **Canonical language** - names and domain terms match the codebase or are explicitly introduced.
142
+ - **Completeness** no TODOs, placeholders, or missing constraints that block planning.
143
+ - **Consistency** sections do not contradict each other.
144
+ - **Precision** two competent developers would implement compatible solutions from this spec.
145
+ - **Scope control** the design solves the asked problem and avoids unrequested extras.
146
+ - **Canonical language** names and domain terms match the codebase or are explicitly introduced.
170
147
 
171
148
  Fix serious issues before handoff. Minor wording changes are advisory, not blockers.
172
149
 
@@ -178,28 +155,13 @@ Fix serious issues before handoff. Minor wording changes are advisory, not block
178
155
  - Separate divergent generation from evaluation.
179
156
  - Lead with your recommendation once the option space is explored.
180
157
  - Stay concrete: boundaries, interfaces, data flow, failure modes, rollout.
181
-
182
- ## Output Contract
183
-
184
- For Simple Mode, produce a concise design note with:
185
-
186
- - Goal
187
- - Constraints
188
- - 3 options
189
- - Recommendation
190
- - Risks and acceptance criteria
191
-
192
- For Full Mode, produce a spec that also covers:
193
-
194
- - Boundaries and component responsibilities
195
- - Interfaces and data flow
196
- - Migration or rollout strategy
197
- - Error handling and operational risks
198
- - Test and acceptance strategy
199
-
200
- When useful, save the approved design to \`docs/plans/YYYY-MM-DD-<topic>-design.md\` before handing off to implementation planning.
201
-
202
- `},{file:`spec-document-reviewer-prompt.md`,content:`# Spec Document Reviewer Prompt Template
158
+ - NEVER skip divergent thinking to jump to solutions. Force at least 3 alternatives before evaluating.
159
+ - NEVER present options without trade-off analysis.
160
+ - NEVER conflate requirements with design.
161
+ - NEVER evaluate during the divergent phase.
162
+ - NEVER propose more than 5 options (3-5 is enough).
163
+ - NEVER hide the deciding constraint.
164
+ `},{file:`references/spec-document-reviewer-prompt.md`,content:`# Spec Document Reviewer Prompt Template
203
165
 
204
166
  Use this template when dispatching a spec document reviewer subagent.
205
167
 
@@ -1,6 +1,6 @@
1
1
  var e=[{file:`SKILL.md`,content:`---
2
2
  name: browser-use
3
- description: "Browser automation for AI agents using AI Kit's owned \`browser\` MCP tool. Triggered when: (1) repo-access exhausts its Strategy Ladder and auth requires browser interaction, (2) \`web_fetch\` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that \`web_fetch\` cannot handle, (5) user asks to browse, scrape, test, or automate a website, or (6) another skill needs a standard recipe format for browser-driven workflows. Uses AI Kit's owned Chromium runtime and recipe patterns for domain-specific automation skills — no external MCP server dependency."
3
+ description: "Browser automation for AI agents using AI Kit's owned browser MCP tool. Triggered when repo-access exhausts its Strategy Ladder and auth requires browser interaction; web_fetch returns login page HTML, SAML redirect, or CAPTCHA; user needs web app interaction; site requires JavaScript rendering; user asks to browse, scrape, test, or automate a website; or another skill needs browser recipes. Uses AI Kit's owned Chromium runtime."
4
4
  metadata:
5
5
  category: cross-cutting
6
6
  domain: general
@@ -12,204 +12,59 @@ metadata:
12
12
  argument-hint: "URL or browser task description"
13
13
  ---
14
14
 
15
- # Browser Automation for AI Agents
15
+ # Browser Automation
16
16
 
17
- Use AI Kit's \`browser\` MCP tool for authentication barriers, data extraction, form interactions, network capture, and web automation. Single tool, action-based dispatch, owned Chromium runtime.
17
+ Use AI Kit browser for JS-rendered pages, auth barriers, forms, screenshots, network capture, storage/cookies, and web app verification.
18
18
 
19
- ## Runtime Preference
19
+ ## Tool Choice
20
20
 
21
- - Always use AI Kit's controlled Chromium when the agent needs to inspect, verify, or interact with a page.
22
- - Use \`mode: 'headless'\` by default. Switch to \`mode: 'ui'\` only for user-visible auth or visual debugging.
23
- - Never use system browser commands for agent work. They provide no programmatic feedback.
21
+ Start with web_fetch/http for static content. Use browser when:
22
+ - Page needs JS rendering or interaction.
23
+ - Auth/SSO/CAPTCHA/login wall blocks fetch.
24
+ - User asks to fill forms, click, scrape, inspect, or test UI.
25
+ - Need screenshot/canvas/visual verification.
24
26
 
25
- ## Quick Reference
26
-
27
- | I need to... | Action | Key params |
28
- |---|---|---|
29
- | Open a page | \`open\` | url, mode (ui/headless) |
30
- | Read page content | \`read\` | readMode: snapshot/dom/markdown/text |
31
- | Click/type/interact | \`act\` | kind: click/type/press/hover/select |
32
- | Wait for something | \`navigate\` | type: waitFor, selector |
33
- | Check network calls | \`network\` | subAction: enable → get |
34
- | Get cookies/storage | \`session\` | sessionAction: cookies/get-storage |
35
- | Take screenshot | \`screenshot\` | fullPage, selector |
36
- | Compare changes | \`diff\` | (compares to previous snapshot) |
37
-
38
- For full parameter details: \`describe_tool('browser')\`
39
-
40
- ## Principles
41
-
42
- - **Prefer \`read\` over \`screenshot\`** — snapshots are structured (ARIA tree), searchable, and token-efficient. Screenshots are opaque blobs. Use screenshots only for visual verification.
43
- - **Prefer \`headless\` over \`ui\`** — faster, no window management. Use \`ui\` only when: user needs to see the browser, auth requires manual interaction, or debugging visual issues.
44
- - **Always \`read\` after \`act\`** — actions don't return page state. You need to verify the result.
45
- - **Use \`diff\` instead of re-reading** — after an action, \`diff\` shows only what changed. Much more efficient than full \`read\`.
46
- - **Network capture BEFORE navigation** — enable network capture THEN navigate. Captures start from enable time, not retroactively.
47
- - **One page = one task** — don't reuse pages across unrelated tasks. Fresh pages avoid state contamination.
48
- - **Read snapshot before targeting** — ARIA refs are more stable than guessing CSS selectors or text matches.
49
- - **Use page-context fetch for authenticated APIs** — if the browser session already has cookies and CSRF state, \`fetch\` is usually simpler than exporting cookies.
50
-
51
- ## NEVER
52
-
53
- - **NEVER use \`file:///\` URLs** — the browser blocks local file access for security. Serve locally instead: \`npx -y serve <dir>\` then open \`http://localhost:<port>\`.
54
- - **NEVER interact without reading first** — you need the ARIA tree to know what elements exist. Blind clicks fail.
55
- - **NEVER send passwords via \`act({ kind: 'type' })\`** — tell the user to type credentials manually. Agent should never handle secrets.
56
- - **NEVER use \`screenshot\` as primary information source** — screenshots waste tokens and can't be searched. Use \`read\` with appropriate readMode.
57
- - **NEVER open system browser (\`Start-Process\`, \`open\`, \`xdg-open\`)** — provides zero feedback to the agent. Always use the owned browser.
58
- - **NEVER leave pages open** — close pages when done: \`session({ sessionAction: 'close', pageId })\`. Leaked pages consume resources.
59
- - **NEVER scrape without rate limiting** — rapid page loads trigger bot detection. Add reasonable delays between navigations.
60
- - **NEVER enable network capture after the event you care about** — you can't recover missed requests.
61
-
62
- ## Activation Signals
63
-
64
- - Activate when \`web_fetch\` returns login HTML, SAML redirects, CAPTCHA pages, or JS-heavy shells with no readable content.
65
- - Activate when \`http\` returns 401/403/407 and browser auth is a plausible recovery path.
66
- - Activate when the task requires interaction, screenshots, network capture, authenticated browser-session fetches, or previewing a locally served HTML file.
67
- - Skip it when \`web_fetch\` or \`http\` already gives the answer.
68
- - The \`browser\` tool is always callable directly. This skill exists for recipes and operating discipline, not for basic availability.
69
-
70
- ## Workflows
71
-
72
- **Two modes:**
73
- - **Script Mode** — direct sequential \`browser()\` calls for one-off tasks, debugging, and authenticated API capture.
74
- - **Recipe Mode** — reusable labeled step sequences for domain-specific automation.
75
-
76
- ### Script Mode (Default — Imperative)
77
-
78
- Direct sequential \`browser()\` calls. Best for one-off tasks, testing, API capture.
79
-
80
- ~~~text
81
- // Open → Read → Act → Read loop
82
- browser({ action: 'open', url: 'https://app.example.com', mode: 'ui' })
83
- browser({ action: 'read', pageId })
84
- browser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })
85
- browser({ action: 'read', pageId }) // verify state changed
86
- ~~~
87
-
88
- **Network Intelligence pattern:**
89
-
90
- ~~~text
91
- browser({ action: 'network', pageId, subAction: 'enable', filter: { resourceTypes: ['xhr', 'fetch'] } })
92
- // ... navigate/interact to trigger API calls ...
93
- browser({ action: 'network', pageId, subAction: 'get' })
94
- browser({ action: 'network', pageId, subAction: 'export-har' })
95
- ~~~
96
-
97
- **Authenticated API calls (using page cookies/session):**
98
-
99
- ~~~text
100
- browser({ action: 'fetch', pageId, fetchUrl: 'https://app.example.com/api/data', fetchMethod: 'GET' })
101
- ~~~
102
-
103
- Executes \`fetch()\` in the page, so cookies, session state, and CSRF tokens are reused automatically.
104
-
105
- **Console capture:**
106
-
107
- ~~~text
108
- browser({ action: 'console', pageId, consoleSubAction: 'enable' })
109
- // ... trigger page actions ...
110
- browser({ action: 'console', pageId, consoleSubAction: 'get', level: 'error' })
111
- ~~~
112
-
113
- ### Recipe Mode (Declarative)
114
-
115
- Structured step-by-step format for reusable workflows and domain skills. Each step declares Action, Verify, On Failure, and Extract fields.
116
-
117
- Load [references/recipes.md](references/recipes.md) for full recipe templates and the recipe format specification.
118
-
119
- Brief recipe format:
120
-
121
- ~~~text
122
- Step N: <description>
123
- Action: browser({ ... })
124
- Verify: <condition to check after action>
125
- On Failure: <recovery strategy>
126
- Extract: <data to capture for next steps>
127
- ~~~
128
-
129
- ### Element Targeting Priority
130
-
131
- 1. **\`ref\`** (e.g., \`@F12\`) — From \`read(snapshot)\` ARIA tree. Most reliable.
132
- 2. **\`selector\`** (e.g., \`input[name='q']\`) — Playwright CSS/attribute selector. Precise.
133
- 3. **\`element\`** (e.g., \`'Submit'\`) — Text matching via \`text=\` locator. **Picks first DOM match regardless of visibility.** Fragile for complex widgets (comboboxes, ARIA roles). Last resort.
134
-
135
- **Always \`read(snapshot)\` first** to get refs before interacting.
136
-
137
- If a selector times out, assume visibility ambiguity first: narrow the selector, add \`:visible\`, or switch to a snapshot ref.
138
-
139
- ## Network Intelligence
140
-
141
- Use browser-native capture when you need to learn how a web app really talks to its backend:
142
-
143
- - **\`network\`** for passive capture of XHR/fetch traffic, timing, and HAR export.
144
- - **\`console\`** for browser-side errors after UI actions.
145
- - **\`fetch\`** for replaying authenticated requests from page context without manually exporting cookies.
146
-
147
- Headers are redacted by default. Use sensitive output only when the task explicitly requires it and never echo secrets back to the user.
27
+ Use headless by default. Use UI only for user-visible auth or visual debugging. Do not use system browser for agent-visible evidence.
148
28
 
149
- **Workflow — Reverse-engineer API:**
29
+ ## Recipes
150
30
 
151
- ~~~text
152
- 1. open target page
153
- 2. network enable (filter: xhr, fetch)
154
- 3. interact with the page (click buttons, submit forms)
155
- 4. network get → see API endpoints, methods, headers
156
- 5. fetch → replay API calls using page session
157
- ~~~
31
+ Open/read:
32
+ 1. browser open with url and waitUntil.
33
+ 2. browser read snapshot/text/markdown.
34
+ 3. Use eval only for data unavailable through accessibility/DOM reads.
158
35
 
159
- ## Session Management
36
+ Interact:
37
+ 1. Read snapshot.
38
+ 2. Act by stable role/name/ref/selector.
39
+ 3. Re-read or diff.
40
+ 4. Capture screenshot when visual result matters.
160
41
 
161
- - Cookie read/write actions require explicit confirmation.
162
- - Prefer \`fetch\` over cookie export when the goal is an authenticated API call.
163
- - Use labels for long flows, but close the page when the task ends.
42
+ Auth:
43
+ 1. Detect login/SSO/CAPTCHA.
44
+ 2. Ask user only for secrets/actions that cannot be automated safely.
45
+ 3. Export cookies/session only when needed for same task.
46
+ 4. Never print tokens/passwords.
164
47
 
165
- ## Security Model
48
+ Network:
49
+ 1. Enable network capture before action.
50
+ 2. Perform action.
51
+ 3. Inspect filtered requests/responses.
52
+ 4. Include status, URL pattern, and relevant body snippets only.
166
53
 
167
- **Hard gates — NEVER bypass:**
168
- - Credentials go via terminal input (NEVER through tool params or chat)
169
- - CAPTCHA/MFA: pause and ask user
170
- - Never store tokens in conversation
171
- - Close pages containing sensitive data when done
172
- - Verify page URL before entering credentials (phishing prevention)
173
- - Use \`headless\` mode for automated non-interactive tasks; \`ui\` for user-supervised auth
54
+ ## Safety
174
55
 
175
- **Cookie safety gate:** All cookie read/write session actions (\`cookies\`, \`set-cookie\`, \`delete-cookie\`, \`clear-cookies\`) require \`confirm: true\` as an explicit acknowledgment. Without it, the tool returns an error.
176
-
177
- ## Local File Preview
56
+ - No destructive web actions without explicit confirmation.
57
+ - No credential exfiltration or secret logging.
58
+ - Respect robots/ToS for scraping; prefer official APIs when available.
59
+ - For payments/admin/delete/send actions, ask clearly and wait.
178
60
 
179
- The browser tool blocks \`file:///\` URLs for security. To preview local HTML files, serve them via a local HTTP server first.
61
+ ## References
180
62
 
181
- **Pattern:**
182
-
183
- ~~~text
184
- // 1. Start local server (pick an unused port)
185
- // Terminal: npx -y serve <directory> -l <port>
186
- // Example: npx -y serve ./dist -l 3847
187
-
188
- // 2. Open in browser
189
- browser({ action: 'open', url: 'http://localhost:3847/my-file.html', mode: 'ui' })
63
+ Load references/recipes.md for detailed browser flows. Load references/auth-patterns.md for SSO/login/cookie handling. Load references/workflows.md for quick reference table, element targeting, script mode, session management, and local file preview.
190
64
 
191
- // 3. Read content or take screenshot
192
- browser({ action: 'read', pageId, readMode: 'markdown' })
193
- browser({ action: 'screenshot', pageId, fullPage: true })
194
-
195
- // 4. Clean up — kill the server terminal when done
196
- ~~~
197
-
198
- **Use cases:**
199
- - Preview generated HTML (viewers, reports, docs)
200
- - Visual regression testing of local builds
201
- - Inspect single-file HTML applications
202
- - Screenshot local pages for review
203
-
204
- **Important:** Always use \`mode: 'ui'\` for visual preview so the user can also see and interact with the page.
205
-
206
- ## High-Value Patterns
65
+ ## Output
207
66
 
208
- - **Dialogs:** register \`dialog\` before the action that triggers it. Prompt dialogs also need \`promptText\`.
209
- - **Labels:** assign a \`label\` on open for long-running flows so later calls stay readable.
210
- - **Batch:** use \`batch\` to reduce round-trips, but only when the needed \`pageId\` is already known.
211
- - **Diff:** first call establishes baseline, second call shows the delta.
212
- - **Preview local HTML:** serve the directory first, then open the localhost URL in the owned browser.
67
+ Report URL, actions taken, evidence, extracted data, screenshots/paths if any, blockers, and next step.
213
68
  `},{file:`references/recipes.md`,content:`# Browser Recipes & Domain Skills
214
69
 
215
70
  Reference file for reusable browser automation patterns. Load this when building domain-specific browser workflows.
@@ -560,4 +415,93 @@ browser({ action: 'eval', pageId, code: 'localStorage.getItem("authToken")' })
560
415
  4. Or use \`navigate\` to move between pages — cookies persist
561
416
 
562
417
  **Important:** The browser auto-closes after idle timeout. For long-running tasks, interact periodically to reset the idle timer.
418
+ `},{file:`references/workflows.md`,content:`# Browser Workflows Reference
419
+
420
+ ## Quick Reference
421
+
422
+ | Need | Action | Key params |
423
+ |---|---|---|
424
+ | Open a page | \`open\` | url, mode (ui/headless) |
425
+ | Read page content | \`read\` | readMode: snapshot/dom/markdown/text |
426
+ | Click/type/interact | \`act\` | kind: click/type/press/hover/select |
427
+ | Wait for something | \`navigate\` | type: waitFor, selector |
428
+ | Check network calls | \`network\` | subAction: enable -> get |
429
+ | Get cookies/storage | \`session\` | sessionAction: cookies/get-storage |
430
+ | Take screenshot | \`screenshot\` | fullPage, selector |
431
+ | Compare changes | \`diff\` | (compares to previous snapshot) |
432
+
433
+ For full parameter details: \`describe_tool('browser')\`
434
+
435
+ ## Principles
436
+
437
+ - Prefer \`read\` over \`screenshot\` — snapshots are structured (ARIA tree), searchable, token-efficient.
438
+ - Always \`read\` after \`act\` — actions don't return page state.
439
+ - Use \`diff\` instead of re-reading after an action — shows only what changed.
440
+ - Network capture BEFORE navigation — enable then navigate. Captures start from enable time.
441
+ - One page = one task — fresh pages avoid state contamination.
442
+ - Read snapshot before targeting — ARIA refs are more stable than guessing CSS selectors.
443
+
444
+ ## Element Targeting Priority
445
+
446
+ 1. **\`ref\`** (e.g., \`@F12\`) — From read(snapshot) ARIA tree. Most reliable.
447
+ 2. **\`selector\`** (e.g., \`input[name='q']\`) — Playwright CSS/attribute selector. Precise.
448
+ 3. **\`element** (e.g., \`'Submit'\`) — Text matching via text= locator. Picks first DOM match regardless of visibility. Fragile for complex widgets. Last resort.
449
+
450
+ Always read(snapshot) first to get refs before interacting.
451
+
452
+ ## Script Mode (Imperative)
453
+
454
+ Direct sequential browser() calls for one-off tasks and debugging:
455
+
456
+ \`\`\`
457
+ // Open -> Read -> Act -> Read loop
458
+ browser({ action: 'open', url, mode: 'headless' })
459
+ browser({ action: 'read', pageId })
460
+ browser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })
461
+ browser({ action: 'read', pageId }) // verify state changed
462
+ \`\`\`
463
+
464
+ **Network Intelligence pattern:**
465
+ \`\`\`
466
+ browser({ action: 'network', pageId, subAction: 'enable', filter: { resourceTypes: ['xhr', 'fetch'] } })
467
+ // ... navigate/interact ...
468
+ browser({ action: 'network', pageId, subAction: 'get' })
469
+ browser({ action: 'network', pageId, subAction: 'export-har' })
470
+ \`\`\`
471
+
472
+ **Console capture:**
473
+ \`\`\`
474
+ browser({ action: 'console', pageId, consoleSubAction: 'enable' })
475
+ // ... trigger page actions ...
476
+ browser({ action: 'console', pageId, consoleSubAction: 'get', level: 'error' })
477
+ \`\`\`
478
+
479
+ ## Session Management
480
+
481
+ - Cookie read/write requires explicit \`confirm: true\`.
482
+ - Prefer \`fetch\` action over cookie export when the goal is an authenticated API call — reuse page cookies/session directly.
483
+ - Use labels for long flows; close the page when the task ends.
484
+ - Never store extracted cookies in code, commits, or logs. Warn user cookies are auth tokens that expire.
485
+
486
+ ## Local File Preview
487
+
488
+ The browser tool blocks \`file:///\` URLs. Serve local HTML first:
489
+
490
+ \`\`\`
491
+ // 1. Start local server
492
+ // Terminal: npx -y serve <directory> -l <port>
493
+ // 2. Open in browser
494
+ browser({ action: 'open', url: 'http://localhost:<port>/my-file.html', mode: 'ui' })
495
+ // 3. Read content or screenshot
496
+ browser({ action: 'read', pageId, readMode: 'markdown' })
497
+ // 4. Clean up — kill the server terminal when done
498
+ \`\`\`
499
+
500
+ ## High-Value Patterns
501
+
502
+ - **Dialogs:** Register \`dialog\` before the action that triggers it. Prompt dialogs also need \`promptText\`.
503
+ - **Labels:** Assign a \`label\` on open for long-running flows.
504
+ - **Batch:** Use \`batch\` to reduce round-trips, but only when \`pageId\` is already known.
505
+ - **Diff:** First call establishes baseline, second call shows the delta.
506
+ - **Security:** Never send passwords via \`act({ kind: 'type' })\`. Ask user to type credentials manually. Never use \`file:///\` URLs.
563
507
  `}];export{e as default};